FIT3084: Forms
& CGI
In the previous lecture:
In this lecture:
- Working with server scripts
- Using forms to supply data
Reference:
- Stein, L.D.: How To Set Up and Maintain a Web Site, 2nd edn, Addison
Wesley 1997, Chpter 8.
What are scripts?
- Scripts are external programs run by a server in response to a request from
a web browser.
- Scripts may accept input parameters from a web browser along with the request
to be executed.
- Scripts may return output to be displayed in the browser.
- Scripts running on a web server add to the WWW the ability to synthesize
responses to changing conditions... they add a dynamic aspect to the web.
- Scripts can be written in any langauge, interpretted (eg. PERL) or compiled
(eg. C).
Common Gateway Interface (CGI)
- CGI is an interface or gateway between server and script.
- A gateway may link between the web server and a database search engine for
example.
- CGI compliant scripts will run on CGI compliant servers.
- Servers running on Unix, VMS, OS / 2, Windows NT / 95 are CGI compliant.
- Old Macintosh web servers (pre OS-X) are not CGI compliant but simple measures
allowed scripts to run on them. Current Macintoshes running OS-X may also
run the Apache web server which is CGI compliant.
- Many scripts are front ends to UNIX programs (such as emailers and text
search engines) and so, even though the script itself may run on a non-Unix
machine, the back end program doing the work may not run (or even exist)!
- Have a look at FastCGI for an alternative that extends and enhances the CGI model:
- Enables applications to persist between client requests, eliminating application start up overhead and allowing the application to maintain state between client calls.
- Enables applications to reside on remote systems (rather than having to reside on the same system as the Web server)
A Few Examples
- A counter telling me that I have re-loaded this page times since 14 Sept 98.
- The interface on the Yellow
Pages website.
- The php web site.
Identifying Requests to Execute
Scripts
- When a user requests a URL pointing to a script, the server executes the
script.
- The server can identify the URL as a script rather than a document to be
retrieved by
- the directory the URL indicates (frequently .../cgi-bin or a subdirectory)
James, how long until the <A HREF="/cgi-bin/bombTimer">
bomb detonates? </A>
- a unique file extension (frequently .cgi)
James, how long until the <A HREF="blah/bombTimer.cgi">
bomb detonates? </A>
- Frequently, scripts are authorized and installed by the system administrator
to prevent malicious, careless or ignorant folk from installing programs which
may breach security.
- Scripts may be run under a special username (eg. www) with no special priveleges.
This may help prevent inadvertent or malicious damage.
- Scripts may be run within a wrapper as the user who owns the script.
Special security checks ensure the server's security is not compromised (cgiwrap
and cgiwrapd - see below for examples).
Passing 'Hard Coded' Parameters to Scripts
- Parameters may be passed to a script directly through the URL.
<A HREF="/cgi-bin/search?James%20Bond">Where is
007?</A>
- The ? is appended to the URL and precedes the query string
which constitutes the argument list passed to the script.
- The %20 escapes the space character in the search string.
- Query strings usually (not necessarily) fall into one of two formats:
- Keyword list in the form:
value1+value2+value3+...
<A HREF="/cgi-bin/search?Secret+Agent+James+Bond">
Where is 007?
</A>
This format is often used for scripts which do word searches.
- Named parameter list in the form:
name1=value1&name2=value2&name3=value3...
<A HREF = "/cgi-bin/search?job=Secret%20Agent&name1=James&name2=Bond">
Where is 007?
</A>
This format is useful for complex data where various options may or may
not be specified depending on conditions and a name must be associated
with each datum to determine its meaning.
- Path information (such as the path to a file to be searched by a script)
may be incorporated into the URL of a script by appending it to the URL.
.../cgi-bin/bombTimer/james0/bombFiles/bomb.txt
After the server has decoded the URL of the script, the additional path information
is passed to the script. (The ? and a query string may be appended
following the additional path information as usual.)
Passing User - Specified Parameters to Scripts
- User input may be gathered using fill-out forms containing text entry boxes,
radio buttons etc.
- Scripts may create their own fill out forms...
- Script is called without parameters
- Script requires parameters so it creates an input document which is
despatched to the browser.
- User enters required data to input document and submits it.
- Browser calls the script, passing it the contents of the input document
as parameters.
- Script processes data and returns result.
- A custom interface may be written to such scripts by creating a fill out
form which collects the necessary data and sends it to the script as parameters
directly.
Have a look at the interface to the Altavista
search engine.
(Have a look at a query in the browser's URL entry box)
Front End to Named Parameter List Scripts
Here's some of the HTML...
<FORM ACTION="/cgi-bin/order_toys" METHOD=POST
>
<P>Secret Agent Name:
< INPUT TYPE="text" NAME="name">
...
<INPUT TYPE="submit" VALUE="Transmit Order
to HQ">
< INPUT TYPE="reset" VALUE="Eat Order">
< /FORM>
- The form is marked by <FORM> tags...
- The ACTION attribute tells the browser where to send the submitted parameters.
- The METHOD attribute specifies the means by which the browser submits information
to the script. This can be one of two request methods implemented in HTTP
(see Stein p47 for further details):
- The GET command
tells the server to return an entire document to the browser. This is
the command most commonly used when retrieving data from the web. A
script
call using GET is made by appending the query string to the script's
URL.
In some cases, the URL may be truncated to 255 characters - do not use
the GET method if you have a lot of parameters to pass or some of
the information may not get through to the script.
- The POST command
tells the server to treat a document as an executable and pass it some
information. Using this method the parameters are transmitted between
server and client along a communications channel opened especially.
The
POST method does not suffer from the "truncation problem".
A well written script should handle both POST and GET submissions.
- The INPUT tags denote form elements (text entry boxes, push buttons etc)
- The INPUT tag of type="submit" is a button which places the form data into
a named parameter list. The parameter names are the names of the form elements,
their values are the values of the respective input elements.
- The INPUT tag of type="reset" is a button which... I wonder!?
- Check out the document source to see how some of the other elements are
described. There are more besides! (Refer to an HTML guide)
- Netscape recognizes an ACTION attribute:
<FORM ACTION = "mailto:fox.mulder@fbi.org" METHOD =
POST>
No prizes for guessing that on submission, this mails the contents of the
form to the address given.
Remember Clickable Image Maps?
- Originally clickable image maps were implemented using CGI scripts.
- The user clicked on an image, the x,y coordinates of the click were sent
to a script which read them and returned a URL which was then sent to the
web browser which sent the URL back to the server which returned the requested
document.
- No wonder servers began incorporating the functionality of these scripts!
- ... a scheme which was further accelerated by the client side image map!
So you see how simple all this CGI script stuff is... here's a really
simple CGI script written in C...
main(int argc, char **argv)
{
printf("Content-type: text/html\n"); // tell server MIME type of returned doc.
printf("\n"); // blank line
printf("<HEAD><TITLE><BR>\n'); // output HTML header info.
printf("Echo Script Response<BR>\n");
printf("</TITLE></HEAD><BR>\n");
printf("<BODY>\n<P>\n"); // output HTML body echoing the
printf("%s", getenv("QUERY_STRING"));// environment variable QUERY_STRING
printf("</BODY>\n");
}
- The above script echoes the input sent to it.
- The script receives its input in the environment variable QUERY_STRING
after submission from a form using METHOD=GET.
- The PATH_INFO environment variable contains any path
information appended to the the URL.
Let's call the script using this fill out form and cgi-wrap...
Let's call the same script again (from the same form) but this time, the
ACTION attribute of the FORM tag will call the script via cgi-wrapd...
An additional note for completeness...
Parameters can be read into
a C program where the form applies the POST method, like this (see
the Stanford
site from which this info. originates for details): "The POST query string is encoded in precisely the same form as the
GET query string, but instead of being passed in the URL and read into
the QUERY_STRING variable, it is given to the CGI program as standard
input, which you can thus read using ANSI functions or regular character
reading functions. The only quirk is that the server will not send
EOF at the end of the data. Instead, the size of the string is passed
in the environment variable CONTENT_LENGTH, which can be accessed using
the normal stdlib.h function:"
char *value;
int length;
value = getenv("CONTENT_LENGTH");
sscanf(value, "%d", &length);
|
This lecture's key point(s):
- CGI defines a standard way to transfer information from the client-side
(Browser) to the server.
- Forms provide standard data entry and selection widgets for users to submit
data.
- CGI scripts can generate dynamic or query specific web pages 'on the fly'
Courseware | Lecture notes
©Copyright
Alan Dorin 2008