Forms, forms, forms, forms: we fill 'em out for nearly everything,
from the moment we're born, 'til the moment we die. So what's to
explain all the hoopla and excitement over HTML forms? Simply this:
they make HTML truly interactive.

When you think about it, except for the limited input from users
available through the <isindex> tag, HTML's
interactivity is basically a lot of button pushing: click here, click
there, go here, go there--there's no real user feedback, and it's
certainly not personalized. Applets provide extensive
user-interaction capability, but they can be difficult to write and
are still not standardized for all browsers. Forms, on the other hand,
are supported by almost every browser and make it possible to create
documents that collect and process user input, and formulate
personalized replies.

This powerful mechanism has far-reaching implications, particularly
for electronic commerce. It finishes an online catalog by giving
buyers a way to immediately order products and services. It gives
nonprofit organizations a way to sign up new members. It gives market
researchers a way to collect user data. It gives you an automated way
to interact with your HTML document readers.

Mull over the ways you might want to interact with your readers while
we take a look at both the client- and server-side details of creating
forms.

Unlike
the <isindex> tag, you can put one or more forms
in a single document. And unlike an <isindex>
document, users can ignore the embedded forms, reading content and
interacting with the document's links just as with a form-less
document. [the section called "The <isindex> Tag"]

Forms are comprised
of one or more text input boxes, clickable buttons, multiple-choice
checkboxes, and even pull-down menus and clickable images, all placed
inside the <form> tag. Within a form, you may
also put regular body content, including text and images. The text
is particularly useful for providing instructions to the users on
how to fill out the form and for form element labels and prompts.
And, within the various form elements, you can use JavaScript event
handlers for a variety of effects like testing and verifying form
contents and calculating a running sum.

Once a user fills out the various fields in the form, they click a
special "Submit" button (or, sometimes, press the Return key) to
submit the form to a server. The browser packages up the user-supplied
values and choices and sends them to a server or to an email
address.[1]
The server passes the information along to a supporting program
or application that processes the information and creates a reply,
usually in HTML. The reply may be simply a thank you or it might
prompt the user how to fill out the form correctly or to supply
missing fields. The server sends the reply to the browser client
who then presents it to the user. With emailed forms, the information
is simply put into someone's mailbox; there is no notification
of the form being sent.

[1]
Some browsers, Netscape in particular, may also encrypt the
information, securing it from credit-card thieves, for example.
However, the encryption facility must also be supported on the server-side
as well: contact the browser manufacturer for details.

The server-side data-processing
aspects of forms are not part of the HTML standard; they are defined
by the server's software. While a complete discussion of
server-side forms programming is beyond the scope of this book,
we'd be remiss if we did not include at least a simple
example to get you started. To that end, we've included
at the end of this chapter a few skeletal programs that illustrate
the common styles of server-side forms programming.

You place a form anywhere inside the body of an HTML document with its
elements enclosed by the <form> tag and its
respective end tag </form>. You may, and we
recommend you often do, include regular body content inside a form to
specially label user-input fields and to provide directions, for
example.

Browsers flow the special form elements into
the containing paragraphs as if they were small images embedded
into the text. There aren't any special layout rules for
form elements, so you need to use other HTML elements, like the
<br> and <p> tags, to control
the placement of elements within the text flow. [the section called "The <p> Tag"]
[the section called "The <br> Tag"]

All of the form elements within a
<form> tag comprise a single form. The browser
sends all of the values of these elements--blank, default,
or user-modified--when the user submits the form
to the server.

You must define at least two special
form attributes, which provide the name of the form's processing
server and the method by which the parameters are to be sent to
the server. A third, optional attribute lets you change how the
parameters get encoded for secure transmission over the network.

The required
action attribute for the <form>
tag gives the URL of the application that is to receive and process
the form's data.

Most webmasters keep their
forms-processing applications in a special directory on their web
server, usually named cgi-bin, which stands
for Common Gateway Interface binaries.[2] Keeping
these special forms-processing programs and applications in one
directory makes it easier to manage and secure the server.

[2]
The Common
Gateway Interface (CGI) defines the protocol by which servers interact
with programs that process form data.

A
typical <form> tag with the action attribute looks
like this:

<form action="http://www.kumquat.com/cgi-bin/update">
...
</form>

The example URL tells the browser to contact the server named
www.kumquat.com and pass along the user's
form values to the application named update
located in the cgi-bin directory.

In
general, if you see a URL that references a document in a directory
named cgi-bin, you can be pretty sure that
the document is actually an application that creates the desired
page dynamically each time it's invoked.

The browser specially encodes the form's
data before it passes that data to the server so it does not become
scrambled or corrupted during the transmission. It is up to the
server to either decode the parameters or to pass them, still encoded,
to the application.

The standard encoding format is
the Internet Media Type named "application/x-www-form-urlencoded."
You can change that encoding with the optional enctype attribute
in the <form> tag. The only optional encoding
formats currently supported are "multipart/form-data"
and "text/plain."

The multipart/form-data
alternative is required for those forms that contain file-selection
fields for upload by the user. The text/plain format should
be used in conjunction with a mailto URL in the action attribute
for sending forms to an email address instead of a server. Unless
your forms need file-selection fields or you must use a mailto URL
in the action attribute, you probably should ignore this attribute
and simply rely upon the browser and your processing server to use
the default encoding type. [the section called "File selection fields"]

The standard encoding--
application/x-www-form-urlencoded--converts
any spaces in the form values to a plus sign (+), nonalphanumeric
characters into a percent sign (%) followed by two hexadecimal
digits that are the ASCII code of the character, and the line breaks
in multiline form data into %0D%0A.

The
standard encoding also includes a name for each field in the form.
(A "field" is a discrete element in the form,
whose value can be nearly anything, from a single number to several
lines of text--the user's address, for example.)
If there is more than one value in the field, the values are separated
by ampersands.

For example, here's what the
browser sends to the server after the user fills out a form with
two input fields labeled name and address; the former field has
just one line of text, while the latter field has several lines
of input:

We've broken the value into two lines for clarity
in this book, but in reality, the browser sends the data in an unbroken
string. The name field is "O'Reilly and Associates"
and the value of the address field, complete with embedded newline
characters, is:

The
multipart/form-data encoding
encapsulates the fields in the form as several parts of a single
MIME-compatible compound document. Each field has its own section
in the resulting file, set off by a standard delimiter. Within each
section, one or more header lines define the name of the field,
followed by one or more lines containing the value of the field.
Since the value part of each section can contain binary data or
otherwise unprintable characters, no character conversion or encoding
occurs within the transmitted data.

This encoding format
is by nature more verbose and longer than the application/x-www-form-urlencoded
format. As such, it can only be used when the method attribute of
the <form> tag is set to post, as described below.

A simple example makes it easy to understand this format.
Here's our previous example, when transmitted as multipart/form-data:

The first line of the transmission defines the delimiter that
will appear before each section of the document. It always consists
of thirty dashes and a long random number that distinguishes it
from other text that might appear in actual field values.

The
next lines contain the header fields for the first section. There
will always be a Content-Disposition field indicating that this
section contains form data and providing the name of the form element
whose value is in this section. You may see other header fields;
in particular, some file-selection fields include a Content-Type
header field that indicates the type of data contained in the file
being transmitted.

After the headers, there is a single
blank line followed by the actual value of the field on one or more
lines. The section concludes with a repeat of the delimiter line
that started the transmission. Another section follows immediately,
and the pattern repeats until all of the form parameters have been
transmitted. The end of the transmission is indicated by an extra
two dashes at the end of the last delimiter line.

As
we pointed out earlier, use multipart/form-data encoding
only when your form contains a file-selection field. Here's
an example of how the transmission of a file-selection field might
look:

------------------------------146931364513459
Content-Disposition: form-data; name="thefile"; filename="test"
Content-Type: text/plain
First line of the file
...
Last line of the file
------------------------------146931364513459--

The only notable difference is that the Content-Disposition
field contains an extra element, filename, that defines the name
of the file being transmitted. There might also be a Content-Type
field to further describe the file's contents.

Use
this encoding only when you don't have access to a form-processing
server and need to send the form information by email (the form's
action attribute is a mailto URL; see 10.1.1.13). The conventional
encodings are designed for computer consumption; text/plain
was designed with people in mind.

In this encoding,
each element in the form is placed on a single line, with the name
and value separated by an equal sign. Returning to our name and
address example, the form data would be returned as:

As you can see, the only characters
still encoded in this form are the carriage return and line feed
characters in multiline text input areas. Otherwise, the result
is easily readable and generally parsable by simple tools.

The
other required attribute for the <form> tag sets
the method by which the browser sends the form's data to
the server for processing. There are two ways: the POST method and
the GET method.

With the
POST
method, the browser sends the data in two steps: the browser first
contacts the form-processing server specified in the action attribute,
and once contact is made, sends the data to the server in a separate
transmission.

On the server side, POST-style applications
are expected to read the parameters from a standard location once
they begin execution. Once read, the parameters must be decoded
before the application can use the form values. Your particular
server will define exactly how your POST-style applications can
expect to receive their parameters.

The
GET method, on the other hand, contacts the form-processing
server and sends the form data in a single transmission step: the
browser appends the data to the form's action URL, separated
by the question mark character.

The common browsers
transmit the form information by either method; some servers receive
the form data by only one or the other method. You indicate which
of the two methods--POST or GET--your forms-processing
server handles with the method attribute in the <form>
tag. Here's the complete tag including the GET transmission
method attribute for the previous form example:

Which one to use if your form-processing server supports both
the POST and GET methods? Here are some rules of thumb:

For best form-transmission performance,
send small forms with a few short fields via the GET method.

Because some server operating systems limit the
number and length of command-line arguments that can be passed to
an application at once, use the POST method to send forms that have
many fields, or ones that have long text fields.

If you are inexperienced in writing server-side
form-processing applications, choose GET. The extra steps involved
in reading and decoding POST-style transmitted parameters, while
not too difficult, may be more work than you are willing to tackle.

If security is an issue, choose POST. GET places
the form parameters directly in the application URL where they easily
can be captured by network sniffers or extracted from a server log
file. If the parameters contain sensitive information like credit
card numbers, you may be compromising your users without their knowledge.
While POST applications are not without their security holes, they
can at least take advantage of encryption when transmitting the
parameters as a separate transaction with the server.

If you want to invoke the server-side application outside
the realm of a form, including passing it parameters, use GET because
it lets you include form-like parameters as part of a URL. POST-style
applications, on the other hand, expect an extra transmission from
the browser after the URL, something you can't do as part
of a conventional <a> tag.

The foregoing
bit of advice warrants some explanation. Suppose you had a simple
form with two elements named x and y. When the values of these elements
are encoded, they look like this:

x=27&y=33

If the form uses method=GET, the URL used to reference
the server-side application looks something like this:

http://www.kumquat.com/cgi-bin/update?x=27&y=33

There is nothing to keep you from creating a conventional
<a> tag that invokes the form with any parameter
value you desire, like so:

<a href="http://www.kumquat.com/cgi-bin/update?x=19&y=104">

The only hitch is that the ampersand that separates the parameters
is also the character-entity insertion character. When placed within
the href attribute of the <a> tag, the ampersand
will cause the browser to replace the characters following it with
a corresponding character entity.

To keep this from
happening, you must replace the literal ampersand with its entity
equivalent, either &#38; or &amp;. With
this substitution, our example of the nonform reference to the server-side
application looks like this:

<a href="http://www.kumquat.com/cgi-bin/update?x=19&amp;y=104">

Because of the potential confusion that arises from having
to escape the ampersands in the URL, server implementors are encouraged
to also accept the semicolon as a parameter separator. You might
want to check your server's documentation to see if they
honor this convention. See Appendix E, Character Entities.

The
name attribute is used to associate
a name with the form. This name can subsequently be used in JavaScript
code to reference and manipulate the form and its input elements.
Unless you plan to control elements of your form with JavaScript,
it is not necessary to include the name attribute. This attribute
is supported only by Netscape.

The
onSubmit attribute
for the <form> tag is a special JavaScript event
handler built into the modern browsers. The value of the event handler
is--enclosed in quotation marks--one or a sequence
of semicolon-separated JavaScript expressions, methods, and function
references that the browser executes just before it actually submits
the data to the form-processing server or sends it to an email address.
[the section called "JavaScript Event Handlers"]

You may use the
onSubmit event for a variety of effects. The most popular is for
a client-side form-verification program that scans the form data
and prompts the user to complete one or more missing elements. Another
popular and much simpler use is to inform users when a mailto URL
form is being processed via email (see 10.1.1.13).

The
onreset attribute is used just like the onsubmit attribute, except
that the associated JavaScript code is only executed if the user
presses a "Reset" button in the form. This attribute
is supported only by Netscape.

The actual effects of style with <form>
are hard to predict, however. In general, style properties affect
the body content--text, in particular--that you
may include as part of the form's contents, but <form>
styles do affect the display characteristics of the form elements.

For instance, you may create a special font face and background
color style for the form. The form's text labels, but not
the text inside a text input form element, will appear in the specified
font face and background color. Similarly, the text labels you put
beside a set of radio buttons will be in the form-specified style,
but not radio buttons themselves.

With the advent of frames,
it is possible to redirect the results of a form to another window
or frame. Simply add the target attribute to your <form>
tag and provide the name of the window or frame to receive the results.

Like the target attribute used in conjunction with the <a>
tag, you can use a number of special names with the target attribute
in the <form> tag to create a new window or to
replace the contents of existing windows and frames. [the section called "The target Attribute for the <a> Tag"]

In a moment we'll
examine each element of a form in detail. Let's first take
a quick look at a simple example to see how forms are put together.
This one (shown in Figure 10.1)
gathers basic demographic information about a user:

The first line of the example starts the form and indicates
we'll be using the POST method for data transmission to
the form-processing server. The form's user-input elements
follow, each defined by an <input> tag and type
attribute. There are three elements in the simple example, each
contained within its own paragraph.

The first element is a conventional text
entry field, letting the user type up to 80 characters, but displaying
only 32 of them at a time. The next element is a multiple-choice
option, which lets the user select only one of two radio buttons.
This is followed by a pull-down menu for choosing one of three options.
The final element is a simple submission button, which, when clicked
by the user, sets the form's processing in motion.

It
is becoming increasingly common to find authors who have no access
to a web server other than to upload their HTML documents. Consequently,
they have no ability to create or manage CGI programs. In fact,
some providers, particularly those hosting space for hundreds or
even thousands of sites, typically disable CGI services to limit
their server's processing load or as a security precaution.
If you are working with one of these sites, forms become a difficult,
if not impossible, proposition.

All is not lost: You
can use a mailto URL as the value of the form's action
attribute. The Netscape browser will automatically email the various
form parameters and values to the address supplied in the URL. The
recipient of the mail can then process the form and take action
accordingly.

For example, by substituting the following
for the <form> tag in our previous example:

<form method=POST action="mailto:chuckandbill@ora.com"
enctype="text/plain"
onSubmit="window.alert('This form is being sent by email, even
though it may not appear that anything has happened...')">

the
form data gets emailed to chuckandbill when submitted by the user,
not otherwise processed by a server. Notice, too, that we have a
simple JavaScript alert message that appears when the browser gets
ready to send out the form data. The alert tells the user not to
expect confirmation that the form data was sent (see Figure 10.2).
Also, unless disabled by the user or if you omit the method=POST
attribute, the browser typically will warn the user that they are
about to send unencrypted ("text/plain")
and thereby unsecure information over the network, and gives them
the option to cancel the submission. Otherwise, the form is sent
via email without incident or notification.

The body of the resulting emailed form message
looks something like this:

name=Bozo the Clown
sex=M
income=$50,001 and higher

If you choose to use the mailto form capability, there are
several problems you may have to deal with:

Your forms won't work on
browsers that don't support a mailto URL as a form action.

Some browsers, such as Internet Explorer, do not
properly place the form data into the email message body and may
even open an email dialog, confusing the user.

Unlike with most form CGI scripts, a mailto doesn't
present the user with a confirmation page to assure them that their
form has been processed. After executing the mailto form, the user
is left looking at the form, as if nothing had happened. (Use JavaScript
to overcome this dilemma with an onSubmit or onClick event handler.)
[the section called "JavaScript Event Handlers"]

Your data may arrive in a form that is difficult,
if not impossible, to read, unless you use a readable enctype, such
as text/plain.

In
spite of all this, mailto forms present an attractive alternative
to the web author constrained by a restricted server. Our advice:
use CGI scripts if at all possible, and fall back to mailto URLs
if all else fails.