There are many header types besides Host and User-Agent. They specify the date and time the request was sent, languages the browser would prefer, and other stuff. A typical GET might have a half dozen headers.

Here’s the important bit for this chapter: the total HTTP request – the URL and the headers – can include data the user typed into form fields. This gives a way for browsers to send data to servers (usually it’s the other way around).

When the user submits a form in a Web browser, the form data travels along with a URL.

Figure 3. Form data attached to URL

So, the form data is sent along with a URL. But which URL? What page does the browser send the data to?

The browser sends form data to a page that is specially written to handle form data. The page knows how to extract the data, and do something with it, like save it to a database. That’s one of the things you do in PHP: write pages that can handle form data.

More on that later. Right now, we’re just looking at the form.

A form example

Let’s create a form like this:

Figure 4. Simple form

There are two text fields, and a button. Clicking the button submits the form. This means that the browser sends the form’s data to the server.

Some of the characters have been changed. The spaces became plus signs (+). The comma (,) became %2C.

This is called “URL encoding.” Browsers automatically URL encode data before sending it in a get or a post.

Why do browsers do this? URLs use a small character set. A “character set” is a limited group of characters to choose from. For example, when computers were first developed, most used the old US-ASCII character set. It has just the letters A to Z, the digits, punctuation(,.!; etc.) and a few other things. If you wanted to send characters with accents (like é) or currency symbols (like ¥), you were out of luck.

Other character sets were developed. Perhaps the most common international character set these days is UTF-8. It can represent thousands of characters; Cyrillic, Kanji, you name it, it’s probably in UTF-8.

The trouble is, URLs still use US-ASCII. It’s efficient, and is universally supported. So how do you send special characters in a URL?

The answer: URL encoding. It converts some special characters to a code. For example, it changes commas (,) into %2C. It also changes spaces into plus signs.

Browsers are not consistent in their URL encoding. Here’s how different browsers encode the data in Figure 11:

Firefoxfirst_name=Renata+is+cute!&surname=Yes%2C+I+agree.

Internet Explorerfirst_name=Renata+is+cute%21&surname=Yes%2C+I+agree.

Chromefirst_name=Renata+is+cute!&surname=Yes,+I+agree.

Safarifirst_name=Renata+is+cute%21&surname=Yes%2C+I+agree.

Only IE and Safari are the same.

Usually, you won’t need to worry about URL encoding. It happens automatically. But sometimes you’ll need to do it yourself in PHP. More on that later.

Exercise: Special characters in the address form

Open the page you created for the address form exercise. Type special characters into the fields. See how they are encoded in the URL generated by the form.

Make some notes below on how different characters (like %, $, +, and #) are encoded.