Write Secure Scripts with PHP 4.2!

For the longest time, one of the biggest selling points of PHP as a server-side scripting language was that values submitted from a form were automatically created as global variables for you. As of PHP 4.1, the makers of PHP recommended an alternate means of accessing submitted data. In PHP 4.2, they switched off the old way of doing things! As I’ll explain in this article, these changes have been made in the name of security. Together, we’ll explore the new features of PHP for handling form submissions and other data, and how they can be used to write more secure scripts.

What’s wrong with this picture?

Consider the following PHP script, which grants access to a Web page only if the correct username and password are entered:

Okay, I’m sure about half the readers in the audience just rolled their eyes and said "That’s so stupid — I would never make a mistake like that!" But I guarantee that a good number of you are thinking "Hey, that’s not bad. I should write that down!" And of course there’s always the rather confused minority ("What’s PHP?"). PHP was designed as a "nice and easy" scripting language that beginners can start to use in minutes; it should also protect those beginners from making scary mistakes like the one above.

For the record, the problem with the above script is that you can easily gain access to it without supplying the correct username and password. Simply type the address of the page into your browser with ?authorized=1 tacked on the end. Since PHP automatically creates a variable for every value submitted — either from a form post, the URL query string, or a cookie — this sets $authorized to 1 in the script and plops an unauthorized user right in front of the Colonel’s secret recipe (apologies to the non-junk food eating readers who will not get that joke).

So, easy fix, right? Just set $authorized to false by default at the top of the script. The problem here is that a fix shouldn’t have been necessary at all! $authorized is a variable created and used entirely within the script; why should the developer have to worry about protecting every single one of his or her variables from being overridden by values submitted by malicious users?

How does PHP 4.2 change things?

As of PHP 4.2, a fresh PHP installation has the register_globals option turned off by default, so EGPCS values (EGPCS is short for Environment, Get, Post, Cookies, Server — the full range of external variable sources in PHP) are not created as global variables. Yes, this option can still be turned on manually, but the PHP team would prefer it if you didn’t. To comply with their wishes, you’ll need to use an alternate method to get at these values.

Beginning with PHP 4.1, EGPCS values are now available in a set of special arrays:

$_ENV — Contains system environment variables

$_GET — Contains variables in the query string, including from GET forms

$_POST — Contains variables submitted from POST forms

$_COOKIE — Contains all cookie variables

$_SERVER — Contains server variables, such as HTTP_USER_AGENT

$_REQUEST — Contains everything in $_GET, $_POST, and $_COOKIE

$_SESSION — Contains all registered session variables

Prior to PHP 4.1, developers who worked with register_globals turned off (this was also considered a good way to boost PHP performance a little) accessed these values using cumbersome arrays like $HTTP_GET_VARS. These new variable names are not only shorter, but they have some nice new features as well.

First, let’s re-write the broken script from the previous section for use under PHP 4.2 (i.e. with register_globals turned off):

Since we’re expecting the username and password to be submitted by the user, we grab these values out of the $_REQUEST array. Using this array allows users to pass in the values by any means at their disposal: through the URL query string (e.g. to allow users to create a bookmark that enters their credentials automatically for them), by a form submission, or as a cookie. If you would prefer to limit the methods by which they can submit their credentials to form submissions only (or more precisely, HTTP POST requests, which can be simulated without a form submission if push comes to shove), you could use the $_POST array instead:

We also fetch the commonly-used PHP_SELF variable out of the array of server variables; like the form variables, it isn’t created automatically with register_globals disabled. Since it’s only used once in the script, you’ll probably prefer just referencing it directly in your form code:

<form action="<?=$_SERVER['PHP_SELF']?>" method="POST">

Other than ‘allowing in’ these three variables, the script hasn’t changed at all. Turning register_globals off simply forces the developer to be aware of data that comes in from outside (untrusted) sources.

Note for the Nitpickers: The default error_reporting setting of PHP is still E_ALL & ~E_NOTICE, so if the ‘username’ and ‘password’ values haven’t been submitted, attempting to pull them out of the $_REQUEST or $_POST array will not produce an error message. If you’re using a stricter level of error checking on your PHP setup, you’ll need to add a little more code to check if these variables are set first.

But doesn’t that mean more typing?

Yes, in simple scripts like the one above, the new way of doing things in PHP 4.2 does often require more typing. But hey, look on the bright side — you could start charging by the keystroke!

Seriously though, the makers of PHP are not entirely insensitive to your pain (I have a half brother who suffers from repetitive stress injuries). A special feature of these new arrays is that, unlike all other PHP variables, they are totally global. How does this help you? Let’s extend our example a little to see.

To allow for multiple pages on the site to require a username/password combination to be viewed, we’ll move our authorization code into an include file (protectme.php) as follows:

Nice and simple, right? Now here’s a challenge for the especially eagle-eyed and experienced — what’s missing from the authorize_user function?

What’s missing is the declaration of $_POST inside the function to bring it in from the global scope! In PHP 4.0, with register_globals turned on, you’d have had add a line of code to get access to the $username and $password variables inside the function:

In PHP, unlike in other languages with similar syntax, variables outside a function are not automatically available inside the function. You need to specifically bring them in with the global line demonstrated above.

With register_globals turned off in PHP 4.0 to improve security, you would use the $HTTP_POST_VARS array to obtain the values submitted from your form, but you would still have had to import that array from the global scope:

But in PHP 4.1 or later, the special $_POST variable (and the rest of the special variables listed in the previous section) are always available in all scopes. This is why the $_POST variable didn’t need to be declared global at the top of the function:

How does this affect sessions?

The introduction of the special $_SESSION array actually helps to simplify session code. Instead of registering global variables as session variables and then having to keep track of which variables are registered when, simply refer to all your session variables as $_SESSION['varname'].

Let’s consider another authorization example. This time, it will use sessions to mark a user as authorized for the remainder of his or her stay on your site. First, the PHP 4.0 version (with register_globals enabled):

Now, spot the security hole. As before, adding ?authorized=1 to the end of the URL bypasses the security measures and grants access to the page contents. The developer probably thought of $authorized as a session variable, and missed the fact that the same variable could easily be set by user input.

Here’s how the script looks when we add our special arrays (PHP 4.1) and switch off register_globals (PHP 4.2):

See? Much more straightforward! Instead of registering a normal variable as a session variable, we set the session variable (in the $_SESSION array) directly, and then use it the same way. There’s no more confusion as to which variables are session variables and you’ll notice the code is slightly shorter too!

Summary

In this article I explained the reasoning behind recent changes to the PHP scripting language. In PHP 4.1, a set of special arrays were added to the language to access external data values. These arrays are available in any scope to make external data access a more convenient. In PHP 4.2, register_globals was turned off by default to encourage migration to the new arrays and to reduce the tendency of inexperienced developers to write insecure PHP scripts.

Kevin began developing for the Web in 1995 and is a highly respected technical author. Kev is a world-renowned author, speaker and JavaScript expert. He has a passion for making web technology easy to understand by anyone. Yes, even you!