Introduction

Web programming is probably why you're reading this book. It's why the first version of PHP was written and what continues to make it so popular today. With PHP, it's easy to write dynamic web programs that do almost anything. Other chapters cover various PHP capabilities, like graphics, regular expressions, database access, and file I/O. These capabilities are all part of web programming, but this chapter focuses on some web-specific concepts and organizational topics that will make your web programming stronger.

Recipe 8.2, Recipe 8.3, and Recipe 8.4 show how to set, read, and delete cookies. A cookie is a small text string that the server instructs the browser to send along with requests the browser makes. Normally, HTTP requests aren't "stateful"; each request can't be connected to a previous one. A cookie, however, can link different requests by the same user. This makes it easier to build features such as shopping carts or to keep track of a user's search history.

Recipe 8.5 shows how to redirect users to a different web page than the one they requested. Recipe 8.6 explains the session module, which lets you easily associate persistent data with a user as he moves through your site. Recipe 8.7 demonstrates how to store session information in a database, which increases the scalability and flexibility of your web site. Discovering the features of a user's browser is shown in Recipe 8.8. Recipe 8.9 shows the details of constructing a URL that includes a GET query string, including proper encoding of special characters and handling of HTML entities.

The next two recipes demonstrate how to use authentication, which lets you protect your web pages with passwords. PHP's special features for dealing with HTTP Basic authentication are explained in Recipe 8.10. Sometimes it's a better idea to roll your own authentication method using cookies, as shown in Recipe 8.11.

The three following recipes deal with output control. Recipe 8.12 shows how to force output to be sent to the browser. Recipe 8.13 explains the output buffering functions. Output buffers enable you to capture output that would otherwise be printed or delay output until an entire page is processed. Automatic compression of output is shown in Recipe 8.14.

Recipe 8.15 to Recipe 8.20 cover error handling topics, including controlling where errors are printed, writing custom functions to handle error processing, and adding debugging assistance information to your programs. Recipe 8.19 includes strategies for avoiding the common "headers already sent" error message, such as using the output buffering discussed in Recipe 8.13.

The next four recipes show how to interact with external variables: environment variables and PHP configuration settings. Recipe 8.21 and Recipe 8.22 discuss environment variables, while Recipe 8.23 and Recipe 8.24 discuss reading and changing PHP configuration settings. If Apache is your web server, you can use the techniques in Recipe 8.25 to communicate with other Apache modules from within your PHP programs.

Recipe 8.26 demonstrates a few methods for profiling and benchmarking your code. By finding where your programs spend most of their time, you can focus your development efforts on improving the code that has the most noticeable speed-up effect to your users.

This chapter also includes two programs that assist in web site maintenance. Program Recipe 8.27 validates user accounts by sending an email message with a customized link to each new user. If the user doesn't visit the link within a week of receiving the message, the account is deleted. Program Recipe 8.28 monitors requests in real time on a per-user basis and blocks requests from users that flood your site with traffic.

Setting Cookies

Problem

You want to set a cookie.

Solution

Use setcookie( ) :

setcookie('flavor','chocolate chip');

Discussion

Cookies are sent with the HTTP headers, so setcookie( ) must be called before any output is generated.

You can pass additional arguments to setcookie( ) to control cookie behavior. The third argument to setcookie( ) is an expiration time, expressed as an epoch timestamp. For example, this cookie expires at noon GMT on December 3, 2004:

setcookie('flavor','chocolate chip',1102075200);

If the third argument to setcookie( ) is missing (or empty), the cookie expires when the browser is closed. Also, many systems can't handle a cookie expiration time greater than 2147483647, because that's the largest epoch timestamp that fits in a 32-bit integer, as discussed in the introduction to Chapter 3.

The fourth argument to setcookie( ) is a path. The cookie is sent back to the server only when pages whose path begin with the specified string are requested. For example, the following cookie is sent back only to pages whose path begins with /products/:

setcookie('flavor','chocolate chip','','/products/');

The page that's setting this cookie doesn't have to have a URL that begins with /products/, but the following cookie is sent back only to pages that do.

The fifth argument to setcookie( ) is a domain. The cookie is sent back to the server only when pages whose hostname ends with the specified domain are requested. For example, the first cookie in the following code is sent back to all hosts in the example.com domain, but the second cookie is sent only with requests to the host jeannie.example.com:

If the first cookie's domain was just example.com instead of .example.com, it would be sent only to the single host example.com (and not www.example.com or jeannie.example.com).

The last optional argument to setcookie( ) is a flag that if set to 1, instructs the browser only to send the cookie over an SSL connection. This can be useful if the cookie contains sensitive information, but remember that the data in the cookie is stored in the clear on the user's computer.

Different browsers handle cookies in slightly different ways, especially with regard to how strictly they match path and domain strings and how they determine priority between different cookies of the same name. The setcookie( ) page of the online manual has helpful clarifications of these differences.

Reading Cookie Values

Problem

Solution

Discussion

A cookie's value isn't available in $_COOKIE during the request in which the cookie is set. In other words, the setcookie( ) function doesn't alter the value of $_COOKIE. On subsequent requests, however, each cookie is stored in $_COOKIE. If register_globals is on, cookie values are also assigned to global variables.

When a browser sends a cookie back to the server, it sends only the value. You can't access the cookie's domain, path, expiration time, or secure status through $_COOKIE because the browser doesn't send that to the server.

To print the names and values of all cookies sent in a particular request, loop through the $_COOKIE array:

Deleting Cookies

Problem

You want to delete a cookie so a browser doesn't send it back to the server.

Solution

Call setcookie( ) with no value for the cookie and an expiration time in the past:

setcookie('flavor','',time()-86400);

Discussion

It's a good idea to make the expiration time a few hours or an entire day in the past, in case your server and the user's computer have unsynchronized clocks. For example, if your server thinks it's 3:06 P.M. and a user's computer thinks it's 3:02 P.M., a cookie with an expiration time of 3:05 P.M. isn't deleted by that user's computer even though the time is in the past for the server.

The call to setcookie( ) that deletes a cookie has to have the same arguments (except for value and time) that the call to setcookie( ) that set the cookie did, so include the path, domain, and secure flag if necessary.

Redirecting to a Different Location

Problem

You want to automatically send a user to a new URL. For example, after successfully saving form data, you want to redirect a user to a page that confirms the data.

Solution

Before any output is printed, use header( ) to send a Location header with the new URL:

header('Location: http://www.example.com/');

Discussion

If you want to pass variables to the new page, you can include them in the query string of the URL:

header('Location: http://www.example.com/?monkey=turtle');

The URL that you are redirecting a user to is retrieved with GET. You can't redirect someone to retrieve a URL via POST. You can, however, send other headers along with the Location header. This is especially useful with the Window-target header, which indicates a particular named frame or window in which to load the new URL:

Discussion

To start a session automatically on each request, set session.auto_start to 1 in php.ini. With session.auto_start, there's no need to call session_start( ).

The session functions keep track of users by issuing them cookies with a randomly generated session IDs. If PHP detects that a user doesn't accept the session ID cookie, it automatically adds the session ID to URLs and forms.[1] For example, consider this code that prints a URL:

print '<a href="train.php">Take the A Train</a>';

If sessions are enabled, but a user doesn't accept cookies, what's sent to the browser is something like:

<a href="train.php?PHPSESSID=2eb89f3344520d11969a79aea6bd2fdd">Take the A Train</a>

In this example, the session name is PHPSESSID and the session ID is 2eb89f3344520d11969a79aea6bd2fdd. PHP adds those to the URL so they are passed along to the next page. Forms are modified to include a hidden element that passes the session ID. Redirects with the Location header aren't automatically modified, so you have to add a session ID to them yourself using the SID constant:

The session_name( ) function returns the name of the cookie that the session ID is stored in, so this code appends the SID constant only to $redirect_url if the constant is defined, and the session cookie isn't set.

By default, PHP stores session data in files in the /tmp directory on your server. Each session is stored in its own file. To change the directory in which the files are saved, set the session.save_path configuration directive in php.ini to the new directory. You can also call session_save_path( ) with the new directory to change directories, but you need to do this before accessing any session variables.

Discussion

One of the most powerful aspects of the session module is its abstraction of how sessions get saved. The session_set_save_handler( ) function tells PHP to use different functions for the various session operations such as saving a session and reading session data. The pc_DB_Session class stores the session data in a database. If this database is shared between multiple web servers, users' session information is portable across all those web servers. So, if you have a bunch of web servers behind a load balancer, you don't need any fancy tricks to ensure that a user's session data is accurate no matter which web server they get sent to.

To use pc_DB_Session, pass a data source name (DSN) to the class when you instantiate it. The session data is stored in a table called php_session whose structure is:

The pc_DB_Session::_write( ) method uses a MySQL-specific SQL command, REPLACEINTO, which updates an existing record or inserts a new one, depending on whether there is already a record in the database with the given id field. If you use a different database, modify the _write( ) function to accomplish the same task. For instance, delete the existing row (if any), and insert a new one, all inside a transaction:

Once you download a browser capability file, you need to tell PHP where to find it by setting the browscap configuration directive to the pathname of the file. If you use PHP as a CGI, set the directive in the php.ini file:

browscap=/usr/local/lib/browscap.txt

If you use Apache, you need to set the directive in your Apache configuration file:

php_value browscap "/usr/local/lib/browscap.txt"

Many of the capabilities get_browser( ) finds are shown in Table 8-1. For user-configurable capabilities such as javascript or cookies though, get_browser( ) just tells you if the browser can support those functions. It doesn't tell you if the user has disabled the functions. If JavaScript is turned off in a JavaScript-capable browser or a user refuses to accept cookies when the browser prompts him, get_browser( ) still indicates that the browser supports those functions.

Discussion

The query string has spaces encoded as +. Special characters such as # are hex-encoded as %23 because the ASCII value of # is 35, which is 23 in hexadecimal.

Although urlencode( ) prevents any special characters in the variable names or values from disrupting the constructed URL, you may have problems if your variable names begin with the names of HTML entities. Consider this partial URL for retrieving information about a stereo system:

/stereo.php?speakers=12&cdplayer=52&amp=10

The HTML entity for ampersand (&) is &amp; so a browser may interpret that URL as:

/stereo.php?speakers=12&cdplayer=52&=10

To prevent embedded entities from corrupting your URLs, you have three choices. The first is to choose variable names that can't be confused with entities, such as _amp instead of amp. The second is to convert characters with HTML entity equivalents to those entities before printing out the URL. Use htmlentities( ) :

You may run into trouble with any GET method URLs that you can't explicitly construct with semicolons, such as a form with its method set to GET, because your users' browsers use & as the argument separator.

Because many browsers don't support using ; as an argument separator, the easiest way to avoid problems with entities in URLs is to choose variable names that don't overlap with entity names. If you don't have complete control over variable names, however, use htmlentities( ) to protect your URLs from entity decoding.

See Also

Using HTTP Basic Authentication

Problem

You want to use PHP to protect parts of your web site with passwords. Instead of storing the passwords in an external file and letting the web server handle the authentication, you want the password verification logic to be in a PHP program.

Solution

The $_SERVER['PHP_AUTH_USER'] and $_SERVER['PHP_AUTH_PW'] global variables contain the username and password supplied by the user, if any. To deny access to a page, send a WWW-Authenticate header identifying the authentication realm as part of a response with status code 401:

Discussion

When a browser sees a 401 header, it pops up a dialog box for a username and password. Those authentication credentials (the username and password), if accepted by the server, are associated with the realm in the WWW-Authenticate header. Code that checks authentication credentials needs to be executed before any output is sent to the browser, since it might send headers. For example, you can use a function such as pc_validate( ) , shown in Example 8-2.

Replace the contents of the pc_validate( ) function with appropriate logic to determine if a user entered the correct password. You can also change the realm string from "My Website" and the message that gets printed if a user hits "cancel" in their browser's authentication box from "You need to enter a valid username and password."

HTTP Basic authentication can't be used if you're running PHP as a CGI. If you can't run PHP as a server module, you can use cookie authentication, discussed in Recipe 8.11.

Another issue with HTTP Basic authentication is that it provides no simple way for a user to log out, other then to exit his browser. The PHP online manual has a few suggestions for log out methods that work with varying degrees of success with different server and browser combinations at http://www.php.net/features.http-auth.

There is a straightforward way, however, to force a user to log out after a fixed time interval: include a time calculation in the realm string. Browsers use the same username and password combination every time they're asked for credentials in the same realm. By changing the realm name, the browser is forced to ask the user for new credentials. For example, this forces a log out every night at midnight:

You can also have a user-specific timeout without changing the realm name by storing the time that a user logs in or accesses a protected page. The pc_validate() function in Example 8-3 stores login time in a database and forces a log out if it's been more than 15 minutes since the user last requested a protected page.

See Also

Using Cookie Authentication

Problem

You want more control over the user login procedure, such as presenting your own login form.

Solution

Store authentication status in a cookie or as part of a session. When a user logs in successfully, put their username in a cookie. Also include a hash of the username and a secret word so a user can't just make up an authentication cookie with a username in it:

You can use the same pc_validate( ) function from the Recipe 8.10 to verify the username and password. The only difference is that you pass it $_REQUEST['username'] and $_REQUEST['password'] as the credentials instead of $_SERVER['PHP_AUTH_USER'] and $_SERVER['PHP_AUTH_PW']. If the password checks out, send back a cookie that contains a username and a hash of the username, and a secret word. The hash prevents a user from faking a login just by sending a cookie with a username in it.

Once the user has logged in, a page just needs to verify that a valid login cookie was sent in order to do special things for that logged-in user:

If you use the built-in session support, you can add the username and hash to the session and avoid sending a separate cookie. When someone logs in, set an additional variable in the session instead of sending a cookie:

Using cookie or session authentication instead of HTTP Basic authentication makes it much easier for users to log out: you just delete their login cookie or remove the login variable from their session. Another advantage of storing authentication information in a session is that you can link users' browsing activities while logged in to their browsing activities before they log in or after they log out. With HTTP Basic authentication, you have no way of tying the requests with a username to the requests that the same user made before they supplied a username. Looking for requests from the same IP address is error-prone, especially if the user is behind a firewall or proxy server. If you are using sessions, you can modify the login procedure to log the connection between session ID and username:

This example writes a message to the error log, but it could just as easily record the information in a database that you could use in your analysis of site usage and traffic.

One danger of using session IDs is that sessions are hijackable. If Alice guesses Bob's session ID, she can masquerade as Bob to the web server. The session module has two optional configuration directives that help you make session IDs harder to guess. The session.entropy_file directive contains a path to a device or file that generates randomness, such as /dev/random or /dev/urandom. The session.entropy_length directive holds the number of bytes to be read from the entropy file when creating session IDs.

No matter how hard session IDs are to guess, they can also be stolen if they are sent in clear text between your server and a user's browser. HTTP Basic authentication also has this problem. Use SSL to guard against network sniffing, as described in Recipe 14.11.

Solution

Discussion

The flush( ) function sends all output that PHP has internally buffered to the web server, but the web server may have internal buffering of its own that delays when the data reaches the browser. Additionally, some browsers don't display data immediately upon receiving it, and some versions of Internet Explorer don't display a page until they've received at least 256 bytes. To force IE to display content, print blank spaces at the beginning of the page:

See Also

Buffering Output to the Browser

Problem

You want to start generating output before you're finished sending headers or cookies.

Solution

Call ob_start( ) at the top of your page and ob_end_flush( ) at the bottom. You can then intermix commands that generate output and commands that send headers. The output won't be sent until ob_end_flush( ) is called:

<?php ob_start(); ?>
I haven't decided if I want to send a cookie yet.
<?php setcookie('heron','great blue'); ?>
Yes, sending that cookie was the right decision.
<?php ob_end_flush(); ?>

Discussion

You can pass ob_start( ) the name of a callback function to process the output buffer with that function. This is useful for postprocessing all the content in a page, such as hiding email addresses from address-harvesting robots:

Compressing Web Output with gzip

Problem

You want to send compressed content to browsers that support automatic decompression.

Solution

Add this setting to your php.ini file:

zlib.output_compression=1

Discussion

Browsers tell the server that they can accept compressed responses with the Accept-Encoding header. If a browser sends Accept-Encoding: gzip or Accept-Encoding:deflate, and PHP is built with the zlib extension, the zlib.output_compression configuration directive tells PHP to compress the output with the appropriate algorithm before sending it back to the browser. The browser uncompresses the data before displaying it.

You can adjust the compression level with the zlib.output_compression_level configuration directive:

See Also

Hiding Error Messages from Users

Problem

Solution

Set the following values in your php.ini or web server configuration file:

display_errors =off
log_errors =on

These settings tell PHP not to display errors as HTML to the browser but to put them in the server's error log.

Discussion

When log_errors is set to on, error messages are written to the server's error log. If you want PHP errors to be written to a separate file, set the error_log configuration directive with the name of that file:

error_log = /var/log/php.error.log

If error_log is set to syslog, PHP error messages are sent to the system logger using syslog(3) on Unix and to the Event Log on Windows NT.

There are lots of error messages you want to show your users, such as telling them they've filled in a form incorrectly, but you should shield your users from internal errors that may reflect a problem with your code. There are two reasons for this. First, these errors appear unprofessional (to expert users) and confusing (to novice users). If something goes wrong when saving form input to a database, check the return code from the database query and display a message to your users apologizing and asking them to come back later. Showing them a cryptic error message straight from PHP doesn't inspire confidence in your web site.

Second, displaying these errors to users is a security risk. Depending on your database and the type of error, the error message may contain information about how to log in to your database or server and how it is structured. Malicious users can use this information to mount an attack on your web site.

For example, if your database server is down, and you attempt to connect to it with mysql_connect( ), PHP generates the following warning:

<br>
<b>Warning</b>: Can't connect to MySQL server on 'db.example.com' (111) in
<b>/www/docroot/example.php</b> on line <b>3</b><br>

If this warning message is sent to a user's browser, he learns that your database server is called db.example.com and can mount an attack on it.

Discussion

Every error generated has an error type associated with it. For example, if you try to array_pop( ) a string, PHP complains that "This argument needs to be an array," since you can only pop arrays. The error type associated with this message is E_NOTICE, a nonfatal runtime problem.

By default, the error reporting level is E_ALL & ~E_NOTICE, which means all error types except notices. The & is a logical AND, and the ~ is a logical NOT. However, the php.ini-recommended configuration file sets the error reporting level to E_ALL, which is all error types.

Error messages flagged as notices are runtime problems that are less serious than warnings. They're not necessarily wrong, but they indicate a potential problem. One example of an E_NOTICE is "Undefined variable," which occurs if you try to use a variable without previously assigning it a value:

In the first case, the first time though the foreach, $html is undefined. So, when you append to it, PHP lets you know you're appending to an undefined variable. In the second case, the empty string is assigned to $html above the loop to avoid the E_NOTICE. The previous two code snippets generate identical code because the default value of a variable is the empty string. The E_NOTICE can be helpful because, for example, you may have misspelled a variable name:

A custom error-handling function can parse errors based on their type and take an appropriate action. A complete list of error types is shown in Table 8-2.

Table 8-2. Error types

Value

Constant

Description

Catchable

1

E_ERROR

Nonrecoverable error

No

2

E_WARNING

Recoverable error

Yes

4

E_PARSE

Parser error

No

8

E_NOTICE

Possible error

Yes

16

E_CORE_ERROR

Like E_ERROR but generated by the PHP core

No

32

E_CORE_WARNING

Like E_WARNING but generated by the PHP core

No

64

E_COMPILE_ERROR

Like E_ERROR but generated by the Zend Engine

No

128

E_COMPILE_WARNING

Like E_WARNING but generated by the Zend Engine

No

256

E_USER_ERROR

Like E_ERROR but triggered by calling trigger_error( )

Yes

512

E_USER_WARNING

Like E_WARNING but triggered by calling trigger_error( )

Yes

1024

E_USER_NOTICE

Like E_NOTICE but triggered by calling trigger_error( )

Yes

2047

E_ALL

Everything

n/a

Errors labeled catchable can be processed by the function registered using set_error_handler( ) . The others indicate such a serious problem that they're not safe to be handled by users, and PHP must take care of them.

Discussion

A custom error handling function can parse errors based on their type and take the appropriate action. See Table 8-2 in Recipe 8.16 for a list of error types.

Pass set_error_handler( ) the name of a function, and PHP forwards all errors to that function. The error handling function can take up to five parameters. The first parameter is the error type, such as 8 for E_NOTICE. The second is the message thrown by the error, such as "Undefined variable: html". The third and fourth arguments are the name of the file and the line number in which PHP detected the error. The final parameter is an array holding all the variables defined in the current scope and their values.

For example, in this code $html is appended to without first being assigned an initial value:

When the "Undefined variable" error is generated, pc_error_handler( ) prints:

[ERROR][8][Undefined variable: html][err-all.php:16]

After the initial error message, pc_error_handler( ) also prints a large array containing all the globals, environment, request, and session variables.

Errors labeled catchable in Table 8-2 can be processed by the function registered using set_error_handler( ). The others indicate such a serious problem that they're not safe to be handled by users and PHP must take care of them.

Discussion

An HTTP message has a header and a body, which are sent to the client in that order. Once you begin sending the body, you can't send any more headers. So, if you call setcookie( ) after printing some HTML, PHP can't send the appropriate Cookie header.

Also, remove trailing whitespace in any include files. When you include a file with blank lines outside <?php ?> tags, the blank lines are sent to the browser. Use trim( ) to remove leading and trailing blank lines from files:

Instead of processing files on a one-by-one basis, it may be more convenient to do so on a directory-by-directory basis. Recipe 19.8 describes how to process all the files in a directory.

If you don't want to worry about blank lines disrupting the sending of headers, turn on output buffering. Output buffering prevents PHP from immediately sending all output to the client. If you buffer your output, you can intermix headers and body text with abandon. However, it may seem to users that your server takes longer to fulfill their requests since they have to wait slightly longer before the browser displays any output.

Discussion

Debugging code is a necessary side-effect of writing code. There are a variety of techniques to help you quickly locate and squash your bugs. Many of these involve including scaffolding that helps ensure the correctness of your code. The more complicated the program, the more scaffolding needed. Fred Brooks, in The Mythical Man-Month, guesses that there's "half as much code in scaffolding as there is in product." Proper planning ahead of time allows you to integrate the scaffolding into your programming logic in a clean and efficient fashion. This requires you to think out beforehand what you want to measure and record and how you plan on sorting through the data gathered by your scaffolding.

One technique for sifting through the information is to assign different priority levels to different types of debugging comments. Then the debug function prints information only if it's higher than the current priority level.

Reading Environment Variables

Problem

Solution

Discussion

Environment variables are named values associated with a process. For instance, in Unix, you can check the value of $_ENV['HOME'] to find the home directory of a user:

print $_ENV['HOME']; // user's home directory
/home/adam

Early versions of PHP automatically created PHP variables for all environment variables by default. As of 4.1.0, php.ini-recommended disables this because of speed considerations; however php.ini-dist continues to enable environment variable loading for backward compatibility.

The $_ENV array is created only if the value of the variables_order configuration directive contains E. If $_ENV isn't available, use getenv( ) to retrieve an environment variable:

$path = getenv('PATH');

The getenv( ) function isn't available if you're running PHP as an ISAPI module.

Setting Environment Variables

Problem

You want to set an environment variable in a script or in your server configuration. Setting environment variables in your server configuration on a host-by-host basis allows you to configure virtual hosts differently.

Solution

To set an environment variable in a script, use putenv( ) :

putenv('ORACLE_SID=ORACLE'); // configure oci extension

To set an environment variable in your Apache httpd.conf file, use SetEnv:

SetEnv DATABASE_PASSWORD password

Discussion

An advantage of setting variables in httpd.conf is that you can set more restrictive read permissions on it than on your PHP scripts. Since PHP files need to be readable by the web-server process, this generally allows other users on the system to view them. By storing passwords in httpd.conf, you can avoid placing a password in a publicly available file. Also, if you have multiple hostnames that map to the same document root, you can configure your scripts to behave differently based on the hostnames.

For example, you could have members.example.com and guests.example.com. The members version requires authentication and allows users additional access. The guests version provides a restricted set of options, but without authentication:

Reading Configuration Variables

Problem

You want to get the value of a PHP configuration setting.

Solution

Use ini_get( ) :

// find out the include path:
$include_path = ini_get('include_path');

Discussion

To get all configuration variable values in one step, call ini_get_all( ) . It returns the variables in an associative array, and each array element is itself an associative array. The second array has three elements: a global value for the setting, a local value, and an access code:

The global_value is the value set from the php.ini file; the local_value is adjusted to account for any changes made in the web server's configuration file, any relevant .htaccess files, and the current script. The value of access is a numeric constant representing the places where this value can be altered. Table 8-3 explains the values for access. Note that the name access is a little misleading in this respect, as the setting's value can always be checked, but not adjusted.

Table 8-3. Access values

Value

PHP constant

Meaning

1

PHP_INI_USER

Any script, using ini_set( )

2

PHP_INI_PERDIR

Directory level, using .htaccess

4

PHP_INI_SYSTEM

System level, using php.ini or httpd.conf

7

PHP_INI_ALL

Everywhere: scripts, directories, and the system

A value of 6 means the setting can be changed in both the directory and system level, as 2 + 4 = 6. In practice, there are no variables modifiable only in PHP_INI_USER or PHP_INI_PERDIR, and all variables are modifiable in PHP_INI_SYSTEM, so everything has a value of 4, 6, or 7.

You can also get variables belonging to a specific extension by passing the extension name to ini_get_all( ):

By convention, the variables for an extension are prefixed with the extension name and a period. So, all the session variables begin with session. and all the Java variables begin with java., for example.

Since ini_get( ) returns the current value for a configuration directive, if you want to check the original value from the php.ini file, use get_cfg_var( ) :

Setting Configuration Variables

Problem

Solution

Use ini_set( ) :

// add a directory to the include path
ini_set('include_path', ini_get('include_path') . ':/home/fezzik/php');

Discussion

Configuration variables are not permanently changed by ini_set( ). The new value lasts only for the duration of the request in which ini_set( ) is called. To make a persistent modification, alter the values stored in the php.ini file.

It isn't meaningful to alter certain variables, such as asp_tags or register_globals because by the time you call ini_set( ) to modify the setting, it's too late to change the behavior the setting affects. If a variable can't be changed, ini_set( ) returns false.

However, it is useful to alter configuration variables in certain pages. For example, if you're running a script from the command line, set html_errors to off.

Solution

Discussion

When Apache processes a request from a client, it goes through a series of steps; PHP plays only one part in the entire chain. Apache also remaps URLs, authenticates users, logs requests, and more. While processing a request, each handler has access to a set of key/value pairs called the notes table. The apache_note( ) function provides access to the notes table to retrieve information set by handlers earlier on in the process and leave information for handlers later on.

For example, if you use the session module to track users and preserve variables across requests, you can integrate this with your log file analysis so you can determine the average number of page views per user. Use apache_note( ) in combination with the logging module to write the session ID directly to the access_log for each request:

The trailing n tells Apache to use a variable stored in its notes table by another module.

If PHP is built with the --enable-memory-limit configuration option, it stores the peak memory usage of each request in a note called mod_php_memory_usage. Add the memory usage information to a LogFormat with:

The Benchmark_Iterate::get( ) method returns an associative array. The mean element of this array holds the mean execution time for each iteration of the function. The iterations element holds the number of iterations. The execution time of each iteration of the function is stored in an array element with an integer key. For example, the time of the first iteration is in $results[1], and the time of the 37th iteration is in $results[37].

To automatically record the elapsed execution time after every line of PHP code, use the declare construct and the ticks directive:

The ticks directive allows you to execute a function on a repeatable basis for a block of code. The number assigned to ticks is how many statements go by before the functions that are registered using register_tick_function( ) are executed.

In the previous example, we register a single function and have the profile( ) function execute for every statement inside the declare block. If there are two elements in $_SERVER['argv'], profile( ) is executed four times: once for each time through the foreach loop, and once each time the print strlen($arg) line is executed.

You can also set things up to call two functions every three statements:

If you want to execute an object method, pass the object and the name of the method in encapsulated within an array. This lets the register_tick_function( ) know you're referring to an object instead of a function.

Call unregister_tick_function( ) to remove a function from the list of tick functions:

Program: Website Account (De)activator

When users sign up for your web site, it's helpful to know that they've provided you with a correct email address. To validate the email address they provide, send an email to the address they supply when they sign up. If they don't visit a special URL included in the email after a few days, deactivate their account.

This system has three parts. The first is the notify-user.php program that sends an email to a new user and asks them to visit a verification URL, shown in Example 8-4. The second, shown in Example 8-5, is the verify-user.php page that handles the verification URL and marks users as valid. The third is the delete-user.php program that deactivates accounts of users who don't visit the verification URL after a certain amount of time. This program is shown in Example 8-6.

Here's the SQL to create the table that user information is stored in:

You probably want to store more information than this about your users, but this is all that's needed to verify them. When creating a user's account, save information to the users table, and send the user an email telling them how to verify their account. The code in Example 8-4 assumes that user's email address is stored in the variable $email.

The user's verification status is updated only if the email address and verify string provided match a row in the database that has not already been verified. The last step is the short program that deletes unverified users after the appropriate interval, as shown in Example 8-6.

Run this program once a day to scrub the users table of users that haven't been verified. If you want to change how long users have to verify themselves, adjust the value of $window, and update the text of the email message sent to users to reflect the new value.

Program: Abusive User Checker

Shared memory's speed makes it an ideal way to store data different web server processes need to access frequently when a file or database would be too slow. Example 8-7 shows the pc_Web_Abuse_Check class, which uses shared memory to track accesses to web pages in order to cut off users that abuse a site by bombarding it with requests.

To use this class, call its check_abuse( ) method at the top of a page, passing it the username of a logged in user:

// get_logged_in_user_name() is a function that finds out if a user is logged in
if ($user = get_logged_in_user_name( )) {
$abuse = new pc_Web_Abuse_Check( );
if ($abuse->check_abuse($user)) {
exit;
}
}

The check_abuse( ) method secures exclusive access to the shared memory segment in which information about users and traffic is stored with the get_lock( ) method. If the current user is already on the list of abusive users, it releases its lock on the shared memory, prints out an error page to the user, and returns true. The error page is defined in the class's constructor.

If the user isn't on the abusive user list, and the current page (stored in $_SERVER['PHP_SELF']) isn't on a list of pages to exclude from abuse checking, the count of pages that the user has looked at is incremented. The list of pages to exclude is also defined in the constructor. By calling check_abuse( ) at the top of every page and putting pages that don't count as potentially abusive in the $exclude array, you ensure that an abusive user will see the error page even when retrieving a page that doesn't count towards the abuse threshold. This makes your site behave more consistently.

The next section of check_abuse( ) is responsible for adding users to the abusive users list. If more than $this->recalc_seconds have passed since the last time it added users to the abusive users list, it looks at each user's pageview count and if any are over $this->pageview_threshold, they are added to the abusive users list, and a message is put in the error log. The code that sets $this->data['traffic_start'] if it's not already set is executed only the very first time check_abuse( ) is called. After adding any new abusive users, check_abuse( ) resets the count of users and pageviews and starts a new interval until the next time the abusive users list is updated. After releasing its lock on the shared memory segment, it returns false.

All the information check_abuse( ) needs for its calculations, such as the abusive user list, recent pageview counts for users, and the last time abusive users were calculated, is stored inside a single associative array, $data. This makes reading the values from and writing the values to shared memory easier than if the information was stored in separate variables, because only one call to shm_get_var( ) and shm_put_var( ) are necessary.

The pc_Web_Abuse_Check class blocks abusive users, but it doesn't provide any reporting capabilities or a way to add or remove specific users from the list. Example 8-8 shows the abuse-manage.php program, which lets you manage the abusive user data.

Example 8-8 prints out information about current user page view counts and the current abusive user list, as shown in Figure 8-1. It also lets you add or remove specific users from the list and clear the whole list.

Figure 8-1. Abusive users

When it removes users from the abusive users list, instead of:

unset($abuse->data['abusive_users'][$_REQUEST['user']])

it sets the following to 0:

$abuse->data['abusive_users'][$_REQUEST['user']]

This still causes check_abuse( ) to return false, but it allows the page to explicitly note that the user was on the abusive users list but was removed. This is helpful to know in case a user that was removed starts causing trouble again.

When a user is added to the abusive users list, instead of recording a pageview count, the script records the time the user was added. This is helpful in tracking down who or why the user was manually added to the list.

If you deploy pc_Web_Abuse_Check and this maintenance page on your server, make sure that the maintenance page is protected by a password or otherwise inaccessible to the general public. Obviously, this code isn't very helpful if abusive users can remove themselves from the list of abusive users.

Notes

↑ Before PHP 4.2.0, this behavior had to be explicitly enabled by building PHP with the --enable-trans-sid configuration setting.