Web Development

Implementing Audio CAPTCHA

By David Summer, December 10, 2007

Source Code Accompanies This Article. Download It Now.

David uses sound to make CAPTCHA an equal opportunity security device.

Roll Tape

When I created my audio CAPTCHA, my goals were to keep things simple by using only a small amount of code with no outside dependencies. I also wanted to keep the instructions and user interaction as simple as possible. A solution using PHP seemed best for this, with some JavaScript to handle browser independence and keyboard input.

The main file, index.htm (Listing One), starts by setting a PHP session with a call to session_start(). Sessions are used in PHP as a persistence mechanism, letting data be passed from one page to another. (For more information on PHP sessions, see us2.php.net/session).

After setting the session, the PHP script loads an array with four random numbers, from 0 to 9. These numbers serve as the basis for the randomly generated audio and are used to verify the subsequent user entry. The four digits are then concatenated together and stored in the $_SESSION global array variable. Here I make use of PHP's associative array feature, naming this array index "captchaAnswer".

When the page is loaded, a JavaScript Init() function is called. This simply makes sure that the text box is free of any previous entry. The next two functions determine the user's browser and OS. If the user is running Internet Explorer under Windows, I want to use an embedded Windows Media Player to play the sound files.

A JavaScript function KeyCheck() is called when the browser gets an onkeyup message. I use the key up rather than the key down because the key down can be generated more than once if the user holds a key down for a certain length of time.

KeyCheck() gets the key code, the key that was pressed and released by the user. It gets this code via either the window.event that is built into IE or the passed-in variable that will be present in Mozilla-based browsers.

I chose to use the P key to start the audio. There is no need to check for a lowercase P because the browser does "case folding," meaning any lowercase characters are converted to uppercase. I could have used almost any key here. I wanted to use the Spacebar, because that is what's used in many audio software packages for starting or stopping playback. But the Spacebar is already a browser hot key for page down, so I settled for P as in "Play."

When the P key up is detected, KeyCheck() clears any data that may be in the text box. Then it plays the audio file in one of two ways depending on the browser and the OS. If under IE/Windows, the embedded Windows Media player is used. The Media Player has an object model that allows script control. Calling the Player's controls.play() starts the Media Player.

If the user is not on IE/Windows, the script uses whatever the user has set as the default media player. When I tested this, I found that when I used the HTML embed tag, the dynamically created sound file did not play. Using the iframe tag instead, with a 0 size frame, plays the file.

When the audio has started, I shift the focus to the input text box. This sequence activating the audio on the P key hit, shifting the focus to the input text box, and accepting the Enter key as a signal that the entry is completeallows for a mouse-free user experience.

Below KeyPress() is code that embeds the Windows Media Player if the user is on IE/Windows. The url parameter is given the value of PlaySound.php. This is the file that generates the audio. Also, autoStart is False so the audio won't just start when the page is first loaded. You may want to change this value to True depending on how you are presenting the CAPTCHA.

After the JavaScript section comes the form data. The form takes the user input for the CAPTCHA and calls the CaptchaSubmit.php file to verify the input data. Lastly, I include a div section to contain the HTML frame that may be generated in KeyPress().

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task.
However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

This month's Dr. Dobb's Journal

This month,
Dr. Dobb's Journal is devoted to mobile programming. We introduce you to Apple's new Swift programming language, discuss the perils of being the third-most-popular mobile platform, revisit SQLite on Android
, and much more!