Chances are, you've been CAPTCHA-ed. Since the Internet spam problem reached epidemic proportions several months ago, an increasing number of Web-based e-mail services and antispam applications have started using CAPTCHA (Completely Automated Public Turing test to tell Computers and Humans Apart) to combat the online bots spammers so often use to carry out their dirty deeds.

Since 1950, when British mathematician Alan Turing wrote an article called "Computer Machinery and Intelligence" for an Oxford philosophy journal, people have applied the phrase Turing test to any experiment in which subjects must distinguish between man and machine by exchanging information with the unknown entity. Such tests strive to determine whether a computer exhibits human intelligence, indicated, in Turing's view, by the computer successfully fooling subjects into believing it's human. CAPTCHA turns the game around, with the machine separating humans from computers.

When you sign up for a Web-based e-mail account on Yahoo!, for example, the site displays a small rectangular graphic containing a short, well-known word, such as coat, manage, or worry. The word is obscured, but you can just make it out. As part of the account registration, you are asked to type the word in an adjacent box. This is a CAPTCHA process. A human can read the word and correctly type it in the box, whereas a bot can't, ostensibly. In this way, Yahoo! hopes to prevent spammers from registering thousands of e-mail addresses and using them to broadcast unwanted advertisements.

At the other end of the e-mail chain, Internet users often install software that uses CAPTCHA to weed out incoming spam from legitimate correspondence. When a message arrives, such apps will mail the sender a challenge message with a built-in reverse Turing test (reverse because a machine applies the test). The most famous, a tool called MailFrontier, asks senders to identify the number of cats in a digital photograph.

CAPTCHAs are blocking all sorts of online misbehavior. The Ticketmaster Web site, for instance, posts CAPTCHAs to prevent scalpers from buying enormous numbers of tickets to sell at a profit. The trouble is, diligent spammers find ways around such tests, rolling out new and improved bots. For example, circumventing MailFrontier's CAPTCHA isn't difficult. After all, only so many cats can fit in a photograph. With the Yahoo! CAPTCHA, bots can use optical character recognition to identify hidden words. And since the Yahoo! implementation uses common English words, a bot must identify only the first one or two letters and guess the rest.

To make life more difficult for spammers, scientists at the Palo Alto Research Center (PARC) have designed a new breed of CAPTCHA called BaffleText. The CAPTCHA is similar to the one Yahoo! uses in that it asks you to identify words buried in digital graphics, but it can't be read by today's OCR technology. PARC scientists begin with a graphic in which the embedded word is easily visible, then use a host of techniques to degrade the image progressively.

The scientists may print out the image and scan it back in or apply a technique called thresholdingtransferring the image from color to black and white and back again, which significantly changes gray levels. They may just add random noise to the image. "It often looks like the image has undergone a shark attack," says PARC scientist Henry Baird, who heads the project. "Basically, we make the image worse and worse and worse until OCR systems crap out trying to read it." But even if an OCR engine were able to discern a letter or two, it couldn't guess the remainder of the word, because BaffleText uses only nonsense words.

A team of computer scientists at UC Berkeley led by Jitendra Malik has broken the Yahoo! CAPTCHA, but BaffleText has proven insurmountable thus far. Meanwhile, PARC has done many user tests with the system, and humans can read the degraded images with ease.

But BaffleText could make using the Net even more difficult for the visually impaired. CAPTCHAs are already a big problem for blind Web users. "If you're using standard screen-reading programs such as speech-output technology to read a Web site, those programs have no way to access the information contained in a CAPTCHA," says Dan Aunspach, chief engineer for the Virginia Department for the Blind and Visually Impaired.

Aunspach adds, "Normally we tell Web designers to add HTML tags to their image files to describe the image for screen-reading programs, but you can't do that with CAPTCHAs, because that would allow bots to get around them." BaffleText would cause the same problem for blind users and hinder people with partial visual impairments. "The more difficult you make a CAPTCHA to read, the more problems it's going to cause for people with limited vision," Aunspach says. "A lot of people have a loss of central visual, and they have to use peripheral vision for sight. It's very hard for them to read and decode a CAPTCHA image."

The lab has yet to sell the technology for commercial use, but plans are in the works. Spammers beware.

Get Our Best Stories!

This newsletter may contain advertising, deals, or affiliate links. Subscribing to a newsletter indicates your consent to our Terms of Use and Privacy Policy. You may unsubscribe from the newsletters at any time.