Background

The article continues the topic of image-based CAPTCHAs that was started in my previous article, The Image-Based CAPTCHA. Please look through it in order to get some idea about this article's subject.

Image-Based Bot Detector

I made a control - I named it ImageBasedBotDetector - that implemented the idea stated in the previous article. Its code differs from the example posted before, basically because of the separation of the code that renders the control and the code that renders the CAPTCHA image (it is replaced with an HttpHandler). For use, it needs to set TemplateImageFolder to a path where template images are located, and add the registration of the HttpHandler in the web.config.

ImageBasedBotDetector can be used as either a finished control, or as a base control if you want to add your own functionality - the GetTemplateImage and DrawCustom methods are virtual and can be overridden. An example of use you can be found in the demo project #1.

A few answers on your comments

After the previous article publication, I got many comments concerning the reliability of this type of CAPTCHAs. I'll try to systematize and answer them.

It is possible to make a system that can recognize images and find the distorted part.

Quite possible, no doubt. If it is possible to distinguish between a real person and a fake one on a photo, then it can be done for specific distortions on a picture. Most of the text-based CAPTCHAs are also crackable in 90-99% of the cases, but it does not prevent their general use. The fact is that there is no system that can crack everything; a specific implementation is required in every concrete case. Specific implementations mean resources and money. So, don't worry if your site is not as popular as Yahoo! or Google.

A solution that relies on JavaScript is not firm because the script can be disabled.

Sure, this is applicable for about 4% of visitors. But, it is hard to imagine a present-day web application that offers site-to-visitor interaction and does not use JavaScript. Besides, if you use ASP.NET controls, then most probably you already have these visitors excluded.

Insufficient accessibility.

The control now offers three types of distortion: Stretched, Random, and Volute (see picture).

Also, you can implement your own distortion if you set DistortionType to Custom and override the DrawCustom method.

Possibility to download template images and compare them to an image that is generated by the CAPTCHA.

To avoid pixel-to-pixel comparison, an original image is slightly distorted too. As for more complicated comparisons, read paragraph #1.

Besides, a reconstruction of template images set can be avoided if you don't use any images set. It sounds strange, but actually it is rather simple. The basic principle is that though images of specific themes are required, there is no need to sort them manually. So you can use a search engine, for example Google Images search. Select a theme, for example, "Landscape", and choose the appropriate search keywords. Then, override the GetTemplateImage method.

Other Links

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

Comments and Discussions

As far as a bot is concerned this algorithm is analogous to a CAPTCHA whose challenge is "find the character X in this picture". It can be cracked by a generic CAPTCHA cracking algorithm used for text-based CAPTCHAs:
1. Edge detect by shifting 1px and then negating the difference.
2. OCR[^] to detect the distortion design.

Unfortunately with this algorithm, the harder it is for a particular image to be seen by a bot, the harder it is for a human to see it as well, thus defeating the purpose of the CAPTCHA.

As for avoiding image comparison, Hausdorff image matching[^] would make images identifiable even if they are slightly distorted. Storing the image in a database wouldn't make any difference; a lot of bots use the MSIE DOM API to retrieve data from webpages in practically the exact same way that users see them.