Both of these approaches have their pros and cons, of course. This article will only deal with the second technique: verifying that the data you are receiving is coming from an actual human being and not a robot or script. A CAPTCHA is a way of testing input to ensure that you're dealing with a human. Now, there are a lot of ways to build a CAPTCHA, as documented in this MSDN article on the subject, but I will be focusing on a visual data entry CAPTCHA.

So, this article will document how to turn a set of existing ASP.NET web pages into a simple, drag and drop ASP.NET server control -- with a number of significant enhancements along the way.

Implementation

The first thing I had to deal with was the image generated by the CAPTCHA class. This was originally done with a dedicated .aspx form-- something that won't exist for a server control. How could I generate an image on the fly? After some research, I was introduced to the world of HttpModules and HttpHandlers. They are extremely powerful -- and a single HttpHandler solves this problem neatly.

All we need is a small Web.config modification in the <system.web> section:

This handler defines a special page named CaptchaImage.aspx. Now, this "page" doesn't actually exist. When a request for CaptchaImage.aspx occurs, it will be intercepted and handled by a class that implements the IHttpHandler interface: CaptchaImageHandler. Here's the relevant code section:

PublicSub ProcessRequest(ByVal context As System.Web.HttpContext) _
Implements System.Web.IHttpHandler.ProcessRequest
Dim app As HttpApplication = context.ApplicationInstance
'-- get the unique GUID of the captcha;' this must be passed in via querystringDim strGuid AsString = Convert.ToString(app.Request.QueryString("guid"))
Dim ci As CaptchaImage
If strGuid = ""Then'-- mostly for display purposes when in design mode'-- builds a CAPTCHA image with all default settings '-- (this won't reflect any design time changes)
ci = New CaptchaImage
Else'-- get the CAPTCHA from the ASP.NET cache by GUID
ci = CType(app.Context.Cache(strGuid), CaptchaImage)
app.Context.Cache.Remove(strGuid)
EndIf'-- write the image to the HTTP output stream as an array of bytes
ci.Image.Save(app.Context.Response.OutputStream, _
Drawing.Imaging.ImageFormat.Jpeg)
'-- let the browser know we are sending an image,'-- and that things are 200 A-OK
app.Response.ContentType = "image/jpeg"
app.Response.StatusCode = 200
app.Response.End()
EndSub

A new CAPTCHA image will be generated, and the image streamed directly to the browser from memory. Problem solved!

However, there's another problem. There has to be communication between the HttpHandler responsible for displaying the image, and the web page hosting the control -- otherwise, how would the calling control know what the randomly generated CAPTCHA text was? If you view source on the rendered control, you'll see that a GUID is passed in through the querystring:

This GUID (globally unique identifier) is a key used to access a CAPTCHA object that was originally stored in the ASP.NET Cache by the control. Take a look at the CaptchaControl.GenerateNewCaptcha method:

It may seem a little strange, but it works great! The sequence of ASP.NET events is as follows:

Page is rendered.

Page calls CaptchaControl1.OnPreRender . This generates a new GUID and a new CAPTCHA object reflecting the control properties. The resulting CAPTCHA object is stored in the Cache by GUID.

Page calls CaptchaControl1.Render; the special <img> tag URL is written to the browser.

Browser attempts to retrieve the special <img> tag URL.

CaptchaImageHandler.ProcessRequest fires. It retrieves the GUID from the querystring, the CAPTCHA object from the Cache, and renders the CAPTCHA image. It then removes the Cache object.

Note that there is a little cleanup involved at the end. If, for some reason, the control renders but the image URL is never retrieved, there would be an orphan CAPTCHA object in the Cache. This can happen, but should be rare in practice-- and our Cache entry only has a 20 minute lifetime anyway.

One mistake I made early on was storing the actual CAPTCHA text in the ViewState. The ViewState is not encrypted and can be easily decoded! I've switched to ControlState for the GUID, which is essential for retrieving the shared Captcha control from the Cache -- but by itself, it is useless.

CaptchaControl Properties

The CaptchaControl is a good ASP.NET citizen, and properly implements all the default ASP.NET Server Control properties. It also has a few properties of its own:

Property

Default

Description

CacheStrategy

HttpRuntime

For security reasons, the CAPTCHA text is never sent to the client; it is only stored on the server. It can be stored in Session (web-farm friendly) or HttpRuntime (very fast, but local to one webserver).

CaptchaBackgroundNoise

Low

Amount of background noise to add to the CAPTCHA image. Ranges from None to Extreme.

CaptchaChars

A-Z, 1-9

A whitelist of characters to use when building CAPTCHA text. A character will be picked randomly from this string. By default, I omit some characters likely to be confused, such as O, 0, I, 1, 8, B, etcetera.

CaptchaFont

""

Font family to use for the CAPTCHA text. If not provided, a random installed font will be chosen for each character. A font whitelist is maintained internally so only known legible fonts will be used (e.g., not WingDings).

CaptchaFontWarping

Low

Level of warping used on each character of the CAPTCHA text. Ranges from None to Extreme.

CaptchaHeight

50

Default height of the CAPTCHA image, in pixels.

CaptchaLength

5

Number of characters used in the randomly generated CAPTCHA text.

CaptchaLineNoise

None

Amount of "scribble" line noise to add to the CAPTCHA image. Ranges from None to Extreme.

CaptchaMaxTimeout

90

Number of seconds that the CAPTCHA will remain valid and stored in the cache after it is generated.

CaptchaMinTimeout

3

Minimum number of seconds the user must wait before entering a CAPTCHA.

CaptchaWidth

180

Default width of the CAPTCHA image, in pixels.

UserValidated

False

After postback, returns True if the user entered text that matches the randomly generated CAPTCHA text. Note that the standard IValidation interface is implemented as well.

LayoutStyle

Horizontal

Determines if the text and input box are to the right, or below, the image. Allows greater layout flexibility.

Many of these properties have to do with the inherent tradeoff between human readability and machine readability. The harder a CAPTCHA is for OCR software to read, the harder it will be for us human beings, too! For illustration, compare these two CAPTCHA images:

The CAPTCHA on the left is generated with all "medium" settings, which are a reasonable tradeoff between human readability and OCR machine readability. The CAPTCHA on the right uses a lower CaptchaFontWarping, and a smaller CaptchaLength. If the risk of someone writing OCR scripts to defeat your CAPTCHA is low, I strongly urge you to use the easier-to-read CAPTCHA settings. Remember, just having a CAPTCHA at all raises the bar quite high.

The CaptchaTimeout property was added later to alleviate concerns about CAPTCHA farming. It is possible to "pay" humans to solve harvested CAPTCHAs by re-displaying a CAPTCHA and giving the user free MP3s or access to pornography if they solve it. However, this technique takes time, and it doesn't work if the CAPTCHA has a time-limited expiration.

Conclusion

Many thanks to BrainJar for creating his simple yet effective CAPTCHA image class. Now that I've wrapped it up into an ASP.NET server control, it should be easier than ever to simply drop on a web form, set a few properties, and start defeating spammers at their own game!

There are many more details and comments in the demonstration solution provided at the top of the article, so check it out. And please don't hesitate to provide feedback, good or bad! I hope you enjoyed this article. If you did, you may also like my other articles as well.

History

Monday, November 8, 2004 - Published.

Friday, December 17, 2004 - Version 1.1

added UserValidationEvent

changed defaults to be less aggressive (more user friendly)

added LayoutStyle property for choice of horizontal or vertical layout

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

Share

About the Author

My name is Jeff Atwood. I live in Berkeley, CA with my wife, two cats, and far more computers than I care to mention. My first computer was the Texas Instruments TI-99/4a. I've been a Microsoft Windows developer since 1992; primarily in VB. I am particularly interested in best practices and human factors in software development, as represented in my recommended developer reading list. I also have a coding and human factors related blog at www.codinghorror.com.

Is there something special that needs to get this to work under IIS7? It worked perfectly when I tested this under debug mode on VS2008, but now I've compiled the project and moved it to my testing webserver (Windows 2008R2 / IIS7), and the captcha image isn't being generated.

All ready then, getting the captcha control (CC1) functioning as it should was one thing but then how does one "use it".

After some time I realized the following:
CC1 in the DLL communicates with the Validation control (V1).
At CC1 code entry, V1 checks the entry validity. V1 sets CC1.UserValidated = true if the CC1 code is correct.
A submit button simply has to check CC1.UserValidated.

The next issue was IIS7 integrated mode and HttpHandler requests.
I was required to enter this line in Web.config as shown below:
<validation validateIntegratedModeConfiguration="false" />

This enabled captch image generation on both my local developement server and the remote IIS 7 server at discountasp.net (DASP.net). It appears that I have an issue with integrated mode on IIS 7 on DASP.net. Haven't sorted that out yet.

Thanks to CodeProject and Wumpus1 for providing a Captcha solution for .Net.

if you see everything is ok in local but you have problem in your website just replace "MSCaptcha.CaptchaImageHandler" with "MSCaptcha.captchaImageHandler" means just type captchaImageHandler with lower "c"

if you see everything is ok in local but you have problem in your website just replace "MSCaptcha.CaptchaImageHandler" with "MSCaptcha.captchaImageHandler" means just type captchaImageHandler with lower "c"

Relying on a guid passed via a querystring opens this approach up to injection.
An attacker could decode one guid, then keep passing the same guid in a querystring.
You should either
- keep the guid (or captcha text) on the server, or
- only allow a guid to be used once

I think it's appropriate to note that there isn't always a direct correlation between human readability and computer readability. I've seen many CAPTCHA systems that are very difficult for humans to read but are really easy for computers. The most notable mistake in this area is random colors and many small dots or lines (they need to be the same thickness as the letters or they'll be removed by a bot in one fast command leaving only easily segmentable letters).

To be honest, this CAPTCHA can be broken with fairly high accuracy using only 10 lines of my CBL scripting language (http://code.google.com/p/captcha-breaking-library/[^]). This system could be greatly strengthened just by moving the letters closer together, morphing/rotating the characters more, and possibly making the random lines the same thickness as the average letter.

I tried to place the Captcha Control on a page in my site. Everything seemed to work perfectly right from the start except that the image didn't show. I double checked everything and it all looked fine, debug didn't even give any useful info about the problem. After a lot of frustration, I discovered it was the deny rule that was preventing it. Add this code to your Web.Config to solve that: