Using WolframAlpha to Hack Text CAPTCHA

I’ve been playing around with the Text CAPTCHA demo page and wondered how well WolframAlpha is at logic questions. As it turns out, Wolfram is pretty smart! Although, since a CAPTCHA requires an exact answer, some of the results from WolframAlpha are logically correct, but are not exactly correct. If someone wanted to use WolframAlpha to crack the text CAPTCHA technology, they could build in filters and such to narrow down answers to what the CAPTCHA is likely looking for.

Out of 10 demo questions, 3 failed and 7 were correct (although, 4 had the correct answer but would fail a CAPTCHA if the exact answer was not parsed out). Here are the results:

Text CAPTCHA: What is seven hundred and forty four as a number? WolframAlpha: NumberQ[744] Result:ALMOST

This entry was posted on Wednesday, November 10th, 2010 at 1:15 pm and is filed under Security.
You can follow any comments to this entry through the RSS 2.0 feed.
You can leave a comment, or trackback from your own site.

18 comments

Text CAPTCHA: What is seven hundred and forty four as a number? WolframAlpha: NumberQ[744] Result: ALMOST Text CAPTCHA: The 7th letter in the word “central” is? WolframAlpha: the word Result: FAILED Text CAPTCHA: Which word in this sentence is all IN c…

True. I didn’t fiddle with the demo page long enough to notice that. If you check out the discussion on Hacker News, you’ll see what some people are saying about the accuracy of WolframAlpha tested against more questions.

You could always take it a step further and render each textCAPTCHA question as an image before serving it. Then it would be considerably more difficult (though not impossible w/ OCR technology) to just plug it into a search engine.

Really though, CAPTCHAs and anti-CAPTCHA measures are just another kind of arms race that will never cease.

I agree that making it an image would make it harder, but that also begs the question, are we trying to make the CAPTCHA’s easier for users to decipher or harder for bots to crack?

Text CAPTCHA wants to address both, which makes sense, but the types of questions that are asked need to remain fairly simple for users and also have absolute answers. So there is some give-and-take there I think.

I imagine that questions is the right direction, but they need to be structured in a more abstract way. Someone on Hacker News commented that a question about the contents of an image would make it harder for bots to comprehend questions and use detection algorithms on images to figure out the contents of the image.

I find interesting that, in the ALMOST cases, they’re two words answers from WAlpha.

The spammer just needs to have this into account, and, when receiving answers with two words from WA, try to use only one of them. If it fails, save the other in a database and wait for the question to appear again.

Spammers would love to have this success ratio with traditional captchas.

@Saiyine – Exactly. Using WolframAlpha directly is not going to be accurate. But understanding the types of responses they give and parsing the correct answer out of it can be easy. Adding a couple other algorithms to the process will definitely help… for the ones that WolframAlpha comes close to answering correctly.

There’s another very important criterion for CAPTCHAs: they must be easy to generate computationally. If they’re not easy to generate, then you get a small number of questions that the bot can cache.

It seems to me that a big limitation of text CAPTCHA is the limited range of query templates. You can write parsers for all of those templates pretty easily, defeating the system. Even if you only get 20% CAPTCHA hit rates, that just reduces the number of accounts you can get per IP address in a given amount of time. But you’re still “in”m fundamentally.

I posted a comment on the Hacker News thread about that. Questions will come in certain sentence structures that become predictable. I think that, if text CAPTCHA’s are to be harder for a bot to guess, the subject matter has to be much more abstract. For instance, instead of asking “what is the second color in the sequence blue, apple, banana, yellow, green, orange?” (which can easily be learned and guessed by a bot) you need to ask something along the lines of “how many people will have warm hands if you hand out 3 hats, 5 pairs of gloves, and 2 scarves?”…. Or at least, maybe that gets closer?

Maybe a list of requirements/criteria need to be established against which each text CAPTCHA question is vetted. Then again, it’s just a CAPTCHA, so does it have to work 100%?

Interesting how good Wofram is at decoding the logic. I’m the author of the textcaptcha service — I think its certainly true that textual captchas will never be as strong as their image counterparts, but they do serve a purpose in a middle-ground. I would hesitate about using them in a misson-critical situation as they form a weaker defence. On the most part CAPTCHAs are used as obstacles to spam rather than hard-line defence: whether this usage is justified is debatable and I believe there is a grey area.

Rendering the questions as an image is not very helpful — this negates the reason to use logic at all (why not a word), you need to provide an audio alternative and is not much of an obstacle as OCR is easy.

I’ll have to have a think about the question construction, but it is a delicate line between making questions that are simply too confusing (or take too long) for everybody to actually understand and solve. I have thought about trying to grade question strength — and allowing people to specify a question strength when they ask for questions: stronger questions would be harder to break but probably more difficult for real users to understand, and the decision would depend on your audience.

I think you are right that CAPTCHA’s are not meant as an absolute means for protecting an application, but a way to slow down spammers and whomever else. I like the idea of text CAPTCHA’s making it easier for people to decipher and from an academic standpoint, playing more with sentence structure for increased strength could be interesting. However, the effort that may go into that might not be practical for your cause? Like you said, CAPTCHA’s are meant as an obstacle, not anything more.

180 million questions is impressive! I look forward to seeing how the methodology for creating questions evolves. Good luck!

@David – Yeah, it does some arbitrary guess work, but could be easily improved. However WolframAlpha have implemented it, it would be interesting to see if they use something like Text CAPTCHA’s database as “practice” in tuning their algorithms….

The colour one is just a coincidence, it doesn’t work at all. Try changing the question to ask for the 1st, 3rd, 4th, first, or last colour in that list; it answers “yellow” to everything. It even answers “yellow” if you ask it “The purple colour in purple, yellow, arm, white and blue is?”!