How robots and spiders are causing issues, how to stop them. We can also talk about Completely Automated Public Turing Test To Tell Computers And Humans Apart - their use, their compliance issues, porn proxies, PWNtcha and other ways to defeat them.

Recently I've faced a very powerful software that can act like a real human.
It can spam on guestbooks, forums of various types like PhpBB, YaBB, BulletinBoard.

The url is www.botmaster.net

I knew it since a bad guy started spamming on my forum: http://forum.flashband.net

Now I have to currently disable the forum.

That bot ware is able to recognize cache image validation and check emails for "Activating account"; it can automatically create thousands of accounts and can over flow the database with the aid of multithreading features.

See its demo features:
http://www.botmaster.net/movies/XFull.htm
http://www.botmaster.net/movies/XDemo.htm

How does it defeat captchas? Has it an own OCR module? Maybe it would be useful to check the logfiles after an attack of this tool. Maybe there's a pattern.

What about pure JS Captchas? I was concepting a drag&drop captcha for my company some weeks ago - sth like drag the ball into the circle - drag&drop is an action that even selenium can't record or playback.

jungsonn is exactly right. I'd move towards a custom CAPTCHA in that case. You could also make the page dynamic and more difficult to automatically submit to... just changing it around randomly or adding another page once in a while can throw off bots. Watch it's patterns and circumvent them by changing the required pattern randomly.

a trick I use regularly is making a hidden form, with name, comment, and submit button which points to a bot trap, or logging script only. Put it in a div layer set visibility:hidden, display:none. you get the idea.

and below that your actual true form with captcha. (visible ofcorse)

it's interesting to see that bots get stuck by the first form, and choke in it.
I logged quite some bots now, and they all go to -> dev/null forever.
i thought about proxy forwarding them to FBI.gov webforms processors, don't know yet. I gonna get them soon or later. ^^

Botmaster uses proxies, and in fact I found the proxylist which botmaster utilizes. I came accross it through some good searching and googledorking but I didn't save the proxy list. Anyhow it was pretty huge.

Still i think when one can juggle the contents of your source it's pretty hard to regex on for them. Like using multiple hidden random singup forms. And rotate them around in the source.

Take a way the uniqueness, you take away there share and ease to find and target you.