Der's blag

Some shtuff

Captcha is a modern simple Turing test for everyday use, for human it's simple, but for bot or a simple neural network captcha can become a hard nut to crack.
You can try to solve it with your AI too, but it definitely can be solved with several lines of code, isn’t it?

Captcha example:

Your browser does not support the audio element.

There are few opportunities to do it:

Use google servers for recognition (slow; boring; not free)

Use some audio recognition package. I’ve tried two of them: could not understand how to use them

Because the voice is artificial, all the sounds really look the same.
This way, we can work with spectrogram like with usual captcha.

At the bottom of the image you can see some low-frequency noise. We have to cut it and with imagemagick it’s really easy. Moreover, sound vary in length.
Now you can use your favourite captcha solver or continue reading.

Next step is separating every sound. I’ve used my old script (you can find it as split.py in files.tar.gz) for it. It simply iterates over every image column and return regions which contains at least one black pixel.

There was a problem with it: sound “nine” on spectrogram was sometimes separated incorrectly (two images instead of one). I could fix that easily even without editing python script, but the result was 80% successful, so I did not bother.