Using machine learning to predict gender

This all started with a simple question: could we train an algorithm to determine if a Twitter account belonged to a man or a woman? With that in mind, we ran a simple data categorization job, fired up our brand new CrowdFlower AI feature, and tried to answer just that. What we found was, well, pretty damn interesting. But no spoilers. We’ll get to all that in a second. Let’s take a step back and start at the beginning. Here’s how we did it To run any CrowdFlower job, you of course need data, in this case tweets. The first challenge with a question like this is exactly what sort of tweets do you pull? To put it another way: if you fetch social data about, say, an especially seedy…