Predicting Crime, 140 Characters at a Time

A UVA professor works at predicting crime patterns via Twitter

In the not-too-distant future, your local police department may be using an unlikely source to solve crimes: Twitter. If so, they may have UVA research assistant professor Matt Gerber to thank.

Gerber, 32, along with his colleagues in the Systems and Information Engineering Department, developed a computer program model that can predict where certain crimes might happen in particular areas, using information pulled from Twitter.

And for 19 of 25 crimes that occurred in metropolitan Chicago last year, the computer model’s predictions were more accurate than predictions based on historical crime patterns alone.

“There are still giant research questions to be figured out along the way,” Gerber says, “but it’d be fantastic if we could get this into practical use, effecting positive outcomes.”

After arriving at UVA in 2011, Gerber joined the University’s Predictive Technology Lab, focusing on crime analysis. It wasn’t long after that Gerber wondered whether Twitter could help crime-prediction models.

Initially, Gerber and his colleagues pulled tweets from local news accounts. But they realized they needed individual Twitter users to find predictive patterns in the data, rather than reactive information. Criminals don’t typically log on to Twitter and talk about the crimes they’re going to commit, so Gerber and his team began looking at the activities of everyday users, talking about where they’re going out on Saturday night, or what sporting events they’ll attend this weekend.

“What we’re building on is the idea that those activities—you and your friends going out to the bar tonight—you aren’t planning to commit a crime necessarily, but a lot of people going to a bar correlates with certain types of crimes,” Gerber says.

Using the Chicago area as their focus, Gerber and his team took more than 1.5 million GPS-tagged tweets in the Chicago area from January to March of 2013. They divided Chicago into geographical squares, which they called “neighborhoods.”

They grouped the tweets from a particular neighborhood and ran a computer-driven analysis to determine what was being discussed. Next, they took that output and scored how strong a topic was within that neighborhood.

They then looked at where certain crimes, like theft, tended to occur, before building a computer model drawing the correlation between the two. After creating a model based on data for January, Gerber asked the computer to build speculative data for February. He and his team then compared the computer’s data with the actual data and found that in the majority of crime types, the accuracy of prediction improved with his computer model’s findings, as opposed to basing the prediction on historical crime-pattern data alone.

“With some crime types, the accuracy goes up significantly when using Twitter, but it depends on the type of crime,” Gerber says. The biggest accuracy improvements came in the areas of criminal damage, gambling and stalking.

Given the early stages of his findings, police departments aren’t signing up for his technology just yet. “We have to show improvements in the accuracy,” Gerber says. “We need to demonstrate that if you used our approach [it could] create improvements in response time to crime, or reductions in crime in ‘hot spots.’ Ultimately, if it were actually used by crime analysts, that’d be fantastic.”