The 2 Reasons Data May Not Solve America’s Crime Problem

According to a 1987 Rand Corporation study, data surveys can accurately predict the likelihood of repeat offenses as much as 70% of the time. This percentage is tempered by the likelihood that those surveys were used correctly, but regardless of the margin of error these kind of survey-based prediction systems have become increasingly popular at American prisons.

A February article from the Associated Press says states are using such systems to “drive down prison populations, reduce recidivism, and save billions of dollars.” The results of such initiatives have been mixed. On the one hand, states have made significant savings by reforming their prison systems. At the same time, certain cases blow holes in the theory of efficient data-driven systems.

For example, the story highlights the case of 58-year-old Milton Thomas, a parolee, who is accused of raping a 71-year-old woman in Alabama. Before the rape accusation, Thomas had a record of being jailed several times since 2008 for nonviolent crimes, such as check fraud. He was deemed low-risk for committing another crime based on a questionnaire used to assess a criminals’ likelihood of conducting repeat offenses. If the allegations are true, then this is one of the 30% of cases where data surveys fail to accurately predict a repeat offense.

The idea of using data to predict crimes has broad appeal for a number of reasons. For starters, there is a pattern to criminal incidents in that they are circumscribed by common factors, such as location and methodologies. Data can help uncover those patterns and take appropriate corrective action.

Data analysis can also helps reduce overall criminal enforcement costs. According to a report by the Justice Policy Institute, North Carolina could save $560 million through 2017 by using data to assess criminals and effectively deploy law enforcement resources. According to the AP news story, North Carolina saved $84 million between 2011 and 2014 by using data to reduce its prison population by more than 3,000.

Finally, data can be used as part of an overall strategy to reduce incarceration rates across the United States. For example, comprehensive information about crime patterns coupled with an assessment on the ground can provide valuable information about the likelihood of a crime occurring again in a particular neighborhood. This is important because the U.S. has the highest incarceration rate in the world with about 1.6 million prisoners, or 500 prisoners per 100,000 residents, and the cost to maintain this prisoner population is enormous. Comparable countries average incarcerations rates of about 100 per 100,000 residents, according to the Population Reference Bureau.

But the science of data analysis is still evolving, and as the case of Milton Thomas shows us there’s a lot of room to grow. Here are some of the major issues facing criminal data analytics.

1. Data collection methods are not verifiable

Surveys are used to collect data and predict the likelihood of a criminal committing a future crime. But, the nature and intent of the questions leave such surveys open to manipulation and mistakes in interpretation.

As an example, the surveys depend on a criminals’ memory of vital incidents and events in their life. But, their recollection could be faulty or purposely incorrect.

Thomas, the parolee in the AP story, claimed to have been less than 25 when he was first arrested. This would have put him in the high-risk category for criminals. But court records indicated that he was in his 30s when he was first arrested. As a result, he ended up in the low-risk category.

Establishing robust mechanisms to verify and validate information provided by the criminal will go a long way in making this data more effective.

2. Correlation does not always mean causation

According to Marc Goodman, founder of Future Crimes Institute, data “replicates” what a police officer already knows through experience. “Data analysis is more useful when it can reveal more complex information that police officials may not be able to figure out on their own,” he said. In other words, data is useless without correlations.

As mentioned earlier, the survey lists several questions. For example, Arkansas’s questionnaire has a hundred questions that affix a score to information such as an offender’s education, family, income, job status, history of moving, parents’ criminal record etc. The ostensible intent, as one can guess, is to derive causation from correlation.

There are two problems with this methodology.

First, the questions are not holistic enough. They are designed to assess the possibility of a future crime that is similar to a previous one. But Thomas’ crimes, which were nonviolent earlier, changed course and nature to a violent direction later.

Changed societal dynamics and forces, such as the Internet, provide relatively easy material access in the form of websites for such transformations to occur. What’s more, the preponderance of anonymous names and pseudonyms on the Internet makes it relatively easy for potential criminals to commit crimes or transform their online and offline identity.

Second, the questions perpetuate stereotypes about a criminal’s background without specific causes. For example, the assessment questionnaire in Arkansas uses the number of times that someone has moved as a data point. This question may not be a good indicator of propensities towards crime. After all, there are a number of legitimate professions that require constant movement.

Criminals are products of multiple factors and circumstances in society. Data can help illuminate and identify these circumstances. But, that data should be tempered with human assessments before final judgments about the likelihood of another crime being conducted by criminals.