Predictive policing, with roots in business analytics, relies on using advanced technological tools and data analysis to take proactive measures to “pre-empt” crime. Regarded as a refinement of “intelligence-led policing” - which came to the US from the UK where it has led police to focus on research-based approaches rather than responding to service calls - predictive policing appears to take another large step away from community policing and accountability.

Image: Predictive Policing in Santa Cruz, Ca.

Predictive policing, with roots in business analytics, relies on using advanced technological tools and data analysis to take proactive measures to “pre-empt” crime.

Predictive policing has been closely identified with the Los Angeles Police Department, whose Chief of Detectives Charlie Beck defines it in these terms:

"With new technology, new business processes, and new algorithms, predictive policing is based on directed, information-based patrol; rapid response supported by fact-based prepositioning of assets; and proactive, intelligence-based tactics, strategy, and policy. The predictive-policing era promises measurable results, including crime reduction; more efficient police agencies; and modern, innovative policing."(1)

It essentially applies the Total Information Awareness approach to policing:

"Advanced analytics includes the systematic review and analysis of data and information using automated methods. Through the use of exploratory graphics in combination with advanced statistics, machine learning tools, and artificial intelligence, critical pieces of information can be identified and extracted from large repositories of data. By probing data in this manner, it is possible to prove or disprove hypotheses while discovering new or previously unknown information. In particular, unique or valuable relationships, trends, patterns, sequences, and affinities in the data can be identified and used proactively to categorize or anticipate additional actions or information. Simply stated, advanced analytics includes the use and exploitation of mathematical techniques and processes that can be used to confirm things that we already know or think that we know, as well as discover new or previously unknown patterns, trends, and relationships in the data."

Regarded as a refinement of “intelligence-led policing” - which came to the US from the UK where it has led police to focus on research-based approaches rather than responding to service calls - predictive policing appears to take another large step away from community policing and accountability.

A feedback loop of injustice

The predictive policing model is deceptive and problematic because it presumes that data inputs and algorithms are neutral, and therefore that the information the computer spits out will present police officers with objective, discrimination-free leads on where to send officers or deploy other resources. This couldn't be farther from the truth.

As Ronald Bailey wrote for Reason, "The accuracy of predictive policing programs depends on the accuracy of the information they are fed." Many crimes aren't reported at all, and when it comes to the drug war, we know for certain that police don't enforce the law equally.

Take marijuana arrests as an example. We know that black people and Latinos are arrested, prosecuted and convicted for marijuana offenses at rates astronomically higher(3) than their white counterparts, even if we adjust for income and geography. We also know that whites smoke marijuana at about the same rate as blacks(4) and Latinos.

Therefore we know that marijuana laws are not applied equally across the board: Blacks and Latinos are disproportionately targeted for associated arrests, while whites are arrested at much lower rates for smoking or selling small amounts of marijuana.

Now consider that these arrest data are put into computer programs instructed to spit out information to officers about where to target police patrols -- what's called predictive policing. The returned intelligence telling police departments where to target their patrols is supposedly accurate because arrest data fed into a computer algorithm produced it.

But if historical arrest data shows that the majority of arrests for marijuana crimes in a city are made in a predominately black area, instead of in a predominately white area, predictive policing algorithms working off of this problematic data will recommend that officers deploy resources to the predominately black area -- even if there is other information to show that people in the white area violate marijuana laws at about the same rate as their black counterparts.

If an algorithm is only fed unjust arrest data, it will simply repeat the injustice by advising the police to send yet more officers to patrol the black area. In that way, predictive policing creates a feedback loop of injustice.

A thought experiment may help elucidate the problem.

It's sort of a cultural axiom in the United States that high-powered bankers and lawyers have tastes for expensive cocaine and prostitutes, but because these kinds of illegal activities take place in boardrooms and fancy hotels, instead of on street corners in poor neighborhoods, and because the people doing them are powerful officials with ties to political leaders and access to piles of money, there aren't very many arrests for those crimes in this demographic.

If police arrested lots of bankers and lawyers for cocaine use and for hiring expensive sex workers, we might see predictive policing algorithms sending cops to patrol rich suburbs or fancy hotels in downtown areas. Instead, the algorithms simply reproduce the unjust policing system we've got, and dangerously, add a veneer of 'objectivity' to that problem. The information came out of a computer, after all, so it must be accurate!

Law officers like to say that predictive policing helps them dodge questions about racism and unequal policing. But data isn't neutral, and neither are the algorithms tasked to sort through and make sense of those pieces of information.

TIA became “Terrorism Information Awareness” after Total Information Awareness was condemned as sounding too Orwellian.

TIA, a $240 million program, first came to public attention in a New York Times piece by John Markoff dated November 9, 2002. The article reported that through Total Information Awareness, intelligence and law enforcement officials would be given “instant access to information from Internet mail and calling records to credit card and banking transactions and travel documents, without a search warrant.” TIA would not just enable the government to develop “cradle to grave dossiers” on known individuals. It would also (in theory) have the ability to detect terrorists and their plots by subjecting massive troves of electronic information to data mining techniques.

See below for the TIA Systems Plan (PDF)

After TIA was publicly unmasked, it faced withering criticism all along a political spectrum. In the words of the right-wing libertarian Cato Institute, this “power to generate a comprehensive data profile on any US citizen” involved “the specter of the East German secret police and communist Cuba’s block watch system” (G. Healy, “Beware of Total Information Awareness,” January 20, 2003).

Data mining programs that were part of DARPA’s TIA before it was de-funded by Congress have been transferred to the National Security Agency. They include:

Automated Detection, Identification and Tracking of Deceptive Terrorist Activity, developed by 21st Century Technologies, Inc. of Austin, Texas (USA Today, July 20, 2006).

Research “into the mass harvesting of the information that people post about themselves on social networks… it could harness advances in internet technology…to combine data from social networking websites with details such as banking, retail and property records, allowing the NSA to build extensive, all-embracing personal profiles of individuals” (New Scientist, June 9, 2006).

Analysis, Dissemination, Visualization, Insight and Semantic Enhancement (ADVISE) which would “troll a vast sea of information, including audio and visual and identify suspicious people, places, and other elements based on their links and behavioral patterns” (The Washington Post, February 28, 2007.

According to a June 2006 report by the Government Accountability Office, there were at least 199 TIA-style data mining projects funded by the government that trawled through huge amounts of information in hopes of finding links or patterns to locate suspicious activity.

On January 20, 2007, Senator Patrick Leahy (D-VT) held a Senate Judiciary Committee hearing about the privacy implications of these data mining programs. He reported that apart from the programs based at the NSA, there were 14 different government data mining programs run by the Departments of Defense, Justice, Homeland Security and Health. “Although billed as counterterrorism tools, the overwhelming majority of these data mining programs use, collect, and analyze personal information about ordinary American citizens…a mistake in a government data base could cost a person his or her job, sacrifice their liberty, and wreak havoc on their life and reputation.”