Welcome

What We Know About the Computer Formulas Making Decisions in Your Life

By Lauren Kirchner / ProPublicaThis piece originally ran on ProPublica.

We reported Thursday on a study of Uber’s dynamic pricing scheme that investigated Uber’s surge pricing patterns in Manhattan and San Francisco and showed riders how they could potentially avoid higher prices. The study’s authors finally shed some light on Uber’s “black box,” the algorithm that automatically sets prices but that is inaccessible to both drivers and riders.

That’s just one of a nearly endless number of algorithms we use every day. The formulas influence far more than your Google search results or Facebook newsfeed. Sophisticated algorithms are now being used to make decisions in everything from criminal justice to education.

But when big data uses bad data, discrimination can result. Federal Trade Commission chairwoman Edith Ramirez recently called for “algorithmic transparency,” since algorithms can contain “embedded assumptions that lead to adverse impacts that reinforce inequality.”

Advertisement

Here are a few good stories that have contributed to our understanding of this relatively new field.

The Journal staff (including Julia Angwin, now a reporter at ProPublica) showed that Staples was giving online customers different prices for the same products depending on how close those customers were to competitors’ stores. Offering different prices to different customers is not illegal, the article points out. “But using geography as a pricing tool can also reinforce patterns that e-commerce had promised to erase: prices that are higher in areas with less competition, including rural or poor areas. It diminishes the Internet’s role as an equalizer.”

Chicago’s police department is at the forefront of “predictive policing”—the idea that police can prevent crimes using a combination of mathematical analysis and careful interventions. Chicago’s “heat list” analyzes residents’ social networks and criminal records to identify people who are most at risk of either perpetrating or falling victim to future violence. (A TechCrunch piece this year discussed some of the thorny problems of bias that this raises.)

A recent Carnegie Mellon study found that Google was showing ads for high-paying jobs to more men than women. Another study from Harvard showed that Google searches for “black-sounding” names yielded suggestions for arrest-record sites more often than other types of names. Algorithms are often described as “neutral” and “mathematical,” but as these experiments suggest, they can also reproduce and even reinforce bias.

The Internet erupted in anger after images of African Americans on Google Photos and Flickr were automatically tagged as “gorillas.” The Chronicle found two underlying issues: the data that programmers use to “teach” algorithmic software matters, and so does the diversity of the Silicon Valley companies that do the teaching. “Not enough photos of African Americans were fed into the program that it could recognize a black person. And there probably weren’t enough black people involved in testing the program to flag the issue before it was released.”

“Risk assessment” scores are being used at different stages of the criminal justice system, to help evaluate whether defendants and inmates will commit crimes in the future. The formulas include things like a person’s age, employment history, and even the criminal records of family members. But is it fair to score people based on not only their own past criminal behavior, but on statistics about other people who fit the same profile? And should these scores be used to help determine their sentences?

Volkswagen recently admitted to rigging the software in millions of its diesel cars to cheat on emissions tests. The Times points out that some new cars now contain computer software that’s more complex than the Large Hadron Collider. Along with increased convenience and safety, the endless lines of code also make it hard for regulators to keep up.

Researchers analyzed hundreds of thousands of military records to create an algorithm that they say the U.S. Army can use to find the soldiers who are at the greatest risk of committing violent crimes. “For men, who accounted for the vast majority of both soldiers and offenders, 24 factors were found to be at play. Those most at risk were young, poor, ethnic minorities with low ranks, disciplinary trouble, a suicide attempt and a recent demotion.”

Massachusetts’ child welfare system is considering adopting “predictive analytics” software to help caseworkers identify the children and families who are at the greatest risk of abuse. Higher “risk scores” are assigned to people with more extensive criminal records, previous drug addictions, previous mental health problems, and other factors. Critics of the plan, like the ACLU’s Kade Crockford, argue that this technology risks “disproportionately ensnaring the poor and parents of color.”

In a notable example of reporters keeping algorithms accountable, a FiveThirtyEight analysis found that Fandango was skewing movie ratings upward. The site, which sells movie tickets, “uses a five-star rating system in which almost no movie gets fewer than three stars.” Confronted with these results, Fandango said that this was due to an error in its “rounding algorithm,” and promised to fix it.