Book Review: Weapons of Math Destruction by Cathy O’Neil

Normally the books I review for insideBIGDATA play the role of cheerleader for our focus on technologies like big data, data science, machine learning, AI and deep learning. They typically promote the notion that utilizing enterprise data assets to their fullest extent will lead to the improvement of people’s lives. A good example is the review I wrote early last year on “The Master Algorithm.” But after reading “Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy,” by Cathy O’Neil, I can see that there’s another important perspective that should be considered.

According to O’Neil, Weapons of Math Destruction or WMDs can be characterized by three features: opacity, scale, and the damage the model causes. WMDs can be summarized in the following ways:

An algorithm based on mathematical principles that implements a scoring system that evaluates people in various ways.

A WMD is widely used in determining life-affecting circumstances like the amount of credit a person can access, job assessments, car insurance premiums, and many others.

A common characteristics of WMDs is that they’re opaque and unaccountable in that people aren’t able to understand the process by which they are being scored and cannot complain about them if they’re wrong.

WMDs cause destructive “feedback loops” that undermine the algorithm’s original goals, which in most cases are positive in intent.

The book includes a compelling series of case studies that demonstrate how WMDs can surface as a result of applying big data technologies to everyday life. Here are some examples:

Algorithms used by judges to make sentencing decisions based on recidivism rates.

Algorithms that filter out job candidates from minimum-wage jobs.

Micro-targeting algorithms used in politics that allow campaigns to send tailored messages to individual voters.

The math-powered applications powering the data economy were based on choices made by fallible human beings. Some of these choices were no doubt made with the best intentions. Nevertheless, many of these models encoded human prejudice, misunderstanding, and bias into the software systems that increasingly managed our lives. Like gods, these mathematical models were opaque, their workings invisible to all but the highest priests in their domains: mathematicians and computer scientist. Their verdicts, even when wrong or harmful, where beyond dispute or appeal. The they tended to punish the poor and the oppressed in our society, while making the rich richer.”

O’Neil certainly has the math street cred to write this book. She is a data scientist and author of the popular blog mathbabe.org. She earned a Ph.D. in mathematics from Harvard and taught at Barnard College before moving to the private sector and working for the hedge fund D. E. Shaw. O’Neil started the Lede Program in Data Journalism at Columbia and is the author of a very nice book that I like to recommend to newbie data scientists, “Doing Data Science.”

O’Neil’s main premise is that many algorithms are not inherently fair just because they have a mathematical basis. Instead, they amount to opinions embedded in code. But as a special kind of model, a WMD hides its foundational assumptions in an impenetrable black box. Models like these obscure the source and kind of their input information and model parameters, they rely on proxy data instead of directly observable inputs, and they create invisible feedback loops that make their effects nearly inescapable. The mathematicians who create the algorithms often are unaware of the biases introduced, and sometimes the opaque and powerful algorithms are in effect “secret laws.” O’Neil makes the case that laws protecting consumers and citizens in general are behind the times with respect to the digital age and need to be updated.

As a quant on Wall Street, O’Neil was eager to put her math skills to use. But soon she realized that the hedge fund she was working for was betting against people’s retirement funds, and she became deeply disillusioned. She felt the way math was being used was immoral and once she left Wall Street, she joined the Occupy movement. O’Neil is uniquely qualified to talk about the social and political implications of this kind of math given her deep knowledge of modeling techniques and her insider’s perspective for how companies are using them.

I recently experienced a WMD myself. I attended a local Meetup group here in Los Angeles that was all about the virtues of big data and machine learning. One speaker, a representative from a large, well-known electronic gaming company advised the attendees about how he used data to affect the behavior of gamers. The WMD nature became clear when this data science team manager proudly exclaimed how his company developed technology to “addict kids to their game products” using psychological techniques coupled with algorithms. My take-away was thinking something wasn’t quite right, and it wasn’t until I read O’Neil’s book that I understood he was admitting to using a WMD.

As a data scientist myself, reading O’Neil’s book was eye opening. I reflected back to all the data science projects I’ve worked on and wonder how many of them have evolved to become WMDs. Moving forward, I intend to do a deeper dive on projects I may engage. I think all data scientists should do the same. In light of potential harm, it might be a good idea for data scientists to take a “do no harm”oath, sort of like a Hippocratic oath for data.

My only concern with the book is that it was never really explained that mathematics, as an objective science, cannot itself be blamed for contributing to social inequality. In reality, organizations intent on maximizing profits might inappropriately choose to misuse mathematical models but big data and/or algorithms are not inherently predisposed to perpetuating inequality. The tool is not to blame, it’s the user of the tool.

As a supplement to this book review, here is a nice presentation by the author as one of the “Talks at Google” series where she sounds an alarm on the mathematical models that pervade modern life and threaten to rip apart our social fabric.

I think all data scientists should read Weapons of Math Description in order to add an important filter for judging how their work may be misused. The book recently became available in paperback on September 15, 2017 and includes a special afterward that looks at the failure of algorithms used by news outlets to accurately predict the 2016 presidential election results and the role of Facebook’s algorithms in helping Russian intelligence agencies to spread “fake news” to American voters in an attempt to sway the election (17 U.S. intelligence agencies concluded that Russia interfered in the presidential election).

Contributed by Daniel D. Gutierrez, Managing Editor and Resident Data Scientist for insideBIGDATA. In addition to being a tech journalist, Daniel also is a consultant in data scientist, author, educator and sits on a number of advisory boards for various start-up companies.

Comments

Interesting Review. How does the “Weapons of Math Description” book compare to say “Data Science and Predictive Analytics (https://www.springer.com/us/book/9783319723464)? It Appears as is the former is a bit more philosophical, where as the latter is a bit more practice-oriented. Any thoughts? Thanks for your comments and review.

Resource Links:

Industry Perspectives

In this special guest feature, Brian D’alessandro, Director of Data Science at SparkBeyond, discusses how AI is a learning curve, and exploring opportunities within the technology further extends its potential to enable transformation and generate impact. It can shape workflows to drive efficiency and growth opportunities, while automating other workflows and create new business models. While AI empowers us with the ability to predict the future — we have the opportunity to change it. [READ MORE…]

Latest Video

White Papers

The self-service data analytic journey often begins with data catalog. Download the new white paper from Unifi Software that offers insight on what considerations to take into account when choosing a data catalog in today’s market.