To Debias or Not to Debias?

This blog puts into question the vetted technique of removing sensitive data from algorithms as a way of reducing bias and discrimination.

Artificial intelligence (AI) technologies built into complex social systems such as criminal justice, loans, recruiting, insurance, etc. are becoming more and more widespread. When it comes to understanding the implications of AI in such contexts, people tend to fall into the following reaction types:

AI enthusiasts look at algorithmic decision-making as fundamentally augmenting human judgement. According to this view, because computer models do not hold personal prejudice and arbitrariness, they are best fitted to produce predictions that are rational, impartial and objective, regardless of their application;

AI detractors consider algorithmic reasoning as fundamentally flawed, as inevitably reflecting and amplifying human prejudice and inequalities. Much of the discussion in this space often points at “progress traps” that instead of empowering people to shape possible futures, present a dystopian course of events where powerless humans are taken over by the machines.

Much ado about bias

The more widespread the use of AI is, the more it becomes evident that serious problems can arise from the mismanagement of algorithmic bias.

Many predictive policing tools, for instance, are often found caught in runaway feedback loops of discrimination.Common responses to this issue are individual adjustments to data inputs and corrections to the design that will eventually lead to the creation of “neutral” models where protected variables are omitted, or at least controlled for.

Here, I will refer to this process as “debiasing” and I will question it on the basis of the following reasons:

On one hand, smudging away certain dimensions of the data can cause a skew in the AI tool’s representation of the real world, especially when relevant variables are omitted. And this often directly affects the validity of its predictions in the first place;

On the other hand, debiasing only appears to address the symptoms of bias, and not to cure it. Also, doesn’t it seem curious to address social inequality by telling our algorithm that it doesn’t exist?

It is not biased algorithms, but broader societal inequalities that drive discrimination in the real world.

What algorithms bring to life are patterns that need to be interpreted. And often, it is AI specific use cases and applications that should be questioned and debated, rather than the data used to build them.

In 2016 the COMPASS algorithm, an AI tool used in a number of US courtrooms to predict recidivism risk scores (i.e. whether and when a convict will break the law again) was found predicting that African American defendants were almost twice as likely to re-offend as white defendants, according to a Pro Publica study. In addition, according to this analysis, white defendants were mislabelled as low risk more often than black defendants.

A number of responses have followed, some addressing methodological issues in Pro Publica’s study, others noting how competing notions of fairness have an impact on probabilistic models, making it impossible for a risk score to satisfy fairness criteria for black and white people at the same time.

One thing in this debate is certain: US arrests are not race-neutral. On the contrary, evidence indicates that African-American people are disproportionately targeted in policing. As a result, US arrest record statistics are heavily shaped by this inequality.

Debiasing the COMPASS tool could therefore be considered a totally acceptable trade-off solution - if only the cost of reducing algorithmic accuracy would contribute to the fairness of its application.

However, more often than not, the result of not including sensitive information is just that we get an imperfect tool to implement imperfect actions.

What is sensitive data? It depends on the context...

We’ve already seen how apparently innocuous data, such as addresses, could in fact be used in discriminatory ways. Similarly, lifestyle information, such as smoking or drinking habits are usually not considered as sensitive, but could potentially become so if used to discriminate against employees in the workplace. Even more so, if such information is gathered without individual consent.

If we decide that we need to remove fields from the data - we should not only think about the nature of the bias; but also the why, when and how - and all agree on what data should be considered as sensitive.

Once we recognise the importance of context, we should not worry too much about what goes into algorithms, but what is the consequence of their use.

Instead of fixating on modifying input data or classifying their degree of sensitivity itself, we should consider algorithms as part of larger systems, having specific tasks to accomplish. COMPASS had the great value of unveiling even more the disparity in treatment African American people receive by US police. However, any use of COMPASS to measure recidivism is, in my view, problematic. This is because it is a probabilistic tool - but the consequences of its application, especially in cases of false findings, can have far-reaching effects on the lives of people charged with crimes.

Accepting the usability of AI tools to decide on the future of individuals would implicitly mean accepting the possibility of applying a quantitative, positivist approach to human behaviour. However, the extremely complex reality of constantly changing factors interacting and influencing human relations and reactions makes it incredibly difficult, if not impossible, to make accurate, nontrivial predictions.

Therefore, before asking ourselves if race, gender and so on and so forth should or should not be included in algorithms, we need to actually discuss whether these are the right tools to use in cases that directly affect outcomes for individuals.

Stay up to date

Get our weekly newsletter and tailor your updates on our programmes, events and research

Stay up to date

Join our mailing list to receive updates about Nesta’s work, including the regular Nesta newsletter and tailored information on jobs, funding opportunities, programme updates, new research and publications, event invites and the occasional requests to take part in research or surveys - based on your interests.

Sign up for our newsletter

I'm interested in
*

Education

Creative economy

Government innovation

Innovation policy

Health

Futurescoping

Challenge prizes

Impact investment

You can unsubscribe by clicking the link in our emails where indicated, or emailing [email protected]. Or you can update your contact preferences. We promise to keep your details safe and secure. We won’t share your details outside of Nesta without your permission. Find out more about how we use personal information in our Privacy Policy.

About Nesta

Nesta is an innovation foundation. For us, innovation means turning bold ideas into reality and changing
lives for the better. We use our expertise, skills and funding in areas where there are big challenges facing society.