Search This Blog

Three very different sources of bias in AI, and how to fix them

Since our Science paper came out it's been evident that people are surprised that machines can be biased. They assume machines are necessarily neutral and objective, which is in some sense true -- in the sense that there is no machine perspective or ethics. But to the extent an artefact is an element of our culture, it will always reflect bias.

I think the problem is that people mistake computation for math. Math really is pure, has certain truth, it's eternal, it would be the same without any particular sentient species looking at it. That's because math is an abstraction that doesn't exist in the real world. Computation is a physical process. It takes time, energy, and space. Therefore it is resource constrained. This is true whether you are talking about natural or artificial intelligence. From a computational perspective there's little difference between these.

Click to see Joy Buolamwini's awesome work on fighting the second kind of AI bias I mention here.

People are smart because we are able to exploit the selected "best of" other people's computations. We are super good at communicating (at least as animal species go), and we communicate the best and most useful ways of thinking. The reason AI is making so much progress right now is because we've figured out how to transfer and represent what's already been computed by our culture or our biology into AI. That's the first source of AI bias: unintentionally uploading the implicit human biases that pervade our culture. That's what we demonstrated with our Science paper. There's no real way to fix this without fixing our culture first, so we need to compensate for it when we design our systems.
Prior to our paper coming out, the main source of AI bias people talked about was seen as a consequence of the lack of diversity of AI developers. That is, the second source of AI bias is poorly-selected training data for machine learning, or poorly reasoned rules. So for example training face recognition only on Caucasian faces. This differs from the first source in that it's easier to address, though I don't think it's just a matter of hiring diverse programmers -- we can all occasionally make sexist errors, even women. But nevertheless, when we detect these errors we can fix them, so we can expect that these types of AI biases should iteratively improve and hopefully eventually disappear. Basically, solving this problem comes down to adequately testing systems.

But what I worry about most is actually the least-sexy kind of AI bias. Many people are trying to make out that AI is now self-learning, or at least all programmed via machine learning. I heard some fairly famous AI experts making that strikingly false claim at the Aspen Ideas Festival. No algorithm spontaneously generates a software system or a robot. Every intelligent artefact has system design behind it. Quite a lot of the algorithms that affect people's lives are just macros someone programmed in a spread sheet -- macros they may claim are proprietary. And the third source of AI bias is evil programmers. Or corporations, or governments. Someone sits down and says "I'm a white nationalist and I want other races to get less money." The way to deal with this is to insist on the right to explanation, on due process. All algorithms that affect people's lives should be subject to audit.

That isn't to say there can be no trade secrets. Medicine has tons of trade secrets and IP, but it also has government oversight. That's what AI needs now.

I've been invited to meetings by architects and lawyers recently, and it's amazing to see how those disciplines utterly take for granted cooperation with government, engagement in policy, training in legal responsibilities and so forth. ICT is just lagging behind. We've been affecting people's lives as fundamentally as architects for decades now. It's time our discipline matures and we accept what that means in terms of accountability.

Popular Posts

The good news: We know where word meanings come from
We have a paper in Science,Semantics derived automatically from language corpora contain human biases (a green open access version is hosted at Bath). What this paper shows is that you can find the implicit biases humans have just by learning semantics from our language. We showed this by using machine learning of semantics from the language on the Web, and comparing that to implicit biases psychologists have documented using the Implicit Association Test (IAT). The IAT uses reaction times to show that people find it easier to associate some things than others. For example, it's easier to associate flowers with pleasant terms and bugs with unpleasant terms than the other way around. Notice that the actual statistics underlying the IAT is always about these slightly complicated, dual relative measures. It's easier to: group {flowers and pleasant terms} together, AND {unpleasant terms and insect names} (both those groupin…

The great thing about Solaiman's article is that they make it clear why corporations are legal persons but AI and chimpanzees aren't. Basically, the notion of legal person has been developed to synchronise with our system of justice. Justice among other things requires means of redress and coercion. A legal person most know and be able to claim their rights–they must be able to assert themselves as members of a society. This is why non-human animals (and some incapacitated humans) are not legal persons. I'm happy with definitions of "know" and "assert" that would mean that intelligent artefacts could do this, …