Algorithms can be more accountable than people

At an academic meeting recently, I was surprised to hear some social scientists accept as obviously correct the claim that involving “algorithms” in decision-making, instead of sticking with good old-fashioned human decision-making, necessarily reduces accountability and increases the risk of bias. I tend to believe the opposite, that making processes algorithmic improves our ability to understand why they give the results they do. Let me explain why.
Consider a process to decide who receives some award or benefit; and suppose we want to make sure the process is not biased against some disadvantaged group, which I’ll call Group G. If a person just makes the decision, we can ask them whether they were fair to members of Group G. Or we can ask them why decided the way they did. Either way, they can simply lie about their true motivation and process, to construct a story that is consistent with non-discrimination; or they might honestly believe their decision was fair even though it reflected unconscious bias. At the risk of massive understatement: history teaches that this kind of bias in human decision-making is difficult to prevent.

An algorithm, by contrast, cannot hide from everyone the details of how it reached its decision. If you want to know that an algorithm didn’t use information about a person’s Group G status, you can verify that the Group G status wasn’t provided to the algorithm. Or, if you prefer, you can re-run the algorithm with the Group G status field changed, to see if the result would have been different. Or you can collect statistics on whether certain parts of the algorithm have a disparate impact on Group G members as compared to the rest of the population.

This is not to say that everything about algorithms is easy. There are plenty of hard problems in understanding algorithms, both in theory and in practice. My point is merely that if you want to understand how a decision was made, or you want to build in protections to make sure the decision process has certain desirable properties, you’re better off working with an algorithm than with a human decision, because the algorithm can tell you how it got from inputs to outputs.

When people complain that algorithms aren’t transparent, the real problem is usually that someone is keeping the algorithm or its input data secret. What makes the process non-transparent is that the result is emitted without explanation—which is a non-transparent approach no matter what is behind the curtain, a person or a machine.

Of course, a company might be justified legally in keeping their algorithm secret from you; and it might be good business for them to do so. Regardless, it’s important to recognize that non-transparency is a choice they are making and not a consequence of the fact that they’re using computation.

If accountability is important to us—and I think it should be—then we should be developing ways to reconcile transparency with partial secrecy, so that a company or government agency can keep some aspects of their process secret when that is justified, while making other aspects transparent. Transparency needn’t be an all-or-nothing choice.

Comments

I think you’ve got your asocial nerd hat jammed on a little too tight today. Even when the algorithms are transparent, one big reason for their use is to justify inflexibility — “why should we make an exception for you, once we start tweaking The Algorithm for you, when do we stop? Rules are rules”. But algorithms are not necessarily neutral, fair, rational, or even good; often their main purpose is just to add legitimacy to authority, or to help convince people that we do in fact live in the Best of All Possible Worlds.

Sure. Algorithms are used to justify inflexibility and power imbalance. That’s exactly why we need to deflate the idea that there is something magical about “algorithms” that justifies secrecy or non-accountability.

_Your_ application of an algorithm is only as accountable to me, as _you_ are accountable to me. If I have no power over you, then whether you use human whim or pitiless algorithm isn’t going to change the fact that I can’t do anything about your decision. Similarly, if I _do_ have power over you, then any “justification for inflexibility” that you give me, isn’t likely to stay my hand.

My aunt was a biology professor. I remember one time when she came to me and said “Here are the test and homework scores for everyone in my class. Tell me what weight to assign to each so that these specified students get these grades.”

So published algorithms *can* add transparency. But you may also need to know how the algorithm was chosen, and especially how the parameters in the algorithm were selected.

Not just the parameters, the whole algorithm (which is sort of meta-parameters). See, for example, elliptic curves vs factoring. Or any of zillions of opaque derivatives that were rigged to ensure profit for one party. Or the results of all those obfuscated-programming contests.

Algorithms can be transparent if the person looking at them from the outside has access to the algorithms, all the underlying data sets, and at least the computational and intellectual resources of the people who created the algorithm. Otherwise they tend to be a way to reduce accountability without coming right out and saying so.

If algorithms are just an excuse to impose an intended outcome, or to avoid accountability, then we only make things worse by having hand-wringing discussions about how algorithms are inherently difficult to govern or inherently resistant to accountability.

I think the point is perhaps the difference between “inherently” (intrinsically, as a fundamental property) and something like “extraordinarily likely to the point of practically inevitable in a given social/political context which is the relevant discussion” (empirically highly correlated to an extremely high degree in a particular situation). To continue the “discrimination” analogy, when politicians and pundits talk(ed) of “state’s rights”, I suppose somewhere someone could be found who really was concerned with that topic as a governmental theory. But almost every time, it was a euphemism for supporting segregation. So was “state’s rights” _inherently_ about segregation? No, not in the most abstract and theoretical sense. But in practice, in context, as a phrase it was and is highly entwined with racial politics and various justifications thereof. So it would be kind of missing what was going on, to object like “If state’s rights are just an excuse to impose Jim Crow, then we only make things worse by having hand-wringing discussions about how states rights are inherently racist or inherently segregationist”. Formally it’s not “wrong” linguistically, but it’s arguably _overly_ pedantic in stripping out real-world context and then objecting based on the version without that relevant real-world context.

If someone said in a political debate “State’s rights is racism!”, it’d be a very problematic reply to respond along the lines of “No, no, that’s making things worse, because it’s asserting an identity between a political theory of government which is neutral, versus a system of irrational prejudice, so those obviously cannot be the same. Poor thinking doesn’t help the cause of equality. Perhaps you mean to say that while state’s rights is of course a neutral and non-racist and historically distinguished principle, the particular use of it by the politician in question is an unjustified attempt to support segregation, which we do agree is immoral”. See the problem? The latter phrasing is much more literally accurate and academically correct, but also arguably belaboring at length an implicit context.

Classification done by machine learning can be downright opaque. Yes, you could re-run the classifier with different input to see if the result would be different in a particular case, but it might be near-impossible to figure out if there’s subtle discrimination against an entire group. Perhaps being a member of Group G reduces your chances of receiving the benefit by something small like 1%. That’s hard to detect, especially if membership in Group G is not a specific input to the classifier but something that might be inferred from a combination of other inputs.

The silver lining is that it’s hard for someone to game a machine learning classifier.

Freedom to Tinker is hosted by Princeton's Center for Information Technology Policy, a research center that studies digital technologies in public life. Here you'll find comment and analysis from the digital frontier, written by the Center's faculty, students, and friends.