Accountable Algorithms

Ethan Zuckerman had an interesting reaction to his first experience with the TSA Pre-Check program, which lets frequent flyers go through a much shorter and less elaborate procedure at airport security checkpoints. Ethan’s concerns about unfairness are worth pondering, but I want to focus here on his call for more openness about the algorithm that selects people for enhanced search.

Public processes often involve algorithms, and the public has an interest in the openness of these processes. Today I want to expand on what we mean when we talk about this kind of openness. In my next post, I’ll work through a specific example, taken from airport security, and show how we can improve the public accountability of that algorithm.
When we talk about making an algorithmic public process open, we mean two separate things. First, we want transparency: the public knows what the algorithm is. Second, we want the execution of the algorithm to be accountable: the public can check to make sure that the algorithm was executed correctly in a particular case. Transparency is addressed by traditional open government principles; but accountability is different.

Sometimes accountability is easy. If the algorithm is deterministic and fully public, and all of the inputs to the algorithm are known, then accountability is trivial: just run the algorithm yourself, and check that your result matches the output of the public process. But accountability can be more challenging if the algorithm involves randomness, or if one or more of the inputs to the algorithm is (legitimately) secret.

An example of a randomized algorithm comes from airport security. The algorithm might say that 5% of people will be selected at random for search. If officials tell you that you’re one of the unlucky 5%, how can you tell if you were really selected at random, as opposed to being singled out for a reason outside the official algorithm? An accountable algorithm would let you verify that things were done by the book.

Note that accountability does not require that you know in advance that you are going to be selected. In the airport setting, it’s important for security that people are surprised by their selection. What accountability requires is that you can verify afterward that your selection (or non-selection) was legit. To put it another way, an algorithm can be unpredictable yet still be accountable.

What about secret inputs? Again, we have an example from airport security. You can be selected for search because you’re on what I’ll call the “searchlist,” which designates a set of people who are considered higher-risk, so that they are always selected for enhanced search, though they’re not considered dangerous enough to be on the no-fly list. The searchlist is secret, but it is an input into the selection algorithm.

Here, an accountable algorithm would require the authorities to commit to a specific searchlist, but without telling the public what it was. Then the accountability mechanism would ensure that, if you were selected because you were allegedly on the searchlist, you could verify that the searchlist to which the authorities committed did indeed contain your name–but you could not learn anything more about the contents of the searchlist. This kind of accountability is possible, using cryptographic methods.

In practice, the search selection algorithm would probably say that a person is always selected if they are on the searchlist, and selected with 5% probability if they’re not on the searchlist. Now, accountability says that you should be able to tell that your selection was correct, but you shouldn’t be able to tell whether your selection was due to the search list or to random selection. This too turns out to be possible, combining the two accountability mechanisms described above, plus a bit more cryptography.

Accountable algorithms show up elsewhere besides airport security. For example, electronic voting systems often use post-election audits, which need to be accountable so that the public can be sure that the audit was done according the approved procedures.

In my next post, I’ll work through an example, to show how an algorithm can be made accountable.

Comments

> In practice, the search selection algorithm would probably say that a person is always selected if they are on the searchlist, and selected with 5% probability if they’re not on the searchlist. Now, accountability says that you should be able to tell that your selection was correct, but you shouldn’t be able to tell whether your selection was due to the search list or to random selection.

Can’t an attacker differentiate between the two of these in relatively short order regardless of the cryptography involved, by repeatedly probing the system?

Yes, as you say, an adversary can probe the system and figure out within a few trials whether or not he is on the searchlist. A more precise statement of the property we want is that the adversary cannot learn anything about why the algorithm produced the output it did, except for what is revealed inherently by the output itself.

You can make it harder for the adversary to infer his status if you lower the odds that somebody on the searchlist will be searched to something less than 100%. There is an inherent tradeoff between wanting to search people on the searchlist more often, and wanting it to be difficult for the adversary to figure out whether he is on the searchlist.

In practice, this does not seem workable. Doesn’t the TSA need to be able to decide to search/harass people who are not randomly selected without publishing that they have done so? For instance, local police surveiling a suspect who heads to an airport (let’s assume they have a warrant for various searches). The police would like the TSA to search the suspect without informing the suspect he is being specially selected. With a process like this, the suspect would be able to determine what happened later.

If the scenario you described is allowed by the rules, it can be incorporated into the accountability framework. The basic idea would be that local police can ask the central authority to add somebody to a special section of the searchlist. Then accountability would mean that nobody gets searched through this mechanism unless there was an actual request from the local police.

This means that if somebody claims “I was searched at the airport as retaliation for dating the police chief’s ex-wife” there would be a record of the request from local police, or evidence that there was no such request.

To put it another way, the mechanism you describe could be incorporated, and it would make sure that if the sort of thing you describe did happen, it wasn’t done “off the books.”

So how do you solve the infinite-regress problem? If people don’t trust the TSA algorithm (the implemented one, not the spec) then how are you going to get them to trust the accountability part? It, too, may necessarily have secret inputs (including not only inputs whose content is a secret but also inputs whose very existence is secret), so any verification is at some point going to be based on trust that the official doing the verifying isn’t just lying about the whole thing.

In theory, crypto could buy you an assurance, at least, that the algorithm that produced your result is actually the algorithm that the official claims was used. But even that (in the airport case where it’s 5% or 100%) could be subject to replay attacks and the like. I’d hate to miss my flight trying to go over the code.

The point is to arrange things so that any member of the public can do the verification; and so that if somebody claims that the authorities deviated from the announced algorithm then that claim of deviation can also be verified by any member of the public. Cryptography is an important tool for making this possible.

And note that the method allows after-the-fact checking, so that you can rush off to your flight and then do the verification later. Since this is really an auditing procedure, it works fine if most people don’t bother to check. If you suspect a problem, or if you’re just feeling skeptical, you can check; and if enough people check then any systematic deviation will probably be noticed.

This is a good idea in principle, but it doesn’t address any of the real problems with the TSA.

1) While the TSA aren’t really police, they are effectively unaccountable to you and me just as if they were. Whether they violate your body, tear apart or rob your luggage, or just deny you your flight for “bad attitude” (or all three), you can’t arrest them or sue them, so they will never suffer a penalty and you will never get compensation for your losses. An algorithm that lets you determine whether or not they followed some predefined rule in selecting people to inspect is not going to change any of those facts.

2) TSA only guards against one narrow type of threat — the hijacking of a passenger airliner — and that threat has already been solved, as demonstrated aboard Flight 93. Thus TSA doesn’t protect us from anything, and everybody knows it.

And 3) The 9/11 attack would not have been possible in the first place if TSA’s predecessors had not ensured, by declaring all aircraft “gun-free zones”, that only the bad guys would be armed. Think about it. Where else would six people with box-cutter knives have been able to take and hold a crowd of over 100 strangers hostage? Let’s allow people with carry permits from their home states to be armed aboard planes, and send the feds back to DC.

Freedom to Tinker is hosted by Princeton's Center for Information Technology Policy, a research center that studies digital technologies in public life. Here you'll find comment and analysis from the digital frontier, written by the Center's faculty, students, and friends.