Ben Laurie on Selective Disclosure (Part 2)

In introducing people to his Selective Disclosure paper, Google's Ben Laurie says:

In fact, there are many ways CardSpace could violate the laws, but there is one which it is currently inherently incapable of satisfying, which is the 4th law – the law of directed identity – which says, once youâ€™ve fought your way through the jargon, that your data should not be linkable.

Ben doesn't like the term “omni-directional identifier”, which is certainly his prerogative. But speaking of jargon, note that according to the Oxford English Dictionary, Ben's word “linkable” doesn't actually exist…

Meanwhile “omni-directional” is a real word from the field of telecommunications (isn't that what we are working in?) It means “of equal sensitivity or power in all directions”. Similarly, “unidirectional” means “moving or operating in a single direction”.

So I think the Fourth Law is precise enough:

A universal identity system must support both omni-directional identifiers for use by public entities and unidirectional identifiers for use by private entities, thus facilitating discovery while preventing unnecessary release of correlation handles.

The word “discovery” is again used in the technical sense of making it possible to locate and rendezvous with other entities. The goal here is to embrace the range of use cases I discussed in part one.

Ben argues that CardSpace breaks this Fourth Law. In this he is wrong. I suspect he is confusing CardSpace, which is just a way of selecting between identity providers, and the identity providers themselves. Let's get that really straight.

CardSpace is one component of the Identity Metasystem – by which I mean a distributed mesh of parties asserting and depending upon identity. CardSpace informs the user about the identity providers that can satisfy a given relying party request, and provides a mechanism to tell the user what information must be released to the relying party in order to gain admission to the site. But the identity selector does not itself make assertions or sign them cryptographically, or do anything that introduces linkability or traceability. If linkability is introduced, it is because of the choice of the identity provider people use with the the system, and of the cryptographic systems they employ.

Privacy characteristics of the self-asserted Identity Provider

While CardSpace is an identity selector, we have built it to include a self-asserted identity provider as a way to bootstrap the system. So what are the privacy characteristics of this provider?

It emits no identifiers that allow the identity of the user on one system to be linked to the identity of the user on any other.

A different signing key is used at each site visited so keys and signatures cannot be correlated.

Relying parties cannot collude with this identity provider because it is run by the user.

So the Identity Provider that ships with CardSpace does not break the Fourth Law either.

Managed Card Providers

Now let's talk about managed card providers, and think about other systems such as Liberty or Shibboleth or OpenID or SAML or WS-Federation in browser mode.

These systems always identify the relying party to the identity provider. They do so because they need to tell the identity provider how to redirect the client's browser back to the relying party. So they are what I call panoptical by design. There is zero collusion required for the identity provider to know what is being asserted where – that knowledge is designed right into the system.

CardSpace breaks with this paradigm, and I hope Ben comes to recognize this.

To support use cases where we do NOT want linkability, CardSpace hides the identity of the relying party from the identity provider. If we had built it without this feature, it too would have been “panoptical by design”. But we didn't do that. We built it to conform with the Fourth Law. In other words, CardSpace does not provide unnecessary handles to facilitate linkability.

How does CardSpace hide the identity of the relying party? It associates some random information – unknown to the identity provider – with each Information Card. Then it hashes this random information (let's call it a “salt”) with the identity of the site being visited. That is conveyed to the identity provider instead of the identity of the site. We call it the “Client Pseudonym”. Unlike a Liberty Alliance client pseudomym, the identity provider doesn't know what relying party a client pseudonym is associated with.

The identity provider can use this value to determine that the user is returning to some site she has visited before, but has no idea which site that would be. Two users going to the same site would have cards containing different random information. Meanwhile, the Relying Party does not see the client pseudonym and has no way of calculating what client pseudonym is associated with a given user.

The question now becomes that of how identity providers behave. Given that suddenly they have no visibility onto the relying party, is linkability still possible? I'll discuss this next.

5 thoughts on “Ben Laurie on Selective Disclosure (Part 2)”

I appreciate your clarification of how CardSpace hides the identity of the relying party from the identity provider, as this was not at all clear to me from anything I'd read to date on CardSpace. In fact, if anything the introductory documentation implies the opposite: here's a quote from “Introducing Windows CardSpace” (perhaps the basis for Ben's assumption?):

It's also important to note what's not in an information card: sensitive data about this identity. For example, an information card created by a credit card company would not contain the user's credit card number. While this kind of sensitive information might appear as a claim in a security token created by an identity provider, it is always stored at the identity provider's system. When sent in a security token, this information is typically encrypted, making it inaccessible to both attackers and CardSpace.

How can the identity provider encrypt anything in a way which can only be read by the relying party if the identity provider does not know who the relying party is?

Furthermore, if the identity provider doesn't know who the relying party is, then why does CardSpace need to send a “Client Pseudonym” at all? What good does that do?

To the best of my understanding the linkability concern with CardSpace is in the case where the relying party and identity provider are in cahoots with each other, comparing timestamped requests. Because CardSpace requests a fresh claim from the IP each time it needs to provide one to an RP, there is room for IP-RP misbehavior. As you made clear, though, this is by no means unique to CardSpace.

Gee – you have won the “best formatted comment” award hands down – and your points are great too! They will help me clarify.

It is hard to get the subtlety of what we have done accross – and even harder to control all the “collateral”. I hope I haven't appeared dismissive Ben's comments, which I really value, especially to the extent that our messaging hasn't been clear enough.

I know we all share the same goals. And I appreaciate both your and Ben helping to clarify these issues.

I am also very curious about why a Client Pseudonym is sent, and why it is computed that way. The idea of hashing with a salt is used to make brute force (rainbow table) attacks against a hash more difficult. The salt only helps you when it is kept secret or when the hashed data (other than the salt) has some entropy. In the latter case, the type of security this provides is less than you would typically hope for, especially when you are hashing low entropy data like an RP's name.

But if we are in the former case, you aren't given the salt, and the hash is meaningless. You can't check that you were given a meaningful hash because you don't know what salt to use. Why not make the Client Pseudonym a random nonce (that you reuse for that particular RP) and forget about hashing the RP's name? That would make sense, given that the IP is not supposed to be able to check (i.e. hash and compare) the RP's name later on.

It seems that the Client Pseudonym must have another use beyond this case…