Commentary

The EFF SSL Observatory is a project to investigate the certificates used to secure all of the sites encrypted with HTTPS on the Web. We have downloaded a dataset of all of the publicly-visible SSL certificates, and will be making that data available to the research community in the near future.

This is exciting. I knocked together a less ambitious version of this last year, but the EFF guys are doing it like grown-ups, and are getting some interesting data.

Numbers-wise, they’re in the right ballpark, as far as I can tell. Their numbers (1-2m CA-signed certs) coarsely match ones I’ve seen from private sources. I’ve heard from a few CAs that public-crawl estimates tend to err 50-80% low since they miss intranet dark matter, but at least the EFF is tracking other public-crawls. Given that their collection tools and data are going to be made public, that’s a really big deal. Previously, I haven’t been able to get this kind of data without paying for it or collecting it myself. If the database is actively maintained and updated, this will be a great resource for research.

Their analysis of CA certificate usage is also interesting. I’d like to see more work done, here, and in particular I’d like to see how CA usage breaks down between the Mozilla root store and others. We spend considerable effort managing our root store, and recently removed a whole pile of CA certificates that were idle. In some places, the paper seems to make the claim that fully half of trusted CAs are never used, but in other places, the number of active roots they count outnumbers our entire root program. I understand why they blurred the line for the initial analysis, but it would be swell to see it broken out.

As they mention, there are legit reasons for root certs to be idle, particularly for future-proofing. We have several elliptic curve roots, and some large-modulus RSA roots, which are waiting for technology to catch up before they become active issuers while giving CAs a panic switch in the case of an Interesting Mathematical Result — that feels okay to me. On the other hand, if there are certs which are just redundant, it would be great to know, so that we can have that conversation with the relevant CAs, and understand the need to keep the cert active.

This is exactly what I hoped would come of my crawler last year, but they’ve done a much more thorough job. We’ve seen an uptick in research interest in SSL over the last few years. Having a high quality data source to poke when testing a hunch is going to make it easier to spot trends, positive or otherwise. Interesting work, folks; keep it going!

Kathleen Wilson works for the Mozilla Corporation, and manages our queue of incoming certificate authority requests. She coordinates the information we need from the CAs, shepherds them through our public review process and, if approved, files the bugs to get them into the product.

Q: Holy crap! One person does all of that? Is she superhuman?

It has been proven by science. She is 14% unobtainium by volume.

Q: That’s really awesome, but I am a terrible, cynical person and require ever-greater feats of amazing to maintain any kind of excitement.

She came in to a root program with a long backlog and sparse contact information, and has reduced the backlog, completely updated our contact information, and is now collecting updated audit information for every CA, to be renewed yearly.

Q: Hot damn! She’s like some kind of awesome meta-factory that just produces new factories which each, in turn, produce awesome!

I know, right? She has also now removed several CAs that have grown inactive, or for which up to date audits cannot be found. They’ll be gone as of Firefox 3.6.7. They’re already gone on trunk.

Q: Wait, what?

Yeah – you can check out the bug if you like. I’m not positive, but I think this might represent one of the first times that multiple trust anchors have ever been removed from a shipping browser. It’s almost certainly the largest such removal.

Q: I don’t know what to say. Kathleen completes Mozilla. It is inconceivable to me that there could be anything more!

In her spare time, she’s coordinating with the CAs in our root program around the retirement of the MD5 hash algorithm, which should be a good practice run for the retirement of 1024-bit RSA (and eventually, in the moderately distant but forseeable future, SHA-1).

She has invented a device that turns teenage angst into arable land suitable for agriculture.

Fully 2 of the above statements are true!

Q: All I can do is whimper.

Not true! You can also help! Kathleen ensures that every CA in our program undergoes a public review period where others can pick apart their policy statements or issuing practices and ensure that we are making the best decisions in terms of who to trust, and she’d love you to be a part of that.

For those who haven’t seen it, scam-detectives.co.uk has a really interesting 3-part interview with a former Nigerian scammer.

Scam-Detective: A reader has asked me to talk to you about face to face scams. Were you ever involved in meeting a victim, or was all of your contact by email?

John: I never met a victim, but I was involved in a couple of Wash-Wash scams.

Scam-Detective: Wash Wash scams? What does that involve?

John: We would tell the victim that we had a trunk full of money, millions of dollars. One victim met some of my associates in a hotel in Amsterdam, where he was shown a box full of black paper. He was told that the money had been dyed black to get through customs, and that it could be cleaned with a special chemical that was very expensive. My associates showed him how this worked with a couple of $100 bills from the top of the box, which they rinsed with some liquid to remove the black dye. Of course the rest of the bills were only black paper, but the victim saw real money. He handed over $27,000 (about £17,000) to buy the chemicals and was told to return to the hotel later that day to pick up the cash. Of course when he came back, there was nobody there. He couldn’t report it to anybody because if it had been real it would have been illegal, so he would have gotten himself into trouble.

We build tools in Firefox like stale-plugin warnings and malware blocking to help protect our users, to neuter the technological attacks they may encounter on the web. But we also try, and need to keep trying, to build tools that inform our users so that they can make better decisions. Our phishing warnings and certificate errors try to do this, but mostly by scaring users away from specific attack situations. I hope we’ll continue to build tools like Larry which try to give people some affirmative context as well, to lend some nuance to their sense of place online. I want us to help our users know when they’re on Main Street, and when they’re in an alley.

I know: People get conned in the real world, too, and certainly no browser UI is going to save you from an email-based scam. Stories like this, though, are just specific instances of what I believe to be a more universal principle:

the biggest security risk most people face is misplaced trust

John: Some of the blame has to go to the victims. They wanted the money too because they were greedy. Lots of times I would get emails telling me that they wanted more money than I was offering because of the money they were having to send. They could afford to lose the money.

Scam-Detective: John, I think you have been basically honest with me so far. Please don’t stop that now. You know as well as I do that not all of your victims were motivated by greed. I have seen plenty of scam emails that talk about dying widows who want to give their money to charity, or young people who are in refugee camps and need help to get out. You targetted vulnerable, charitable people as well as greedy businessmen, didn’t you? You didn’t care whether they could afford it or not, did you?

John: Ok, you are right. I am not proud of it but I had to feed my family.

If you have ideas for how we can help users place their trust online more deliberately and carefully: please comment here, or build an addon, or file a bug.

I don’t know if there are people out there who like the way they sound in audio recordings, or look on video. I certainly don’t. I don’t think it’s a self-image issue, either; and I know I’m not alone. My recorded voice lacks the resonance I experience internally, and my recorded image just looks… mouthier (?!) than I imagine myself to be. I don’t even know what that means.

Proposed:

Nightingale’s Corollary to the Uncanny Valley Hypothesis: The depth of one’s psychological attachment to, and familiarity with, one’s own image, amplifies feelings of canny/uncanniness. This can result in greater than average affinity for moderately dissimilar representations (c.f. the popularity of “realistic cartoon avatar” generators, or caricature artists), but also particularly heightened sensitivity to minor dissimilarities.

[Discuss. Cite examples.]

The Point (i.e. Where You Should Have Started Reading)

I bring this up because the inimitable duo of Alix and Rainer recently took some of my scattered ramblings and knit them together into an educational piece on some of the security features in Firefox. I think they did a lovely job:

In very much related news, Drew worked with Alix and Rainer to put together a video that talks about some of Firefox’s privacy features. I find it much easier to listen to Drew’s calm, matter of fact, “we did awesome stuff, and want you to know about it” delivery. I suspect you will, as well.

To a first approximation, I think you can gauge how much people think about software quality by how highly they value deletion. While most rookie developers are chiefly interested in building rather than in tearing down (for what I hope are obvious reasons), great throbbing brains like Graydon speak about deletion with the kind of reverence that I presume cardinals reserve for only the coolest of popes.

In what history will likely judge as a vain attempt to impress him, then, I recently landed bug 513147, deletion of the now antiquated “Properties” dialog that used to be available on right-clicking things like images and links. Not because it was useless (every feature is someone’s baby, and is added for a reason) but because it wasn’t useful enough, to enough people, to justify the cost.

50kb of code in our product that is poorly understood, not often used, and not covered by unit tests is not free. When bugs show up, it takes longer than it should to fix them. If a security bug were to show up (which is always a risk when content mixes with chrome, however remote it may seem) it would be particularly expensive for us to reload that context into our brains to fix it.

Deleting it isn’t free either, of course – there are 4 extensions that build off that dialog that will need to be updated, and there may be some who use it regularly who will be disappointed. But the forces of software (inertia, squeaky wheels, cynicism and inertia) bias so heavily towards keeping code in the tree that we should all try to take clear deletion opportunities when they come up. Not capriciously, not without sensitivity to the impact it can have, but with recognition that the hidden cost to keeping them is also large and… hidden.

It is in the spirit of this sensitivity that we, on the Firefox team, have tagged this bug and others like it: [killthem]. What else do you think should go? (And please, be gentle. Remember, every feature is someone’s baby.)

[Update: Geoff Lankow has taken the code that used to be built in, and made it into an add-on, which is think is fantastic. As I said to him, and as I said above, my assertion has never been that the code was useless, just that it wasn’t useful enough to justify its cost in the core product. An add-on is a great place for functionality like that, and I thank Geoff for his work.]

While talking to press in North America and Europe about Firefox 3.5 (you’re already running it, right?) one topic that really resonated with people was the way we pushed on privacy in this release.

I think, initially, some people viewed our private browsing mode as a checklist feature. Even though we’d been working on it since before Firefox 3, it wasn’t strong enough for us to ship until 3.5 and in the interim other browsers have implemented versions of the same functionality. I really like the way we’ve done it, and there seem to be significant differences between the various browsers’ implementations, but regardless of all that I also don’t think that any private browsing mode is a complete solution.

Private browsing mode assumes that you will always know ahead of time that you’re about to do privacy-sensitive things. In Firefox 3.5, we tried to match more closely the way people actually use the browser, and sometimes that means they need to clean up after the fact – forgetting a slice of time, or a particular site. It also means that sometimes they want their browser to remember things, sensitive bookmarks for example, but not publicize those in the location bar. People’s use of a web browser in 2009 is more nuanced than:

Public Private

Alex Faaborg has done a fantastic job detailing many of the privacy features in the latest release of Firefox. I’d encourage you all to check it out.

A couple weeks ago I was attending a panel discussion at the Computers, Freedom and Privacy conference in DC (featuring our very own Mike “Gillette Mach 3” Shaver) when Betsy, from Google Economics, started talking about their behaviour-based advertising.

She was making a point about how Google gives users control over the kind of ads they see, and she mentioned this:

I think I always knew that the “Ads by Google” text at the bottom of ads was clickable – I’ve probably even clicked it. Historically though, it’s just been a sales pitch for would-be advertisers and content authors. Now, when you click on it (go on, there’s one at the bottom of this post), there’s a link to your very own “Ad Preferences Manager.”

This page tells you what Google thinks you’re interested in based on the browsing habits it’s observed, and hence what kinds of ads it wants to show you (seriously, go check it out). It also gives you the option to add/remove interests, or opt out entirely.

Betsy, from Google, was talking about how they had been trying to really get the word out to people about this interface, so that people could control their ad experience. I wasn’t sure whether that message was reaching people – even people who might care about the information advertisers collect.

A couple of questions, then:

Did you know about this page?

Do the contents there surprise you? How accurate are they?

How does it all make you feel? Are you more comfortable, knowing that you have some control? Or are you less comfortable, seeing the profile laid out like that?

When I blogged about my database of SSL certs from the top 1M alexa sites, it got much more reaction than I expected. It’s nice to have peers in this microcosm of nerdspace.

Easily the most often requested improvement was to include intermediates in the database. People wanted to see which issuers had a bunch of subordinate CAs and which issued right from the root. They wanted to see what kind of key sizes and algorithms CAs chose, and how they compared to the key sizes and algorithms used in regular site certs.

I’ve gone and re-crawled to gather that information now, and you can download the zipped db (509M). It’s still an SQLite3 database, though I’ve changed the schema a bit, with certificates now stored in their own table. Let me know in the comments/email if you need help working with the data.

The schema, if you can call it that, was 100% expediency over forethought, so I would welcome any suggestions on DB organization/performance tweaking. I have done no optimizing so low-hanging fruit abounds, and a complicated query can take more than a day right now, so your suggestions will have visible effects!