Sunday, February 26, 2012

For a while, Firefox has included malware and phishing protection to
keep our users safe on the web. Recently, Gian-Carlo Pascutto
made some significant improvements to our Firefox support for the feature, resulting in much more efficient operation and use of the Safe Browsing API for this protection.

Privacy in the Safe Browsing API

I
want to take a little time to explain how this feature works and why I
like it from a privacy perspective: Firefox can check whether or not a
web site is on the Safe Browsing blacklist without actually telling the
API what the web site is called.

At a high level, using
this API to find URLs on the "bad" list is like asking your friend to
identify whether or not he likes things you show him through a dirty
window. Say you hold up an apple to the dirty window and the your
friend on the other side sees a fuzzy image of what you're holding. It
looks round and red and pretty small, but he's not sure what it is.
Your friend looks at his list of things he doesn't like and says he
likes everything like that except for plums and red tennis balls. While
he still does not know exactly what you're holding, you can know for
sure he likes the apple.

More technically, this uses a
hash function to turn web URLs into numbers. Each number corresponds to
exactly one URL. For each site you visit, Firefox hashes the URL and
sends the first part of the resulting number to the Safe Browsing API.
The API responds with any values on the list of bad URLs that start with
the value it received. When Firefox gets the list of "bad" site hash
values that match the first part, it looks to see if the entire hash is
in the list. Based on whether or not it's in the provided list of bad
stuff, Firefox can determined whether the URL is on the Safe Browsing
blacklist or not.

Consider this hypothetical example of two sites and their (fake) hash values:

Site

Hash Value

http://mozilla.com

1339

http://phishingsite.com

1350

When you visit http://mozilla.com, Firefox
calculates the hash of the URL, which is 1339. It then asks the Safe
Browsing API what bad sites it knows about that start with "13". It
returns a list of numbers including "1350". Firefox takes that list,
notices that 1339 (http://mozilla.com) is not in the list, so the site
must be okay.

If you repeat the same procedure with
http://phishingsite.com, the same prefix "13" is sent to the API, and
the same list of bad sites (including 1350) is returned. In this case,
however, the site's hash is "1350" so Firefox knows it's on the list of
bad sites and gives you a warning.

For you techies and
geeks out there: yeah, I'm glossing over a few protocol details, but the
gist is that you don't need to tell Google exactly where you browse in
return for the bad-stuff blocking.

Keeping the Safe Browsing Service Running Smoothly

Google
hosts the Safe Browsing service on the same infrastructure as many of
their other services, and they need to ensure that our users aren't
blocked from accessing the malware and phishing blacklists as well as
make sure they invest in the right resources to keep the service
operating well. One of the mechanisms they need for performing this quality-of-service assurance is a cookie, so the first request Firefox makes to the Safe Browsing API results in the setting of a Google cookie.

I
know that not everyone likes that cookie, but Google needs it to make
sure their service is working well so I've been working with them to
ensure that they can use it for quality of service metrics but not track
you around the web. The most straightforward way to do this is to
split the Firefox cookie jar into two: one for the web and one for the
Safe Browsing feature. It's not there yet, but with a little
engineering work, in a future version of Firefox that cookie will only
be used for Safe Browsing, and not sent with every request to Google as
you browse the web.

While
Firefox has been using Safe Browsing for a while, Google has started
experimenting with a couple new features in Safe Browsing for additional
malware and phishing filtering. Both of these new features are pretty
new and it's not yet clear how effective they are or what percent of my
browsing history will be traded for this improvement. Both new features
involve sending whole URLs to Google and departing from Firefox's
current privacy-preserving state requires evidence of a significant gain
in protection. When Google measures and shares how much gain is
encountered by their pilot deployment in Chrome, we can take a deeper
look and consider whether these new features are worth it.

For
now, Firefox users are getting a lot of protection for very little in
return and there does seem to be good reason for Google to use cookies
with Safe Browsing. We are always looking out for things we can do to
give Firefox users both the best of privacy and security.