Firefox Sync:

Then and Now and Soon

Brian Warner, Mozilla Identity

warner@mozilla.com

Hi everybody, thanks for coming out.
My name is Brian Warner, I work in the Mozilla Identity group, and today I'm
gonna talk about the cryptography in Firefox Sync, which is the feature that
keeps bookmarks, tabs, passwords, and other state synched between various
firefox browsers on your laptop, desktop, and mobile device.
I'm going to talk about the protocols we've used, the problems we've observed
with them, and the compromises that are guiding our next generation of
protocols.

Browser Data Synchronization

This story starts back in 2007.
At that point, to keep bookmarks in sync between multiple browsers, you had
to use a third-party extension like FoxMarks, or just keep all your URLs in a
third-party server like delicious where they weren't really bookmarks. And
there were even fewer options for integrated management of passwords, open
tabs, themes, preferences, form-fill data, history, and the other bits of
state that browsers keep track of.
Most of the third-party options at that time held your data in plaintext on
their servers. This made a lot of people nervous, and there was demand for
some kind of encryption.

Firefox Sync (neé Weave)

Firefox extension by Mozilla Labs, 2007-2010

username + password + passphrase

So in late 2007, the Mozilla Labs team created an extension called Weave to
fix this. By the time I joined in 2009, it was working pretty well,
synchronizing all the major data types, including open tabs, which was pretty
novel at the time. If you're reading something in a tab at work, you don't
even have to bookmark it, you can just walk away from your desk, and then
pick right back up where you left off when you get home to your other
browser. It also synched history, which is great because that's used to drive
the URL autocompletion feature, so you can type a couple letters of the URL
or site title and get the same recommendations everywhere.
Weave also synchs Firefox's password-manager database, which makes it easier
to use better passwords that aren't so memorable. Which is, of course, the
first step towards getting rid of them altogether.
So both because we're synching passwords, and because which sites you're
visiting is a private matter that we take very seriously, encryption was a
high priority very early on. The weave team implemented this with an extra
passphrase, beyond the usual account password, that never left the client
device. It was used to derive a master encryption key. I'm going to skip over
the details, there was some more crypto in between this master key and the
actual ciphertext that got uploaded, but when I reviewed it in 2009, I was
pretty satisfied with that part.

Now, we all know how lousy passwords are, so we were worried about people
managing this secret phrase, and looking for an alternative. And about this
time, someone told me about the J-PAKE algorithm. This is a member of the
PAKE family: password authenticated key exchange algorithms, which let two
parties safely bootstrap from a low-entropy password up to a strong session
key, without giving an eavesdropper any information about the key. Even an
active man-in-the-middle only gets one guess. You're basically spending
roundtrips to buy protection against offline attacks.
The crypto community has a lot of experience with PAKE algorithms, but they
aren't that common in the wild. SRP is the most well-known, but even so it's
hard to find examples of it being used in the wild. SRP is an "augmented"
PAKE, which means the server doesn't store the user's password, but only
something derived from it. J-PAKE is a "balanced" PAKE, which means both
sides start with the same password.

Credential Transfer

So it was kind of crazy and experimental, but that's what Labs is all about,
so the neat crazy thing we did in 2010 was to glue J-PAKE into Weave.
The first time you set up sync, your one device creates a long random
passphrase, and uses that to encrypt your data. Then, when it comes time to
set up the *second* device, we use J-PAKE to safely transfer your credentials
from the old machine to the new one. One of the devices displays a short
random transfer code, and you type it into the other device. The two devices
use that code as a PAKE password, and establish a strong session key. Then
they encrypt everything necessary to set up Sync in that session, including
the passphrase.
We describe this as "pairing" your devices together, like pairing a bluetooth
keyboard with your computer. I know, it's not as sexy as Weil pairing or Tate
pairing.
We thought of a few corner cases, like where you're trying to pair two
desktops together and you can't be in two places at once. So we kind of waved
our hands and figured we could expose the encryption key in some advanced
prefs panel, so you could manually copy and paste it, if really necessary.
But the net result is that you don't have to remember a strong password to
get strong security. Your data is end-to-end encrypted, with key that lives
only on your own devices: you never even see it.

Sync 1.3, now with J-PAKE

included in Firefox 4.0 (March 2011)
So in March of 2011, we shipped Firefox 4.0 with a J-PAKE-based Weave as a
standard feature. We renamed it Firefox Sync to make it sound more official.
We released Firefox for Android at the same time, and we made a big deal
about that keeping your desktop and your mobile phone browsers in sync.
When you set up a device, you get a screen like this one.
The final version used a 12-character one-time sync code. The first four
characters give you 20 bits of a channel ID, to make sure you're doing PAKE
with the right browser, among all the browsers that are currently trying to
connect. The other 40 bits are the J-PAKE shared secret. We could probably
have gone with a shorter code, but we wanted to be conservative.

Awesome!

great security, even against the server

no passwords to remember

Then on your second browser, you just paste the code into this box.
In theory, this should have been awesome. There are no passwords to remember,
no keys to manage, but all your data is encrypted so well that even our
servers don't have the slightest chance to see it. Adding a new device looks
a lot like pairing a bluetooth keyboard: all you have to do is type in a few
characters.
I'm still pretty proud of this scheme, and everyone I've ever described this
to absolutely loves it.

Not So Awesome

But what we learned is that a lot of people who actually *used* it got really
really confused. Apparently I need to get out more and talk to someone other
than crypto nerds. These are some recent quotes from SUMO, the firefox
support forums.
I love this first one: "I cannot for the life of me figure out firefox sync".
Some of the problems are kinda shallow, but some of them are a lot deeper.
If we unpack these problems, it's clear that the account creation process is
kind of deceptive. When I was working on the protocol, I had this sort of
platonic ideal in my head of no passwords at all, where the setup process on
the first device would have a single button that just said "enable sync",
with no other inputs.

Problem #1: incomplete transition

pairing replaced passphrase

but email/password was left in

Our new fancy pairing scheme replaced the pass*phrase*, but for a variety of
reasons, when we built this, we retained the email address and account
pass*word*.
The reasons made sense at the time. We wanted the ability to contact our
users in case something went wrong, so we wanted that email address. And we
wanted to give folks the ability to delete their whole account easily, to be
forgotten, which needs some sort of account identifier and a way to prove
control over it.
And then the longer-term vision was to create a larger Account thing and use
it for more services than just keeping data in sync.
But since this looks just like the signin dialog on other systems, people
assumed that Sync worked just like those others ones, and were tricked into
thinking that this password was important. In fact, the account creation
dialog was the only place you ever really used this password.

Problem #2: no single-device recovery

When we looked at the stats, we learned that over half of the people using
this only had one device connected. They weren't pairing with anything, they
weren't doing end-to-end encryption because there was only one end. They were
doing "end" encryption. They were getting zero value out of it. So something
was pretty wrong.
And when those folks lost their one device, when their hard drive got erased
or whatever, they wanted to recover their data, and hit the "Set Up Sync"
button, and they'd see this dialog, and they'd hit the "I Have an Account"
button, and get..
.. this weird pairing thing, this is the first time they'd ever seen this,
and they'd say, WTF. And eventually they'd see the little note in the corner,
"I don't have the device with me", which we put there to deal with the
desktop-to-desktop case, but they'd be thinking "of course I don't have the
device with me, that's the laptop that my five-year-old dropped into the
fishtank", so they'd push that button..
And they'd get a page that looks *almost* like the one they were expecting,
and they'd dutifully type in their email address and password, and then
discover that there's this weird "Sync Key" thing that they didn't know
about, which was the raw encryption key that is possible to get at if you
work at it, but it's not obvious, because we didn't think people would really
need it,
"I didn't know I was supposed to write that thing down"
"Why didn't you tell me that"
"encryption is stupid. wah."
It's not obvious that you should ever get to this thing, because, pairing is cool, right?

Solving the Wrong Problem

We built Sync: connecting your devices to each other

incidentally provided an elegant security solution

But people wanted a backup service: connecting their device to a
server

They used Sync anyways, with bad results.

The net result is that, basically, we built the wrong thing. We have this
sort of elegant solution, but for the wrong problem. It serves a few peoples
needs well, but leaves an awful lot of people bewildered and confused.
We designed this pairing thing to work fine for 2, 3, 4, multiple devices,
and didn't really expect that people with just one device would ever use it.
We kind of expected that they'd look at the description of what it does,
they'd say "oh, that's not for me, what a shame, I'll go use these other
services".
But they used it anyways, lots of them, with just one device, and they
expected it to work differently, and they were disappointed. We built a
*Sync*, a synchronization service, but what people really wanted was
something more like a backup service.
Also, just after we shipped this, some other products came out on the market
using this word "sync" in a very different way than we had.

New (contradictory) constraints

instructions: "Fix Sync!". Make it:

"secure"

recoverable-by-password

recoverable-by-email

use one password, not two

make it look more like a "normal" account system

So the Identity group, my team, was given the task to "fix Sync" and replace
this with something that most people could actually use. As usual, if you ask
four people what you should build, you get five different answers, and we got
a bunch of conflicting requirements.
We have this spreadsheet of user stories that were all mututally exclusive.
Some people want their data to be "secure", which is pretty vague, but if you
tell them it's so secure that even the server can't break in, that's usually
good enough for them. Other folks want to get their data back with just a
password, which is probably guessable, so it's not as secure as a
full-strength device-managed key. And other folks want to get their data back
even if they forget their password, which is clearly incompatible with being
secure against the server.
But, the overall product direction we received was to make it look like a
"normal" sync system, which means we start with an email address and a
password, and as designers, we have to make it as secure as possible within
those constraints.

New SRP-based Design

So we huddled around the whiteboard for a while, and eventually came up with
this. I'll walk through the pieces.
The client starts with the user's email address and password, at the top
left. And the goal here is to wind up with a session token that lets it talk
to the storage servers, and the encryption keys it needs to encrypt and
decrypt the synchronized browser data, down in the bottom left.

Data-Protection Classes

class A: recoverable by email

class B: recoverable only by password

To resolve the contradiction between wanting something to be secure against
the server, and to be recoverable if you forget your password, we defined two
different classes of data protection, with the idea that users could choose
which one they wanted to use. Maybe they'd put their saved passwords into the
more secure one, and their bookmarks into the recoverable one. We didn't know
what the default would be, or what the UI would look like, but we started by
making sure the protocol would accomodate both.
So we defined the first protection class, "class A", to mean data that can be
recovered if you can prove control over your account, even if you forget the
password. This typically means that you can click on an unguessable link in a
challenge email. The server holds a master key "kA", from which other keys
are derived to encrypt the class-A data. The server knows this key, but the
storage machines don't, so at least we've been able to keep the storage boxes
outside of the security perimeter.
Then we defined "class-B" to mean data that requires knowledge of the
password to access. This data is encrypted with something derived from this
master "kB" key. The server isn't allowed to learn kB. It's only allowed to
hold a wrapped form that's protected by the user's password.

Client-Side Key-Stretching

client does not reveal password to server

To make that safe, we need to make sure the server doesn't learn the user's password.
There's sort of a spectrum of what it means to "know a password". If you know
the actual raw password, that's like level zero. Knowing a hash of the
password is like level 1. And then using more and more expensive
key-stretching hashes are like level 10, 20, 4000, etc.
So to protect the user's password, we start the whole process by running it
through about a full second of scrypt, in a configuration that uses about 64
meg of RAM, to get a stretch that's not something you could easily speed up
with a GPU or an ASIC. This happens on the client side, before talking to the
server. We salt the scrypt step with the user's email address, to avoid
waiting for the round trip before starting the process.
We use HKDF to derive two independent values from the stretched password. One
of them is held in reserve to unwrap the class-B key later. The other is used
as the password in an SRP protocol.

SRP

By using SRP instead of sending the password or a derivative over the wire,
we deny eavesdroppers any attack on the password or keys that get exchanged
later. All these messages are exchanged inside a TLS connection, of course,
but it makes me feel better to know that we aren't relying upon that.
If someone manages to steal a copy of the server database, they can use the
SRP verifier as an oracle to perform a dictionary attack. For each password
guess they want to test, they run it through scrypt, derive the SRP password,
use that to derive the SRP verifier, and then compare it against the
database. They could also use the unwrap-B key to make the same check, by
unwrapping the server's class-B key and see if it makes sense. Both attacks
require the expensive scrypt stretch for each guess, making them relatively
expensive.
If the server turns completely evil, or the client is talking to the wrong
server because of a DNS or TLS attack, they only get one guess for each login
attempt.
So we designed this thing, figured out all the corner cases with password
changes and account resets, made sure there were no holes that would enable
an attack in some weird situations. The server never gets an attack better
than the scrypt stretch, and external attackers like eavesdroppers get almost
nothing, the weakest point is account creation when you have to send the SRP
verifier up up to the server, for that point we're relying upon TLS, but for
everything else we could actually run this safely over unencrypted HTTP.
It was pretty awesome. We implemented the whole thing, we got everything
working, but we kind of got some grumbling and angst about it.

Pushback

full spec looks pretty complex

SRP is underspecified: scary

implementing our own SRP (in Javascript): scary

can't do server-side stretching with SRP verifier

slow clients, JS clients: performance worries

scrypt RAM usage vs small phones: OOM Killer

The full spec, when you include all the cases like changing passwords,
resetting the account after you forget the password, verifying email address,
etc, when you draw it all out, it looks pretty complex.
SRP is kind of underspecified: you have to pick a group, you have to decide
how to represent the group elements, and our environment meant we had to
write our own implementation, which always makes people nervous.
I like SRP, but one limitation is that you can't do any extra stretching of
the verifier, so all your protection against dictionary attacks has to come
from the client. And there are lots of slow clients out there. It'd be nicer
to use the consistent CPU power in the datacenter to help protect the user,
but with SRP and a single password you just can't do it.
And to make that client-side stretch meaningful, something that you can't
short-circuit with a GPU or an ASIC, you need to give scrypt a significant
amount of RAM. And 64 meg of RAM can be enough to kick your app out of memory
altogether, especially on a small android phone.
(The more crypto you use, the more people get nervous that they can't possibly
implement it correctly. Which is an interesting constraint.)

Scope Creep

new requirement: generalized accounts

auth-only, same password

don't care about encryption keys

login from arbitrary browsers

Finally, while we were building it, the scope of the project expanded to
include a more generalized accounts system, which could be used for more than
just Sync. There would be new clients which only care about authentication,
using the same password to prove control of an account, but not caring about
the encryption keys.
This meant the sync password, which our design so carefully protected from
eavesdroppers and server-side attacks, would be used from ordinary web pages
on arbitrary browsers, not just Firefox. And more developers would be writing
compatible clients, meaning more people to complain about the complexity of
the system.

"onepw" design

So we went back to the drawing board and put this together. So far it's been
holding up.
This design removes SRP, and does minimal stretching on the client side.
The client does a thousand rounds of PBKDF, again salted by the email
address, then derives two values as before. But instead of using one of them
as an SRP password, it just sends it to the server directly in a parameter
called authPW. The server immediately does a longer scrypt stretch with it,
and derives two values of its own. One is used to compare against the
database: this is the standard hashed-password scheme. The other is used to
decrypt the outer wrapping of a doubly-wrapped class-B key stored in the
database. The server returns the single-wrapped result to the client, who
uses their second derivative to unwrap it and obtain kB.

"passive" attack

This scheme is just as strong against the previous one a "passive server
attack", in which someone gets a copy of the database and decides to run an
offline dictionary attack against its contents. They must do the full scrypt
stretch for each guess they want to test.

"active" attack

But it's weaker against a what you might call an "active server attack", in
which the bad guy gets control of the server while it's processing a login
request. That kind of attacker learns the authPW value, so they only have to
do the shorter PBKDF stretch for each guess they want to test. Someone who
can forge a TLS certificate gets the same kind of attack.

just auth

But notice that a client who merely wants to prove control over the account,
who doesn't care about getting the encryption keys, can be pretty simple. The
PBKDF stretch is pretty easy, and runs quickly even in javascript. The worst
case for this is probably an old copy of Internet Explorer on a slow
computer, since their javascript engine in that one was pretty bad, or a
really low-powered mobile phone. But 1000 rounds is pretty light.
Does that much stretching really help? Talk to me afterwards, it's a really
interesting question.
So this is the scheme we're busy building right now. Our target is to get
this released in Firefox 29, which ships at the end of April. We think it'll
provide enough usability and security for the majority of our users.

future directions

Ship it!: Firefox 29, late April 2014

Reintroduce Pairing

2FA

We've built up a pretty.. enthusiastic following for our initial
full-strength no-compromises security model, what you might call the
"popemobile" level of security. So we're trying to figure out how to retain
that level of security for the folks that want it.
One approach is to just use a good password: if it won't fall to a dictionary
attack, then this protocol will keep your data safe.
But we're also trying to figure out how to re-introduce pairing, in a way
that doesn't confuse people. We think we can do better this time, especially
if we start by having the user type in their email address and password. We
can use that to safely determine which other devices are configured and then
pop up the pairing dialog at the right time. In addition, we can make the
pairing code shorter because we're already protected by the normal account
password.
And of course we also have plans to add two-factor authentication into this
scheme.