It’s been a great four and a half years at Mozilla, where I’ve had the privilege to work with the wonderful and brilliant people in Labs, Jetpack, Identity, and most recently Cloud Services. I’m grateful to you all.

Now it’s time for me to move on. This Friday will be my last day in the office (but certainly not as a Mozillian!), and this blog will probably be closed down or frozen at that time. You can reach me at warner@lothar.com, and my home blog lives at http://www.lothar.com/blog .

Mozilla is an amazing place, and will always be in my heart. Thank you all for everything!

(This wraps up a two-part series on recent changes in Firefox Sync, based on my presentation at RealWorldCrypto 2014. Part 1 was about problems we observed in the old Sync system. Part 2 is about the protocol which replaced it.)

Last time I described the user difficulties we observed with the pairing-based Sync we shipped in Firefox 4.0. In late April, we released Firefox 29, with a new password-based Sync setup process. In this post, I want to describe the protocol we use in the new system, and their security properties.

(For the cryptographic details, you can jump directly to the full technical definition of the protocol, which we’ve nicknamed “onepw”, since there is now just “one password” to protect both account access and your encrypted data)

(This begins a two-part series on upcoming changes in Firefox Sync, based on my presentation at RealWorldCrypto 2014. Part 1 is about problems we observed in the old system. Part 2 will be about the system which replaces it.)

In March of 2011, Sync made its debut in Firefox 4.0 (after spending a couple of years as the Weave add-on). Sync is the feature that lets you keep bookmarks, preferences, saved passwords, and other browser data synchronized between all your browsers and devices (home desktop, mobile phone, work computer, etc).

Our goal for Sync was to make it secure and easy to share your browser state among two or more devices. We wanted your data to be encrypted, so that only your own devices could read it. We weren’t satisfied with just encrypting during transmission to our servers (aka “data-in-flight”), or just encrypting it while it was sitting on the server’s hard drives (aka “data-at-rest”). We wanted proper end-to-end encryption, so that even if somebody broke into the servers, or broke SSL, your data would remain secure.

Proper end-to-end encryption typically requires manual key management: you would be responsible for copying a large randomly-generated encryption key (like cs4am-qaudy-u5rps-x/qca-hu63l-8gjkl-28tky-6whlt-fn0) from your first device to the others. You could make this easier by using a password instead, but that ease-of-use comes at a cost: short, easy-to-remember passwords aren’t very secure. If an attacker could guess your password, they could get your data.

We didn’t like that tradeoff, so we designed an end-to-end encryption system that didn’t use passwords. It worked by “pairing”, which means that every time you add a new device, you have to introduce it to one of your existing devices. For example, you could pair your home computer with your phone, and now both devices could see your Sync data. Then later, you’d pair your phone with your work computer, and now all three devices could synchronize together. Continue reading →

I’ve been working on the security design for the next version of Firefox Sync, which is the bit that keeps your bookmarks/history/saved-passwords/etc synchronized between Firefoxes on all your various devices. The working title is “PiCL”, which stands for “Profile In the CLoud”. In the coming year, this will be deployed to roughly 500 million Firefox users.

I’m looking for feedback on our design. It involves key-stretching (PBKDF2 and scrypt), secure handling of password-derived keys, SRP, and a healthy distrust of SSL. If you’re interested, read on!

I’m delighted to see that the new code is roughly 20x faster than the previous version, without using processor-specific non-portable assembly language. The old “ref” code, on my 2008 laptop (2.53GHz Core2Duo), makes signatures in 2ms and verifies them in 7ms. The “ref10″ code signs in 120us and verifies in 307us. That’s over 8300 signatures per second! The ref10 version also includes the batch-verification function, which (thanks to some tricks in the design of Ed25519) makes it faster to verify many signatures at once. Interestingly, this requires random numbers on the *verification* side (since it’s doing statistical verification: if the attacker knew which random numbers you were going to use, they could craft a set of message that would appear valid when checked by the batch verifier, but were invalid when checked individually).

Naturally, this release came exactly one day after I finally published python-ed25519 1.0 :-). But 1.1 will have the speedups.

What’s a good way to manage version numbers in a Python project? I don’t mean:

where should it be stored, so that other code can find it. PEP 8 tells us to use __version__, and distutils tells us to call setup() with a version= argument. The embedded string is particularly useful to report or record a version in bug reports.

what format it should take: PEP 386 describes a format (N.N[.N]+[{a|b|c|rc}N[.N]+][.postN][.devN]) that enables comparison, so packaging tools can evaluate things like “dependency > 1.2.0″. (I happen to find this format really limiting, and this tool doesn’t necessarily produce PEP386-compliant strings, but that’s not what this post is about)

What I do mean is:

how does the right version string get into the code?

what does a release manager need to do when it’s time to make a new release?

The traditional approach, ages old, is to have a static string embedded in the code. Each time you’re about to make a new release, you make a commit which updates this string. It’s nice and simple, but has some problems: Continue reading →

But actually building them is a hassle, because the NaCl build process is so idiosyncratic. It consists of a 500-line undocumented shell script named “do”, and running it gets you 25 minutes of 100% CPU that executes in stony silence (all progress messages are redirected to a logfile). If you can wait that long, and think to explore the directory afterwards, you’ll be rewarded with a build/HOSTNAME/ directory that contains a libnacl.a and a set of header files that are pretty easy to use. What’s actually going on behind the scenes is that the script is exhaustively compiling and testing a large matrix of compiler flags (-O vs -O3 vs -O3 -funroll-loops), ABI variants, and alternative implementations. The goal is apparently to:

select the fastest possible implementation and compiler options, using any assembly-language tricks specific to the processor (SSE3, etc)

make sure the unit tests pass

construct a performance report to send back to the authors

Unfortunately, this doesn’t play well with other build systems that might want to embed a copy, such as Python’s distutils, because:

some compiler flags (-fPIC) are needed to build the .so files that python can load at runtime: distutils knows what these are, “do” doesn’t

when building e.g. debian packages, the results will be used on other machines, so processor-specific optimizations aren’t ok (you might build your packages on a machine with some feature that’s not present on the machines that use those packages, so stick with least-common-denominator).

running “do” requires a Bourne Shell interpreter, standard on unix systems but not so obvious on windows. distutils knows how to compile things on windows, but you have to tell it the source files, and it will run the compiler itself.

having a separate “./do” compile step means that “setup.py build” is not enough, which means that easy_install won’t work, making it hard to use pynacl as a dependency in virtualenv or pip environments.

In addition, waiting 25 minutes for an otherwise small and elegant library to build is just a drag.

How do Ed5519 keys work?

There are several different implementations of the Ed25519 signature system, and they each use slightly different key formats. While writing python-ed25519, I wanted to validate it against the upstream known-answer-tests, so I had to figure out how to convert those keys into a format that my code could use.

Ed25519 is an implementation of Schnorr Signatures in a particular elliptic curve (Curve25519) that enables very high speed operations. It also has a few nice features to make the algorithm safer and easier to use.

I’ve published some MIT-licensed Python bindings to djb++’s portable C implementation of this signature scheme. They’re available here:

There are amd64-specific assembly versions that run even faster, in just a few hundred microseconds, and for bulk operations you can do batch verification faster than one-at-a-time verification. So you can perform thousands of operations per second with this algorithm (and hundreds with this particular implementation).

It’s very exciting to finally have short+fast signatures (and also, through Curve25519, key-agreement and encryption): it opens up a lot of new possibilities. When public-key encryption was first invented, keys took so long to generate that folks assumed that each human would have just one: all sorts of mental baggage was built up around this restriction (ideas like never sharing signing keys, keys representing people, and the need to distribute keys separately from fingerprints). When you can easily generate a new key for each message or object or operation, we can let go of some of those psychological fetters and build something new.

(Note that “Curve25519” uses the same basic curve equation, but only provides Diffie-Hellman key agreement [and, by extension, public-key encryption]. It can’t be used to create signatures that can be verified by third parties: for that you need Ed25519. A portable Curve25519 implementation can be found in curve25519-donna, and includes a Python binding that I wrote too)