Background

Randomness in Cryptography

As we’ve discussed in the past, cryptography relies on the ability to generate random numbers that are both unpredictable and kept secret from any adversary.

But “random” is a pretty tricky term; it’s used in many different fields to mean slightly different things. And like all of those fields, its use in cryptography is very precise. In some fields, a process is random simply if it has the right statistical properties. For example, the digits of pi are said to be random because all sequences of numbers appear with equal frequency (“15” appears as frequently as “38”, “426” appears as frequently as “297”, etc). But for cryptography, this isn’t enough - random numbers must be unpredictable.

To understand what unpredictable means, it helps to consider that all cryptography is based on an asymmetry of information. If you’re trying to securely perform some cryptographic operation, then what you’re concerned about is that somebody - an adversary - will try to break your security. The only thing that distinguishes you from the adversary is that you know some things that the adversary does not, and the job of cryptography is to ensure that this asymmetry of information is enough to keep you secure.

Let’s consider a simple example. Imagine that you and a friend are trying to go to a movie, but you don’t want your nemesis to figure out which movie you’re going to go to (lest she show up and thwart your movie watching!). This week, it’s your turn to choose the movie. Once you’ve made your choice, you’re going to need to send a message to your friend telling him which movie you’ve chosen, but you’re going to need to ensure that even if your nemesis intercepts the message, she won’t be able to figure out what it says.

You devise the following scheme: since there are only two movies available to watch at the moment, you label one A and the other B. Then, while in the presence of your friend, you flip a coin. You agree on the following table outlining which message that you will send depending on your choice of what movie to watch and whether the coin comes up heads (H) or tails (T). Later, once you’ve made up your mind about which movie to see, you’ll use this table to send an encrypted message to your friend telling him which movie you’ve chosen.

If you were to decide on movie B, and the coin came up heads, you would send the message, “In Hertford, Hereford, and Hampshire, hurricanes hardly ever happen.” Since your friend knows that the coin came up heads - he was there when it happened - he knows that you must have decided on movie B. But consider it from your nemesis’ perspective. She doesn’t know the result of the coin toss - all she knows is that there’s a 50% chance that the coin came up heads, and a 50% chance that it came up tails. Thus, seeing the message “In Hertford, Hereford, and Hampshire, hurricanes hardly ever happen” doesn’t help her at all! There’s a 50% chance that the coin came up heads (implying movie B), and a 50% chance that it came up tails (implying movie A). She doesn’t know anything more than she knew before!

Let’s return now to the concept of unpredictability. Imagine that the result of the coin toss was completely predictable - say your nemesis planted a trick coin that always comes up heads on the first toss, tails on the second, heads on the third, and so on. Since she would know that there was a 100% chance of the first coin toss coming up heads, then regardless of which message you sent, she would know which movie you were going to see. While the trick coin still exhibits some basic properties of “randomness” as the term is used in the field of statistics - it comes up heads as often as it comes up tails - it’s predictable, which makes it useless for cryptography. The takeaway: when we say random in the context of cryptography, we mean unpredictable.

Randomness in Computing

Unfortunately for cryptographers, if there’s one thing computers are good at, it’s being predictable. They can execute the same code a million times, and so long as they are given the same inputs each time, they’ll always come up with the same outputs. This is very good for reliability, but it’s tricky when it comes to cryptography - after all, we need unpredictability!

The solution to this problem is cryptographically-secure pseudorandom number generators (CSPRNGs). CSPRNGs are algorithms which, provided an input which is itself unpredictable, produce a much larger stream of output which is also unpredictable. This stream can be extended indefinitely, producing as much output as required at any time in the future. In other words, if you were to flip a coin a number of times (a process which is known to be unpredictable) and then use the output of those coin flips as the input to a CSPRNG, an adversary who wasn’t able to predict the output of those coin flips would also be unable to predict the output of the CSPRNG - no matter how much output was consumed from the CSPRNG.

But even though CSPRNGs are a very powerful tool, they’re only one half of the equation - they still need an unpredictable input to operate. But as we said, computers aren’t unpredictable, so aren’t we back at square one? Well, not quite. It turns out that computers do usually have sources of unpredictability that they can use, but they’re pretty slow. What we can do is combine that slow process of gathering unpredictable input with a CSPRNG, which can take that input and produce a much larger amount of input more quickly, and we can satisfy all of our randomness needs!

But where can a computer get such unpredictable input, even slowly? The answer is the real world. While computers provide a nice simplification of the world for programmers to live in, real, physical computers still exist in the real, physical world. And that world is unpredictable. Computers have all sorts of ways of taking input from the real world - temperature sensors, keyboards, network interfaces, etc. All of these provide the ability to take measurements of the real world, and all of those measurements have some degree of inherent inaccuracy. As we’ll explain in a moment, that inaccuracy is the same thing as unpredictability, and this can be used! Common sources of unpredictable randomness include measuring the temperature of the CPU with high - and thus inaccurate - precision, measuring the timing of keystrokes on the keyboard with high precision, etc. To see how this can be used to produce unpredictable randomness, consider the question, “what’s the temperature of the room I’m sitting in right now?” You can probably estimate to within a couple of degrees - say, somewhere between 70 and 75 degrees Fahrenheit. But you probably have no idea what the temperature is to 2 decimal places of accuracy - is it 73.42 degrees or 73.47 degrees? By measuring the temperature with high precision and then using only the low-order digits of the measured value, you can get highly unpredictable randomness just by observing the world around you. And so can computers.

So, to recap:

Randomness used in cryptography needs to be unpredictable.

Computers can slowly get a small amount of unpredictable randomness by measuring their environment.

Computers can greatly extend this randomness by using a CSPRNG which can quickly turn it into a large amount of unpredictable randomness.

Hedging Your Bets

If there’s one thing cryptographers are wary about, it’s certainty. Cryptographic systems are routinely shown to be less secure than they were originally thought to be, and we’re constantly updating our understanding of what algorithms are safe to use in what scenarios.

So it shouldn’t be any surprise that cryptographers like to hedge their bets by using more security than they believe necessary in case it turns out that one of their assumptions was wrong. It’s sort of like the cryptographer’s version of the engineering practice of designing buildings to withstand far more weight or wind or heat than they think will arise in practice.

When it comes to randomness, this hedging often takes the form of mixing. Unpredictable random values have the neat property that if they’re mixed in the right way with more unpredictable random values, then the result is at least as unpredictable as either of the inputs. That means that if you mix a random value which is highly unpredictability with a random value which is somewhat predictable, the result will be a highly unpredictable value.

This mixing property is useful because it allows you to mix unpredictable random values from many sources, and if you later discover that one of those sources was less unpredictable than you’d originally thought, it’s still OK - the other sources come to the rescue.

LavaRand

At Cloudflare, we have thousands of computers in data centers all around the world, and each one of these computers needs cryptographic randomness. Historically, they got that randomness using the default mechanism made available by the operating system that we run on them, Linux.

But being good cryptographers, we’re always trying to hedge our bets. We wanted a system to ensure that even if the default mechanism for acquiring randomness was flawed, we’d still be secure. That’s how we came up with LavaRand.

The view from the camera

LavaRand is a system that uses lava lamps as a secondary source of randomness for our production servers. A wall of lava lamps in the lobby of our San Francisco office provides an unpredictable input to a camera aimed at the wall. A video feed from the camera is fed into a CSPRNG, and that CSPRNG provides a stream of random values that can be used as an extra source of randomness by our production servers. Since the flow of the “lava” in a lava lamp is very unpredictable,1 “measuring” the lamps by taking footage of them is a good way to obtain unpredictable randomness. Computers store images as very large numbers, so we can use them as the input to a CSPRNG just like any other number.

We're not the first ones to do this. Our LavaRand system was inspired by a similar system first proposed and built by Silicon Graphics and patented in 1996 (the patent has since expired).

Hopefully we’ll never need it. Hopefully, the primary sources of randomness used by our production servers will remain secure, and LavaRand will serve little purpose beyond adding some flair to our office. But if it turns out that we’re wrong, and that our randomness sources in production are actually flawed, then LavaRand will be our hedge, making it just a little bit harder to hack Cloudflare.

Today has been a big day for Cloudflare, as we became a public company on the New York Stock Exchange (NYSE: NET). To mark the occasion, we decided to bring our favorite entropy machines to the floor of the NYSE....