Tarsnap critical security bug

Tarsnap
versions 1.0.22 through 1.0.27 have a critical security bug. It
may be possible for me, Amazon, or US government agencies with access to
Amazon's datacenters to decrypt data stored with those versions of
Tarsnap. This is an absolutely unacceptable compromise of Tarsnap's
security principles, and I sincerely apologize to everyone affected.

There's a lot to say about this, and it's entirely possible that I'll
miss covering some important points in this post; if I've missed
something, please email me or
post a comment below and I'll do my best to add the necessary information.

The bug

Tarsnap archives data by first converting it into a series of "chunks"
of average size 64 kB; next compressing and encrypting each chunk; and
finally uploading those chunks. The encryption is performed using a
per-session AES-256 key in CTR mode.

In versions 1.0.22 through 1.0.27 of Tarsnap, the CTR nonce value is
not incremented after each chunk is encrypted. (The CTR counter is
correctly incremented after each 16 bytes of data was processed, but
this counter is reset to zero for each new chunk.)

How the bug happened

Up to version 1.0.21 of Tarsnap, AES-CTR was used in two places: First,
to encrypt each chunk of data; and second, in the Tarsnap client-server
protocol. In version 1.0.22 of Tarsnap, I introduced passphrase-protected
key files, which used AES-CTR encryption (with a key computed using
scrypt).

In order to simplify the Tarsnap code — and in the hopes of
reducing the potential for bugs — I took this opportunity to
"refactor" the AES-CTR code into a new file (lib/crypto/crypto_aesctr.c
in the Tarsnap source code) and modified the existing places where
AES-CTR was used to take advantage of these routines.

It is at this point where the bug slipped into the chunk-encryption
code (crypto_file_enc in lib/crypto/crypto_file.c):

The encr_aes->nonce++ turned into encr_aes->nonce,
and as a result the same nonce value was used repeatedly. (The other
places where Tarsnap uses AES-CTR — in the client-server protocol
and in the handling of passphrase-protected key files — are not
affected by this bug.)

Impact of the bug

As stated above: It may be possible for me, Amazon, or US government
agencies with access to Amazon's datacenters to decrypt data stored with
affected versions of Tarsnap. Other individuals/agencies are unlikely to
be able to decrypt data for the simple reason of being unable to access
the encrypted data: Amazon Web Services is considered to be sufficiently
secure to handle
medical
records and
credit
cards, and while I often remind people that regulatory compliance is
not at all the same thing as security, in this case I think they align
fairly accurately. (Note that since the Tarsnap client-server protocol
is encrypted, being able to intercept Tarsnap client-server traffic does
not provide an attacker with access to the data.)

There are two ways of decrypting AES-CTR data when the nonce is reused:
By comparing two ciphertexts, or by using a known plaintext. In the
first case, the ciphertexts A xor C and B xor C are
compared to yield the exclusive OR of the two plaintexts, A xor
B. If the plaintexts are English text or otherwise have a small
amount entropy, this usually enough to allow both plaintexts to be
extracted — in fact, this is one of the methods which was used
by British codebreakers in the second world war. However, the blocks
which Tarsnap encrypts do not have low entropy: Tarsnap compresses its
chunks of data before encrypting them. While the compression is not
perfect (there are, for instance, some predictable header bits), I do
not believe that enough information is leaked to make such a
ciphertext-only attack feasible.

Given a known plaintext, however — that is, if the attacker knows
any block of data which was encrypted — then the attack is
trivial: They need only compare the plaintext against corresponding
ciphertext block to recover the AES-CTR keystream, which can then be
used to decrypt other blocks of data. If Tarsnap is used to perform
complete system backups, there will be many such plaintexts —
files belonging to the operating system and the Tarsnap binary itself
are obvious examples — but if Tarsnap is used selectively then
it is possible that the attacker will have no such plaintext at his
disposal.

Because Tarsnap uses per-session AES keys for encrypting blocks of data,
this bug affects only data uploaded using affected versions of Tarsnap,
and the known-plaintext attack will only endanger data uploaded during
the same archive when the known plaintext is uploaded; so it is possible
that an attacker would be able to decrypt some data but not all.

What Tarsnap users should do

Tarsnap users should immediately upgrade to version 1.0.28.

Tarsnap users who wish to re-encrypt their stored data should register
a new machine using tarsnap-keygen, upload their data using
the newly generated keys, and then delete the old data by running
tarsnap --nuke with the old keys. (Note that creating a new
archive with the same set of keys will not cause data to be re-encrypted
and uploaded, since Tarsnap's de-duplication will recognize the
duplicated data.) Anyone wishing to do this should
contact me via email so that I
can provide a Tarsnap account credit to cover the bandwidth fees
which would otherwise be charged. (Of course, if the US
government wants your data, re-encrypting it and deleting the old
version from Tarsnap won't force them to delete any copies they have
made — but it might help you if the US government doesn't realize
that it wants your data yet.)

Tarsnap users who wish to stop using Tarsnap should delete their
stored data by running tarsnap --nuke and
contact me via email for a
refund.

Tarsnap users with any other questions or concerns should
contact me via email,
twitter, IRC, or any other convenient form of communication.

What I'm doing about this

After being contacted on Friday afternoon and confirming the bug, I
immediately re-checked all of the Tarsnap crytographic code; I found no
other bugs. Of course, this can't guarantee that there are no subtle
issues lurking; but at least it makes very unlikely the possibility
that other similarly obvious problems exist.

I've also added "double-check all changes to critical security code,
even if they are 'cosmetic' or 'refactoring' changes" to my pre-release
checklist. When I wrote the original chunk-encryption code, I reviewed
my work very carefully to make sure that I got it right — and it
was right for two years, until I accidentally introduced this bug while
making what I thought was an insignificant change. This is an important
lesson to learn: Mistakes can happen any time a piece of code is
modified.

Finally, I am instituting a Tarsnap bug bounty (complete details to
follow in a later blog post). This bug was found and reported to me by
someone who was reading the Tarsnap source code purely out of curiosity
— I'm a great fan of curiosity, but I've also learned that money
can help to encourage curiosity. While I hope that I this is the last
time I have to pay out a bounty for a security bug, if there are other
bugs I hope this bounty will result in them being found sooner rather
than later.

Final remarks

I will not attempt to decrypt and read your data. Amazon claims that it
does not inspect Amazon Web Services users' data. And the US government
is theoretically bound by a constitution which prohibits unreasonable
searches. This is all, however, entirely irrelevant: The entire point
of Tarsnap's security is to remove the need for such guarantees. You
shouldn't need to trust me; you shouldn't need to trust Amazon; and you
most certainly shouldn't need to trust the US government.

This was a very easy mistake to make. Anyone could have made it. It
was also a very easy mistake to find. I should have found it, 19 months
ago, before releasing version 1.0.22 of Tarsnap. I didn't, and I'm
sorry.

I'd like to thank Taylor R Campbell for bringing this bug to my
attention.

Q&A

Some questions I've been hearing, aggregated here so that people can
stop asking them:

Is the updated Tarsnap in the FreeBSD ports tree?
Yes. It wasn't when this announcement first went up, but I committed
the update at 21:23 UTC.

Is there any way to download all the data for a machine, re-encrypt,
and re-upload?
This is theoretically possible, but needs some new code to be written,
and I didn't want to delay announcing this bug for the time it would
take to write that code. If you don't want to take the 'upload a new
archive and nuke the old ones' approach (e.g., if you have important
history to keep), you'll have to wait a few days at least.

UPDATE: This can be done using the new tarsnap-recrypt utility
in version 1.0.29 of the Tarsnap client code.

So are my keys compromised now?
This bug affected data stored on Tarsnap, not the keys used to encrypt
it. If you delete all your data and then re-upload, it will be encrypted
securely -- the only reason to need new keys is if you have data already
stored and need to make sure that Tarsnap's deduplication doesn't prevent
the data from being re-uploaded.

One caveat to this: If your tarsnap keys were in an archive you stored,
they might be compromised that way.

I'm not worried about you, Amazon, or the US government reading my
data; all I'm concerned about is keeping it safe from script kiddies.
Do I need to worry about this?
Script kiddies aren't going to be able to access Tarsnap's backing storage
on S3, so this issue shouldn't affect you. (Whether your lack of worry
about me, Amazon, and the US government is justified is another matter,
but that's for you to judge, not me.)

I don't want to create new keys; can I keep my existing key file and
nuke first then upload?
Yes. The purpose of creating a new key is to ensure that new data isn't
deduplicated against old (insecurely encrypted) data, so if you delete
all the old data first you'll be fine. Unless, of course, your computer
dies between deleting the old archive and uploading the new ones...

Inequality in Equalland

Life in the nation of Equalland (population 80 million) is idyllic.
Boring, but idyllic. By all measures, it is a wonderful place to live:
Zero infant mortality; 100% high school graduation; 100% college
graduation; zero unemployment; zero income inequality; a steadily rising
stock market; no poverty; etc. There is one measure which raises some
eyebrows, however: The wealthiest 20% of households own well over 50% of
the nation's wealth.

Every resident of Equalland has the same life story. From birth until
age 18, they live with their parents, earning nothing and spending
nothing — their parents cover all their needs. At age 18, they
graduate from high school, become independent of their parents, and go
to college. The government of Equalland funds the post-secondary system
well enough that it can provide free tuition to students, but the college
students of Equalland want to study full-time, so they take out student
loans to cover their living expenses. At age 22 they graduate from
college, get married, and have children (in Equalland, women always give
birth to pairs of twins, one male and one female, in order to keep the
gender ratio fixed at 1:1). They immediately find jobs, and work until
age 65, all earning the same constant (inflation-indexed) salary,
gradually saving up enough money (which they invest in the stock market)
to pay for their retirement. At age 65 they retire, and they gradually
spend their retirement savings until age 80, when they die peacefully in
their sleep and their few dollars of remaining retirement savings are
spent on funeral costs.

To provide a more concrete picture of the household economics of
Equalland, here's some more numbers (dollar values are
inflation-adjusted, i.e., expressed in constant "2011 dollars"):

The stock market rises at a consistent rate of inflation + 4% each year.

Student loans carry with them an interest rate equal to the stock
market's growth rate (since both are zero-risk investments, there is
no reason for them to have different rates).

While at college, each student spends $15,000 per year.

From age 22 until 65, every person earns $50,000 per year (while on
parental leave or when sick, enlightened government policies replace
100% of their income).

Every 4-person household (parents aged between 22 and 40) spends
$90,000 per year, thus saving $10,000 per year towards retirement.

After their kids leave home, household expenses drop by $10,000 per
year (kids are expensive!), and thus parents start saving $20,000 per
year towards their retirement.

Upon retiring, their expenses drop slightly further, to $75,000 per
year, due to a lack of employment-related costs (e.g., bus/train fares).

Upon dying when they (simultaneously) reach age 80, each couple has
$5,452 remaining, which covers their funeral costs.

From these numbers, we can obtain a complete picture of the wealth of
Equalland:

The most indebted households in Equalland — $130,133 in debt, to
be precise — are those formed by 22 year olds; not only have they
spent 4 years taking student loans while studying at college, but they
have also just married, thus doubling their per-household debt.
The average household, in contrast, has a comfortable $206,080 in
retirement savings.

But what of the wealthiest? Out of the 33 million households in
Equalland, the wealthiest 20% — 6.6 million households —
are aged between 58 years 5 months and 71 years 8 months: In short,
those who are either about to retire or recently retired. Their
average wealth is $693,182 — over triple the average wealth
— and between them, they hold 4.58 trillion dollars out of
Equalland's total 7.13 trillion dollars of household wealth... or
slightly over 64%. So much for equality.

Obviously no such country exists, and most countries have significantly
higher wealth inequality — in the US, for example, the top 20%
of households own 84% of the wealth. But consider this: Equalland
is an idealized scenario. If the stock market didn't rise consistently,
or some people lived beyond age 80, or some people had significant
medical costs in their final years — or, god forbid, there was
any variability in how much individuals earned — then the top 20%
of the population would need to hold more than 64% of the country's
wealth just to maintain their standard of living after retiring.

Is there too much inequality in the world? Sure. Is all inequality
bad? Not if you hope to retire some day.