Continuous Deployment

This morning Steve gave a presentation on context automation and
Kynetx at the Utah Technology Council's CTO P2P forum. The
presentation was great and the audience asked a lot of good
questions. One thing that came up (I don't even remember why) was
the subject of continuous deployment. I decided I'd pull a few URLs
out of my head and put them in a blog post for people to mull over.

The first URL I think of when I consider continuous deployment is code.flickr.com. If you've never
been there, the bottom of the page lists when the last deployment of
Flickr was, how many deployments have happened in the last week, and
who was involved (pictures). Here's a screenshot:

When I first saw that I was astounded. Sometimes there are a dozen
or more deployments in a single day. The questions that spring to
mind: why? and how? Both are answered in a few posts by Timothy
Fitz.

In Continuous
Deployment, Timothy discusses the concept. Basically, it comes
down to "fail fast." Deploying a few small changes is less likely to
break something and when it does, you'll know more quickly what
caused the problem and be able to correct it.

There are, of course, some problems to solve to get that done.
First, you need a good, thorough test suite. Timothy points out that you also need
tests that

run fast and

execute reliably.

The test suite Timothy is describing takes 4.4 machine hours to
execute. That's a lot of testing. To make it run fast enough to
deploy continuously, they have a buildbot that runs tests across 36
machines in parallel.

The point about test reliability is important too. Intermitently
failing tests will ruin this process. Timothy says:

When I say reliable, I don't mean "they can fail once in a thousand test runs." I mean "they must not fail more often than once in a million test runs." We have around 15k test cases, and they're run around 70 times a day. That's a million test cases a day. Even with a literally one in a million chance of an intermittent failure per test case we would still expect to see an intermittent test failure every day. It may be hard to imagine writing rock solid one-in-a-million-or-better tests that drive Internet Explorer to click ajax frontend buttons executing backend apache, php, memcache, mysql, java and solr. I am writing this blog post to tell you that not only is it possible, it's just one part of my day job.

I love this whole idea. I've lived the life of infrequent
deployments and it will suck the soul right out of your engineering
and ops teams. That's why when we started up Kynetx, I was
determined to not repeat those mistakes. Our system is not as
sophisticated as the one Timothy describes, but my goals is to get
there and we make specific goals about things that need to happen to
get there.

You may not be able to get to 50 deployments a day overnight, but you
can increase the frequency of deployment and prioritize the
development efforts necessary to increase that frequency. Set some
goals and take your life back.