Joel Test for Continuous Delivery

There’s a bit of a cottage industry forming around updating the Joel Test or producing more purpose-specific variations of it. If you’re not familiar with the Joel test, it’s Joel Spolsky’s “irresponsibly quick” way of measuring the quality of a development team. I find these fascinating to read, and wanted to contribute one in the latter category— a Joel Test specifically for continuous delivery (the practice of delivering software in short cycles that can be released at any time).

I should really note that this list is even more specific than that— it’s focused almost exclusively on continuous delivery for web applications. Mobile is a different ballgame entirely, one that I hope to write more about in the future. Nevertheless, pulling this list together helped me crystallize my own thoughts on the processes we use here at LaunchDarkly, and more importantly why we do some of the things we do.

So here’s my take on the Joel Test, continuous delivery edition.

1. Do you use a distributed version control system?

As Joel himself noted, the distributed part is not that interesting. Every commonly used DVCS workflow uses a centralized repository anyway. What is interesting about DVCS is that it makes branching and merging cheap. Which is interesting because…

2. Do you use feature branches?

3. Do you do pull-request based code reviews and continuous integration with branch builds?

4. Do you have a master (or release) branch that’s always deployable?

…if there were a prime directive for continuous delivery, it would be that master must always be deployable. The reason it’s so important is that it eliminates the need to build any kind of communication or coordination process around deploying. You can’t deploy multiple times a day if you have to @all mention your team in your Slack or HipChat room and wait for everyone to give the all-clear.

If you can maintain the invariant that master (or whatever your release branch is named) is always deployable, it means that you have a lot of the right processes in place. It’s a necessary, but not sufficient condition for achieving continuous delivery.

Feature branches, pull-request based code reviews, and branch builds are techniques that help keep master deployable. They ensure that no code can be merged into master without being reviewed and tested adequately. Any workflow that merges code before these steps have been completed violates the prime directive.

5. Do you deploy to production daily?

In 2010, this might have been every month. Now, weekly deploys are standard, and many teams strive for daily deploys. Cutting-edge teams deploy multiple times per day (Flickr, for example, deploys 30-50 times per day). That’s an ambitious target— one that requires significant cultural changes and (at the time of this writing) bespoke tooling.

6. Do you have a staging environment?

Your staging environment needs to be as close to your production environment is possible. Production runs under https? So should staging. Use PubNub, Fastly, or other cloud services in production? So should staging. Otherwise, bugs will slip through to production because of environmental differences. Staging should be (semi)-stable. It should have some nines in its uptime. How many is up to you, but there should be at least one nine in there. Preferably a leading nine.

Also, note that one staging environment is the bare minimum. Ideally, anyone on the team should be able to spin up a production-like environment that others can see with one button click, with complete production or production-like data. Once code is pushed to a branch, anyone on the team (especially non-developers) should be able to see how that code behaves in a production-like environment within minutes.

On one of my previous teams, we built a system called Rainicorn that helped us achieve this, and it was absolutely game-changing. It is one of the systems I wished I could have taken with me when I started LaunchDarkly.

7. Do you use feature flags?

No matter how sophisticated your staging environment and tests are, production traffic is a different beast entirely. Feature flags unlock the ability to do dark launches, allowing you to mitigate risk by rolling out features gradually. Feature flags separate deployment from rollout. These steps are often conflated in the continuous deployment pipeline— the best teams understand that these are distinct steps, and use feature flags to separate the two.

8. Do you have cross-functional teams (dev, design, product)?

If you want to deliver new code and new value to your users daily, your need to build a feedback loop where design, pm, and dev incorporate user feedback as rapidly as possible back into the product. Unless your teams are sitting together and collaborating efficiently, it’s difficult to make the feedback loop tight enough.

9. Do you have user analytics on production?

This ties back to having an effective feedback loop. Moving quickly often means taking bets on what should be built and how— and some of those bets are going to be wrong. If you don’t have adequate data, it’s impossible to refine and adjust to what’s happening. Users are constantly telling you what’s working and what’s not simply by using your product— but if you don’t have user analytics (e.g. Mixpanel or KISSMetrics) set up, you’re not listening.

10. Can anyone on the team deploy to production with a button push?

Remember, a team should be a cross-functional group— which means that to hit this mark, your designers, your PMs, everyone should be able to deploy to production.

Few teams achieve this, and in my experience there is usually some combination of three factors in play:

Tooling problems

Quality concerns

Compliance / legal restrictions

The most common issue I’ve seen with putting a play button on production deploys is that it’s too hard to build the tooling necessary to do so. It often does require a significant technical investment, usually some combination of eliminating technical debt and shoring up deployment infrastructure. Thankfully, deployment tooling has gotten massively better over the past few years. I thought about naming some specific examples but they all boil down to Docker anyway. So I’ll just go with Docker. Docker Docker Docker.

I can’t help or say much about (3), other than to note that you should push back on compliance restrictions if you can. In some cases, overly conservative recommendations are made without understanding the impact they have on productivity. Does Sarbanes-Oxley mean that only your dedicated release engineer can deploy to production? I don’t know, and I’m not a lawyer, but I’m betting Docker can help here. Maybe microservices too.

John Kodumal is the CTO and co-founder of LaunchDarkly. LaunchDarkly helps software teams launch, measure, and control their features. Previously, John was a development manager at Atlassian and an architect at Coverity. He has a Ph.D. from UC Berkeley in programming languages and type systems, and a BS from Harvey Mudd College. He is constantly trying to improve LaunchDarkly’s continuous delivery workflow.