Thoughts on building software from a wayward Hawaiian.

12/20/2010

What I learned about Continuous Deployment

While at QCon I took a great class from the fellows at ThoughtWorks on Continuous Deployments. The single most important thing I learned from the class was that if something is painful, do it a lot! Having just sat through at least three war rooms in the last two months to manage deployments this statement is true!

However, when I mentioned this late one night to the team I had a lot of blank looks and confusion. Given our current environment and processes I don't blame them for the skepticism. In fact there was a lot of if we did this all the time none of us would be here for long. They are right in more ways than they know! The trick with doing something painful a lot is that you should learn what the painful parts are and work to make them pain free.

The biggest part of the problem is that we had a monumental amount of change that has gone into our last few releases. Starting with the first release of our beta site we tried to move to a three week release schedule. For the most part we did, however, the problem was that the time from code complete on the first release to the code complete for the second and third releases was months. We held off from putting a lot of change into our second release and looked at the third release as the one that introduced most of the business facing changes. What ended up happening is that we crammed months of work into a three week release window. Not fun and not the point of the three week cycle. If we had made us of the techniques that the continuous improvements clan is pushing we could have greatly reduced the painfulness of our process.

The ways to get to a short or continuous release cycle are:

Automate testing through the whole stack including acceptance testing

Create feature flags to allow functionality to be released but turned off

Keep everything in mainline, don't branch

Keep configuration in scm with your code but deployed separately

Use green/blue or red/black or a/b type deployment paradigm

At Edmunds we have started to automate testing, however, we have a ways to go. We have automated unit, or fine grained tests, automated UI tests using selenium, and mostly automated load tests using JMeter. However, we are missing more coarse grained or integration style testing, and our ability to create automated acceptance tests needs work. There is a mythical land where all acceptance tests are fully automated, once one has arrived in this land one can push code with the confidence that nothing has broken.

I am still trying to figure out how to get to that mythical land. We are setting 80% code coverage numbers for our unit tests. Next we need to build up our integration tests. My current thinking is that we build integration tests based on pre-production and production bugs and incidents. As these types of issues come up the idea would be to create tests to reproduce the problem and then prove that the problem has been fixed. Our selenium tests have pretty good coverage, but do not account for business user acceptance criteria. BDD style frameworks such as jBehave may help, however, this is the area I know the least about.

One of the goals for my group next year will be to work with our QA staff to figure out how we are going to address testing to eliminate part of our deployment pain.

Part of the goal of continuous deployments is that every build could be pushed to production. Even with good test coverage there are times when you need to delay the release of a feature because it is not ready for some reason. To allow any build to be deployed to production one needs a mechanism by which to turn these features off. Feature flags is a concept that allows one to do this. Using a configuration file, or other mechanism, one would simply put checks in the code to determine if a feature should be on or off. Such an implementation feels heavy weight. I think we need to spend some time figuring out how to make a more elegant implementation that a bunch of hard coded if/else statements. Perhaps we could use annotations? For us the simplest might be to use some of the a/b functionality we have built into our cms. By switching out templates or modules on the site we could, basically, build new functionality as an extension of an existing module and switch the module include based on feature flag configuration files. We have done something like this in the past using flags to enable or disable features.

Another goal for my team next year will be to design a feature flag system for our front end applications to use.

Once you have feature flags and comprehensive tests one can stop using feature branches. Eliminating feature branches eliminates the complexity of managing our code base and ensures that we have a higher probability of having builds that are ready for production. We have already begun the move towards mainline only development.

An area we have spent a lot of time with and still need more work is configuration. We have leveraged DNS for certain configuration items using DNS text records. Using text records we let our systems know what environment they are running in. A simple use case of environment name based configuration is URL writing. For all non-production environments we use a url like qa-www.edmunds.com while production is www.edmunds.com. To make this switch we have a simple tag library that prepends environment name + "-" to all urls and images. Prior to this setup we had a tool that would switch your host file based on what environment you wanted to look at. We also make extensive use of configuration files outside of our code base and use Spring to wire in the configuration if an external file is found. We try our best to do configuration by exception using a combination of these two techniques. Where we need to improve is on the SCM part of the configuration. Many of the one off files are kept on machines and are not checked in. During deployments this practice leads to confusion and often results in missed configuration steps. We will be moving configuration into SCM and working on deploying the configuration as part of our roll-out. Our use of DNS names for environments should help with this as our configuration could be pulled down automatically. Additionally, our use of zookeeper can provide a more robust configuration system, we just need to build it :-)

The last piece of the puzzle is something we are currently working on (in fact I am in a war room right now launching the first B version of our server farm). Having two full production stacks is a luxury to be sure, however, it gives us the ability to provide rapid fail back in case of a problem with a build. Again, make something painful less painful! By having an A/B set of servers we can deploy to one, give it traffic and watch for failure. In the case of failure we can can easily fall back. In addition, for deployments we do not have to worry about being as safe so we can build fast upgrade processes.

Doing something painful more often is a recipe for disaster if you keep doing what you have always done. However, if you use the practice as a mechanism to improve, it will be a great benefit. I look forward of the course of 2011 to implementing more of the practices and moving us closer to the nirvana that potentially awaits ;-)

Comments

What I learned about Continuous Deployment

While at QCon I took a great class from the fellows at ThoughtWorks on Continuous Deployments. The single most important thing I learned from the class was that if something is painful, do it a lot! Having just sat through at least three war rooms in the last two months to manage deployments this statement is true!

However, when I mentioned this late one night to the team I had a lot of blank looks and confusion. Given our current environment and processes I don't blame them for the skepticism. In fact there was a lot of if we did this all the time none of us would be here for long. They are right in more ways than they know! The trick with doing something painful a lot is that you should learn what the painful parts are and work to make them pain free.

The biggest part of the problem is that we had a monumental amount of change that has gone into our last few releases. Starting with the first release of our beta site we tried to move to a three week release schedule. For the most part we did, however, the problem was that the time from code complete on the first release to the code complete for the second and third releases was months. We held off from putting a lot of change into our second release and looked at the third release as the one that introduced most of the business facing changes. What ended up happening is that we crammed months of work into a three week release window. Not fun and not the point of the three week cycle. If we had made us of the techniques that the continuous improvements clan is pushing we could have greatly reduced the painfulness of our process.

The ways to get to a short or continuous release cycle are:

Automate testing through the whole stack including acceptance testing

Create feature flags to allow functionality to be released but turned off

Keep everything in mainline, don't branch

Keep configuration in scm with your code but deployed separately

Use green/blue or red/black or a/b type deployment paradigm

At Edmunds we have started to automate testing, however, we have a ways to go. We have automated unit, or fine grained tests, automated UI tests using selenium, and mostly automated load tests using JMeter. However, we are missing more coarse grained or integration style testing, and our ability to create automated acceptance tests needs work. There is a mythical land where all acceptance tests are fully automated, once one has arrived in this land one can push code with the confidence that nothing has broken.

I am still trying to figure out how to get to that mythical land. We are setting 80% code coverage numbers for our unit tests. Next we need to build up our integration tests. My current thinking is that we build integration tests based on pre-production and production bugs and incidents. As these types of issues come up the idea would be to create tests to reproduce the problem and then prove that the problem has been fixed. Our selenium tests have pretty good coverage, but do not account for business user acceptance criteria. BDD style frameworks such as jBehave may help, however, this is the area I know the least about.

One of the goals for my group next year will be to work with our QA staff to figure out how we are going to address testing to eliminate part of our deployment pain.

Part of the goal of continuous deployments is that every build could be pushed to production. Even with good test coverage there are times when you need to delay the release of a feature because it is not ready for some reason. To allow any build to be deployed to production one needs a mechanism by which to turn these features off. Feature flags is a concept that allows one to do this. Using a configuration file, or other mechanism, one would simply put checks in the code to determine if a feature should be on or off. Such an implementation feels heavy weight. I think we need to spend some time figuring out how to make a more elegant implementation that a bunch of hard coded if/else statements. Perhaps we could use annotations? For us the simplest might be to use some of the a/b functionality we have built into our cms. By switching out templates or modules on the site we could, basically, build new functionality as an extension of an existing module and switch the module include based on feature flag configuration files. We have done something like this in the past using flags to enable or disable features.

Another goal for my team next year will be to design a feature flag system for our front end applications to use.

Once you have feature flags and comprehensive tests one can stop using feature branches. Eliminating feature branches eliminates the complexity of managing our code base and ensures that we have a higher probability of having builds that are ready for production. We have already begun the move towards mainline only development.

An area we have spent a lot of time with and still need more work is configuration. We have leveraged DNS for certain configuration items using DNS text records. Using text records we let our systems know what environment they are running in. A simple use case of environment name based configuration is URL writing. For all non-production environments we use a url like qa-www.edmunds.com while production is www.edmunds.com. To make this switch we have a simple tag library that prepends environment name + "-" to all urls and images. Prior to this setup we had a tool that would switch your host file based on what environment you wanted to look at. We also make extensive use of configuration files outside of our code base and use Spring to wire in the configuration if an external file is found. We try our best to do configuration by exception using a combination of these two techniques. Where we need to improve is on the SCM part of the configuration. Many of the one off files are kept on machines and are not checked in. During deployments this practice leads to confusion and often results in missed configuration steps. We will be moving configuration into SCM and working on deploying the configuration as part of our roll-out. Our use of DNS names for environments should help with this as our configuration could be pulled down automatically. Additionally, our use of zookeeper can provide a more robust configuration system, we just need to build it :-)

The last piece of the puzzle is something we are currently working on (in fact I am in a war room right now launching the first B version of our server farm). Having two full production stacks is a luxury to be sure, however, it gives us the ability to provide rapid fail back in case of a problem with a build. Again, make something painful less painful! By having an A/B set of servers we can deploy to one, give it traffic and watch for failure. In the case of failure we can can easily fall back. In addition, for deployments we do not have to worry about being as safe so we can build fast upgrade processes.

Doing something painful more often is a recipe for disaster if you keep doing what you have always done. However, if you use the practice as a mechanism to improve, it will be a great benefit. I look forward of the course of 2011 to implementing more of the practices and moving us closer to the nirvana that potentially awaits ;-)