On the Road to Continuous Delivery

Continuous delivery is a hot topic. A lot of people are talking about it, but implementation in the real world is scarce. I lucked out at my last assignment when I was at SVT (Swedish Public Television) and got the chance to work on implementing a continuous delivery pipeline.

When I started, the project had delivered once and was gearing up for its second delivery. Representatives from each team met, and we decided to aim for an (at the time) aggressive schedule of one release per week! Our first “fast” release would go out in January, and we would continue from there.

It would be nice to say that this worked out well and we were continuously delivering from then on, but this blog entry is about our road to continuous delivery, so my story starts here!

Our first “fast” release took almost 2 weeks to deploy. We struggled with roll forward and how to integrate that with testing. We ran into collisions when some people were fixing issues, while development was ongoing. We also had a huge backlog of features to test since the previous release was deployed almost 3 months earlier. All of this took a lot time and effort. Once the release was in production, we set up release retrospectives to help us improve our releases.

Our situation after that first release:

No branching. SVT had previously experienced major problems with team branches that were long lived, and were difficult to integrate and release. So we weren’t going to branch.

Roll forward. We were also really serious about following the idea of roll forward. If there’s a problem with the release, we don’t roll back, we fix it and try again.

Automated release scripts. Our releases were all scripted. We didn’t want any manual involvement during the releases. The teams had started the project by creating a system of Fabric scripts to handle deployment of the application to all the different environments.

Automated tests. The goal was to have as many automated Selenium tests as possible running against our application.

It sounds like a pretty good plan.

Unfortunately, it wasn’t so easy to get it working right away. Our attempts after that first 2 week release did result in weekly releases, but we were working against the following factors:

The combination of no branches and roll forward meant that we were fixing code in the same branch in which features were being developed. We were testing a moving target throughout the release cycle. Teams were also eager to get just one more thing in so we were facing a higher than usual rate of change on mainline the day before a release.

Our scripted releases didn’t work so well in reality. All the test environments up to staging were in-house environments. Unfortunately, staging and production were hosted externally, and their setup was very different. The scripts that ran flawlessly up to staging, had a habit of failing the day before the release due to environmental differences. The situation was further aggravated because we didn’t “own” staging and couldn’t deploy to it outside the release schedule.

The goal, supported by management, was to have automated tests for all major functionality. Unfortunately it was difficult to write these tests so that they consistently returned the same results in all environments. We were writing detailed tests so minor changes to the UI made the tests fail. The tests were also dependent on a certain content setup that was sometimes difficult to achieve. The result was that people ignored the multitude of failures, and the automated test status didn’t really tell us much about the state of the release.

By the time the migration project ended we had succeeded in getting releases out every week, with the possibility of more releases on demand. How did we do it?

We evolved our no branching/roll forward strategy. We started out by introducing a code freeze, then we progressed to a release branch. We finally settled on tagging mainline, and making that the release version. If the tag contained any blocking defects the release was canceled. This removed all the wasted time we were spending trying to save failed releases. This strategy also motivated the teams to ensure we had better code quality going into the release. Code was now guaranteed tested and working before we tagged.

We also got buy-in to deploy twice daily to the staging environment, ensuring that our automated deploy scripts worked in a production-like environment. (SVT’s goal was to move all the hosting internally so we didn’t spend too much time in trying to get the scripts to match exactly for all environments). At the same time we improved the deployment process so that we really had a one-click deploy to production.

The teams started focusing more on the Selenium tests. We implemented a strategy of importing the data necessary for running the tests to the test environment. We also improved the way the tests could create the data they needed. Small changes, including ensuring data integrity as well as a dedicated automated testing environment, helped us tremendously. We also instituted a more disciplined manual testing process. The Product Owners identified the features that were “must-haves” for a release to be a GO, and that’s what we focused on instead of nitty gritty pixel details.

These changes allowed us to get to a stage where we could release on demand. I left SVT once the migration project was complete, but the improvements have continued. In addition to moving the hosting of the staging and production environments in-house, the deployments are now being handled by Chef. The automated test coverage was expanded (we were still spending too much time on routine manual tests that should have been automated) and JBehave was introduced.

I have to say that this was the most positive releasing experience I’ve had at a project. I really enjoyed working with the other devops-minded developers throughout this journey, and wouldn’t want to work any other way. The key is to automate as much as possible, and make sure the environments are similar as possible.

I enjoyed working on these things as well, and I have to admit it’s cool to read about what we’ve accomplished!

At the moment we’re working on a performance baseline to assert that we don’t see a performance degradation over time. This is tricky, for the same reasons that you succinctly stated above with regard to the selenium tests. It might be worthy of an article of its own, to be honest.

Yassal, Firstly an interesting and insightful article. How did you handle non functional testing e.g performance, security etc? Was that not required for the kind of project you were doing or it was handled out of the scope of the continuous delivery?

Thanks! Short answer: Security and performance testing were not part of the continuous delivery process (not automated). They were part of the development process. As you can see from Ola’s response above, it’s still a work in progress!