DevOps: A Gentle Introduction

DevOps is an abbreviation for Developers and Operators (i.e., bringing developers and operators closer), but DevOps is much more than that:

“DevOps is a set of practices intended to reduce the time between committing a change to a system and the change being placed into normal production, while ensuring high quality.”

In typical software development projects, the time to checkin the code changes to release takes ages! In my experience, I have been in organisations where release is done in every two to three years. In DevOps, it is lightning fast: it could be even hundreds of times (like in Amazon) per day. That’s awesome, isn’t it? You can think of it as cyclic vs. continuous delivery. Cyclic delivery of earlier times took weeks, months, or even years. Whereas in DevOps, you can think of continuous delivery of modern times where it takes days or even just an hour or so for releases.

There are many drivers for DevOps. Many organizations strive forquicker releases to realise business needs. Wider availability of virtualization & cloud-based platforms made quicker releases possible. Increased availability of data centre automation & configuration management tools has also made DevOps possible.

DevOps can be seen as a natural evolution of agile and lean methods.

Agile focuses on bridging the gap between user requirements and realising them and testing it. So it bridges the gap between user requirements and development + testing. DevOps focuses on bridging the gap between the developers and the operations people. So, in addition to users functional and non-functional requirements, devops focuses on operational and business readiness.

Now let us look at some core underlying principles behind DevOps. DevOps encourages systems thinking, i.e., looking at how the entire system works instead of silos (like development teams, IT operations team, etc). Rapid delivery is possible only when we can “fail fast” and hence devops amplifies feedback loops to get feedback on the quality or problems as early as possible which is enabled through automation. A culture of continual experimentation and learning is need for effective adoption of DevOps practices.

The deployment pipeline is used to enable reducing the time between checkin to release. When a change is committed, automated acceptance and capacity testing is performed. If needed, manual tests are performed to ensure that the software can be released. In other words, deployment pipelines are structured to enable reliable software releases through build, test, and deployment automation.

DevOps advocates a set of processes including:

Treating operators as first class citizens

Making developers more responsible for incident handling

Enforcing deployment practices uniformly across both developers and operators

Using continuous deployment

Developing infrastructure code using same processes as application code

A very important concept in DevOps is how to deploy new releases (typically in the cloud environments). Two basic all or nothing strategies are blue/green deployments and rolling upgrades.

In Blue/Green (Red/Black) deployments: leave N instances with version A as they are, then allocate and provision N instances with version B and then switch to version B and release instances with version A.

Rolling Upgrade: allocate one instance, provision it with version B, release one version A instance. Repeat N times till all instances are updated.

In deployments, it is important to be able to disable or enable new or partially development features. This is enabled through feature toggles. The key idea is to differentiate between installing a new version and activating a new version. In feature toggle approach, you develop version B with new code under control of feature toggle. Now, install each instance of version B with the new code toggled off. When all of the instances of version A have been replaced with instances of version B, activate new code through toggling the feature.

Now let us discuss important concepts related to testing and deploying new features: of canary testing and Alpha/Beta testing.

Canaries are a small number of instances of a new version placed in production in order to perform live testing in a production environment. Canaries are observed closely to determine whether the new version introduces any logical or performance problems. If not, roll out new version globally. If so, roll back canaries. The word canary refers to birds. In early days before technology could be used to detect dangerous gases in coal mines, canaries (or birds) were used to detect leakage of dangerous gases. In the same way, we can use canary testing to perform live testing to see how the new version performs in production environment. If the new version fails in the production environment, it can be quickly rolled back.

In alpha/beta testing, also known as split testing or bucket testing, we place two variations of features in live environment (for example a web page). Now, when we the variations are evaluated by different visitors, we know which variation performs better. Interestingly, testing variations to evaluate the effectiveness of variations comes from old marketing practices. For example, marketing emailing campaigns use same content with different variations to evaluate which variation gets better conversions or response from the receivers. This same practice is applied in the DevOps contexts to evaluate attractiveness of design of new features.

When newly deployed feature has bugs or other problems (e.g., performance), what to do?

There are two main strategies to use when we find bugs in newly deployed code: to roll back (i.e., undo the deployment) or to roll forward (replace the feature with new feature / fixed version). Roll back may be a practical for most scenarios because when we know about a problem we want to ensure that no more users suffer from it. However, in some scenarios, especially ones that are not critical, one can also do roll forward by fixing the defect and replace it with the new version.

In DevOps, deployment tools are widely used. We could “bake” machine images or use “recipes” for standard configurations. DevOps engineers manage the recipes. Another interesting approach is treating “infrastructure as code” approach. Note that such scripts need to be managed with similar processes as source code (for e.g., versioned in configuration management systems).

There are a wide range of tools are used for automated deployment including conventional approach of using shell scripts.

For infrastructure configuration management, tools such as Apache ZooKeeper, Noah, Chef, and Puppet are commonly used. For environment virtualisation, tools such as Vagrant, XEN, and KVM are widely used. Capistrano is often used for deployment automation.

There are many challenges we have to deal with when introducing DevOps in an organisational context. Let us see a couple of challenges. Given the rapid deployments, how to integrate security audits on continually changing codebases? This challenge is important to address given the fact that security attacks are becoming more common than ever before. Another key challenge is on dealing with culture change in the organization when it adopts DevOps – both developers and operators will resist DevOps practices because it moves them out of their comfort zone and changes the way they are used to work. For example, developers may be frustrated to know that they have to care about monitoring and handling customer queries in addition to developing the code. Members in operations team may have to learn how to automate more – that will require programming and scripting and they often resist because they are forced to learn and use them.

However, the benefits of DevOps are significant. Adopting devops can quicken delivery with considerably shorter time from market need to realizing it in the software. DevOps practices also typically result in better quality because of shorter feedback cycles and increased automation. Finally, adopting DevOps increases organisational effectiveness because of close collaboration between development, testing, and operations teams.

“Deployment celebrations should be about the value of the new features, not joyful relief that nothing went horribly wrong”- Rebecca Parsons (Chief Technology Officer, ThoughtWorks)