Devops — Agile Moves to Operations

In the world of constantly changing technology a debate is raging. On one side you have the movement for flexibility and more faster deployments at cost of unexpected downtime and issues. The other side the desire and need for stability. The debate? How to bridge the gap between the two sides — developers and operations. The answer DEVOPS.

DEVOPS seeks to align developers and operations with the business needs of continuous delivery not just of code, but infrastructure. This naturally scares operations and for good reasons. DEVOPS.COM provides additional information as well as resources on the concept.

A short synopsis of the problem and the process on one side of IT you have a silo of Developers who want to code,they want to create, they want to build. Their end game is deployment. The more features and changes the developers deploy the more business objectives they meet.

On the other side you have a silo of operations. Operations is charged with keeping production up as well as the development, test, and integration environments. Over the years operations have learned the less you change the more stable the environment and the less they are called during the middle of the night or during their child’s birthday part on the weekend. Those of us that have spent our careers working in operation are way too familiar with problematic changes. Deployments contribute to the biggest pain points. Faster and more deployments introduce more potential issues. Its easy to see why operations desire a stable environment. The stable environment meets business objects by achieving the SLA (service level agreement).

Two completely separate teams within each of the silos. On one hand continuous change and on the other stability which equates to less frequent change. Neither approach is wrong, but together they create friction. An us against them mentality tends to prevail. When a deployment occurs, problems ensue fingers begin pointing. Frustration looms over operation as they try to fight the fire. When developers are quick to point out that it worked when they tested it in their environment. Operations doesn’t understand how items are deployed with issues obviously all the necessary testing wasn’t completed. Developers can’t understand why the configurations are different.

How did we build these silos? Over the years operations and development have been viewed as separate. Operations developed their own set of processes. Think ITIL (IT Infrastructure Library). Often teetering on the verge of death by process. The name of the game — stability. Building, configuring and testing infrastructure takes time. Operations is charged with system up time and maintaining SLAs that are can be as high as 99%.

The business wants continuous up time while increasing the number of features and bugs rolled out quickly to meet the customer desires (demands). Developers answer with their own process, Agile. They code, test and deploy. The name of the game is small changes in a small window continuously flowing into production.

The reality of fast continuous delivery from an operations view is the instability. Every deployment problems occur. Some right away others weeks or months down the road. Sometimes the problems don’t manifest themselves until another release is deployed. Developers view continuous delivery is impeded by the archaic processes that delay releases for weeks.

DEVOPS seeks to breakdown the silos with operation and developers working as one team with the same objectives and common terminology. The important part is that this is not a technology but an attempt to resolve a business process to align IT with the business strategy.

What’s important from an operations standpoint is decreasing delivery time doesn’t necessarily mean increasing issues. Software data centers promise to allow for quick continuous deployments while protecting the stability of the environment. Vendors such as Amazon and Puppet Lab to name two. In my opinion I do believe we have turned the corner where up time was crucial and issue faults were less than tolerated to more toleration as we increase our pace just how much that will continue remains to be seen.

There’s an ongoing conversation on anti-fragile. The theory is proposed by Nassim Nicholas Taleb in his book Antifragile: Things That Gain from Disorder. The premise is simple, “.. is that category of things that no only gain from chaos but need it in order to survive and flourish.” The question is how does this apply to complex systems within the realm of IT?

Before we can change our behavior within operations, we have to change our culture. And not just in operation but within development as well. Instead of two separate entities existing in our silos throwing work over the wall to other groups, we need to become one team aligned with the business strategy. the starting point is conversations and rebuilding the culture.