The state of operations health in the world of DevOps

One of the compelling and important conclusions in the DORA report is that improving your operations through the use of DevOps practices and methodologies is a continuous process. In this article, Ophir Ronen, Director of Digital Operations Insights at PagerDuty explains why improving operations must become a core attribute of company culture.

At PagerDuty, we believe the best way to truly understand the health of your employees is to leverage the real-time human data that is already flowing through your systems. PagerDuty’s platform for action and real-time IT Operations orchestration consists of multiple facets and interlocking capabilities.

My Expert Services Team focuses on measurably improving the efficiency, operations health, and operational maturity of our customers. We do this via our Operations Health Management Service (OHMS), which quantifies three things—operator health, systemic efficiency, and operational maturity using the data generated by our customers. This dataset is distilled algorithmically into easily understood scores that range from 0–100 across those three primary areas. This data provides you with a true understanding of the on-call load and its impact on your teams. This crucial information empowers managers and HR professionals to diagnose and rectify employee operations health issues before they become serious enough to cause valuable employees to leave your organization.

DORA, the DevOps Research and Assessment team, released a “State of DevOps” report*, which reveals some intriguing trends across industries. For example, the report states that their “analysis shows that any team in any industry, whether subject to a high degree of regulatory compliance or not—across all industry verticals—has the ability to achieve a high degree of software delivery performance. We classify teams into high, medium, and low performers and find that they exist in all organization types and industry verticals.”

This finding is consistent with our observations—but with one significant divergence: While it is true that high, medium, and low performers exist in all organization types and industry verticals, we have clear evidence that operations health varies significantly across industries. To illustrate this, we segment over 11,000 companies, from small startups to the upper strata of the Fortune 100, using the North American Industry Classification System (NAICS) in the below image:

Additionally, the DORA report found that the top 7 percent of performers recover 2,604 times faster than the bottom 15 percent from major incidents and that such periods of incident resolution are “intensive and exhausting.” Most interestingly, the report also pointed out that when such incidents happen multiple times a year, “they can quickly take over the work of the team so that unplanned work becomes the norm, leading to burnout, an important consideration for teams and leaders.”

In our recently completed attrition study involving 50,000 responders, we showed that on-call pain (unplanned work) has a statistically significant impact on employee attrition. As operations health degrades (as measured by our Operations Health Score algorithm), employee attrition rates increase across the global population of responders. Employees who suffer extreme on-call pain, as measured by operations health parameters, cannot and do not continue this experience for a sustainable period of time. The below image shows the relationship between health and attrition risk, and how it segues into employee attrition.

Our data further indicates that no responder will tolerate extreme notification measures for a sustained period of more than 18 months, as shown below. With the red-hot demand for talented IT operations responders, the ability to accurately diagnose and address the causes of employee attrition is the cornerstone of effective human capital management.

One of the compelling and important conclusions in the DORA report is that improving your operations through the use of DevOps practices and methodologies is a continuous process. Furthermore, the focus on improving operations must become a core attribute of company culture and led by a strong executive sponsor in order to succeed.

At PagerDuty, we aim to measurably improve your operations with our industry-leading digital operations platform, data science–powered telemetry and analysis, and guidance from our Expert Services Team. For more information, please reach out to us at insights@pagerduty.com.

Ophir Ronen is a serial Internet entrepreneur having started his career as a co-founder of Internap Network Services. He has started five companies including EE HQ, which PagerDuty acquired in 2015. At PagerDuty, he founded Event Intelligence, the Operations Command Console, and the Operations Health Management Service (OHMS). Ophir excels at using data science to improve operations and guiding teams of creative people to successful outcomes. He holds eight patents and enjoys building and sharing knowledge.