Latency SLOs Done Right

Wednesday, 2019, October 2 - 11:30–12:00

Heinrich Hartmann, Circonus

Abstract:

Latency is a key indicator of service quality, and important to measure and track. However, measuring latency correctly is not easy. In contrast to familiar metrics like CPU utilization or request counts, the "latency" of a service is not easily expressed in numbers. Percentile metrics have become a popular means to measure the request latency, but have several shortcomings, especially when it comes to aggregation. The situation is particularly dire if we want to use them to specify Service Level Objectives (SLOs) that quantify the performance over a longer time horizons. In the talk we will explain these pitfalls, and suggest three practical methods how to implement effective Latency SLOs.

Heinrich Hartmann is the Analytics Lead at Circonus. He is driving the development of analytics methods that transform monitoring data into actionable information as part of the Circonus monitoring platform. In his prior life, Heinrich pursued an academic career as a mathematician. Later he transitioned into computer science and worked as a consultant for a number of different companies and research institutions.