Organizations—big and small—have started to realize just how crucial system and application reliability is to their business. At the same time, they’ve also learned just how difficult it is to maintain that reliability while iterating at the speed demanded by the marketplace. Site Reliability Engineering (SRE) is a proven approach to this challenge.
SRE is a large and rich topic to discuss. Google led the way with Site Reliability Engineering, the wildly successful O’Reilly book that described Google’s creation of the discipline and the implementation that has allowed them to operate at a planetary scale. Inspired by that earlier work, this book explores a very different part of the SRE space.
The more than two dozen chapters in Seeking SRE bring you into some of the important conversations going on in the SRE world right now. Listen as engineers and other leaders in the field discuss different ways of implementing SRE and SRE principles in a wide variety of settings; how SRE relates to other approaches like DevOps; the specialities on the cutting edge that will soon be common place in SRE; best practices and technologies that make practicing SRE easier; and finally hear what people have to say about the important, but rarely discussed human side of SRE.

Introduction
1. SRE Implementation
1. Context Versus Control in SRE
2. Interviewing Site Reliability Engineers
3. So, You Want to Build an SRE Team?
4. Using Incident Metrics to lmprove SRE at Scale
5. Working with Third Parties Shouldn’ tSuck
6. How to Apply SRE Principles Without Dedicated SRE Teams
7. SRE Without SRE: The Spotify Case Study
8. Introducing SRE in Large Enterprises
9. From SysAdmin to SRE in 8,963 Words
10. Clearing the Way for SRE in the Enterprise
11. SRE Patterns Loved by DevOps People Everywhere
12. DevOps and SRE: Voicesfrom the Community
13. Production Engineering at Facebook
lⅡ. Near Edge SRE
14. In the Beginning, There Was Chaos
15. The Intersection of Reliability and Privacy
16. Database Reliability Engineering
17. Engineering for Data Durability
18. Introduction to Machine Learning for SRE
ll. SRE Best Practices and Technologies
19. Do Docs Better: Integrating Documentation into the Engineering Workflow
20. Active Teaching and Learning
21. The Artand Science of the Service-Level Objective
22. SRE as a Success Culture
23. SRE Antipatterns
24. Immutable Infrastructure and SRE
25. Scriptable Load Balancers
26. The Service Mesh: Wrangler of Your Microservices?
Ⅳ. The Human Side of SRE
27. Psychological Safety in SRE
28. SRE Cognitive Work
29. Beyond Burnout
30. Against On-Call:A Polemic
31. Elegy for Complex Systems
32. Intersections Between Operations and Social Activism
33. Conclusion
Index