alphalist.CTO Podcast – For CTOs and Technical Leaders
Embrace the Site Reliability Mindset with Alex Ewerlöf, Sr. Staff Engineer @ Volvo Cars 🚗 and SRE thought leader. Understand how the different aspects of site reliability work together 🪢 and when you need a DevOps vs. Platform Team vs. SRE. Find out how to set your own Service Level Indicators (SLI), Service Level Objectives (SLO), and Service Level Agreements (SLA), as laid out by the creator of the Service Level Calculator. Listen to find out:
Listen here
BROUGHT TO YOU BY DoiT
(00:00:00) Intro (00:04:04) Who is Alex (00:05:14) The Nerd Journey: From Assembling Computers to SRE (00:06:58) The Evolution of a Site Reliability Engineer (00:07:55) SRE vs. DevOps vs. Platform Engineering (00:08:32) Washing Machine vs. Laundry Room: The Challenge of Standardization in Large Organizations (00:10:56) Platform vs. SRE (if you are not Google) (00:12:37) Common Platform without Premature Standardisation (00:13:55) Software in Car Companies (00:14:40) Volvo's Strategic Shift Towards In-House Software Development (00:15:06) Swedish Tech Scene: Why Volvo now has offices in Stockholm (00:16:53) Central Platform for Software in the Automotive Industry? (00:18:35) Role of CSO: Chief Software Officer (00:20:09) When the platform is the Product (e.g. Cars) (00:21:42) Implementing Service Levels: A Guide to SLIs, SLOs, and SLAs (00:22:20) What is SLI (Service Level Indicator) (00:22:40) What is SLO (Service Level Objective) (00:23:20) What is SLA (Service Level Agreement) (00:24:24) Leveraging SLOs for autonomous teams in complex products (00:26:08) Relationship between SLI and other Engineering Metrics (00:28:04) Getting Started with Service Levels (00:28:48) STEP 1. Understand SLI/SLOs (00:29:10) STEP 2. Workshop: Identify your Service and What Matters (00:30:02) Setting up SLIs and Alerting for SLIs (00:31:02) STEP 3. Calculate your SLI (00:31:44) STEP 4. Empower team with Good On-Call Practices (00:32:17) Getting Buy-In Across the Organisation (00:33:33) The Significance of Choosing the Right SLIs (00:34:06) Ways to Measure Availability (00:35:43) On-Call Management: Team vs. Centralized Approaches (00:38:22) Downtime of External Vendors (00:40:47) Why SLI needs to come from consumers (00:43:01) SLOs: Setting Realistic SLOs and Avoiding Common Pitfalls (00:44:05) Meaning of 'Objective' differs in OKR and SLO (00:46:18) Financial Incentive to Fail Less? (00:48:43) Biggest Mistakes (00:49:30) No Blame Game - Public Metrics Need Cultural Fit (00:51:50) Advice to Younger Self (00:53:01) Stay Curious (00:53:30) Don't Confuse Confidence with Competence (00:54:47) Outro
Alex Ewerlof is a Snr. Staff Engineer at Volvo Cars and a Site Reliability Engineering thought leader. He is the author of the Reliability Engineering Mindset, creator of Service Level Calculator and regularly shares insights on his website AlexEwerlof.com. Get 40% off his book with this coupon code.
As the cloud landscape has evolved, so have its challenges. The shift from adopting to optimising public cloud infrastructure has forced born-in-the-cloud digital natives to grapple not only with growing technical complexities but also the intricacies of cost management and evolving best practices.
DoiT addresses these challenges head-on with an intelligent product portfolio and market-leading cloud expertise that equips engineering and finance teams to understand cloud costs in the context of their business, maximise savings with minimal effort, and make costs more predictable.
Need help making sense of (and optimising) your cloud costs? Set up a call with a DoiT expert to learn about gaining access to DoiT’s FinOps and infrastructure specialists More here