No Brian (Gracely) for a Sunday Perspective this week. We have Brian Singer (@brian_singer, CPO @nobl9inc) talking about Service Level Objectives (SLO), what they are, why they matter, and how to use SLOs to focus on innovation vs. technical debt.
SHOW: 613
CLOUD NEWS OF THE WEEK - http://bit.ly/cloudcast-cnotw
CHECK OUT OUR NEW PODCAST - "CLOUDCAST BASICS"
SHOW SPONSORS:
SHOW NOTES:
Topic 1 - Brian, welcome to the show! We had Alex on the podcast last year and look forward to continuing the SLO conversation. Give everyone a brief introduction.
Topic 2 - If someone isn’t familiar with SLO’s (Service Level Objectives), how do you define them? Why do they matter? What problem do they solve? How are they different from SLA’s?
Topic 3 - Is this a transition from max reliability to instead look at errors as a “budget”? How can you manage a certain window of unreliability and keep customers happy?
Topic 4 - How do you create SLOs? Who creates them? Is this an SRE connecting up to existing systems or new tooling and plumbing? Does it fit into an existing GitOps workflow for instance - SLOs-as-Code? Is there automation triggers that happen when conditions are met?
Topic 5 - How does an SRE know which metrics matter? I would imagine not all downtime is equal? How does this correlate with business KPIs? Do you fine tune over time?
Topic 6 - The big question is always the focus on technical debt vs. innovation. Does this help and if so how?
FEEDBACK?