Managing reliability means being available when things go wrong. But how do you make the on-call time productive? While at NDC in London, Richard talked to Lesley Cordero about her work with the New York Times on reliability management teams. Lesley talks about how putting regular sprint work into on-call time causes more problems than it solves - the quality of work suffers, and people get frustrated. Better to focus on preventative work, which is more contemplative. Even better to have an array of preventive efforts that can be worked on over time. The goal is to have fewer outages and more reliability, and that means being able to communicate reliability needs to leadership - document all the things!
Links:
Recorded January 25, 2023