A veteran of early Twitter's fail whale wars, Dmitriy joins the show to chat about the time when 70% of the Hadoop cluster got accidentally deleted, the financial reality of writing a book, and how to navigate acquisitions.
Segments: (00:00:00) The Infamous Hadoop Outage (00:02:36) War Stories from Twitter's Early Days (00:04:47) The Fail Whale Era (00:06:48) The Hadoop Cluster Shutdown (00:12:20) “First Restore the Service Then Fix the Problem. Not the Other Way Around.” (00:14:10) War Rooms and Organic Decision-Making (00:16:16) The Importance of Communication in Incident Management (00:19:07) That Time When the Data Center Caught Fire (00:21:45) The "Best Email Ever" at Twitter (00:25:34) The Importance of Failing (00:27:17) Distributed Systems and Error Handling (00:29:49) The Missing README (00:33:13) Agile and Scrum (00:38:44) The Financial Reality of Writing a Book (00:43:23) Collaborative Writing Is Like Open-Source Coding (00:44:41) Finding a Publisher and the Role of Editors (00:50:33) Defining the Tone and Voice of the Book (00:54:23) Acquisitions from an Engineer's Perspective (00:56:00) Integrating Acquired Teams (01:02:47) Technical Due Diligence (01:04:31) The Reality of System Implementation (01:06:11) Integration Challenges and Gotchas
Show Notes: - Dmitriy Ryaboy on Twitter: https://x.com/squarecog - The Missing README: https://www.amazon.com/Missing-README-Guide-Software-Engineer/dp/1718501838 - Chris Riccomini on how to write a technical book: https://cnr.sh/essays/how-to-write-a-technical-book
Stay in touch: - Make Ronak's day by signing up for our newsletter to get our favorites parts of the convo straight to your inbox every week :D https://softwaremisadventures.com/
Music: Vlad Gluschenko — Forest License: Creative Commons Attribution 3.0 Unported: https://creativecommons.org/licenses/by/3.0/deed.en