Sveriges mest populära poddar

The GeekNarrator

How would you design a database on Object Storage?

68 min • 2 december 2024

Join Kaivalya Apte and Simon Hørup Eskildsen from Turbopuffer as they talk about the complexities of building a database on top of object storage. Discover the key challenges, the nuances of various storage formats, and the critical trade-offs involved. Learn from Simon's rich experience, from his time at Shopify to creating Turbopuffer. This episode covers everything—from approaches to write-ahead logs to multi-tenancy and object storage advancements. Perfect for database enthusiasts and those keen on first-principles thinking! 00:00 Introduction 00:17 Simon's Background and Journey to TurboBuffer 02:42 Challenges in Database Scalability 04:21 Experimenting with Vector Databases 05:02 Cost Implications of Vector Databases 05:52 Architectural Considerations for Search Workloads 07:39 Building a Database on Object Storage 16:14 Designing a Simple Database on Object Storage 26:01 Handling Multiple Writers and Consistency 31:26 Trade-offs in Write Operations 32:36 Optimizing MySQL Write Performance 34:03 Batching Writes in Object Storage 35:08 Time-Based vs Size-Based Batching 36:32 Understanding Amplification in Databases 42:26 Challenges with Cold Queries 44:02 Building and Persisting B-Trees 50:53 Separating Workloads in Databases 56:07 Multi-Tenancy Challenges 01:00:39 Choosing Storage Formats 01:06:10 Key Innovations in Object Storage Databases Important links: - https://github.com/sirupsen/napkin-math (numbers) - https://turbopuffer.com/ - https://turbopuffer.com/architecture - https://sirupsen.com/napkin/problem-10-mysql-transactions-per-second - https://sirupsen.com (my blog, napkin math) - https://sirupsen.com/subscribe (napkin math newsletter) - https://github.com/rkyv/rkyv rkyv rust Become a member of The GeekNarrator to get access to member only videos, notes and monthly 1:1 with me. Like building stuff? Try out CodeCrafters and build amazing real world systems like Redis, Kafka, Sqlite. Use the link below to signup and get 40% off on paid subscription. https://app.codecrafters.io/join?via=geeknarrator If you like this episode, please hit the like button and share it with your network. Also please subscribe if you haven't yet. Database internals series: https://youtu.be/yV_Zp0Mi3xs Popular playlists: Realtime streaming systems: https://www.youtube.com/playlist?list=PLL7QpTxsA4se-mAKKoVOs3VcaP71X_LA- Software Engineering: https://www.youtube.com/playlist?list=PLL7QpTxsA4sf6By03bot5BhKoMgxDUU17 Distributed systems and databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4sfLDUnjBJXJGFhhz94jDd_d Modern databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4scSeZAsCUXijtnfW5ARlrsN Stay Curios! Keep Learning!

Kategorier
Förekommer på
00:00 -00:00