The likes of LinkedIn and Uber use Pinot to power some astonishingly high-scale queries against realtime data. The numbers alone would make an impressive case-study. But behind the headline lies a fascinating set of architectural decisions and constraints to get there. So how does Pinot work? How does it process queries? How are the various roles split across a cluster? And equally important - what does it *not* try to achieve.
Joining me to go through the nuts and bolts of how Pinot handles SQL queries is Tim Berglund, veteran technology explainer of the realtime-data world. He takes us through Pinot step-by-step, covering the roles of brokers, servers, controllers and minions as we build up the picture of a query engine that's interesting in theory and massively performant in practice.
–
Apache Pinot: https://pinot.apache.org/
Apache Pinot Docs: https://docs.pinot.apache.org/
StarTree: https://startree.ai/
Event Driven Design episode with Bobby Calderwood: https://youtu.be/V7vhSHqMxus
Tim on Twitter: https://twitter.com/tlberglund
Kris on Mastodon: http://mastodon.social/@krisajenkins
Kris on LinkedIn: https://www.linkedin.com/in/krisjenkins/
Kris on Twitter: https://twitter.com/krisajenkins
–
#podcast #softwaredevelopment #apachepinot #database #dataengineering #sql