Hey 👋 ,
In this podcast we are talking about Distributed Stream Processing at High scale with Maximilian Michels (https://www.linkedin.com/in/maximilianmichels/) , who is an expert in Data intensive applications and an open source maintainer for Apache Flink, Beam and other cool technologies.
We have covered an introduction on Stream Processing at high scale:
- What is stream processing?
- How does it compare with Batch Processing and also event processing?
- What tools are available to perform stream processing?
- Use cases where we need Stream processing.
- Flink vs Kafka and some factors to decide between the two. https://docs.confluent.io/platform/current/streams/index.html and https://flink.apache.org/
- Why Exactly once processing is a Hard problem and how Kafka and Flink approach that problem? Is it similar to exactly once delivery?
- Do we really need exactly once?
- Why data locality matters?
- Should we make network calls from the stream processor OR have a join instead? How do decide? And so many other important considerations based on real production experience in operating stream processing pipelines at a very high scale.
I hope you like the podcast. Please like, share and subscribe 🙏 The GeekNarrator channel needs your love and support 😃
Regards, The GeekNarrator