From Problem to Requirements to Architecture.
In this episode, Nicolay Gerold and Jon Erich Kemi Warghed discuss the landscape of data engineering, sharing insights on selecting the right tools, implementing effective data governance, and leveraging powerful concepts like software-defined assets. They discuss the challenges of keeping up with the ever-evolving tech landscape and offer practical advice for building sustainable data platforms. Tune in to discover how to simplify complex data pipelines, unlock the power of orchestration tools, and ultimately create more value from your data.
Key Takeaways:
Jon Erik Kemi Warghed:
Nicolay Gerold:
Chapters
00:00 The Problem with the Modern Data Stack: Too many tools and buzzwords
00:57 How to Choose the Right Tools: Considerations for startups and large companies
03:13 Evaluating Open Source Tools: Background checks and due diligence
07:52 Defining Data Governance: Transparency and understanding of data
10:15 The Importance of Data Governance: Challenges and solutions
12:21 Data Governance Tools: dbt and Dagster
17:05 The Impact of Dagster: Software-defined assets and declarative thinking
19:31 The Power of Software Defined Assets: How Dagster differs from Airflow and Mage
21:52 State Management and Orchestration in Dagster: Real-time updates and dependency management
26:24 Why Use Orchestration Tools?: The role of orchestration in complex data pipelines
28:47 The Importance of Tool Selection: Thinking about long-term sustainability
31:10 When to Adopt Orchestration: Identifying the need for orchestration tools