Sveriges mest populära poddar

Data Engineering Podcast

The Evolution of DataOps: Insights from DataKitchen's CEO

54 min • 4 augusti 2024
Summary
In this episode of the Data Engineering Podcast, host Tobias Macey welcomes back Chris Berg, CEO of DataKitchen, to discuss his ongoing mission to simplify the lives of data engineers. Chris explains the challenges faced by data engineers, such as constant system failures, the need for rapid changes, and high customer demands. Chris delves into the concept of DataOps, its evolution, and the misappropriation of related terms like data mesh and data observability. He emphasizes the importance of focusing on processes and systems rather than just tools to improve data engineering workflows. Chris also introduces DataKitchen's open-source tools, DataOps TestGen and DataOps Observability, designed to automate data quality validation and monitor data journeys in production.
Announcements
  • Hello and welcome to the Data Engineering Podcast, the show about modern data management
  • Data lakes are notoriously complex. For data engineers who battle to build and scale high quality data workflows on the data lake, Starburst is an end-to-end data lakehouse platform built on Trino, the query engine Apache Iceberg was designed for, with complete support for all table formats including Apache Iceberg, Hive, and Delta Lake. Trusted by teams of all sizes, including Comcast and Doordash. Want to see Starburst in action? Go to dataengineeringpodcast.com/starburst and get $500 in credits to try Starburst Galaxy today, the easiest and fastest way to get started using Trino.
  • Your host is Tobias Macey and today I'm interviewing Chris Bergh about his tireless quest to simplify the lives of data engineers
Interview
  • Introduction
  • How did you get involved in the area of data management?
  • Can you describe what DataKitchen is and the story behind it?
  • You helped to define and popularize "DataOps", which then went through a journey of misappropriation similar to "DevOps", and has since faded in use. What is your view on the realities of "DataOps" today?
  • Out of the popularized wave of "DataOps" tools came subsequent trends in data observability, data reliability engineering, etc. How have those cycles influenced the way that you think about the work that you are doing at DataKitchen?
  • The data ecosystem went through a massive growth period over the past ~7 years, and we are now entering a cycle of consolidation. What are the fundamental shifts that we have gone through as an industry in the management and application of data?
  • What are the challenges that never went away?
  • You recently open sourced the dataops-testgen and dataops-observability tools. What are the outcomes that you are trying to produce with those projects?
  • What are the areas of overlap with existing tools and what are the unique capabilities that you are offering?
  • Can you talk through the technical implementation of your new obserability and quality testing platform?
  • What does the onboarding and integration process look like?
  • Once a team has one or both tools set up, what are the typical points of interaction that they will have over the course of their workday?
  • What are the most interesting, innovative, or unexpected ways that you have seen dataops-observability/testgen used?
  • What are the most interesting, unexpected, or challenging lessons that you have learned while working on promoting DataOps?
  • What do you have planned for the future of your work at DataKitchen?
Contact Info
Parting Question
  • From your perspective, what is the biggest gap in the tooling or technology for data management today?
Links
The intro and outro music is from The Hug by The Freak Fandango Orchestra / CC BY-SA
Förekommer på
00:00 -00:00