Coffee Sessions #22 with Carl Steinbach of LinkedIn, Deep in the Heart of Data.
//Bio
Carl is a Senior Staff Software Engineer and currently the Tech Lead for LinkedIn's Grid Development Team. He is a contributor to Emerging Architectures for Modern Data Infrastructure
//Other links referenced by Carl:
https://rise.cs.berkeley.edu/wp-content/uploads/2017/03/CIDR17.pdf
https://www.youtube.com/watch?v=-xIai_FvcSk&ab_channel=WePayEngineering
https://softwareengineeringdaily.com/2019/10/23/linkedin-data-platform-with-carl-steinbach/
https://www.slideshare.net/linkedin/carl-steinbach-open-source
https://dreamsongs.com/RiseOfWorseIsBetter.html
https://engineering.linkedin.com/blog/2017/03/a-checkup-with-dr--elephant--one-year-later
https://engineering.linkedin.com/
https://engineering.linkedin.com/blog/2018/11/using-translatable-portable-UDFs
https://a16z.com/2020/10/15/the-emerging-architectures-for-modern-data-infrastructure/
--------------- ✌️Connect With Us ✌️ -------------
Join our slack community: https://go.mlops.community/slack
Follow us on Twitter: @mlopscommunity
Sign up for the next meetup: https://go.mlops.community/register
Connect with Demetrios on LinkedIn: https://www.linkedin.com/in/dpbrinkm/
Connect with David on LinkedIn: https://www.linkedin.com/in/aponteanalytics/
Connect with Carl on LinkedIn: https://www.linkedin.com/in/carlsteinbach/
Timestamps:
[00:00] Introduction to Carl Steinbach
[00:44] Carl's background
[04:51] Breakdown of Transpiler
[10:55] Advantages of Decoupling the Execution Layer
[15:25] Differences between UDF (user-defined function) Functions and Views
[18:45] How do you ensure the reproducibility of these Views?
[23:58] Data structure evolution
[27:55] Are Data Lakes and Data Warehouse fundamentally different things or are they on a path towards conversion?
[33:37] It's inevitable that people will start doing machine learning on databases
[36:01] Who gets permission on what, especially when it comes to data and how sensitive things can be?
[41:27] Security aspect of data
[43:40] Does it require a level of obstruction on top of the data of the file system?
[45:48] Why do we go back and go forward which sets this trend?