Deep in the heart of data // Carl Steinbach // MLOps Coffee Sessions #22

Coffee Sessions #22 with Carl Steinbach of LinkedIn, Deep in the Heart of Data. //Bio Carl is a Senior Staff Software Engineer and currently the Tech Lead for LinkedIn's Grid Development Team. He is a contributor to Emerging Architectures for Modern Data Infrastructure //Other links referenced by Carl: https://rise.cs.berkeley.edu/wp-content/uploads/2017/03/CIDR17.pdf https://www.youtube.com/watch?v=-xIai_FvcSk&ab_channel=WePayEngineering https://softwareengineeringdaily.com/2019/10/23/linkedin-data-platform-with-carl-steinbach/ https://www.slideshare.net/linkedin/carl-steinbach-open-source https://dreamsongs.com/RiseOfWorseIsBetter.html https://engineering.linkedin.com/blog/2017/03/a-checkup-with-dr--elephant--one-year-later https://engineering.linkedin.com/ https://engineering.linkedin.com/blog/2018/11/using-translatable-portable-UDFs https://a16z.com/2020/10/15/the-emerging-architectures-for-modern-data-infrastructure/ --------------- ✌️Connect With Us ✌️ ------------- Join our slack community: https://go.mlops.community/slack Follow us on Twitter: @mlopscommunity Sign up for the next meetup: https://go.mlops.community/register Connect with Demetrios on LinkedIn: https://www.linkedin.com/in/dpbrinkm/ Connect with David on LinkedIn: https://www.linkedin.com/in/aponteanalytics/ Connect with Carl on LinkedIn: https://www.linkedin.com/in/carlsteinbach/ Timestamps: [00:00] Introduction to Carl Steinbach [00:44] Carl's background [04:51] Breakdown of Transpiler [10:55] Advantages of Decoupling the Execution Layer [15:25] Differences between UDF (user-defined function) Functions and Views [18:45] How do you ensure the reproducibility of these Views? [23:58] Data structure evolution [27:55] Are Data Lakes and Data Warehouse fundamentally different things or are they on a path towards conversion? [33:37] It's inevitable that people will start doing machine learning on databases [36:01] Who gets permission on what, especially when it comes to data and how sensitive things can be? [41:27] Security aspect of data   [43:40] Does it require a level of obstruction on top of the data of the file system? [45:48] Why do we go back and go forward which sets this trend?

Om Podcasten

Relaxed Conversations around getting AI into production, whatever shape that may come in (agentic, traditional ML, LLMs, Vibes, etc)