MLOps Coffee Sessions #122 with Hannes Hapke, Machine Learning Engineer at Digits Financial, Inc., Scaling Similarity Learning at Digits co-hosted by Vishnu Rachakonda.
// Abstract
Machine Learning in a product is a double-edged sword. It can make a product more useful but it depends on assumed and strictly defined behavior from users.
Hannes walks through the entirety of their machine learning pipeline, how they implemented it, what the elements are, what the learning looks like, and what tooling looks like.
Hannes maps out what good data hygiene looks like not only from the machine learning perspective down to the software engineering, design, and backend engineering, all the way to the data engineering perspectives.
// Bio
Hannes was the first ML engineer at Digits, where he built the MLOPs foundation for their ML team. His interest in production machine learning ranges from building ML pipelines to scaling similarity-based ML to process millions of banking transactions daily.
Prior to Digits, Hannes implemented ML solutions for a number of applications, incl. retail, health care, or ERP companies.
He co-author two machine learning books:
* Building Machine Learning Pipeline (O'Reilly)
* NLP in Action (Manning)
// MLOps Jobs board
https://mlops.pallet.xyz/jobs
// MLOps Swag/Merch
https://mlops-community.myshopify.com/
// Related Links
--------------- ✌️Connect With Us ✌️ -------------
Join our slack community: https://go.mlops.community/slack
Follow us on Twitter: @mlopscommunity
Sign up for the next meetup: https://go.mlops.community/register
Catch all episodes, blogs, newsletters, and more: https://mlops.community/
Connect with Demetrios on LinkedIn: https://www.linkedin.com/in/dpbrinkm/
Connect with Vishnu on LinkedIn: https://www.linkedin.com/in/vrachakonda/
Connect with Hannes on LinkedIn: https://www.linkedin.com/in/hanneshapke/
Timestamps:
[00:00] Introduction to Hannes Hapke
[01:37] Takeaways
[02:40] Design supercharges machine learning
[05:48] Building Machine Learning Pipeline book
[08:09] Updating the edition
[09:37] Abstract away
[11:52] Approach of crossover
[16:04] Training serving skew
[20:42] Tools using continuous integration and deployment
[25:25] Human in the loop touch point
[27:44] Data backfilling update
[30:06] Work and Products of Digits
[32:26] Digit Boost
[35:30] The first machine learning engineer
[39:55] Structured data in good shape, good data processing perspective, concept-educated teams
[43:33] Digits is hiring!
[43:55] Machine Learning struggles
[47:10] Design decision
[49:49] Data or machine learning literacy
[51:30] Data Hygiene
[52:49] Rapid fire questions
[54:47] Wrap up