Round 3 analyzing the Google paper "Continuous Delivery and Automation Pipelines in ML"
// Show Notes
Data Science Steps for ML
Data extraction: You select and integrate the relevant data from various data sources for the ML task.
Data analysis: You perform exploratory data analysis (EDA) to understand the available data for building the ML model. This process leads to the following:
Understanding the data schema and characteristics that are expected by the model.
Identifying the data preparation and feature engineering that are needed for the model.
Data preparation: The data is prepared for the ML task. This preparation involves data cleaning, where you split the data into training, validation, and test sets. You also apply data transformations and feature engineering to the model that solves the target task. The output of this steps are the data splits in the prepared format.
Model training: The data scientist implements different algorithms with the prepared data to train various ML models. In addition, you subject the implemented algorithms to hyperparameter tuning to get the best performing ML model. The output of this step is a trained model.
Model evaluation: The model is evaluated on a holdout test set to evaluate the model quality. The output of this step is a set of metrics to assess the quality of the model.
Model validation: The model is confirmed to be adequate for deployment—that its predictive performance is better than a certain baseline.
Model serving: The validated model is deployed to a target environment to serve predictions. This deployment can be one of the following:
Microservices with a REST API to serve online predictions.
An embedded model to an edge or mobile device.
Part of a batch prediction system.
Model monitoring: The model predictive performance is monitored to potentially invoke a new iteration in the ML process.
The level of automation of these steps defines the maturity of the ML process, which reflects the velocity of training new models given new data or training new models given new implementations. The following sections describe three levels of MLOps, starting from the most common level, which involves no automation, up to automating both ML and CI/CD pipelines.
In the rest of the conversation, we talk about maturity levels 0 and 1. Next session we will talk about Level 2.
Join our slack community: https://go.mlops.community/slack
Follow us on Twitter: @mlopscommunity
Sign up for the next meetup: https://go.mlops.community/register
Connect with Demetrios on LinkedIn: https://www.linkedin.com/in/dpbrinkm/
Connect with David on LinkedIn: https://www.linkedin.com/in/aponteanalytics/