Building Repeatable and Reliable ML Pipelines with a CI/CD System

by May 5, 2023#MachineLearning, #HomePage

Printer Icon
f

MLOps teams are responsible for building, automating, and monitoring an integrated machine learning (ML) system. Building a repeatable and reliable pipeline is a fundamental practice of MLOps that requires a maturity level of expertise and following MLOps best practices.

Operating an ML system in production requires managing many complex elements and a rigorous CI/CD process. This paper describes an overview of Continuous Integration and Continuous Delivery (CI/CD) practices required for MLOps environments a critical process when serving ML models.

MACHINE LEARNING CI/CD PIPELINES

Building a machine learning CI/CD pipeline is the process of automating the building, testing, and deployment of a pipeline to target environments an automated, repeatable, and reliable training pipeline that automatically deploys a prediction pipeline.

Managing MLOps workflows at scale may involve many training experiments and model versions. CI/CD techniques reduce cycle time and the risks associated with delivering new pipelines. Decreasing cycle time is critical for building a CI/CD pipeline. Cycle time is the time from the initial job of changing a code to having it in production.

MLOps engineers work to optimize all the components of ML pipelines with a CI/CD approach to fit all parts together in the delivery process applicable to each use case.

ML systems are similar to other systems but require focusing on specific aspects of the MLOps cycle, continuous integration of version control (source), integration testing, unit testing, and continuous delivery of packages.

CI/CD for machine learning has differences when compared to other software development. ML systems are more about developing predictive models that must be continuously operated. Implementing an automated ML pipeline follows a rigorous process for testing, evaluation, and approval of ML models.

RELIABLE AND REPEATABLE ML MODEL TRAINING

Teams of developers and data scientists perform data analysis and processing (extraction, analysis, preparation), experimentation, training, testing, model development, and validation.

Developers have to do more experimental work as it is a trial and error process, including experimenting with algorithms, configuring parameters, and modeling techniques, with a focus on reusing the pipelines.

MLOps teams transform the manual sequence of steps of a pipeline used to create an ML model and its experimentation details into a reusable pipeline. You can build a reusable pipeline by automating the steps and running multiple iterations with different configurations and values from other data sources to create different predictive services. The code created to build the pipeline and its components is the result of this process.

Training operationalization is often referred to as the process of creating and deploying training pipelines. Training pipeline workflows include data ingestion, validation, transformation, and model training and tuning (hyperparameters).

CONTINUOUS TRAINING OF AN ML PIPELINE IN A CI/CD SYSTEM

The process for building training pipelines and components can be replicated by setting up a CD process for the pipeline code, running automated tests, managing artifacts (tag and store), and deploying them to a target environment.

Continuous training (CT) is about training, retraining, and serving ML models. An automated ML pipeline is deployed into production and executed automatically and repetitively for training models with new data, and then configured to run on a scheduled job or triggered by specific events, such as model decay, new data, etc.

The output of training datasets in a continuous ML training pipeline is a new trained model that provides a prediction service to the organization. New validated ML models are stored in the model registry and become available for use in other experiments or similar cases. The model registry keeps track of the model provenance (origin and history) and can be tracked (lineage tracking), audited, etc.

The metadata, artifacts, files, parameters, data statistics, test results, validation results, evaluation, check points, and logs are tracked and stored in repositories during the training process.

ML MODEL DEPLOYMENT (CI/CD) (MODEL SERVING)

MLOps teams execute model deployment by fetching ML models from the pipeline model registry and the pipeline code from the repository to run prediction services.

Building prediction services from an ML model requires clear objectives for the desired business outcome. It also requires running many types of automated testing according to the use case (canary, AB testing, etc.) to ensure the model is performing as expected (better than the model in production) with real production data before releasing it to the target environment.

Then ML models are deployed with live data, and teams keep tracking, monitoring, and analyzing the performance, continuously evaluating data changes, configurations, parameters, event logs, etc. As data changes take place, a machine learning model might become outdated, skewed, or drifted and may require retraining and running a new model.

An ML model integrates into production environments for enterprise systems to use the model and provides data for getting a prediction service. Continuous integration assumes there are frequent model versions and implementation changes requiring continuous testing, code validations, schemas, latency, and model and data validation steps.

MLOps engineers develop machine learning CI/CD pipeline practices—such as maintaining consistency between development and production environments, version control, and end-to-end automation—that are needed for scaling ML systems.

Online experimentation techniques and methods are used to compare the model running with the previous version and then decide about its release.

Once the ML model is deployed, it can start a prediction service (responding with predictions) using REST APIs, gRPC, streaming systems (event-processing pipelines), batch ETL systems, and IoT devices.

MLOps engineers continuously monitor ML models in production to verify the data changes, concept, and model performance.

DEVELOPMENT AND DEPLOYMENT PRACTICES (CI/CD)

Building repeatable and reliable ML pipelines can only be supported and led by agile leadership that is knowledgeable of Continuous Integration and Continuous Delivery (CI/CD) practices.

MLOps engineers focus on delivering software quickly, with efficiency and reliability, as they use techniques and best practices that interface with ML components.

An ML deployment pipeline can be repeatable as long as the building, deployment, testing, and releasing process is automated. ML engineers may take different approaches but work under the same principles.

Fully automated pipelines help to avoid error-prone steps in deployment. Conversely, manual deployments are expensive and time consuming because they require more documentation, involve time wasted in debugging, are dependent on an expert person, and occupy the time of highly skilled engineers (high cost).

CI/CD best practices integrate testing, deployment, and release tasks into the agile development process, working activities progressively in a sequence of continuous testing, which reduces the risks during release.

All aspects of the environments are configured automatically from the version control system (VCS) (configuration management), allowing the tracking and recreation (rollback) of any component of infrastructure, ML metadata, development artifacts, source code, configurations, and production environments.

Feedback and collaboration are a critical factor in building CI/CD pipelines. Establishing a feedback loop helps to minimize the MLOps cycle by validating the trained model to ensure it is quality and can be used for creating a prediction service accepted by the business criteria.

Feedback is triggered by behavior changes in components of the ML system. This feedback allows for adjustments in the training pipeline. Changes in any of the training pipeline components are captured and tested.

*KEY TAKEAWAY*

Building a repeatable and reliable ML pipeline requires a rigorous CI/CD process and interface of machine learning components.

Manual processes are expensive, time consuming, and prone to errors. MLOps engineers optimize all the ML components to fit its purpose and decide on the approach applicable to specific use cases.

An automated CI/CD system helps build training models and replicate them to build predictive services (prediction pipelines) that scale and are highly adaptative to new conditions.

About Us: Krasamo is a mobile-first Machine Learning and consulting company focused on the Internet-of-Things and Digital Transformation.

Click here to learn more about our machine learning services.

RELATED BLOG POSTS

LLMOps Fundamentals

LLMOps Fundamentals

Explore LLMOps fundamentals for generative AI applications. Learn how effective management and operations transition prototypes to real-world use cases with Krasamo’s specialized services.

Introduction to Machine Learning

Introduction to Machine Learning

Machine learning, a subfield of AI, has become a crucial component of developing tools and applications for data analysis and decision-making in the digital age.

What is Machine Learning?

What is Machine Learning?

Machine Learning is an application in which machines can learn automatically from their experiences or train data to make predictions detecting patterns and creating its own rules.

IIoT-Driven Transformation: Boosting Industrial Efficiency & Innovation

IIoT-Driven Transformation: Boosting Industrial Efficiency & Innovation

This paper discusses the transformative potential of the Industrial Internet of Things (IIoT) in enhancing operational efficiency and reducing expenses in plants and buildings. By leveraging wireless sensors, data collection, analytics, and machine learning, IIoT systems create a competitive advantage through improved interoperability and connectivity. We explore the factors driving IIoT adoption, the benefits it offers, and the different types of IIoT software. The paper also highlights Krasamo’s expertise in IoT consulting services and their comprehensive range of IoT offerings to help enterprises implement and benefit from IIoT systems.

Creating a Machine Learning Use Case: Steps and Considerations

Creating a Machine Learning Use Case: Steps and Considerations

This article discusses the steps and considerations for creating a machine learning use case to improve business processes. It explains the concept of machine learning and the importance of data quality and volume in creating accurate predictions. The article outlines the steps in creating an ML use case, including defining the problem, collecting and preparing data, defining product objectives and metrics, training and evaluating the model, and deploying the model. The article also discusses the types of ML problems and how to discover ML use cases in existing business processes. Overall, the article emphasizes the importance of understanding business problems and identifying opportunities to use ML to create enhanced solutions.