Tech Accelerator What is machine learning and how does it work? In-depth guide

Definition

machine learning operations (MLOps)

Cameron Hashemi-Pour

By

Cameron Hashemi-Pour, Site Editor

What is machine learning operations (MLOps)?

Machine learning operations (MLOps) is the development and use of machine learning models by development operations (DevOps) teams. MLOps adds discipline to the development and deployment of machine learning models, making the development process more reliable and productive.

MLOps encompasses a set of processes that machine learning developers use to build, deploy, and continuously monitor and train their models. It's at the heart of machine learning engineering, and it blends artificial intelligence (AI) and machine learning techniques with DevOps and data engineering practices.

There are many steps needed before an ML model is ready for production, and several players are involved. The MLOps development philosophy is relevant to IT pros who develop ML models, deploy the models and manage the infrastructure that supports them. Producing iterations of ML models requires collaboration and skill sets from multiple IT groups, such as data science teams, software engineers and ML engineers.

Development of deep learning and other ML models is considered inherently experimental, and failures are often part of the process in real-world use cases. The discipline is still evolving, and it's understood that sometimes even a successful ML model might not function the same way from one day to the next.

This article is part of

What is machine learning and how does it work? In-depth guide

Which also includes:
4 types of learning in machine learning explained
How to build a machine learning model in 7 steps
CNN vs. RNN: How are they different?

How MLOps works

MLOps implements the machine learning lifecycle. These are the stages that an ML model must undergo to become production-ready. The following are the four cycles that make up the ML lifecycle:

Data cycle. The data cycle entails gathering and preparing data for training. First, raw data is culled from appropriate sources, then techniques such as feature engineering are used to transform, manipulate and organize raw data into labeled data that's ready for model training.
Model cycle. This cycle is where the model is trained with this data. Once a model is trained, tracking future versions of it as it moves through the rest of the lifecycle is important. Certain tools, such as the open source tool MLflow, can be used to simplify this.
Development cycle. Here the model is further developed, tested and validated so that it can be deployed to a production environment. Deployment can be automated using continuous integration/continuous delivery pipelines that reduce the number of manual tasks.
Operations cycle. The operations cycle is a monitoring process that ensures the production model continues working and is retrained to improve performance over time. MLOps can automatically retrain an ML model either on a set schedule or when triggered by an event, such as a model performance metric falling below a certain threshold.

Five steps to creating a machine learning model. — When boiled down, there are five essential steps involved in creating a machine learning model.

Why is MLOps necessary?

Machine learning models aren't built once and then forgotten; they require continuous training so that they improve over time. That's where MLOps comes in. It provides the ongoing training and constant monitoring needed to ensure ML models operate successfully.

MLOps documents reliable processes and creates safeguards to consistently mitigate failures and reduce development time, creating better models. MLOps uses repeatable processes in the same way businesses use workflows for organization and consistency. In addition, MLOps automation ensures time isn't wasted on tasks that are repeated each time new models are built.

What are the benefits of MLOps?

MLOps provides a range of benefits, such as the following:

Speed and efficiency. MLOps automates many of the repetitive tasks in ML development and within the ML pipeline, such as the initial data preparation procedures. This approach reduces development time and cuts down on human-induced errors in the models.

List of machine learning's business benefits. — Machine learning offers pragmatic benefits to enterprises.

Scalability. ML models often must be scaled to handle increased workloads, larger data sets and new features. To provide scalability, MLOps uses technology such as containerized software and data pipelines that can handle large amounts of data efficiently.
Reliability. MLOps model testing and validation fix problems in the development phase, increasing reliability early on. Operations processes also ensure models comply with policies that an organization has in place. This reduces risks such as data drift, in which the accuracy of a model deteriorates over time because the data it was trained on has changed significantly.

MLOps challenges

MLOps might be far more streamlined and efficient than traditional approaches, but it's not without its challenges. They include the following:

Staffing. The same data scientists responsible for developing ML algorithms might not be the most effective at deploying them. They also might not be best equipped to explain how to use the algorithms to software developers. Some of the best MLOps teams embrace the idea of cognitive diversity -- the inclusion of people who have different approaches to problem-solving and offer unique perspectives because they think differently.
Costliness. MLOps can be costly, given the need to build an infrastructure that encompasses many new tools and the resources required for data analysis as well as model and employee training. This is especially true of large-scale machine learning projects with lots of dependencies and feedback loops. It is important for an organization interested in these projects to assess whether MLOps is the best approach.
Imperfect processes. While MLOps processes are designed to reduce errors, some mistakes still occur and require human intervention.
Cyber attacks. Malicious actors are a threat given the large amount of data that MLOps infrastructures store and process. Cybersecurity is required to minimize the risk of data breaches or leaks.

MLOps vs. DevOps

The most obvious similarity between DevOps and MLOps is the emphasis on streamlining design and production processes. However, the clearest difference between the two is that DevOps is focused on meeting software vendors' business goals by producing the most up-to-date versions of software applications for customers as quickly as possible. MLOps is instead focused on surmounting the challenges that are unique to machine learning to produce, optimize and sustain a model.

DevOps typically involves development teams that program, test and deploy software apps into production. MLOps means to do the same with ML models, but with a handful of additional phases. These include extracting raw data for analysis, data preparation, model training, evaluating model performance, and finally, continuous monitoring and training.

Table comparing DevOps and MLOps. — DevOps and MLOps practices have different goals and objectives.

Standard practices for MLOps

There are many useful practices to which MLOps teams adhere. The following strategies can help guide a successful ML project to completion and reduce its likelihood of failure:

An application programming interface from an existing AI service can be used to simplify or expedite MLOps in various ways. For example, this approach is used to retrieve data from external data sources or for automated testing of ML models.
MLOps professionals often run parallel model development processes so that if one model fails, they still have others in progress.
Pre-trained models are used to show proof of concept.
Generalized algorithms showing some success are further trained for a specific task. For example, a logistic regression algorithm can be trained to predict the likelihood of future events.
Publicly available data sources are used to bridge gaps in model training data, provide new data and prevent model drift.

There are four different types of ML training approaches. Supervised machine learning is the most common, but there is also unsupervised learning, semi-supervised learning and reinforcement learning. Learn about the seven steps involved in machine learning training.

This was last updated in October 2023

Continue Reading About machine learning operations (MLOps)

Battle of the buzzwords: AIOps vs. MLOps square up

Decide when and how to adopt an MLOps framework

Making machine learning operational

Capital One machine learning strategy taps MLOps

DataOps vs. MLOps: Streamline your data operations

local area network (LAN)
A local area network (LAN) is a group of computers and peripheral devices that are connected together within a distinct ...
TCP/IP
TCP/IP stands for Transmission Control Protocol/Internet Protocol and is a suite of communication protocols used to interconnect ...
firewall as a service (FWaaS)
Firewall as a service (FWaaS), also known as a cloud firewall, is a service that provides cloud-based network traffic analysis ...

identity management (ID management)
Identity management (ID management) is the organizational process for ensuring individuals have the appropriate access to ...
single sign-on (SSO)
Single sign-on (SSO) is a session and user authentication service that permits a user to use one set of login credentials -- for ...
fraud detection
Fraud detection is a set of activities undertaken to prevent money or property from being obtained through false pretenses.

CIO

IT budget
IT budget is the amount of money spent on an organization's information technology systems and services. It includes compensation...
project scope
Project scope is the part of project planning that involves determining and documenting a list of specific project goals, ...
core competencies
For any organization, its core competencies refer to the capabilities, knowledge, skills and resources that constitute its '...

Workday
Workday is a cloud-based software vendor that specializes in human capital management (HCM) and financial management applications.
recruitment management system (RMS)
A recruitment management system (RMS) is a set of tools designed to manage the employee recruiting and hiring process. It might ...
core HR (core human resources)
Core HR (core human resources) is an umbrella term that refers to the basic tasks and functions of an HR department as it manages...

Customer Experience

martech (marketing technology)
Martech (marketing technology) refers to the integration of software tools, platforms, and applications designed to streamline ...
transactional marketing
Transactional marketing is a business strategy that focuses on single, point-of-sale transactions.
customer profiling
Customer profiling is the detailed and systematic process of constructing a clear portrait of a company's ideal customer by ...

Close