Sitemap

Discovering: MLOps

Episode 1: An symbiotic dance of two highly valued fields in contemporary data-shifting environments

16 min readJun 4, 2023

In two of my previous AI/ML — oriented articles, I’ve very briefly touched on the subject of MLOps. In this article, one that could be considered a follow-up on the previous ones, I’ll try to introduce this relatively new concept that could help anyone interested in the topic transition from the existing machine learning practices into enterprise level standards.

Right from the start I should come clean and say that I was mistaken in my initial thoughts when hearing about machine learning operations (MLOps). I though it was a merging of ML and traditional DevOps. While that might be the general case and one would be correct for making such a statement, the real ‘thing’ is a bit more complex than that and it involves aspects that are much more closer to idiomatic machine learning workflows than traditional software development principles. So, what is MLOps?

MLOps represents an abstract idea of an end-to-end looping cycle of data science, machine learning and developer operations (DevOps) practices and principles for an optimized system development cycle. By utilizing an automated process that combines components ranging from data extraction and preparation, through model training and, finally, deployment and monitoring, one can significantly reduce the model development cycle and solidify high quality standards.

In this discovering series of articles, we’ll cover the basic definitions of the key concepts around MLOps, the benefits, demystifying its principles, and some of the challenges that have emerged in the past few years. Along the way, there will be a sprinkle of an example project. So, let’s move on with introducing some key concepts.

What is DevOps?

DevOps stands for “development operations” and is meant to represent a set principles, tools, culture and methodologies that have an end goal do bridge gaps in traditional teams such as QA, development and IT infrastructure, when producing new and maintaining existing solutions. It is meant to speed up and maintain a certain quality threshold in the product development and deployment lifecycle. Some of the typical steps in a DevOps workflow consist everything from task planning, designing, code commits, linting and testing the said code and finally, deployment, delivery and monitoring for the whole solution. DevOps introduces the concept of continuous deployment and integration, effectively gatekeeping any system anomalies to slip through different stages of the product lifecycle. This can manifest in better overall product quality and general service availability since we are in no need to compromise existing and running production services with potential problematic states. More and more tools and concepts are introduced that expand the reach of DevOps to cover as much as possible of the product workflow from the conception phase to the very well developed deployment and monitoring phase. Next, we’ll look at some key principles surrounding DevOps.

Automation

What’s there to say about this, perhaps, most important aspect of DevOps? Automation is the key that makes every other aspect of this process possible with enforcing effectiveness and repeatability of each task. Let’s be frank, as software developers, data scientists or machine learning engineers, we are prone to making mistakes when doing repeatable tasks. Automation can help us mitigate the risks taken when manually writing, triggering and executing task. We can simply write code, usually in the form of scripts, that can react to external factors that influence the whole product lifecycle.

Architecture

Choosing the right system architecture can be the deciding factor when cost reduction, scaling and maintenance are crucial driving forces for a solution. Closed-up architectures are by definition very isolated and dependant on API gateways in order to provide communication between various components, and, even more importantly, the various stages of the complete development to operations cycle. With good system architecture, we can predict what parts of the system are providing the most value to the overall product success criteria, and introduce entry points into which we can plug-into and interact with during each of the before mentioned stages in a typical DevOps workflow.

There’s no right and wrong answer when choosing an overall system architecture. There’s only architectures that fulfill the acceptance criteria. What proved to be successful are architectures that enable various implementation details for components to be abstracted from the rest of the system using a form of contracts. Contracts help the business logic for a solution to be independent of the concrete implementations for the said contracts. A huge benefit in this kind of an approach is that, at any stage of the DevOps cycle, we can replace anything, from libraries to the tooling used in the solution.

Continuous Integration/Continuous Delivery (CI/CD)

In the software development process, it is common to have some sort of centralized source code repository that serves as a single source of truth for the whole project. Usually, that repository is hosted on some cloud service like GitHub or GitLab. When software developers merge their latest code changes, automated services can be configured that listen to those code changes and trigger automatic events such as build and testing processes. What this accomplishes, beside the time and error reduction of manually triggering those kind of events, is that issues can be caught very early on in the development cycle, much sooner than any production release schedule and also, if every defined criteria standards is successfully competed, maintain a constant release schedule.

Inter-Team Communication

Since DevOps throws such a wide net over the development cycle, different teams can be part of it. Inter-team communication is crucial for any product success. What DevOps helps with is that, no single team can work outside and in isolation on a product. A seamless transition is made between the end and beginning stages of each team’s work, with clear and understandable practical contracts between them that act like gateways for other teams. The end goal is to ensure minimal to none defect leaving the state one team on another that could lead to a waterfall effect, where one defect is propagated through all teams right until the final production deployment. There are also a set of sub practices focusing on clear inter-team communications such as ChatOps. The understanding of accountability and a good sense of team roles, lead to a more clearer vision of what the end product needs to be.

Infrastructure as Code (IaC)

IaC can be seen like a management layer for data provisioning centers where most of the configurations can be directly relayed to the underlying system resources by using machine readable configuration files. The goals is to abstract the hardware layer for the benefits of more consistent, testable, repeatable and automated deployment and management of the infrastructure layer.

Monitoring and Logging

At the end, we have to know how our system is performing. How resources are being utilized. The service health status with checks for warnings and errors. We can monitor how individual updates and changes affect the overall state of an solution and if there is need for any rollback to previous acceptable state. Logging and monitoring also helps in providing concrete data for potential stakeholders, enabling them to make key strategic decisions when moving forward with the project. Logging helps us identify issues early on, and shorten the team response times for addressing those issues, enabling the rest of the system to maintain its availability throughout this process.

With these principles in place, companies and individuals alike can create environments where fostering coding, building,testing and deployment stages can happen more frequently and, most importantly, more consistently.

MLOps vs. DevOps

With the DevOps part covered in the previous segment, we can proceed with differentiating MLOps to DevOps. It’s only natural to assume that MLOps is an extension of DevOps, they share 50% of their names, don’t they? We should thread carefully, because this kind of thoughts can be very misleading. Yes, MLOps shares the same culture and most of the practices that are introduced in DevOps, where they differentiate is where their end results live. While DevOps is mostly focused delivering an end product, MLOps focuses on machine learning models, their versioning and overall data management.

MLOps can definitely be seen as an extension to DevOps that specifically addresses issues and challenges with deployment and management of models within the AI/Machine Learning ecosystem for production environments. MLOps aims to cover the complete ML lifecycle, including research, data extraction and validation, model training, deployment, monitoring and inner loop (section lifecycle) refactoring with iterative improvements. With MLOps we are combining ML techniques with DevOps practices to achieve the necessary level of effectiveness, inter-team communication and consistency in the ML world that has already been proven in software development with DevOps.

A visual representation of the two can loosely be observed in the following diagrams:

DevOps vs MLOps venn digramas

Both, MLOps and DevOps put an emphasis on continuous improvements and continuous learning. One obvious differentiating point between DevOps and MLOps, is their respective domain knowledge areas. We’ve already mentioned that DevOps focuses on the software development lifecycle, but MLOps in addition to software development, requires knowledge of machine learning, data science and statistical analysis. Each of those additional fields being extremely vast in scope with engineers that are successfully joggling all of them being extremely valuable to industry giants.

There’s also a certain level of experimentation and iteration looping involved when trying to improve upon a certain model type in MLOps. The same can definitely be said about a software solution in DevOps, but the degree to which experimentation is involved in ML is more akin to performance optimization in software development to which the cost-to-effectiveness ratio drastically drops for each gained performance point. The complete opposite is true in ML. Fine-tuning a model can exponentially gain more and more usability and return on investment (ROI), with the norm usually being that at least one model iteration improvement being required in order to satisfy production criteria.

With the concept of MLOps being an extension to DevOps, versioning plays an even more important role. Data and model versioning are just as important as code versioning in the world of ML. At the end of a ML lifecycle, in the monitoring and logging stage, we have to consider all the factors that came into play when creating the output model. What was the input data, what was the data source, what parsing algorithm has been applied to it, to what extent has the data being cleaned-up before being released as a training model set? These all are just potential versioning opportunities regarding the model training data, but we also have the model versioning aspects to consider, such as model fine-tuning hyperparameters. It’s easy now to understand how much more intricate MLOps can be.

Needles to say, much of the tooling and infrastructure is considerably different than in standard DevOps. Anything from model serving, distributed training to monitoring web frameworks, all require to be bespoke-made with ML in mind. With good architecture oversight, we can mitigate the costs of different frameworks and tools with utilizing the efforts to make specialized API gateways with which we can plug-into our existing tooling.

The last important aspect to consider when comparing these two development culture practices is ethical considerations. Data privacy and model fairness are playing key roles in certain industries such as healthcare. This is specifically difficult topic to address and specialized teams are required for creating test cases and validations when targeting certain markets and User groups. Not every model and data source can be applied to niche domains where sensitive topics are present. Early efforts in publicly available AI/ML solutions, have produced less-than-ideal results when no ethical filters have been applied to them. We strive now to make solutions that can be adapted to more versatile scenarios.

Why do We Need MLOps?

The current state of things in the world of machine learning and data science is not suitable for rapid-track change requests and evershifting markets. The more complex models, by some thinking, more valued models, require an extensive amount of time to train, validate and ship to markets. Teams centered around AI/ML are thought to execute manual data experiments, data preparation, model training and providing final evaluation metrics. This can be extremely time consuming, labor intensive, inflexible, not reusable, and at the end, error prone. A prevailing thought, at the time of writing this article, is that machine learning needs to be around ten times faster from the initial steps of data gathering to market delivery! Sounds incredible, doesn’t it? MLOps has emerged with the sole purpose to bring order to this, very problematic, state of things when AI/ML is concerned.

Data

For an optimal MLOps experience, the gathering, analysis and clean up of the input data is fundamental. A quality result model is directly tied to the quality data upon which it is trained on. A complete sub section of MLOps is centered just around the data pipeline.

Model and Data Drift

What is usually omitted when talking about MLOps, are the concepts of data drift and model drift. Model drift refers to the handling of new requirements and change requests while data drift notifies the rest of the MLOps pipeline that the data we re interested in has changed. Both, model and data drift should have listening services attached to any change events and trigger automatic rebuilds or retraining of models.

Project Structure

While it might not seem as important as some other steps mentioned in this section, project structure can significantly boost the overall productivity of all teams involved. It makes the onboarding of new team members fully transparent and ensures that everyone is on the same page when resolving new issues and tackling new challenges. A good project structure as well as good documentation and planning, provide a common language that everyone involved can speak.

Experimentation

Why playing around with your data can be extremely valuable. Software developers have debugging, machine learning engineers and data scientist have experimentations. A lot can be said from differentiating experimental cycles. Depending on the client requirements, the experimental phase can be significantly prolonged if no optimized steps are applied. ML engineers need to define specific criterias that can enable them to quickly iterate thought different data sets and input parameters in order to shorten the feedback loops and improve their workflow. The days of waiting days and months for a model to be trained should be reduced as much as possible. Not only will this help keep the costs down, but also change the overall perspective in how a ML workflow should look like.

Some aspects to consider when looking into reduction points when experimentation is considered:

  • Feature selection — What feature are we currently focusing on
  • Algorithm selection — Switching between different algorithms should be as seamless as possible. This step requires quite a bit knowledge from the engineering team
  • Hyperparameter tuning — We should be able to quickly pass in new parameters when constructing new models
  • Model fitting — Model fitting refers to measurements of how well the training model adapts to new data that is similar to the data it was trained on. This is a fine measurements, since it’s generally regarded to walk a fine line between models that are too accurate and those that are not. This is crucial in defining relationships and patterns in the input, out and target data

Metadata Versioning

Through this section we can observe that a lot of considerations can be made and there’s a lot of room where we can help differentiate different aspects of the MLOps lifecycle with versioning and labeling. Each of the cogs in the intricate engine that is MLOps can have it’s own distinct version that can potentially be reused in order to enhance attributes of the final deployment outcome. Each of those cogs should be made interchangeable with a component of the same type, but different in the metadata that define them. In order for the final monitoring, logging and analytical steps to be complete, we need to have as much insight in the complete roadmap, from the input data, its parameters through the model hyperparameters, in order to form calculated judgments when comparing the deployed models. Every single of the beforehand mentioned experimental runs, needs to be tracked of its metrics, this also applies to the source code versioning of the software development aspects of MLOps.

Learning From Mistakes

The complete culture and practices around MLOps have risen as answers to some previous mistakes made by ML engineers and data scientists. Some issues are hard to uncover in the traditional ML workflows. It’s easy to get carried away when working on a solution. May it be with new algorithms, tools or inner-team communication, we tend to get distracted with them and compromise our initial vision or workflow, resulting in some form of issues or errors. With a good foundation in MLOps cycles (logic-loops), we can pinpoint a lot of the problematic areas relatively quickly and address them in a timely manner without the need to stop the complete pipeline unless its some sort of critical issue that affects everything in our solution. We can also much more effectively define checkpoints in and around the ML lifecycle with custom linting, test, and static code analytical tools. Only when we identify the errors in our solutions, we can learn from them and further improve the very next iteration with fast market delivery.

Automated Model Deployment and Validation

Just the mere mention of the words ‘manual deployment’ can send shivers down the spines of software developers and that’s for a good reason. No matter how careful we are, the room for error increases with each of our manual tinkering with model deployment and validation. In order to increase efficiency and significant time savings, we need to introduce automation in our ML lifecycles. And that’s not even the end of it. With automation, we can talk about scaling our solutions based on the resources at hand. Standardization and consistency are also introduced in automated pipelines with version control, monitoring and quality assurance closely following all of those good practices that make difficult workloads more manageable. While I haven’t experimented with a lot of tooling for automated packaging and deployment when ML workflows are concerned, some tools were mentioned a lot in my research about this topic. Tools such as MLFlow, Apache Airflow, Google Cloud AI platform, Azure ML and Kubeflow (Kubernetes extension), all provide good out-of-the box functions to be integrated with our solutions. While it might be tempting for an individual or company to create their own custom solution for automating the model deployment and validation, we should be careful, since the pitfalls can be significant if we do not fully grasp the scale and problems surrounding the ML lifecycle. It’s best to research each of those tools to see what best suits your own project budget and criteria.

Ethical Checkpoints

What happens when new models are trained on proprietary or licensed data? How many data scrapers are running through countless of data stored on web hosting platforms each day? I’ve briefly touched upon this subject matter in the previous section, but it might be good to emphasize the importance of this MLOps component since it’s something very specific to this domain.

User data privacy and security should be one of the top priorities if we deal with such data sources for our models. Propper anonymity, de-identification and even data encryption, should be made part of our MLOps workflows. Specialized auditing mechanisms can be implemented in ensuring that our security measures are up to date to the latest industry standards.

We should also mention the environmental impact that model training can have. Namely, depending on the model type and training data can exhort vast amounts of computing power and resources in general. A parallel can be made to what blockchain mining impact had on the graphics card (GPU) market and overall environment. At the time of this writing AI/ML are commanding significant resources, especially some of the more intensive LLM’s.

Monitoring

And how do we know that our models are working as expected? How they differentiate between incremental versions and even if they are available to our clients and customers? Of course, we can monitor them upon delivery. Monitoring and logging of interactions, input and response data, provide us with the necessary metric data that we can later analyze and work with in order to identify potential missteps, performance bottlenecks and other ways we could improve our models for the next step in our solutions roadmap. Monitoring metrics also enables us to showcase the results to any potential stakeholders and work on the feedback we receive.

Workflow

A typical MLOps workflow would look something like the following flowchart:

MLOps workflow — SDRemthix
MLOps workflow

Benefits of MLOps

We’ll keep this section short and to the point. This a sort-of summary of the previously mentioned attributes of MLOps, but it’s nice to hear them again in list form:

  • Scalability and Management — All parties involved adhere to the same culture and practices of creating more manageable and maintainable systems. Solutions made with the scalability and maintenance planned from the very beginning, have a better chance to deliver on product requirements. It can at first seem that progress is slow, especially in the very early stages of ML lifecycle, but its the foundation that we’ll rest the proceeding layers of our solution
  • Reusability and Reproducibility — The simple things is that we need to reproduce outcomes of our model training depending on various scenarios. Reproducibility can help us better understand how the underlying data is being manipulated by the model and if it satisfies our progress criteria. As for re-usability, we can apply SOLID principles coupled with DRY programming, to ensure that components are as reusable and versatile as our business domain requires them to be. It also helps us to become better overall developers and produce high-value results
  • CI/CD — We need to know to package, control and monitor our solutions and also where to identify logic gateways for each of the coupling components of our ML solution, so that the output is as production ready as possible. Automated steps will help each team to focus on what is of utmost importance in our solution and let the infrastructure tooling do the rest. When properly setup, of course
  • Health and Governance — MLOps enables us to not only observe our models in action, but also to rapidly adjust them with various hyperparameters (depending on the model and algorithm) in order to fine-tune them in a effective way and ready them for production deployment
  • Ethical AI Practices — From the start, MLOps introduces the ethical component in our workflows, ensuring that this highly sensitive industry is impacted by the real-world information, only when we need them. At the time of writing this article, we are witnessing more and more AI solutions that have the specific purpose in detecting other AI-generated content and report on any User rights violations or plagiarism for instance. Early identification of the possible sensitive areas in our training and input data, can elevate our ML solution and make it stand out in an emerging, competitive market

Final Note

We’ll continue with this discovering series in the next article/episode, where we’ll look into an example MLOps project.

Are you already familiar with MLOps? What are your thoughts on the current state of machine learning delivery? As always, your feedback and thoughts are greatly valued, so feel free to leave a comment, question or message me.

Thank you for reading, and happy engineering!

--

--

Srdjan Delić
Srdjan Delić

Written by Srdjan Delić

Software developer and photography enthusiast

Responses (1)