0% found this document useful (0 votes)
2 views

Machine Learning Operations A Mapping Study

The document presents a systematic mapping study on Machine Learning Operations (MLOps), which integrates DevOps principles into the deployment of machine learning models. It identifies challenges within the MLOps pipeline, including data manipulation, model creation, and deployment, while offering recommendations for tools and solutions to address these issues. The study aims to enhance understanding of MLOps and guide future research by categorizing trends and identifying gaps in the current literature.

Uploaded by

Vamsi Bandi
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

Machine Learning Operations A Mapping Study

The document presents a systematic mapping study on Machine Learning Operations (MLOps), which integrates DevOps principles into the deployment of machine learning models. It identifies challenges within the MLOps pipeline, including data manipulation, model creation, and deployment, while offering recommendations for tools and solutions to address these issues. The study aims to enhance understanding of MLOps and guide future research by categorizing trends and identifying gaps in the current literature.

Uploaded by

Vamsi Bandi
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

Machine Learning Operations: A Mapping Study

Abhijit Chakraborty Suddhasvatta Das Kevin Gary


School of Computing and Augmented School of Computing and Augmented School of Computing and Augmented
Intelligence Intelligence Intelligence
Arizona State University Arizona State University Arizona State University
[email protected] [email protected] [email protected]

Abstract - Machine learning and AI have been recently The adoption of DevOps principles in software
embraced by many companies. Machine Learning Operations, engineering has enabled developers to deliver their products
(MLOps), refers to the use of continuous software engineering more efficiently and scalable, and this trend is now being
processes, such as DevOps, in the deployment of machine applied to machine learning projects, resulting in the
learning models to production. Nevertheless, not all machine emergence of MLOps. Machine Learning Operations, or
learning initiatives successfully transition to the production MLOps, has emerged as a critical discipline that aims to
stage owing to the multitude of intricate factors involved. This address the unique challenges associated with running and
article discusses the issues that exist in several components of maintaining machine learning systems in production
the MLOps pipeline, namely the data manipulation pipeline,
environments.
model building pipeline, and deployment pipeline. A systematic
mapping study is performed to identify the challenges that arise A. DevOps for ML Systems - MLOps
in the MLOps system categorized by different focus areas. Using
DevOps is a subset of software engineering focused on
this data, realistic and applicable recommendations are offered
for tools or solutions that can be used for their implementation.
tightening the coupling between the development and
The main value of this work is it maps distinctive challenges in
operation of software systems. DevOps principles advocate
MLOps along with the recommended solutions outlined in our for end-to-end automation [17] which is expressed through
study. These guidelines are not specific to any particular tool the use of version control systems, automated build and
and are applicable to both research and industrial settings. deploy pipelines, etc. Some motivating factors for automation
are shortening the time to delivery, increasing re-
Keywords—MLOps, Model Creation, Data Management, producibility, and reducing time spent on automatable
Model Deployment, Systematic Mapping Studies processes [18]. DevOps for Machine Learning, named
MLOps, is a subset of SE for ML and a superset/extension of
I. INTRODUCTION DevOps, focusing on adopting DevOps practices when
Machine Learning (ML) has become an important process developing and operating ML systems [19]. [20] defines that
to leverage the potential of data and allows businesses to be “ML Ops is a cross-functional, collaborative, continuous
more innovative [1], efficient [2], and sustainable [3]. It has process that focuses on operationalizing data science by
emerged alongside big data technologies and high- managing statistical, data science, and machine learning
performance computing, creating new opportunities to models as reusable, highly available software artifacts, via a
analyze and understand data-intensive processes in various repeatable deployment process.” [20] identifies four main
operational environments. The learning process in machine steps of MLOps: Build, Manage, Deploy and Integrate, and
learning involves the ability of machines to learn from Monitor. Fig. 1 shows the MLOps pipeline proposal in [21]
experience and perform tasks without being strictly
programmed. However, as highlighted in [4], managing and
maintaining machine learning systems poses unique
challenges compared to traditional software systems. The
success of many productive ML applications in real-world
settings falls short of expectations [5]. Many ML projects
fail—with many ML proofs of concept never progressing as
far as production [6]. From a research perspective, this does
not come as a surprise as the ML community has focused
extensively on the building of ML models, but not on (a)
building production-ready ML products and (b) providing the
necessary coordination of the resulting, often complex ML
system components and infrastructure, including the roles
required to automate and operate an ML system in a real-
world setting [7]. For instance, in many industrial
applications, data scientists still manage ML workflows
manually to a great extent, resulting in many issues during the
operations of the respective ML solution [8]. As the field of Fig 1. MLOps Pipeline [21]
artificial intelligence and machine learning continues to
evolve, the need for efficient and reliable deployment of these B. MLOps
technologies has become increasingly crucial. MLOps is the collection of techniques and tools for the
deployment of ML models in production [35]. Encompassing
the combination of DevOps and Machine Learning processes,

1
DevOps [36] represents the set of practices minimizing the methodology. This choice is supported by numerous
needed time for a software release, reducing the gap between compelling arguments that are in line with our aims.
software development and operations [37] [38]. The two
main principles of DevOps are Continuous Integration (CI) A MLOps system includes various pipelines [27].
Commonly a data manipulation pipeline (DM), a model
and Continuous Delivery (CD). Continuous integration is the
practice by which software development organizations try to creation pipeline (MC) and a deployment pipeline (MD) are
mandatory. Each of these pipelines must be compatible with
integrate code written by developer teams at frequent
intervals. They constantly test their code and make small the others, in a way that optimizes flow and minimizes errors.
improvements each time based on the errors and weaknesses A. MLOps Pipelines
that results from the tests. This results in a reduction in the
software development process cycle [39]. Continuous MLOps (Machine Learning Operations) pipeline is a
delivery is the practice according to which, there is constantly collective term used to describe a collection of three different
a new version of the software under development to be pipelines that are essential for managing the end-to-end
installed for testing, evaluation and then production. With this lifecycle of machine learning models, from development and
practice, the software releases resulting from the continuous training to deployment and monitoring. These pipelines help
integration with the improvements and the new features reach ensure that machine learning models are scalable, reliable,
the end users much faster [40]. After the great acceptance of and maintainable in production environments. Below are
DevOps and the practices of “continuous software descriptions of each of the pipelines in detail.
development” in general [41] [36], the need to apply the same Data Management Pipeline (DM): Involves gathering,
principles that govern DevOps in machine learning models cleaning, and preprocessing the data needed to train and
became imperative [12]. evaluate machine learning models. It may involve feature
MLOps attempts to automate Machine Learning engineering to extract relevant information from raw data.
processes using DevOps practices and approaches, seeking Model Creation Pipeline (MC): Machine learning models
the rapid automated delivery benefits of DevOps. MLOps are trained on the prepared data. This includes selecting
specifically attempts to apply CI and CD principles within the appropriate algorithms, tuning hyperparameters, and
ML model development, integration, and deployment phases evaluating model performance using techniques like cross-
[37]. Although it seems straightforward in reality it is not. validation. After training, models are evaluated using
This is due to the fact that a Machine Learning model is not validation datasets to assess their performance metrics, such
independent but is part of a wider software system and as accuracy, precision, recall, or F1 score. This step helps
consists not only of code but also of data. As the data is ensure that models generalize well to unseen data.
constantly changing, the model is constantly called upon to
retrain from the new data that emerges. For this reason, Model Deployment Pipeline (MD): Once a model meets
MLOps introduce a new practice, in addition to CI and CD, the desired performance criteria, it is deployed to production
that of Continuous Training (CT), which aims to environments where it can make predictions on new data.
automatically retrain the model where needed. From the Deployment involves packaging the model into a deployable
above, it becomes clear that compared to DevOps, MLOps format, integrating it with existing systems, and setting up
are much more complex and incorporate additional APIs for inference and monitoring for it’s stability.
procedures involving data and models [45] [33] [42]. Sustainability Pipeline (Sustainability): Focusing on
One of the key challenges in MLOps is the integration and sustainability considerations in the
operationalization of the complex lifecycle of machine development, deployment, and maintenance of machine
learning models in production. Unlike traditional software learning models. Sustainability in this context encompasses
systems, which are typically governed by a well-defined set environmental, social, and economic aspects, aiming to
of instructions, machine learning models are constantly minimize negative impacts and maximize positive outcomes
evolving and require careful monitoring and maintenance to throughout the ML pipeline lifecycle. This refers to the
ensure their continued performance and reliability [16]. As sustainable factors for each of the above three pipelines.
Sculley et al. [12] noted, "the long-term costs in ML systems" This study focuses on 1) researching ML
can be significant, with issues such as managing the power operationalization in-depth, and not as part of a broader study
configuration for multiple models, tracking experiment of SE for ML and 2) putting more focus on tooling/solutions
results, and monitoring the entire production pipeline. and infrastructure by identifying novelty in how current
Existing studies have shown that the operationalization of solutions are proposed to be used and what is reported to need
ML is an area that presents practitioners with real challenges further research. These objectives are fundamental for
[14]. Operationalization, in the context of this paper, consists understanding the complex dynamics of MLOps and essential
of taking a trained and evaluated ML model to a serving state for guiding future academic and practical efforts. With this
in the intended production environment, including necessary motivation, we framed the following research questions:
support functions, such as monitoring. Tackling • RQ1: What are the research trends in MLOps, how
operationalization challenges requires adopting good many studies cover these and how do they converge to
practices and utilizing suitable tooling or solutions. different MLOps pipelines?
Considering the intricate challenges and evolving nature • RQ2: What kind of novelty or new ideas do these
of MLOps, embracing a research approach that can studies constitute in relation to the pipelines?
effectively explore the wide range and profound aspects of
the area is essential. Given this, we have chosen to conduct a II. RELATED WORK
Systematic Mapping Study (SMS) as our research ML models are increasingly prevalent in virtually all
fields of business and research in the past decade. [15] said

2
that half of the organizations questioned had used artificial entails doing a comprehensive analysis of the current body of
intelligence in some commercial operations. Despite literature in order to get a deep understanding of the topic
extensive research on training and evaluating machine range and the various forms of publishing. Scholars in many
learning models, the primary challenge for many companies domains use systematic mapping research, which adheres to
and practitioners lies not in discovering new algorithms and certain principles or techniques [4]. An SMS is especially
optimizations for training, but rather in effectively deploying suited for offering an exhaustive overview of the research
models to production to generate tangible business value. environment, therefore accomplishing the identification of
Most companies are still in the very early stages of knowledge gaps that need additional research. Additionally,
incorporating ML into their business processes [16]. an SMS aligns with our goal to categorize current information
and assemble guidelines that will impact future research
Considering that MLOps is a relatively young topic, it is
endeavors. During this study, we followed the systematic
not surprising that there are not a lot of review publications. mapping recommendations described by [9], specifically
In this part, we will first present the review articles, and then
using this methodology in the field of software engineering.
we will discuss some of the most significant and impactful
work that has been done in each and every job in the MLOps Following the method of Webster & Watson [22], and
life cycle. Following the presentation of a basic overview of Kitchenham et al. [23]The mapping study was carried out
MLOps by Goyal [25], Zhao [26] examines the academic using a four-step process: 1) Identifying the research
literature about machine learning in production in order to questions, 2) Performing an extensive search for relevant
determine the significance of MLOps. In addition, Zhou et al. literature, 3) Choosing research of superior quality that
[27] focuses on the usage of resources over the whole life- matches the predetermined requirements 4) Analyzing and
cycle of MLOps. In addition to reviews, there are a great consolidating data from the chosen research to uncover
number of papers that discuss the applications of MLOps in recurring themes and patterns.
a variety of fields. Some examples of these papers include the
MLOps approach in the cloud-native data pipeline design by A. Study Selection
Poloskei [28], the application of MLOps in the prediction of Library Scan: The primary aim of this phase is to identify
lifestyle diseases by Reddy et al. [29], and SensiX++: studies that are in line with the research topics previously
Bringing MLOps and Multi-tenant Model Serving to Sensory stated. This section provides a comprehensive description of
Edge Devices by Min et al. [30]. the selection process carried out by the lead investigator and
later evaluated by the other authors. The year 2015 marked a
In terms of the various phases of MLOps, Makineth et al. pivotal moment in the evolution of Machine Learning
[31] highlight the significance of MLOps in the area of data Operations (MLOps), a crucial discipline that bridges the gap
science. Their findings are based on a survey in which they between the development and deployment of machine
gathered answers from 331 experts hailing from 63 different learning (ML) models [12]. Prior to 2015, the landscape of
countries. Regarding the data manipulation task, Renggli et ML tooling and frameworks was fragmented, with
al. [32] explain the relevance of data quality for an MLOps organizations struggling to effectively manage the lifecycle
system while demonstrating how different characteristics of of their machine learning systems [24]. However, the rising
data quality propagate through various phases of machine prominence of MLOps in 2015 signaled a transformative shift
learning development. This is done in relation to the data in the way organizations approached the challenges inherent
manipulation job. In the MLOps cycle, Ruf and colleagues in productionizing machine learning models. One of the key
[33] investigate the function that the MLOps tools play as drivers behind the increased focus on MLOps in 2015 was the
well as the connection between them for each and every growing recognition of the unique challenges posed by
activity. Additionally, they provide a formula for selecting deploying and maintaining machine learning systems in
the best Open-Source tools that are currently available. Klaise production environments [10].
and colleagues [34] had a discussion on monitoring and the
issues that are associated with it. They used current examples First we constructed a brute-force search query on Google
of production-ready solutions that were developed utilizing Scholar for "Machine learning Ops" that got more than 26,100
open source tools. Finally, Tamburri [37] discusses the results. We decided to refine our search, and the subsequent
difficulties and tendencies that are now occurring, with a string was "(Machine learning Operations) AND (Machine
particular emphasis on explainability and sustainability. learning Ops OR MLOps)," and the number of results went
down to 3,460. With further filtering based on ‘review
The contribution of this paper would be two-fold to the articles’ the final number stood up to 269. Initially, our search
greater book of knowledge, first would be an aggregated view was conducted on Google Scholar (which, by default,
of categorizing the research trends under different MLOps comprehensively covers the most significant databases).
pipelines with novelty of usage of existing tools and solutions Subsequently, we applied semantically identical search strings
applications in each of these pipelines, that are being to the other three databases, namely, ACM Digital Library,
proposed by the existing research, this would help the readers IEEE Xplore, and Scopus, which were searched with the
to understand the potential trends. The second would be to strings. However, Google Scholar does not allow refined
identify the potential gaps that the community would need to search strings to limit the searches to just abstracts/titles, but
address in future research to come up with effective strategies the other three databases provide this level of refinement.
for operationalizing the MLOps pipelines. Thus, we utilized that to generate a focused list of studies.
III. METHODS While Scopus and IEEE were giving relevant results, ACM
was not providing anything, hence the search criteria was
Systematic mapping studies (SMS), also known as modified from ‘Title’ to ‘Anywhere’ in the document, and
scoping studies, aim to provide a comprehensive picture of a post application of the year filter it provided 289 articles.
study topic by categorizing and quantifying the contributions Lastly, given the number of studies retrieved from our
within the categories of that area ([4],[9]). This approach database search, we categorized them based on relevance,

3
limiting our exclusion pages in the databases, with each MLOps pipelines in the table 3, which has been deductively
containing ten studies, where the relevance of the studies chosen from the [11]. While reviewing the studies, we also
noticeably declined to a document to no relevant document for observed some relevant research on sustainable ML
the pertaining page. pipelines, which covers the explainability and sustainability
of the former three obtained MLOps pipelines. The following
• ACM: [[All: "machine learning operations"] OR AND information has been mapped as various pipelines in Table 1.
[[All: "machine learning operations"] OR [All:
"mlops"]] AND [E-Publication Date: (01/01/2015 TO TABLE 1 Metadata of Included Studies.
12/31/2024)]] Study References Year MLOps Pipelines
• IEEE: "Document Title": ("Machine learning 43 2023 MD
Operations" AND ("Machine learning Ops" OR
44 2017 DM
"MLOps") ) AND “Publication Year”: 2015 -2024
45 2020 Sustainability
• Scopus: TITLE-ABS-KEY ( "Machine learning
Operations" AND ( "Machine learning Ops " OR 46 2021 DM, MD
"MLOps" ) ) AND $PUBYEAR > 2015$ AND 47 2022 MD
$PUBYEAR < 2024. 48 2023 DM, MD
• Google Scholar: “Machine learning Operations” 49 2022 Sustainability
AND (“Machine learning Ops “ OR “MLOps” ) AND 50 2022 DM, MC, MD
(Year: 2015 – 2024) AND (Article Type: Review 51 2022 MD
only)
52 2023 DM
Title and Abstract: All abstracts were read and if the paper 53 2024 MC
did not present information on MLOps it was excluded. This 54 2022 MD, MC
reduced the number of studies to 81. This was done 55 2022 MC
concurrently with the next step.
56 2021 MD
Duplicates Removed: The publications that are present in 57 2022 MD
multiple databases are removed manually which comprised 58 2023 MD
of 34 duplicate studies. This step helped us to find the unique
59 2024 DM, MD
individual work that came down to 49 studies from 81 studies
60 2023 DM
obtained from the previous step before the execution of
duplication removal. 61 2024 Sustainability
62 2022 DM, MD
Full-Text Scan: The final selections of papers were based on
63 2021 MD
a thorough review of studies from the above step, carried out
64 2021 MD
by the authors. We focused on the research questions
pertaining to MLOps and the overall quality of the papers. 65 2023 DM, MD
Assessing paper quality can be subjective [13], so we looked 66 2023 MD
at factors like where it was published, when it was published, 67 2024 MD
how often it was cited (if it has not been published in recent 68 2022 DM, MD
years), and the reputation of the venue and authors. Those 69 2020 MD
papers which are secondary studies based on the MLOps
70 2021 MD
concepts, they were not considered as part of this study.
71 2022 DM, MD
72 2023 DM, MD
73 2023 MD
74 2022 MD

C. Data Aggregation
After identifying and labeling the papers, we then
aggregated counts for the number of publications per year
over our timeframe, as shown in Fig. 3.

Fig 2. Search Process

B. Data Extraction
Based on the full text scan performed in the previous step, 32
papers were examined in this study and categorized into three Fig 3 Publications vs. Year

4
Based on the above chart it is evident that most of the TABLE 2: RESEARCH TRENDS IN DIFFERENT MLOPS PIPELINES
research has been performed during the 2022 and 2023, as MLOps Pipelines Research Trends Study References
depicted by figure 3. This can be possible either due to the Data Management Data access and [48,50,62,65,72]
availability of new modelling methods or any novelty being management
introduced among the research community. It is also Data Management Shortage of diverse data [50,52,71]
samples
interesting to observe during this phase the most important Data Management Data cleaning and [44,48,50,51,60]
pipelines there are being studied are the Model deployment validation
and data management pipelines. Data Management Data labeling [44,50,59,68]
Model Creation Feature Selection [50,74]
IV. RESULTS AND DISCUSSION Model Creation Calculations of [50,53,54,55,71]
performance metrics
This section provides an overview of the results obtained Model Creation Algorithm & hyper- [50,55,74]
from the SMS, including a detailed analysis of the parameter selection
quantitative data for each Research Question. Model Creation Model evaluation [50,54,55,59,62,71]
Model Creation Experiment tracking [46,50,55,59]
• RQ1:What are the research trends in MLOps, how many Model Model monitoring [46,47,48,50,51,54,57
studies cover these and how do they converge to different Deployment ,63,66,68,69]
MLOps pipelines? Model Managing deployment [46,50,54,56,58,64,65
Deployment pipelines ,66,70,71,72,73]
The data shown in table 2 and fig. 4 demonstrate a strong Model Operations and feedback [43,48,50,51,58,67,
and growing interest among the MLOps community which Deployment loops 69,71,74]
depicts that the field of MLOps has been an important field Model Incompatibilities [50,51,54,72]
Deployment between dev & prod
of research among the community. Specifically it is observed Sustainability Complexity as [45,62]
that the research trends has been in the model deployment infrastructure grows
(MD) category in MLOps. There is significant effort focused
on addressing the issues of operationalizing machine learning
models in production. This is evident and also a confirmation
that it has been increased due to the raising interest of MLOps
integration in various field of works like manufacturing,
mining, marketing etc. Most of the scientific literature has
mostly concentrated on the Model Deployment category,
namely on Model monitoring, Managing deployment
pipelines, and Operations and feedback loops. These are
important for integrating a model to the existing production
system, as unlike a software deployment where the user can
test and perform necessary changes to optimize the new code
as per production need prior to merging to the production, in
machine learning it is highly unlike to completely test and
optimize the model hyper parameters like performance,
biasness, accuracy etc. of the model being getting deployed,
Fig 4: Number of Studies per Year based on MLOps pipelines.
as the data used for this purpose is limited and controlled
(being collected based on specific trends or patters), whereas Although Data Management and Model Creation are
in production we have various patterns of data which the acknowledged as fundamental, the comparatively low
model has to go through and learn and optimize itself. Hence occurrence of studies studying these topics shows they may
monitoring of a productional model is an important step of not be as urgent in contemporary MLOps practice and
deployment to provide necessary inputs through the feedback difficulties or in early phases of research development, even
loop so the model can learn and improve its hyper parameters. though they are still important. Nevertheless, their existence
This is clearly apparent, since a significant proportion of the in the study corpus indicates a comprehensive comprehension
literature, namely over 37%, was only dedicated to this that MLOps encompasses more than just deploying machine
particular problem. While model creation and data learning models in production. It also includes model
management are the other two pipelines which are completed building and data management.
prior to model deployment, they together without model
deployment pipelines, cannot ensure a machine There has been a noticeable increase in both the
learning system is functioning as intended, especially in a publishing output and the demand for research in several
dynamic environment like production. Since these are the fields in 2022. This may include the emergence of novel
steps that are performed during initial requirement analysis machine learning technology or a reaction to industry need
and model creation and most of the time the teams involved for enhanced operational procedures in machine learning
in these stages of pipelines are working in siloed implementations. Based on the summarization Model
environments and has minimum to no interactions, hence, Deployment (MD) has a predominant interest in the field of
continuous real-time monitoring of system activity, together MLOps among academic community with proliferating
with automatic reaction, is essential for ensuring long-term research articles year over year. It comprises of 52% of the
system stability [5]. studies reviewed in this work. The next in the queue is Data
Management (DM) which occupies 26% of the studies, it is
evident with advent rise in the big data technologies and
computer vision, implementation of ML models for obtaining
tangible outcomes has been an area of interest and hence
some challenges are mostly being faced in these areas.

5
The study of combined research findings from various Model health depends on monitoring performance and
years reveals that the field is actively striving to find a reporting forecasts, mistakes, and other indicators.
balance between the flexibility of machine learning model Versioning models allow tracking, comparing, and rolling
creation and the speed of agile software engineering. The back iterations.
many methodologies used by researchers demonstrate a
research community that is flexible and quick to address the Streamlined data management and Automated model
deployment pipelines speed up model development and
difficulties posed by machine learning settings. Although
Model Deployment is the main focus of the study, the production. Effective data management and model
deployment allow ML operations to scale to bigger datasets
discipline acknowledges the significance of a holistic
approach that encompasses Data Management and Model and inference loads. Scalable architectures allow models to
forecast in real-time for many users. Reliable data
Creation, despite the fact that these areas need greater
representation in the existing body of literature. This management and deployment pipelines decrease failures and
maintain performance. Reliable systems boost ML
indicates a research community that is attentive to the current
needs of MLOps and foresees its future direction. confidence and decrease production downtime.
Data management and model deployment are essential to
• RQ2:What kind of novelty or new ideas do these studies
MLOps, improving machine learning project efficiency,
constitute in relation to the pipelines?
scalability, dependability, reproducibility, and cost.
The solutions and tools mentioned to these research trends
from all the sources are aggregated and categorized in the V. LIMITATIONS
table. These solutions help us give an insight on how these Every mapping study has limitations, we address ours
are being proposed for utilization for MLOps here.
operationalization overcome in the industry and academia.
External Validity: This is the generalizability of
TABLE 3. MLOPS PIPELINES RESEARCH TRENDS AND NOVELTY MAPPING extrapolating the findings of a scientific study beyond its
MLOps specific setting. Put simply, it refers to the degree to which
Research Trends Novelty & Studies
Pipelines the findings of a study may be applied or transferred to
Data Data access and Datalake different contexts, individuals, stimuli, and time periods.
Management management [50,62,72] Generalizability pertains to the extent to which a
Data Shortage of diverse Resampling,
Management data samples Augmentation [52] predetermined sample may be applied to a wider population,
Data Data cleaning and whereas transportability refers to the extent to which one
Data scrubbing [48,51]
Management validation sample can be applied to another target population. To ensure
Data Use trained model to generalizability of our findings, To establish external
Data labeling
Management label [50]
Model Creation Feature Selection Feature store [53,55]
validity, we have diligently determined the "scope" of the
Calculations of study, which pertains to the extent to which the theory or
Model Creation River [54] argument of this study can be applied or its restrictions. We
performance metrics
Algorithm & hyper- made a concerted effort to include a diverse range of studies
Model Creation AutoML [11]
parameter selection covering various aspects of MLOps. However, it’s important
Model Creation Model evaluation Deepchecks [55,71] to acknowledge that there are other variations in research
Model Creation Experiment tracking DVC [46] trends, and regions might not fully represent global practices.
Model Model picker, Evidently
Model monitoring Construct Validity: Construct validity concerns whether
Deployment AI [54,68,69]
Model Managing deployment
Efficient automated the methods and measures accurately reflect the studied
pipelines right from phenomena. Our systematic mapping study aimed to provide
Deployment pipelines
development [46,50]
Model Operations and
a comprehensive overview of MLOps research trends. The
Kubeflow [43] diverse definitions and practices within MLOps may have led
Deployment feedback loops
Model Incompatibilities Conduct feasibility study to variations in how it is conceptualized and implemented.
Deployment between dev & prod [54] Furthermore, researcher bias and reliance on published
The table depicts the growing interest in MLOps and literature, excluding unpublished or non-peer-reviewed
specifically in the areas of Data management and Model studies, also threaten the comprehensiveness of our findings
deployment as critical parts outside the standalone model Conclusion Validity: This threat concerns the ability to
creation pipeline work. Though it is a need to pass on the draw accurate conclusions from the data. Our study’s
feedback to the model creation pipeline for continuous conclusions are based on the analysis of 32 studies, which,
training and testing, still it is clear that the operationalization while comprehensive, might only encompass some relevant
is mostly influenced by the model deployment and data research. The potential for publication bias, where only
availability for the team working on it to progress. They positive or significant results are published, could skew our
affect machine learning project efficiency, scalability, and findings. Furthermore, the evolving nature of MLOps and
reliability: Effective data intake pipelines merge data from ways of implementation may need to be revisited in future for
several sources into a single repository. Data validation, inclusion of new research.
cleansing, and enrichment which are part of data management
are essential for model reliability, integrity, security, and Additional Limitations: One notable limitation is the
compliance where rules are maintained via data governance potential for selection bias in our study selection process, as
policies. the inclusion criteria and databases used may have
inadvertently excluded relevant studies. We also recognize
Versioning datasets track changes and ensure experiment that the rapid evolution of both MLOps means that our study
repeatability. Preprocessing raw data into features for model may not capture the most recent developments in the field.
training is crucial.

6
VI. CONCLUSIONS REFERENCES
The most effective method for incorporating machine [1] M. Aykol, P. Herring and A. Anapolsky, "Machine learning for
learning models into production is to apply MLOps. Each continuous innovation in battery technologies", Nature Rev. Mater.,
vol. 5, no. 10, pp. 725-727, Jun. 2020.
year, a greater number of businesses use these strategies, and
[2] M. K. Gourisaria, R. Agrawal, G. M. Harshvardhan, M. Pandey and S.
more study is conducted in this field. A fully developed S. Rautaray, "Application of machine learning in industry 4.0" in
MLOps system that employs continuous training has the Machine Learning: Theoretical Foundations and Practical
potential to lead us to machine learning models that are both Applications, Cham, Switzerland:Springer, pp. 57-87, 2021.
more efficient and more realistic. Additionally, selecting the [3] A. D. L. Heras, A. Luque-Sendra and F. Zamora-Polo, "Machine
appropriate tools for each individual task is an ongoing issue. learning technologies for sustainability in smart cities in the post-
Although there are a great number of papers pertaining to the COVID era", Sustainability, vol. 12, no. 22, pp. 9320, Nov. 2020.
many tools, it is not simple to adhere to the principles and [4] Petersen, K., Feldt, R., Mujtaba, S., & Mattsson, M. (2008) -
Systematic mapping studies in software engineering. In 12th
include them in the most effective manner. There are times International Conference on Evaluation and Assessment in Software
when we are forced to make a decision between flexibility Engineering (Vol. 17, p. 1)B. Rieder, Engines of Order: A
and resilience, each of which comes with its own set of Mechanology of Algorithmic Techniques. Amsterdam, Netherlands:
advantages and disadvantages. Monitoring is the last step, Amsterdam Univ. Press, 2020.
and it is one of the most important points of interest that must [5] R. Kocielnik, S. Amershi and P. N. Bennett, "Will you accept an
be considered. Monitoring the condition of the whole system imperfect AI?: Exploring designs for adjusting end-user expectations
of AI systems", Proc. CHI Conf. Hum. Factors Comput. Syst., pp. 1-
via the use of sustainability, robustness, fairness, and 14, May 2019.
explainability is, in our opinion, the most important factor in
[6] R. van der Meulen and T. McCall, Gartner Says Nearly Half of CIOs
developing mature, automated, robust, and efficient MLOps Are Planning to Deploy Artificial Intelligence, Dec. 2018, [online]
systems. Considering this, it is of the utmost importance to Available: https://www. gartner.com/en/newsroom/press-
create models and methods that are capable of enabling this releases/2018-02-13-gartner-says-nearly-half-of-cios-are-planning-to-
sort of monitoring, such as explainable machine learning deploy-artificial-intelligence.
models. Future work would be to address in building [7] A. Posoldova, "Machine learning pipelines: From research to
explainable machine-learning models for production. production", IEEE Potentials, vol. 39, no. 6, pp. 38-42, Nov. 2020.
[8] L. E. Lwakatare, I. Crnkovic, E. Rånge and J. Bosch, "From a data
VII. FUTURE WORK science driven process to a continuous delivery process for machine
learning systems" in Product-Focused Software Process Improvement,
There is a considerable research gap in the sustainability Springer, vol. 12562, pp. 185-201, 2020.
and monitoring of MLOps pipelines in the development of [9] Kitchenham, B., Brereton, P., & Budgen, D. (2010, May). The
comprehensive frameworks and standardized metrics for educational value of mapping studies of software engineering
assessing and benchmarking the environmental impact of literature. In Proceedings of the 32nd ACM/IEEE International
Conference on Software Engineering-Volume 1 (pp. 589-598).
machine learning models throughout their lifecycle.
[10] Zhao, Y. (2021). Machine learning in production: A literature review.
Standardized Sustainability Metrics: While there is Tech. Rep.
growing awareness of the need for sustainable AI practices, [11] Symeonidis, G., Nerantzis, E., Kazakis, A., & Papakostas, G. A. (2022,
there is a lack of standardized metrics for measuring the January). MLOps-definitions, tools and challenges. In 2022 IEEE 12th
Annual Computing and Communication Workshop and Conference
environmental impact of machine learning models. Research (CCWC) (pp. 0453-0460). IEEE.
is needed to develop comprehensive frameworks that
[12] Sculley, D., Holt, G., Golovin, D., Davydov, E., Phillips, T., Ebner, D.,
consider factors such as energy consumption, carbon ... & Dennison, D. (2015). Hidden technical debt in machine learning
emissions, and resource utilization across different stages of systems. Advances in neural information processing systems, 28.
the MLOps pipeline. [13] Kitchenham, B. A., Budgen, D., & Brereton, O. P. (2010, April). The
value of mapping studies–A participant-observer case study. In 14th
Lifecycle Assessment Tools: Existing tools for assessing international conference on evaluation and assessment in software
the environmental impact of machine learning models often engineering (ease). BCS Learning & Development.
focus on specific stages of the lifecycle, such as training or [14] Andrei Paleyes, Raoul-Gabriel Urma, and Neil D. Lawrence. 2020.
inference. There is a need for integrated lifecycle assessment Chal-lenges in Deploying Machine Learning: a Survey of Case
tools that provide a holistic view of the environmental Studies.TheML-Retrospectives, Surveys & Meta-Analyses Workshop,
footprint of MLOps pipelines, from data collection to model NeurIPS 2020, ArticlearXiv:2011.09926 (Nov. 2020).
arXiv:2011.09926 [cs.LG]
deployment and decommissioning. https://ui.adsabs.harvard.edu/abs/2020arXiv201109926P
Real-time Monitoring and Feedback: Real-time [15] 2020. The state of AI in 2020. https://www.mckinsey.com/business-
monitoring and feedback mechanisms are essential for functions/mckinsey-analytics/our-insights/global-survey-the-state-of-
ai-in-2020
detecting inefficiencies and optimizing sustainability in
[16] Stephan Schlögl, Claudia Postulka, Reinhard Bernsteiner, and
MLOps pipelines. Research is needed to develop automated Christian Ploder.2019. Artificial Intelligence Tool Penetration in
monitoring tools that provide real-time insights into energy Business: Adoption, Challengesand Fears. Springer International
consumption, carbon emissions, and resource utilization, Publishing, 259–270. https://doi.org/10.1007/978-3-030-21451-7_22
enabling proactive optimization and decision-making. [17] Christof Ebert, Gorka Gallardo, Josune Hernantes, and Nicolas
Serrano. 2016.DevOps. 33, 3 (may 2016), 94–100.
Addressing these research gaps will contribute to the https://doi.org/10.1109/ms.2016.68
development of more sustainable and environmentally [18] Breno B. Nicolau de França, Helvio Jeronimo, and Guilherme Horta
responsible MLOps pipelines, enabling organizations to Travassos.2016. Characterizing DevOps by Hearing Multiple Voices.
minimize their carbon footprint and contribute to a more ACM Press. https://doi.org/10.1145/2973839.2973845
sustainable future. [19] Julian Soh and Priyanshi Singh. 2020. Machine Learning Operations.
InDataScience Solutions on Azure. Apress, 259–279.
https://doi.org/10.1007/978-1-4842-6405-8_8
[20] Sweenor David, Hillion Steven, Rope Dan, Kannabiran Dev, Hill
Thomas,O’Connell Michael, and Safari an O’Reilly Media Company.

7
2020. ML Ops :Operationalizing Data Science. of the 2017 ACM International Conference on Management of Data
https://go.oreilly.com/queensland-university-of- (pp. 1717-1722).
technology/library/view/-/9781492074663/?ar7 [45] Tamburri, D. A. (2020, September). Sustainable mlops: Trends and
[21] 022. MLOps pipeline. https://ml-ops.org/content/mlops-principles. challenges. In 2020 22nd international symposium on symbolic and
[22] J. Webster and R. Watson, "Analyzing the past to prepare for the future: numeric algorithms for scientific computing (SYNASC) (pp. 17-23).
Writing a literature review", MIS Quart., vol. 26, no. 2, pp. 8-23, 2002, IEEE.
[online] Available: https://www.jstor.org/stable/4132319. [46] John, M. M., Olsson, H. H., & Bosch, J. (2021, September). Towards
[23] B. Kitchenham, O. P. Brereton, D. Budgen, M. Turner, J. Bailey and S. mlops: A framework and maturity model. In 2021 47th Euromicro
Linkman, "Systematic literature reviews in software engineering—A Conference on Software Engineering and Advanced Applications
systematic literature review", Inf. Softw. Technol., vol. 51, no. 1, pp. (SEAA) (pp. 1-8). IEEE.
7-15, Jan. 2009. [47] Lima, A., Monteiro, L., & Furtado, A. P. (2022). MLOps: Practices,
[24] Chad Lochmiller. 2021. Conducting Thematic Analysis with Maturity Models, Roles, Tools, and Challenges-A Systematic
Qualitative Data.The Qualitative Report(jun 2021). Literature Review. ICEIS (1), 308-320.
https://doi.org/10.46743/2160-3715/2021.5008 [48] Diaz-De-Arcaya, J., Torre-Bastida, A. I., Zarate, G., Minon, R., &
[25] A. Goyal, "Machine learning operations", International Journal of Almeida, A. (2023). A joint study of the challenges, opportunities, and
Information Technology Insights Transformations [ISSN: 2581-5172 roadmap of mlops and aiops: A systematic survey. ACM Computing
(online)], vol. 4, 2020. Surveys, 56(4), 1-30.
[26] Y. Zhao, Machine learning in production: A literature re-view, 2021, [49] Mboweni, T., Masombuka, T., & Dongmo, C. (2022, July). A
[online] Available: https://scholar.google.com/. systematic review of machine learning devops. In 2022 international
conference on electrical, computer and energy technologies (ICECET)
[27] Y. Zhou, Y. Yu and B. Ding, "Towards mlops: A case study of ml
(pp. 1-6). IEEE.
pipeline platform", Proceedings-2020 International Con-ference on
Artificial Intelligence and Computer Engineering ICAICE 2020, pp. [50] Siddappa, P. (2022). Towards addressing MLOps pipeline challenges:
494-500, 10 2020. practical guidelines based on a multivocal literature review.
[28] I. Pölöskei, "Mlops approach in the cloud-native data pipeline design", [51] Kolltveit, A. B., & Li, J. (2022, May). Operationalizing machine
Acta Technica Jaurinensis, 2021, [online] Available: learning models: A systematic literature review. In Proceedings of the
https://acta.sze.hu/index.php/acta/article/view/581. 1st Workshop on Software Engineering for Responsible AI (pp. 1-8).
[29] M. Reddy, B. Dattaprakash, S. S. Kammath, S. KN and S. Manokaran, [52] Singh, P. (2023). Systematic review of data-centric approaches in
"Application of mlops in prediction of lifestyle diseases", SPAST artificial intelligence and machine learning. Data Science and
Abstracts, vol. 1, 2021. Management.
[30] C. Min, A. Mathur, U. G. Acer, A. Montanari and F. Kawsar, Sensix++: [53] Bachinger, F., Zenisek, J., & Affenzeller, M. (2024). Automated
Bringing mlops and multi-tenant model serving to sensory edge Machine Learning for Industrial Applications–Challenges and
devices, 9 2021, [online] Available: Opportunities. Procedia Computer Science, 232, 1701-1710.
https://arxiv.org/abs/2109.03947v1. [54] Shivashankar, K., & Martini, A. (2022, August). Maintainability
[31] S. Mäkinen, H. Skogström, V. Turku, E. Laaksonen and T. Mikkonen, challenges in ML: A systematic literature review. In 2022 48th
Who needs mlops: What data scientists seek to accomplish and how Euromicro Conference on Software Engineering and Advanced
can mlops help?, 2021. Applications (SEAA) (pp. 60-67). IEEE.
[32] C. Renggli, L. Rimanic, N. M. Gürel, B. Karlaš, W. Wu, C. Zhang, et [55] Mohseni, S., Wang, H., Xiao, C., Yu, Z., Wang, Z., & Yadawa, J.
al., A data quality-driven view of mlops, 2 2021, [online] Available: (2022). Taxonomy of machine learning safety: A survey and primer.
https://arxiv.org/abs/2102.07750v1. ACM Computing Surveys, 55(8), 1-38.
[33] P. Ruf, M. Madan, C. Reich and D. Ould-Abdeslam, "Demystifying [56] Xie, Y., Cruz, L., Heck, P., & Rellermeyer, J. S. (2021, May).
mlops and presenting a recipe for the selection of open-source tools", Systematic mapping study on the machine learning lifecycle. In 2021
Applied Sciences, vol. 11, pp. 8861, 2021. IEEE/ACM 1st Workshop on AI Engineering-Software Engineering
for AI (WAIN) (pp. 70-73). IEEE.
[34] J. Klaise, A. V. Looveren, C. Cox, G. Vacanti and A. Coca, Monitoring
and explainability of models in production, 7 2020, [online] Available: [57] Calefato, F., Lanubile, F., & Quaranta, L. (2022, September). A
https://arxiv.org/abs/2007.06299v1. preliminary investigation of MLOps practices in GitHub. In
Proceedings of the 16th ACM/IEEE International Symposium on
[35] S. Alla and S. K. Adari, "What is mlops?" in Beginning MLOps with
Empirical Software Engineering and Measurement (pp. 283-288).
MLFlow, Springer, pp. 79-124, 2021.
[58] Barry, M., Bifet, A., & Billy, J. L. (2023, May). Streamai: Dealing with
[36] S. Sharma, "The devops adoption playbook: a guide to adopting devops challenges of continual learning systems for serving ai in production.
in a multi-speed it enterprise", IBM Press, pp. 34-58. In 2023 IEEE/ACM 45th International Conference on Software
[37] B. Fitzgerald and K.-J. Stol, "Continuous software engi-neering: A Engineering: Software Engineering in Practice (ICSE-SEIP) (pp. 134-
roadmap and agenda", Journal of Systems and Software, vol. 123, pp. 137). IEEE.
176-189, 1 2017. [59] Shankar, S., Garcia, R., Hellerstein, J. M., & Parameswaran, A. G.
[38] N. Gift and A. Deza, Practical MLOps: operationalizing machine (2024). " We Have No Idea How Models will Behave in Production
learning models, O'Reilly Media, Inc, 2020. until Production": How Engineers Operationalize Machine Learning.
[39] E. RAJ, "Mlops using azure machine learning rapidly test build and Proceedings of the ACM on Human-Computer Interaction, 8(CSCW1),
manage production-ready machine learning life cycles at scale", 1-34.
PACKT PUBLISHING LIMITED, pp. 45-62, 2021. [60] Tu, D., He, Y., Cui, W., Ge, S., Zhang, H., Han, S., ... & Chaudhuri, S.
[40] C. A. Ioannis Karamitsos and Saeed Albarhami, "Ap-plying devops (2023, August). Auto-Validate by-History: Auto-Program Data Quality
practices of continuous automation for machine learning", Information, Constraints to Validate Recurring Data Pipelines. In
vol. 11, pp. 363, 2020. [61] Chadli, K., Botterweck, G., & Saber, T. (2024, April). The
[41] B. Fitzgerald and K.-J. Stol, "Continuous software engineering and Environmental Cost of Engineering Machine Learning-Enabled
beyond: Trends and challenges general terms", Proceedings of the 1st Systems: A Mapping Study. In Proceedings of the 4th Workshop on
International Workshop on Rapid Continuous Software Engineering- Machine Learning and Systems (pp. 200-207).
RCoSE 2014, vol. 14, 2014. [62] Baumann, N., Kusmenko, E., Ritz, J., Rumpe, B., & Weber, M. B.
[42] M. Treveil and D. Team, Introducing mlops how to scale machine (2022, October). Dynamic data management for continuous retraining.
learning in the enterprise, pp. 185, 2020. In Proceedings of the 25th International Conference on Model Driven
Engineering Languages and Systems: Companion Proceedings (pp.
INCLUDED STUDIES 359-366).
[43] Kreuzberger, D., Kühl, N., & Hirschl, S. (2023). Machine learning [63] Mäkinen, S., Skogström, H., Laaksonen, E., & Mikkonen, T. (2021,
operations (mlops): Overview, definition, and architecture. IEEE May). Who needs MLOps: What data scientists seek to accomplish and
access.. how can MLOps help?. In 2021 IEEE/ACM 1st Workshop on AI
[44] Kumar, A., Boehm, M., & Yang, J. (2017, May). Data management in Engineering-Software Engineering for AI (WAIN) (pp. 109-112).
machine learning: Challenges, techniques, and systems. In Proceedings IEEE.

8
[64] Garg, S., Pundir, P., Rathee, G., Gupta, P. K., Garg, S., & Ahlawat, S. artificial intelligence and computer engineering (ICAICE) (pp. 494-
(2021, December). On continuous integration/continuous delivery for 500). IEEE.
automated deployment of machine learning models using mlops. In [70] Granlund, T., Kopponen, A., Stirbu, V., Myllyaho, L., & Mikkonen, T.
2021 IEEE fourth international conference on artificial intelligence and (2021, May). Mlops challenges in multi-organization setup:
knowledge engineering (AIKE) (pp. 25-28). IEEE. Experiences from two real-world cases. In 2021 IEEE/ACM 1st
[65] Heydari, M., & Rezvani, Z. (2023, November). Challenges and Workshop on AI Engineering-Software Engineering for AI (WAIN)
Experiences of Iranian Developers with MLOps at Enterprise. In 2023 (pp. 82-88). IEEE.
7th Iranian Conference on Advances in Enterprise Architecture [71] Testi, M., Ballabio, M., Frontoni, E., Iannello, G., Moccia, S., Soda, P.,
(ICAEA) (pp. 26-32). IEEE. & Vessio, G. (2022). Mlops: A taxonomy and a methodology. IEEE
[66] Bodor, A., Hnida, M., & Najima, D. (2023, November). From Access, 10, 63606-63618.
Development to Deployment: An Approach to MLOps Monitoring for [72] John, M. M., Olsson, H. H., Bosch, J., & Gillblad, D. (2023,
Machine Learning Model Operationalization. In 2023 14th December). Exploring Trade-Offs in MLOps Adoption. In 2023 30th
International Conference on Intelligent Systems: Theories and Asia-Pacific Software Engineering Conference (APSEC) (pp. 369-
Applications (SITA) (pp. 1-7). IEEE. 375). IEEE.
[67] Sirisha, N., Kiran, A., Arshad, M., & Mounika, M. (2024, January). [73] Hegedűs, C., & Varga, P. (2023, October). Tailoring MLOps
Automating ML Models Using MLOPS. In 2024 International Techniques for Industry 5.0 Needs. In 2023 19th International
Conference on Advancements in Smart, Secure and Intelligent Conference on Network and Service Management (CNSM) (pp. 1-7).
Computing (ASSIC) (pp. 1-5). IEEE. IEEE.
[68] Symeonidis, G., Nerantzis, E., Kazakis, A., & Papakostas, G. A. (2022, [74] Subramanya, R., Sierla, S., & Vyatkin, V. (2022). From DevOps to
January). MLOps-definitions, tools and challenges. In 2022 IEEE 12th MLOps: Overview and application to electricity market forecasting.
Annual Computing and Communication Workshop and Conference Applied Sciences, 12(19), 9851.
(CCWC) (pp. 0453-0460). IEEE.
[69] Zhou, Y., Yu, Y., & Ding, B. (2020, October). Towards mlops: A case
study of ml pipeline platform. In 2020 International conference on

You might also like