0% found this document useful (0 votes)
19 views8 pages

Training Future Machine Learning Engineers A Project-Based Course On MLOps

This document discusses a project-based course on MLOps designed to train future machine learning engineers in software engineering best practices. It highlights the lack of specialized academic curricula for ML engineers and the challenges companies face in recruiting qualified professionals. The course aims to bridge this gap by providing hands-on experience and fostering collaboration among students to develop production-ready ML components.

Uploaded by

Vamsi Bandi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views8 pages

Training Future Machine Learning Engineers A Project-Based Course On MLOps

This document discusses a project-based course on MLOps designed to train future machine learning engineers in software engineering best practices. It highlights the lack of specialized academic curricula for ML engineers and the challenges companies face in recruiting qualified professionals. The course aims to bridge this gap by providing hands-on experience and fostering collaboration among students to develop production-ready ML components.

Uploaded by

Vamsi Bandi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

FOCUS: AI IN SOFTWARE ENGINEERING EDUCATION

AND TRAINING

Training Future building and maintaining produc-


tion-ready ML components; to ac-
complish this, they leverage a varied

Machine Learning array of practices and tools generally


recognized under the umbrella term
of MLOps.

Engineers
Despite the growing job market
demand, at present, there is almost
no offer of specialized academic cur-
ricula for ML engineers (with a few
notable exceptions, e.g., the Masters
of AI Engineering offered at Carnegie
A Project-Based Course Mellon University (CMU). Indeed,
most AI university programs—mainly
on MLOps concerned with teaching state-of-the-
art ML techniques—miss the oppor-
tunity to train future ML specialists
on building production-grade compo-
Filippo Lanubile , University of Bari
nents out of ML prototypes. For this
Silverio Martínez-Fernández , Universitat Politècnica de Catalunya reason, aspiring ML engineers need to
look beyond university programs to
Luigi Quaranta , University of Bari learn the new craft and, indeed, sev-
eral specialized online courses have
been published lately, compensating
// In this paper, we present an overview of a project- for the absence of academic options.
based course on MLOps by showcasing a couple Even so, it is still challenging for
companies to recruit qualified ML
of sample projects developed by our students.
engineers, with costly repercussions
Additionally, we share the lessons learned from on their ability to bring ML prod-
offering the course at two different institutions. // ucts to the market. For instance, in
a recent survey conducted by Algo-
rithmia, many firms acknowledged
facing significant challenges in the
deployment of their models, despite
the substantial effort and significant
investments. 2 Christian Kästner—
author of the book Machine Learn-
ing in Production3 and lecturer of
the homonymous course at CMU—
argues that “‘software engineering
for ML’ is more of an education prob-
lem than a research problem”4 most
blocking challenges that practitio-
AS MACHINE LEARNING (ML)- from the industry for ML engineers, ners experience in the field would be
based systems become increasingly also known as artificial intelligence solved by empowering data scientists
complex, there is a growing demand (AI) engineers.1 ML engineers are with software engineering knowledge.
professionals—trained in both soft- In the same line of thought, as soft-
Digital Object Identifier 10.1109/MS.2023.3310768
ware engineering (SE) and ML—who ware engineering educators, we be-
Date of current version: 22 February 2024 can handle the end-to-end process of lieve there is a pressing need to train

60
Authorized licensedI Euse
E Elimited
SOFT A R E | P U BInstitute
to:WMadanapalle L I S H E of
D Technology
BY THE IE & EScience.
E C O MDownloaded
PUTER SO IET Y
C November
on 0 74
19,2024 at 07:59:51 UTC 0 -74
from 5 9 / 2Xplore.
IEEE 4 © 2 0 2 4Restrictions
IEEE apply.
ML specialists on SE best practices
and tools. Hence—looking forward
to having a specialized degree pro-
INDUSTRIAL TRENDS
gram on ML engineering offered at ON MLOps
our universities—we have started ex-
ploring the feasibility and outcomes
of teaching MLOps fundamentals in ML engineers have been actively participating in MLOps communities, fostering
a three-month university course. See mutual support as they tackle work challenges together. They have also contrib-
“Industrial Trends on MLOps.” uted crowd-sourced lists of MLOps resources and tools, like “Awesome MLOps”
(https://github.com/visenger/awesome-mlops). “MLOps Community” (https://
Designing a Project- home.mlops.community/) is a prominent example of an online hub where MLOps
Based Course on MLOps professionals gather to share their practical experiences, learn new skills, and col-
In 2021, we decided to explore if laborate on projects. Visiting the community website is a great way to stay updat-
our students from the master’s pro- ed on the latest trends in MLOps. For instance, at present, the community is pri-
gram on computer science at the marily focused on the challenge of managing large language models in production.
University of Bari (Uniba), Italy and Regarding MLOps tools, the latest trends can also be inferred by checking cu-
the bachelor’s program on data sci- rated lists like MLOps.toys (https://mlops.toys). Notably, several of the tools avail-
ence at the Universitat Politècnica de able there are publicly contributed on GitHub, which indicates an increasing and
Catalunya (UPC), Barcelona, Spain needed involvement of the open source community in this space.
could successfully build and deploy
ML-based components while train-
ing on MLOps fundamentals. With
this goal in mind, we designed a the students to carry out their proj- excerpts. Finally, we grouped codes
project-based course on MLOps,7 ect activities end-to-end, meet- into themes and reviewed the analy-
focused on the demonstration and ing all deadlines. To measure how sis results with the whole team.
hands-on experience with MLOps the students perceived the benefits
solutions for the end-to-end develop- and, more in general, the experi- Course Design
ment of ML-based components. We ence of project-based learning of In both editions of the course, we fo-
offered the first edition of the course MLOps, we conducted a survey- cused on six core skills typically ex-
at Uniba in Fall 2021. Upon collect- based feedback study. Specifically, pected from ML engineers as follows:
ing encouraging feedback from the we administered a first survey at the
students, in Fall 2022, we offered beginning of the course—to gauge 1. scoping a real-world ML
a second edition at both Uniba and the students’ initial knowledge and problem and coordinating
UPC. During this second edition, expectations—and a second survey teamwork
we decided to collect data from the at project delivery—to evaluate fi- 2. ensuring ML pipeline
students to evaluate our course, an- nal impressions. (The two surveys reproducibility
swering the following questions: used in this study are available at: 3. fostering quality assurance (QA)
10.5281/zenodo.8026803.) 4. developing an application pro-
Q1: How well does our MLOps We summarized quantitative gramming interface (API) for ML
course align with student answers with descriptive statistics 5. delivering an ML component
expectations? and open-ended ones with thematic 6. keeping the feedback loop.
Q2: How do students perceive analysis. In the latter case, we ad-
the usefulness of the content and opted a focused coding approach: We asked our students to work in
teaching methodology employed After familiarizing ourselves with the teams of three to five people to turn
in our course? collected data, we defined a tenta- a prototypical ML model into a pro-
tive set of codes. Then, we selectively duction-ready ML component. Here,
Course Evaluation coded relevant segments of text, con- by “production-ready,” we mean
As general evidence of course effec- stantly refining the initial codes and a component that can be easily in-
tiveness, we considered the ability of taking note of the most interesting tegrated into a production-grade

MARC
Authorized licensed use limited to: Madanapalle Institute of Technology & Science. Downloaded on November H /A P RatI L07:59:51
19,2024 | I E from
2 0 2 4 UTC EE S O F TXplore.
IEEE  61 apply.
W A R ERestrictions
FOCUS: AI IN SOFTWARE ENGINEERING EDUCATION
AND TRAINING

system and effortlessly maintained final selection, for each MLOps prac- students. Being based on particular
over time. As such, we expect it to tice, we considered the following crite- ML models—freely selected by the
ria: Our picks had to be 1) preferably teams at the course start—each proj-
• be the product of a reproducible open source, 2) popular in the MLOps ect posed specific challenges and in-
build process that can be fully community, 3) well-documented, and spired distinct solutions. Often, the
automated with CI/CD tools 4) easy to learn (see “In­dustrial Trends students went beyond our demon-
• have production-grade quality, on MLOps”). We left our students free strations, adopting additional or al-
i.e., to be properly tested and to explore other options anyway and ternative tools to meet their specific
checked with QA tools make their own informed decisions, project requirements. For the benefit
• expose a cross-platform API and regardless of our choices. of all, we asked each team to report
be packaged in a portable way. We organized the course and their experience to the class in bi-
projects into six milestones, corre- weekly retrospective meetings.
One of the challenges we faced sponding to the aforementioned ML
while designing the course was the se- engineering skills. In the follow- Scoping an ML problem and coordinat-
lection of tools to exemplify MLOps ing paragraphs, we will go through ing teamwork. At the beginning of the
implementation. Not only are the re- each skill, motivating its importance course, we asked all teams to set up
lated practices still consolidating and and showing how a couple of stu- communication and collaboration
far from being standardized, but also dent teams applied the related prac- platforms to coordinate their work.
the multitude of available MLOps tices in their work. Their projects are Then, we tasked them with scoping
tools keeps evolving at a stunningly just representative examples of sev- a real-world problem to be solved
fast pace (see F
­ igure 1). To reach our eral other projects developed by our with ML and selecting (or building)
a prototypical model.
Effective communication is cru-
Scoping an ML Problem Ensuring ML Pipeline cial in collaborative software devel-
and Coordinating Teamwork Reproducibility opment. Defining clear guidelines
in this regard helps team members
Model and Dataset Cards Cookiecutter DS
stay consistent in how they share in-
Trello • Slack Git • DVC • MLflow
formation, for the benefit of team
awareness and information retrieval.
In class, we demonstrated the use
of Microsoft Teams (https://www.
Fostering QA API Development microsoft.com/en-us/microsoft-teams/
group-chat-software) (Uniba) and
Pylint • Pynblint • Pytest Slack (https://slack.com) (UPC), for
Code Carbon • GE FastAPI • OpenAPI
synchronous communication, while
for project coordination, we demoed
a Kanban-style board using Trello
(https://trello.com).
Component Delivery Keeping the Once all teams had selected or
Feedback Loop built their model, we demonstrated
Docker • Compose Better Uptime how to document it using “model
GitHub Actions Prometheus • Grafana cards.” This lightweight approach
to model specification, originally
proposed by Mitchel et al. 8 and
lately popularized by Hugging Face
(https://huggingface.co/docs/hub/
FIGURE 1. This figure roughly depicts the vast technology landscape of MLOps. We model-cards) consists of templates
group a small sample of the existing tools by the skills addressed in our course; for each providing a structured description of
skill, our picks are highlighted in color, while possible alternatives are in grayscale. models, including the ML algorithm,

62
Authorized licensedI Euse
E Elimited
SOFT ARE | WW
to:WMadanapalle W. C Oof
Institute MTechnology
PU T E R .O R
&G W A R E | @on
/ S O F TDownloaded
Science. I E November
EESOFTW A R E at 07:59:51 UTC from IEEE Xplore. Restrictions apply.
19,2024
training dataset, and use cases. A sim- Concerning code, in class, we exem- best experimental paths. To support
ilar approach can be employed for the plified the use of git—the de facto this practice, we demonstrated ML-
specification of datasets (https://hug- standard version control system— flow Tracking (https://mlflow.org),
gingface.co/docs/hub/datasets-cards) with GitHub, the most popular plat- a popular open source solution. Be-
We chose this solution because, with form for git repository hosting. sides the Tracking module, the “MS”
sustainable effort, it allows for con- Also, we recommended using the team leveraged MLflow’s Registry
cise reporting of all relevant aspects
of an ML project.

Example projects. Here, we briefly in- Upon collecting encouraging


troduce a couple of sample projects—
as reported in the corresponding feedback from the students, in Fall
model cards—that will be referenced 2022, we offered a second edition at
throughout this article.
Math Symbol CNN (MS), a team
both Uniba and UPC.
of students from Uniba, built a com-
puter vision (CV) system for the clas-
sification of mathematical symbols
(e.g., +, sin, and log) in low-resolu- GitHub flow,9 a lightweight, branch- module to save models in a central-
tion images (i.e., 28 × 28 pixels) of based workflow for collaborative soft- ized store; moreover, they employed
handwritten text. To this aim, they ware development. Despite knowing DagsHub (ht tps: //dagshub.com /
leveraged the refined version of a git, several students admitted to not about)—a cloud hosting platform for
convolutional neural network built using version control in data science data science projects—as a remote for
for another course. projects. For instance, they would both DVC and MLflow. Other teams
Crystal Gazers (CG), a team of normally ignore Jupyter notebooks as preferred tracking their experiments
students from UPC, built a natural related diffs are hard to read. We em- with Tensorboard, mainly because of
language processing (NLP) applica- phasized the importance of versioning its tight integration with Tensorflow.
tion aimed at predicting the omitted all code artifacts and recommended
word in a sentence based on the con- using modern editors or specialized Fostering QA. Previous research has
text provided by surrounding words. Jupyter extensions, like nbdime, for found that the quality of code in ex-
The students employed a transformer improved notebook diff display. perimental ML artifacts is generally
model trained from scratch on a data- On the other hand, versioning poor, especially in the case of com-
set of Wikipedia articles in Catalan. data is more challenging than ver- putational notebooks.10,11 Similarly,
sioning code. Different data formats model performance is known to be
Ensuring ML pipeline reproducibility. Re- (e.g., text, images) require special- largely affected by the quality of train-
producibility is a key requirement in ized versioning mechanisms; more- ing data, which is far from ideal in
ML projects: Not only it is important over, storing and retrieving data are real-world scenarios. Our third course
to get consistent performances—in harder due to the larger file sizes. In milestone focuses on QA, aiming to
production as in the lab—but also class, we showed how these chal- provide the students with practical
to enable the recovery and timely re- lenges can be overcome using spe- guidance for quality improvement.
training of deployed models. How- cialized tools; a popular example To ensure production-grade qual-
ever, achieving reproducibility in is DVC (https://dvc.org), an open ity for artifacts developed in the
ML is challenging. We address this source platform used to version large lab, data scientists need to modu-
topic in our second course milestone, data files (datasets and models) and larize, test their code, and check it
aimed at providing students with back them up to cloud remotes. with static analyzers. Our students
the knowledge and skills required to Experiment tracking is another re- straightforwardly incorporated the
build reproducible ML pipelines. producibility keystone. Being able to recommended QA tools into their
A first step toward reproduc- trace back experimental decisions is pipelines. For instance, after con-
ibility is embracing version control. crucial to identify and reproduce the solidating experimental notebooks

MARC
Authorized licensed use limited to: Madanapalle Institute of Technology & Science. Downloaded on November H /A P RatI L07:59:51
19,2024 | I E from
2 0 2 4 UTC EE S O F TXplore.
IEEE  63 apply.
W A R ERestrictions
FOCUS: AI IN SOFTWARE ENGINEERING EDUCATION
AND TRAINING

into a pipeline of Python scripts, the categories of input data. In class, we


concerns the delivery of ML-based
“MS” team used Pylint to statically exemplified them in the NLP do-
components. Beyond exposing end-
analyze their code and a combina- main, as inspired by Ribeiro et al.13
points, models need to be packaged
tion of Pytest and unittest to test in a portable way and automatically
Interestingly, some teams like “MS”
it. In addition, “CG” checked their deployed in cloud-based production
showed how the same idea can be ap-
repository with Pynblint12 —a spe- environments. 5 In our fifth course
plied to different domains, like CV.
milestone, we show how to achieve
this using containerization and con-
tinuous integration and continuous
deployment (CI/CD) technologies.
Often, the students went beyond our Being the de facto standard, we
demonstrations, adopting additional used Docker to exemplify software
or alternative tools to meet their containerization. The students found
it relatively easy to understand and
specific project requirements. apply. For instance, “MS” straight-
forwardly employed Docker and
Docker Compose to implement a
four-components microservices ar-
cialized static analyzer for Jupyter API development for ML. To enable their chitecture. However, some students
notebooks. Besides, to optimize en- seamless integration into larger sys- had a hard time setting up containers
ergy efficiency, they tracked the CO2 tems, ML models typically expose for models requiring a GPU at infer-
emissions of their pipelines using their predictive capabilities through ence time. Some of them identified
Code Carbon. web APIs. We devote the fourth suitable base images to leverage full
As versioning, quality assurance milestone of our course to showing model performance with GPUs, while
is more challenging for data than for how to wrap ML models with Rep- others packaged a simplified version
code. Due to the variety of existing resentational State Transfer APIs of their model as a workaround.
data formats, there is no tool cover- using FastAPI. We selected this par- Next, we showed how to auto-
ing all possibilities. In class, we dem- ticular framework for its shallow mate the whole build and deployment
onstrated Great Expectations (GE), learning curve, but also for its com- process with CI/CD tools. Due to its
an open source framework allowing pliance with the OpenAPI standard. seamless integration with GitHub, we
the definition of assertions on various Most teams followed our recom- used GitHub Actions to demonstrate
properties of tabular data. However, mendation and adopted FastAPI. this practice. However, some teams
since none of the teams had trained Conversely, a few groups resorted to preferred using the facilities offered
their model on tabular data, it was alternative solutions; for instance, by their cloud provider. For instance,
challenging for them to find work- “CG” used Amazon Web Services “MS” deployed their multicontainer
arounds. Some students resorted to (AWS) A PI Gateway— an AWS -­ system to Okteto and leveraged its
testing only preprocessed data with managed service for web API devel- native support for Docker Compose
GE (e.g., “CG” used GE to check if opment—to expose HTTP endpoints builds; differently, “CG” employed
tokens extracted from a text were all for their model. The students could AWS facilities to run their compo-
integers). In contrast, other teams pre- also build a demo application to dem- nents in EC2 instances.
ferred using alternative solutions (e.g., onstrate their API. Despite not being
the “MS” team used Deepchecks for trained in web development, several Keeping the feedback loop. To ensure
its native support of image data). of them could build a client web app service availability and performance
Concerning model QA, we showed in no time with special-­purpose front- after deployment, it is crucial to con-
how to complement the use of quanti- end frameworks like Gradio (e.g., tinuously monitor ML-enabled com-
tative metrics, like precision and re- “MS”) and Streamlit (e.g., “CG”). ponents. A monitoring system should
call, with behavioral model testing. track both the resource consumption
Behavioral tests assess the behavior Component delivery. Another crucial set of ML components as well as the per-
of models when applied to specific of skills required of ML engineers formance of ML models themselves,

64
Authorized licensedI Euse
E Elimited
SOFT ARE | WW
to:WMadanapalle W. C Oof
Institute MTechnology
PU T E R .O R
&G W A R E | @on
/ S O F TDownloaded
Science. I E November
EESOFTW A R E at 07:59:51 UTC from IEEE Xplore. Restrictions apply.
19,2024
as they are typically subject to per- Specifically, almost all of them was able to meet by design most of
formance degradation over time. By (94.12%) had had previous experi- the student expectations, the main
setting up a monitoring system, ML ence with CV and most of them with exceptions being extended knowl-
engineers ensure to keep the feed- NLP (e.g., 78.43% had worked on edge about ML, and big data man-
back loop, being able to timely re- text classification). Conversely, only agement and privacy.
place their models as needed. Hence, 58.8% of the students indicated an
we dedicated the final milestone of “average”/“above average” experience Perceived Usefulness of the Course
our course to monitoring practices in SE. Contents and Teaching Methodology
for ML-based systems. Then, we checked what the stu- By the end of the course—and be-
All teams were able to set up a dents were expecting to learn from fore the final exam—we asked the
monitoring system for their ML- the course. More than half of them students to provide final feedback.
based component. They mostly fol- (29) anticipated learning “engineer- Once again, we administered an
lowed the examples provided in ing practices to build production- anonymous survey; this time we col-
class, based on two popular open ready ML-based systems.” Eighteen lected 44 responses in total (18 at
source solutions often used in tan- of them mentioned the application Uniba and 26 at UPC).
dem to track system metrics (Pro- of specific software engineering best To begin, we assessed student
metheus) and visualize them in a practices to ML (e.g., versioning or agreement on the usefulness of the
dashboard (Grafana). containerization): MLOps practices presented in class
using a five-point Likert scale rang-
Lessons Learned I am interested in understanding ing from “Strongly disagree” (1)
All teams could successfully turn containerization, which is currently to “Strongly agree” (5). The most
their model prototype into a pro- a very popular solution that I have “strongly agreed” practices were
duction-ready ML component. Also, never had the opportunity (and the code versioning (83.33%), API de-
they all coped well with the project time) to experiment with. sign for ML (61.90%), and experi-
deadlines and managed to deploy ment tracking (59.09%). Likewise,
their product to the cloud. In the fol- Eight students expected training we surveyed student opinions about
lowing paragraphs, we will examine on top-notch technologies to support our teaching methodology. Most of
the feedback collected from the stu- the building process of ML-enabled them found the project-based na-
dents. The lessons learned by analyz- systems; differently, five of them an- ture of the course (63.64% “Strongly
ing their course experience will help ticipated learning engineering best agree,” 29.55% “Agree”) and team-
us improve the next course editions. practices for better management of work (52.27% “Strongly agree,”
data science projects (9): 29.55% “Agree”) helpful to learn.
Expectations of the Students 84.09% of the students considered
At the beginning of the course, we I expect to learn how to improve the project feasible, and 90,91%
assessed the prior knowledge and the way an ML project is carried agreed or strongly agreed about the
learning expectations of our stu- out from the beginning to the end, appropriateness of the project mile-
dents. We conducted an anonymous through the different stages. stones. Conversely, 15.91% of the
survey, collecting 51 responses (23 at students were neutral about the ap-
Uniba and 28 at UPC). Five students expected to extend propriateness of workload distri-
To start, we asked each student to their knowledge of software en- bution, while 22.73% disagreed or
self-report their experience in both gineering; conversely, six of them strongly disagreed.
SE and ML using a five-point Lik- thought they were going to learn These encouraging results were
ert scale, where 1 represents a “very more about machine learning. Fi- confirmed by our manual analy-
poor” experience and 5 an “excel- nally, only three students men- sis of the open-ended answers. We
lent” experience. Consistently across tioned expecting to know more learned that most of the students
Uniba and UPC—the students ex- about the best practices for collab- (29) found the course useful and
hibited greater confidence in ML , oration in ML projects and just a were willing to reuse some or most
with 86.28% reporting an “aver- couple about big data management of the proposed practices and tools
age” or “above average” experience. and privacy. Overall, our course in their future projects.

MARC
Authorized licensed use limited to: Madanapalle Institute of Technology & Science. Downloaded on November H /A P RatI L07:59:51
19,2024 | I E from
2 0 2 4 UTC EE S O F TXplore.
IEEE  65 apply.
W A R ERestrictions
FOCUS: AI IN SOFTWARE ENGINEERING EDUCATION
AND TRAINING

acquainted with ML enjoy learn-


ABOUT THE AUTHORS ing state-of-the-art engineering
practices and tools to improve their
FILIPPO LANUBILE is a full professor of computer science and ML workflow.
head of the Department of Informatics and leads the Collaborative
Development Research Group at the University of Bari, 70125 Bari, Suggestions for Improvement
Italy. His research interests include human factors in software engi- Seven students expressed criticisms
neering, collaborative software development, software engineering about the course or recommended
for artificial intelligence/machine learning systems, social comput- changes for future editions. For in-
ing, and emotion detection. Lanubile received his M.Sc. in computer stance, a couple of them would have
science from the University of Bari. Contact him at filippo.lanubile@ appreciated more guidance on the
uniba.it. use of Git (and GitHub) or the de-
ployment of an ML-based compo-
SILVERIO MARTÍNEZ-FERNÁNDEZ is an assistant professor at nent to the cloud.
the Universitat Politècnica de Catalunya-BarcelonaTech, Three students complained about
08034 Barcelona, Spain. His research interests include empirical the workload of the course, which
software engineering, software engineering for artificial intelligence they found too heavy.
(AI)/machine learning-based systems, and green AI. Marlnez-Fernán-
dez received his Ph.D. in computer science from the Universitat I enjoyed the course, although
Politècnica de Catalunya. He is a Member of IEEE. Contact him at it has taken most of my study
[email protected]. time.

LUIGI QUARANTA is a research fellow in the Collaborative Develop- or about the general overhead of ap-
ment Research Group at the University of Bari, 70125 Bari, Italy. His plying software engineering prac-
research interests include software engineering for artificial intel- tices to ML projects.
ligence/machine learning-based systems, MLOps, and computational
notebooks. Quaranta received his Ph.D. in computer science from the All of this is important, and the
University of Bari. Contact him at [email protected]. subject has made me realize it.
However, applying these practices
doubles the time spent to develop
a project.

I knew about some of these prac- Eight students highlighted spe- Besides, a couple of students
tices before, but never actually cific tools or categories thereof they reported not being happy with
implemented them. Having to do found particularly useful while some of the recommended tools;
so was useful and taught me a lot developing their project, e.g., col- in particular, they complained
for future projects. laborative versioning with git and about Great Expectation, either
GitHub (4), experiments tracking because its use is redundant in
A couple of them claimed they with MLflow (3), the Cookiecut- their project (“[Great Expecta-
had already started to do so by the ter project structure (2), or build- tions] is used to ensure data qual-
end of the course. ing data pipelines with DVC. A ity s­ tandards, but by preprocessing
couple were willing to reuse espe- data before using it for training
I have already started applying cially the tools offering support for we already ensure them.”) or be-
what we have learned during this reproducibility. cause it does not scale to larger
course to other ML projects. This Finally, five students reported projects (“several of these tools
kind of practice has solved a lot of learning the advantages of using SE are incomplete, in the sense that
problems that I encountered while best practices when building ML- they can only be used in relatively
developing ML models over the based systems. All in all, these re- small projects; for example, Great
past year. sults show that students already Expectations…”).

66
Authorized licensedI Euse
E Elimited
SOFT ARE | WW
to:WMadanapalle W. C Oof
Institute MTechnology
PU T E R .O R
&G W A R E | @on
/ S O F TDownloaded
Science. I E November
EESOFTW A R E at 07:59:51 UTC from IEEE Xplore. Restrictions apply.
19,2024
Appreciation for the Course EU (“FAIR – Future Artificial Intelli- [Online]. Available: http://arxiv.org/
Finally, we examined the willing- gence Research,” code PE00000013, abs/2304.11090
ness of the students to recommend CUP H97G22000210007); the Com- 7. F. Lanubile, S. Martínez-Fernán-
our course to their colleagues using a plementary National Plan PNC-I.1 dez, and L. Quaranta, “Teach-
four-point Likert scale ranging from (“DARE—DigitAl lifelong pRevEn- ing MLOps in higher education
“Definitely not” to “Definitely.” Most tion initiative,” code PNC0000002, through project-based learning,” in
of the students declared to be likely CUP B53C22006420001), and the Proc. IEEE/ACM 45th Int. Conf.
(36.36%) or definitely likely (54.55%) project T ED2021-130923B -I00, Softw. Eng.: Software Engineering
to promote the class. Eleven of them funded by MCIN/AEI/10.13039/ Education and Training (ICSE-
also expressed explicit appreciation 50110001 1033 and the European SEET), Melbourne, Australia,
for the course in the open-ended items Union Next Generation EU/PRTR. 2023, pp. 95–100, doi: 10.1109/
of the survey, confirming it has be- ICSE-SEET58685.2023.00015.
come a necessary addition to the tradi- References 8. M. Mitchell et al., “Model cards
tional academic curriculums. 1. I. Ozkaya, “An AI engineer versus a for model reporting,” in Proc.
software engineer,” IEEE Softw., vol. Conf. Fairness, Accountability,
Nowadays, ML-based systems are 39, no. 6, pp. 4–7, Nov. 2022, doi: Transparency, Atlanta, GA, USA:
everywhere, and it is necessary 10.1109/MS.2022.3161756. ACM, Jan. 2019, pp. 220–229, doi:
to have this course. It would be 2. “2020 state of enterprise ma- 10.1145/3287560.3287596.
great if it could be extended into a chine learning,” Algorithmia, 9. “GitHub flow.” GitHub. Accessed:
9-credit course. 2020. [Online]. Available: https:// Jan. 31, 2023. [Online]. A­ vailable:
www.coriniumintelligence. https://docs.github.com/en/get-
I found it really useful. I think com/2020-state-of-enterprise started/quickstart/github-flow
having this type of subject in our -machine-learning-algorithmia 10. J. F. Pimentel, L. Murta, V. Bra-
degree is crucial. I have used and I -whitepaper-download ganholo, and J. Freire, “A large-
will use what I have learned. 3. C. Kästner, “Machine learning in scale study about quality and
production,” Medium, Feb. 2021. reproducibility of Jupyter note-

I
­Accessed: Jan. 31, 2023. ­[Online]. books,” in Proc. 16th Int. Conf.
n this article, we shared our ex- Available: https://ckaestne.medium. Mining Softw. Repositories,
perience in designing and deliv- com/machine-learning-in-production 2019, pp. 507–517, doi: 10.1109/
ering a project-based university -book-overview-63be62393581 MSR.2019.00077.
course on MLOps aimed at training 4. MSR’22 Keynote: From Models to Sys- 11. J. Wang, L. Li, and A. Zeller, “Bet-
future ML engineers. After examin- tems: Rethinking the Role of Software ter code, better sharing: On the
ing the practices addressed and a se- Engineering for ML. (May 20, 2022). need of analyzing Jupyter note-
lection of tools used by our students Accessed: Feb. 9, 2023. [Online Video]. books,” in Proc. ACM/IEEE 42nd
to overcome engineering challenges ­Available: https://www.youtube.com/ Int. Conf. Softw. Eng., New Ideas
in the development of their project watch?v=_m-m90S_4Gg Emerg. Results, New York, NY,
works, we shared their feedback on 5. D. Sato, A. Wider, and C. Wind- USA: ACM, 2020, pp. 53–56, doi:
the course. From our experience, heuser, “Continuous delivery for 10.1145/3377816.3381724.
we learned that students already machine learning: Automating the 12. L. Quaranta, F. Calefato, and
acquainted with ML are eager to end-to-end lifecycle of machine learn- F. Lanubile, “Pynblint: A static
know more about engineering best ing applications,” Martin Fowler, analyzer for python Jupyter note-
practices for ML and that core com- Sep. 2019. [Online]. Available: books,” in Proc. IEEE/ACM 1st
petencies required of ML engineers https://martinfowler.com/articles/ Int. Conf. AI Eng., Softw. Eng. AI
can be successfully taught over the cd4ml.html (CAIN), May 2022, pp. 48–49, doi:
course of a semester. 6. Q. Lu, L. Zhu, X. Xu, Z. Xing, and 10.1145/3522664.3528612.
J. Whittle, “Towards responsible AI 13. M. T. Ribeiro, T. Wu, C. Guestrin,
Acknowledgment in the era of ChatGPT: A reference and S. Singh, “Beyond accuracy: Be-
This work was supported in part by architecture for designing foundation havioral testing of NLP models with
the NRRP Initiative –Next Generation model-based AI systems,” May 2023. checkList,” 2020, arXiv:2005.04118.

MARC
Authorized licensed use limited to: Madanapalle Institute of Technology & Science. Downloaded on November H /A P RatI L07:59:51
19,2024 | I E from
2 0 2 4 UTC EE S O F TXplore.
IEEE  67 apply.
W A R ERestrictions

You might also like