0% found this document useful (0 votes)
70 views

Machine Learning Dev Ops Engineer Nanodegree Program Syllabus

This document provides an overview of a Nanodegree program focused on machine learning DevOps engineering. The program consists of three courses that teach skills for deploying machine learning models in production environments, including: 1) Implementing production-ready Python code for deploying models outside cloud environments. 2) Engineering automated data workflows for continuous training and validation within CI/CD pipelines. 3) Deploying scalable machine learning pipelines in production environments using tools like FastAPI, Heroku, GitHub Actions, and DVC for version control. The program aims to provide an advanced skill set for streamlining the deployment of machine learning models through automation.

Uploaded by

Cylub
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
70 views

Machine Learning Dev Ops Engineer Nanodegree Program Syllabus

This document provides an overview of a Nanodegree program focused on machine learning DevOps engineering. The program consists of three courses that teach skills for deploying machine learning models in production environments, including: 1) Implementing production-ready Python code for deploying models outside cloud environments. 2) Engineering automated data workflows for continuous training and validation within CI/CD pipelines. 3) Deploying scalable machine learning pipelines in production environments using tools like FastAPI, Heroku, GitHub Actions, and DVC for version control. The program aims to provide an advanced skill set for streamlining the deployment of machine learning models through automation.

Uploaded by

Cylub
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 16

INDIVIDUAL LEARNERS

SCHOOL OF ARTIFICIAL INTELLIGENCE

Machine Learning
DevOps Engineer
Nanodegree Program Syllabus
Overview
This program focuses on the software engineering fundamentals needed to successfully streamline the deployment of data
and machine learning models in a production-level environment. Learners will build the DevOps skills required to automate
the various aspects and stages of machine learning model building and monitoring over time.

Learning Objectives

A graduate of this program will be able to:

• Implement production-ready Python code/processes for deploying ML models outside of cloud-based


environments facilitated by tools such as AWS SageMaker, Azure ML, etc.

• Engineer automated data workflows that perform continuous training (CT) and model validation within a
CI/CD pipeline based on updated data versioning.

• Create multi-step pipelines that automatically retrain and deploy models after data updates.

• Track model summary statistics and monitor model online performance over time to prevent
model-degradation.

Machine Learning DevOps Engineer 2


Program information

Estimated Time Skill Level

4 months at 10hrs/week* Advanced

Prerequisites

A well-prepared learner should have prior experience with Python and machine learning.

Required Hardware/Software

Learners need access to a 64-bit computer, at least 8GB of RAM, and administrator account permissions sufficient to install
programs including Anaconda with Python 3.x and supporting packages.

*The length of this program is an estimation of total hours the average student may take to complete all required
coursework, including lecture and project time. If you spend about 5-10 hours per week working through the program, you
should finish within the time provided. Actual hours may vary.

Machine Learning DevOps Engineer 3


Course 1

Clean Code Principles


Develop skills that are essential for deploying production machine learning models. First, learners will put coding best
practices on autopilot by learning how to use PyLint and AutoPEP8. Then they will further expand their Git and Github skills to
work with teams. Finally, they will learn best practices associated with testing and logging used in production settings in order
to ensure their models can stand the test of time.

Course Project

Predict Customer Churn with Clean Code


In this project, learners will implement their learnings to identify credit card customers that are most
likely to churn. The completed project will include a Python package for a machine learning project that
follows coding (PEP8) and engineering best practices for implementing software (modular, documented,
and tested). The package will also have the flexibility of being run interactively or from the command-line
interface (CLI). This project will give learners practice using their skills for testing, logging, and coding best
practices from the lessons. It will also introduce them to a problem data scientists across companies face all
the time: How do we identify (and later intervene with) customers who are likely to churn?

• Write clean, modular, and well-documented code.

Lesson 1 • Refactor code for efficiency.

Coding Best Practices • Follow PEP8 standards.

• Automate use of PEP8 standards using PyLint and Auto PEP8.

Machine Learning DevOps Engineer 4


• Work independently using Git and Github.

Lesson 2 • Work with teams using Git and Github.

• Create branches for isolating changes in Git and Github.


Working with Others Using
Version Control • Open pull requests for making changes to production code.

• Conduct and receive code reviews using best practices.

• Correctly use try-except blocks to identify errors.

• Create unit tests to test programs.


Lesson 3
• Track actions and results of processes with logging.
Production Ready Code
• Identify model drift and when automated or non-automated retraining should
be used to make model updates.

Course 2

Building a Reproducible Model Workflow


This course empowers the learners to be more efficient, effective, and productive in modern, real-world ML projects by
adopting best practices around reproducible workflows. In particular, it teaches the fundamentals of MLops and how to: a)
create a clean, organized, reproducible, end-to-end machine learning pipeline from scratch using MLflow b) clean and validate
the data using pytest c) track experiments, code, and results using GitHub and Weights & Biases d) select the best-performing
model for production and e) deploy a model using MLflow. Along the way, it also touches on other technologies like
Kubernetes, Kubeflow, and Great Expectations and how they relate to the content of the class.

Machine Learning DevOps Engineer 5


Course Project

Build an ML Pipeline for Short-Term Rental Prices in NYC


Learners will write a machine learning pipeline to solve the following problem: A property management
company is renting rooms and properties in New York for short periods on various rental platforms.
They need to estimate the typical price for a given property based on the price of similar properties. The
company receives new data in bulk every week, so the model needs to be retrained with the same cadence,
necessitating a reusable pipeline. The students will write an end-to-end pipeline covering data fetching,
validation, segregation, train and validation, test, and release. They will run it on an initial data sample, and
then re-run it on a new data sample simulating a new data delivery.

• Learn MLOps fundamentals.

Lesson 1 • Version data and artifacts.

Machine Learning Pipelines • Write a ML pipeline component.

• Link together ML components.

Lesson 2 • Execute and track the exploratory data analysis (EDA).

• Clean and preprocess the data.


Data Exploration
& Preparation • Segregate (split) datasets.

Lesson 3 • Use pytest with parameters for reproducible and automatic data tests.

Data Validation • Perform deterministic and non-deterministic data tests.

Machine Learning DevOps Engineer 6


• Tame the chaos with experiment, code, and data tracking.

Lesson 4 • Track experiments with W&B.

• Validate and choose best-performing model.


Training, Validation &
Experiment Tracking • Export model as an inference artifact.

• Test final inference artifact.

Lesson 5 • Release pipeline code.

Release & Deploy • Options for deployment and how to deploy a model.

Course 3

Deploying a Scalable ML Pipeline in Production


This course teaches learners how to robustly deploy a machine learning model into production. En route to that goal they will
learn how to put the finishing touches on a model by taking a fine grained approach to model performance, checking bias, and
ultimately writing a model card. They will also learn how to version control their data and models using data version control
(DVC). The last piece in preparation for deployment will be learning continuous integration and continuous deployment which
will be accomplished using GitHub Actions and Heroku, respectively. Finally, learn how to write a fast, type-checked, and
auto-documented API using FastAPI.

Machine Learning DevOps Engineer 7


Course Project

Deploying a Machine Learning Model on Heroku


with FastAPI
In this project, learners will deploy a machine learning model on Heroku. The learners will use Git and DVC
to track their code, data, and model while developing a simple classification model on the Census Income
dataset. After developing the model the learners will finalize the model for production by checking its
performance on slices and writing a model card encapsulating key knowledge about the model. They will
put together a continuous integration and continuous deployment framework and ensure their pipeline
passes a series of unit tests before deployment. Lastly, an API will be written using FastAPI and will be
tested locally. After successful deployment the API will be tested live using the requests module.

After completion, the learner will have a working API that is live in production, a set of tests, model card,
and full CI/CD framework. On its own, this project can be used as a portfolio piece, but also any of the
constituent pieces can be applied to other projects, e.g. continuous integration, to further flesh them out.

Lesson 1 • Analyze slices of data when training and testing models.

• Probe a model for bias using common frameworks such as Aequitas.


Performance Testing &
Preparing a Model for • Write model cards that explain the purpose, provenance, and pitfalls of a
Production model.

• Version control data/models/etc locally using DVC.


Lesson 2
• Set up remote storage for use with DVC.
Data & Model Versioning
• Create pipelines and track experiments with DVC.

Machine Learning DevOps Engineer 8


• Follow software engineering principles by automating, testing, and versioning
Lesson 3 code.

• Set up continuous integration using GitHub Actions.


CI/CD
• Set up continuous deployment using Heroku.

• Write an API for machine learning inference using FastAPI.


Lesson 4
• Deploy a machine learning inference API to Heroku.
API Deployment with FastAPI
• Write unit tests for APIs using the requests module.

Course 4

Automated Model Scoring & Monitoring


This course will help learners automate the DevOps processes required to score and re-deploy ML models. After model
deployment, learners will set up regular scoring processes, learn to reason carefully about model drift, and learn whether
models need to be retrained and re-deployed. They will learn to diagnose operational issues with models, including data
integrity and stability problems, timing problems, and dependency issues. Finally, they will learn to set up automated reporting
with APIs.

Machine Learning DevOps Engineer 9


Course Project

A Dynamic Risk Assessment System


In this project, learners will make predictions about attrition risk in a fabricated dataset. They’ll set up
automated processes to ingest data and score, re-train, and re-deploy ML models that predict attrition risk.
They’ll write scripts to automatically check for new data and check for model drift. They’ll also set up API’s
that allow users to access model results, metrics, and diagnostics. After completing this project, learners
will have a full end-to-end, automated ML project that performs risk assessments. This project can be a
useful addition to learners’ portfolios, and the concepts they apply in the project can be applied to business
problems across a variety of industries.

• Ingest data.

• Automatically train models.


Lesson 1
• Deploy models to production.
Model Training & Deployment
• Keep records about processes.

• Automate processes using cron jobs.

• Automatically score ML models.

Lesson 2 • Keep records of model scores.

Model Scoring & Model Drift • Check for model drift using several different model drift tests.

• Determine whether models need to be retrained and re-deployed.

Machine Learning DevOps Engineer 10


• Check data integrity and stability.
Lesson 3
• Check for dependency issues.
Diagnosing & Fixing • Check for timing issues.
Operational Problems
• Resolve operational issues.

• Create API endpoints that enable users to access model results, metrics, and
Lesson 4
diagnostics.

Model Reporting • Set up APIs with multiple, complex endpoints.


& Monitoring with APIs
• Call APIs and work with their results.

Machine Learning DevOps Engineer 11


Meet your instructors.

Joshua Bernhard
Data Scientist at Thumbtack

Josh has been sharing his passion for data for nearly a decade at all levels of university and as
a data science instructor for coding bootcamps. He’s used data science for work ranging from
cancer research to process automation.

Giacomo Vianello
Principal Data Scientist at Cape Analytics

Giacomo Vianello is an end-to-end data scientist with a passion for state-of-the-art but practical
technical solutions. He is principal data scientist at Cape Analytics, where he develops AI
systems to extract intelligence from geospatial imagery bringing, cutting-edge AI solutions to
the insurance and real estate industries.

Justin Clifford Smith, PhD


Senior Data Scientist at Optum

Justin a senior data scientist at Optum where he works to make healthcare more efficient with
natural language processing and machine learning. Previously he was a data scientist at the
US Census Bureau. His doctorate is from the University of California, Irvine where he studied
theoretical physics.

Bradford Tuckfield
Data Scientist & Writer

Bradford Tuckfield is a data scientist and writer. He has worked on applications of data science
in a variety of industries. He’s the author of Dive Into Algorithms, forthcoming with No Starch
Press.

Machine Learning DevOps Engineer 12


Ulrika Jägare
Head of AI/ML Strategy Execution at Ericsson

Ulrika has been with Ericsson for 21 years in various leadership roles, out of which 11 years
in the data and AI space. Ulrika holds a master of science degree from University of Lund in
Sweden and is also author of seven published books in data science.

Machine Learning DevOps Engineer 13


Udacity’s learning
experience

Hands-on Projects Quizzes


Open-ended, experiential projects are designed Auto-graded quizzes strengthen comprehension.
to reflect actual workplace challenges. They aren’t Learners can return to lessons at any time during
just multiple choice questions or step-by-step the course to refresh concepts.
guides, but instead require critical thinking.

Knowledge Custom Study Plans


Find answers to your questions with Knowledge, Create a personalized study plan that fits your
our proprietary wiki. Search questions asked by individual needs. Utilize this plan to keep track of
other students, connect with technical mentors, movement toward your overall goal.
and discover how to solve the challenges that
you encounter.

Workspaces Progress Tracker


See your code in action. Check the output and Take advantage of milestone reminders to stay
quality of your code by running it on interactive on schedule and complete your program.
workspaces that are integrated into the platform.

Machine Learning DevOps Engineer 14


Our proven approach for building
job-ready digital skills.
Experienced Project Reviewers

Verify skills mastery.


• Personalized project feedback and critique includes line-by-line code review from
skilled practitioners with an average turnaround time of 1.1 hours.

• Project review cycle creates a feedback loop with multiple opportunities for
improvement—until the concept is mastered.

• Project reviewers leverage industry best practices and provide pro tips.

Technical Mentor Support

24/7 support unblocks learning.


• Learning accelerates as skilled mentors identify areas of achievement and potential
for growth.

• Unlimited access to mentors means help arrives when it’s needed most.

• 2 hr or less average question response time assures that skills development stays on track.

Personal Career Services

Empower job-readiness.
• Access to a Github portfolio review that can give you an edge by highlighting your
strengths, and demonstrating your value to employers.*

• Get help optimizing your LinkedIn and establishing your personal brand so your profile
ranks higher in searches by recruiters and hiring managers.

Mentor Network

Highly vetted for effectiveness.


• Mentors must complete a 5-step hiring process to join Udacity’s selective network.

• After passing an objective and situational assessment, mentors must demonstrate


communication and behavioral fit for a mentorship role.

• Mentors work across more than 30 different industries and often complete a Nanodegree
program themselves.

*Applies to select Nanodegree programs only.

Machine Learning DevOps Engineer 15


Learn more at
www.udacity.com/online-learning-for-individuals →

01.06.23 | V1.0

You might also like