ML Midterm Cheatsheet
ML Midterm Cheatsheet
ML Midterm Cheatsheet
2. Data
preparation
Key steps in the process: • Data labeling • EDA • Preprocessing
& Feature Engineering • Splitting • Augmentation
EDA is the cyclical process that can be done at any steps of the ML project’s life
cycle, which answer important questions and make it easier to extract insights
Multiple-layer neural network has one or more hidden layers between input and output
layers, including:(Single-layer neural network has only 1 hidden layer between input
and output layers, including:(Perceptron is the simplest form of neural networks, a linear
algorithm in ML usually used for supervised learning of binary classification).
MLOPs = ML = DevOps
A sequence of steps implemented to deploy an ML model to the production environment; It is easy to
create ML models that can predict on the data we feed but challenging to create such models that
are reliable, fast, accurate, and can be used by a large number of users.
Tools used for ML pipeline:
(1) Flask: create API as interfaces of models, (2) MLFlow: for model registry, (3)
Github: for code version control, (4) Data version control (DVC): version control
6. Explainable AI of the datasets and to make pipeline, (5) Cookiecutter: project templates
XAI allows human users to comprehend and trust Attribution problem: Attribute a model’s prediction on an input to features
the results and output created by ML algorithms. of the input, but Attributions do not explain: ● Feature interactions ● What
training examples influenced the prediction ● Global properties of the model