0% found this document useful (0 votes)
27 views

ML - 03 - Machine Learning Systems

This document summarizes steps for developing a machine learning model using real-world data. It discusses using the California Housing Prices dataset to predict median housing prices based on other district metrics. Key steps covered include framing the problem as a regression task, assessing the current manual estimation system, exploring and preparing the data, selecting a performance measure like RMSE, and creating environments and test sets for model development and evaluation.

Uploaded by

Jok eR
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
27 views

ML - 03 - Machine Learning Systems

This document summarizes steps for developing a machine learning model using real-world data. It discusses using the California Housing Prices dataset to predict median housing prices based on other district metrics. Key steps covered include framing the problem as a regression task, assessing the current manual estimation system, exploring and preparing the data, selecting a performance measure like RMSE, and creating environments and test sets for model development and evaluation.

Uploaded by

Jok eR
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 60

Machine Learning

Chapter 03
Machine Learning Systems
Prepared by: Ziad Doughan
Email: [email protected]

Chapter 03 - Machine Learning - Faculty of Engineering - BAU 1


Working with Real Data

IN ML projects, it is best to
experiment with real world
data, not artificial datasets.
There are thousands of open
datasets to choose from.
Example:
The California Housing Prices
dataset is based on data from
the 1990 California bay area.

Chapter 03 - Machine Learning - Faculty of Engineering - BAU 2


Look at the Big Picture

The first step is to understand how a ML is going to be used to


build a intelligent model.
Example:
The housing prices data includes metrics such as the
population, median income, and median housing price for
each block group in California.
Block groups, called “districts”, are the smallest geographical
unit for which the US Census Bureau publishes sample data.
The model should learn to predict the median housing price in
any district, given all the other metrics.

Chapter 03 - Machine Learning - Faculty of Engineering - BAU 3


Frame the Problem

The first question to ask is "what exactly the business


objective is?".
Building a model is probably not the end goal. But, "How does
the company expect to use and benefit from this model?" is
the target.
Knowing the objective is important because it will determine:
• how you frame the problem,
• which algorithms you will select,
• which performance measure you will use to evaluate your
model,
• and how much effort you will spend tweaking it.

Chapter 03 - Machine Learning - Faculty of Engineering - BAU 4


Data Pipeline

The data pipeline is the sequence of data processing.


Pipelines are very common to manipulate and transform data.

Chapter 03 - Machine Learning - Faculty of Engineering - BAU 5


Current System Assessment

The next question to ask is "what the current solution looks


like, if any".
The current situation will often give you a reference for
performance, as well as insights on how to solve the problem.
Example: The district housing prices are currently estimated
manually by experts.
A team gathers up-to-date information about a district, and
when they cannot get the median housing price, they
estimate it using complex rules.

Chapter 03 - Machine Learning - Faculty of Engineering - BAU 6


Current System Assessment

This is costly and time-consuming, and their estimates are not


great.
In cases where they manage to find out the actual median
housing price, they often realize that their estimates were off
by more than 20%.
This is why the company thinks that it would be useful to train
a model to predict a district’s median housing price, given
other data about that district.

Chapter 03 - Machine Learning - Faculty of Engineering - BAU 7


Dataset Assessment

The census data looks like a great dataset to exploit for this
purpose, since it includes the median housing prices of
thousands of districts, as well as other data.
With all this information, you are now ready to start designing
your system.
First, you need to frame the problem:
• Is it supervised, unsupervised, or Reinforcement Learning?
• Is it a classification, a regression, or something else?
• Should you use batch or online learning techniques?

Chapter 03 - Machine Learning - Faculty of Engineering - BAU 8


Dataset Assessment

Let’s return to the housing prices examples:


It is clearly a typical supervised learning task, since we have
labeled training examples.
It is also a typical univariate regression problem, since we are
only trying to predict a single value for each district.
Finally, there is no continuous flow of data coming into the
system, no particular need to adjust to rapidly changing data,
and the data is small enough to fit in memory, so plain batch
learning should do just fine.

Chapter 03 - Machine Learning - Faculty of Engineering - BAU 9


Performance Measure

Your next step is to select a performance measure.


A typical performance measure for regression problems is the
Root Mean Square Error (RMSE).
It gives an idea of how much error the system typically makes
in its predictions, with a higher weight for large errors.
The equation of the RMSE is:
𝑚
1 𝑖 𝑖 2
𝑅𝑀𝑆𝐸 𝑋, ℎ = ෍ ℎ 𝑥 −𝑦
𝑚
𝑖=1

Chapter 03 - Machine Learning - Faculty of Engineering - BAU 10


ML Notations

This equation introduces several notations:


• m is the number of instances in the dataset.
• 𝑥 (𝑖) is a vector of all the feature values.
• 𝑦 (𝑖) is its label of the desired output value.
• X is a matrix containing all the feature of all the records.
• h is the system’s prediction function or hypothesis.
• 𝑦ො (𝑖) = ℎ(𝑥 (𝑖) ) is the predicted value (“y-hat”).

Chapter 03 - Machine Learning - Faculty of Engineering - BAU 11


Mean Absolute Error

RMSE is the preferred performance measure for regression.


But in some contexts, it is preferred to use another function.
For example, suppose there are many outliers.
In that case, you should consider using the mean absolute
error (MAE). It is also called the average absolute deviation:
𝑚
1 𝑖 𝑖
𝑀𝐴𝐸 𝑋, ℎ = ෍ ℎ 𝑥 −𝑦
𝑚
𝑖=1

Mathematically, both the RMSE and the MAE are ways to


measure the distance between two vectors.

Chapter 03 - Machine Learning - Faculty of Engineering - BAU 12


Preparing the Development Infrastructure

Create the Workspace:


• First, you will need to have Python installed.
• Next, you need to create a workspace directory for your
code and datasets.
• Then, you will need some Python modules: NumPy,
pandas, Matplotlib, Scikit-Learn ...
• Finally, make sure to upgrade your modules.

Chapter 03 - Machine Learning - Faculty of Engineering - BAU 13


Creating an Isolated Environment

It is strongly recommended to
work in an isolated environment.
This allows you to work on
different projects without having
conflicting library versions.
In Python you can create an
isolated environment.
Every time you want to use a
specific environment, all you
must do is activate it.

Chapter 03 - Machine Learning - Faculty of Engineering - BAU 14


Getting the Data

In typical environments your data would be available in a


relational database and spread across multiple tables,
documents, or files.
To access it, you would first need to get your credentials and
access authorizations and familiarize yourself with the data
schema.
In simpler projects however, you just need to create or
download a single compressed file, like housing.tgz, which
contains a comma-separated values (CSV) file called
housing.csv with all the data.

Chapter 03 - Machine Learning - Faculty of Engineering - BAU 15


Always Take a Look at the Data

Each row represents one district. There are many attributes:


longitude, latitude, housing_median_age, total_rooms,
total_bedrooms, population, households, median_income,
median_house_value, and ocean_proximity.

Chapter 03 - Machine Learning - Faculty of Engineering - BAU 16


Always Take a Look at the Data

The info() method is useful to All attributes are numerical,


get a quick description of the except the ocean_proximity
data. type is object.

You can find out what


categories exist and how
many districts belong to each
category.

Chapter 03 - Machine Learning - Faculty of Engineering - BAU 17


Always Take a Look at the Data

Chapter 03 - Machine Learning - Faculty of Engineering - BAU 18


Data Histograms

Chapter 03 - Machine Learning - Faculty of Engineering - BAU 19


Data Histograms

These attributes have very different


scales. Thus, we need later to apply
feature scaling on them.
Many histograms are tail-heavy. They
extend much farther to the right of the
median than to the left.
This may make it hard for some ML to
detect patterns. we need later to
transform these attributes to have
more bell-shaped distributions.

Chapter 03 - Machine Learning - Faculty of Engineering - BAU 20


Data Histograms

The median income attribute is not expressed in US dollars


(USD). After checking the data, the numbers represent
roughly $10,000s of dollars (3 means $30,000). Working with
preprocessed attributes is common in Machine Learning.

Chapter 03 - Machine Learning - Faculty of Engineering - BAU 21


Data Histograms

The housing median age and the


median house value were capped. Your
ML may learn that prices never go
beyond that limit. If you need precise
predictions even beyond $500,000,
then you have two options:
a. Collect proper labels for the districts
whose labels were capped.
b. Remove those districts from the
Dataset.

Chapter 03 - Machine Learning - Faculty of Engineering - BAU 22


Create a Test Set

Creating a test set is theoretically simple. Pick some instances


randomly, typically 20% of the dataset or less if your dataset is
very large.
However, you must make sure that your ML model will not be
able to see the whole dataset.
Example: for a small dataset up to 100,000 record 70%
Training set, 30% Test set, or 60% Training set, 20%
Development set, 20% Test set.
However, for big dataset 98% Training set, 1% Development
set, 1% Test set.

Chapter 03 - Machine Learning - Faculty of Engineering - BAU 23


Visualize the Data to Gain Insights

You can create a scatterplot to visualize the data.

Chapter 03 - Machine Learning - Faculty of Engineering - BAU 24


Data Scatterplot

Setting the alpha option to 0.1 makes it much easier to


visualize the places where there is a high density of data
points.

Chapter 03 - Machine Learning - Faculty of Engineering - BAU 25


Data Scatterplot

Red is expensive, blue is cheap, larger circles indicate areas


with a larger population.

Chapter 03 - Machine Learning - Faculty of Engineering - BAU 26


Looking for Correlations

You can easily compute the


standard correlation coefficient
(also called Pearson’s r) between
every pair of attributes using the
corr().
The correlation coefficient ranges
from –1 to 1. When it is close to 1,
it means that there is a strong
positive correlation.
Example: the median house value
tends to go up when the median
income goes up.

Chapter 03 - Machine Learning - Faculty of Engineering - BAU 27


Looking for Correlations

Chapter 03 - Machine Learning - Faculty of Engineering - BAU 28


Prepare the Data for ML

You should write functions for this purpose:


• This will allow you to reproduce these transformations
easily on any dataset.
• You will gradually build a library of transformation
functions that you can reuse in future projects.
• You can use these functions in your live system to
transform the new data before feeding it to your
algorithms.
• This will make it possible for you to easily try various
transformations and see which combination of
transformations works best.
Chapter 03 - Machine Learning - Faculty of Engineering - BAU 29
Data Cleaning

Most ML algorithms cannot work with missing features, so


let’s take care of them.
We saw earlier that the total_bedrooms attribute has some
missing values.
To fix this, you have three options:
1. Get rid of the corresponding districts.
2. Get rid of the whole attribute.
3. Set the values to some value (zero, mean, median, etc.).

Chapter 03 - Machine Learning - Faculty of Engineering - BAU 30


Handling Text and Categorical Attributes

Let’s look at its value for the first 10 instances:

Chapter 03 - Machine Learning - Faculty of Engineering - BAU 31


Feature Scaling and Normalization

What is feature scaling and normalization?


It is the operation that transforms the feature to have a mean
of 0 and a variance of 1.
The (z-score) equation is simply: z = (𝑥 − 𝜇) / 𝜎
Where, x is the feature value, 𝜇 is the mean of x, and 𝜎 is the
standard deviation of x.
1 𝑚 1 𝑚
𝜇= σ𝑖=1 𝑥 [𝑖] ,𝜎 = σ 𝑥 [𝑖] 2
𝑚 𝑚 𝑖=1

Chapter 03 - Machine Learning - Faculty of Engineering - BAU 32


Feature Scaling and Normalization

For a dataset with multiple features, we generally have


different scales for each one. This situation could give rise to
greater influence in the final results for some of the features.

Chapter 03 - Machine Learning - Faculty of Engineering - BAU 33


Feature Scaling and Normalization

Min-Max normalization: also called feature scaling, performs a


linear transformation to get all the scaled data between [0, 1].

The formula to achieve this is the following:


𝑥 − 𝑥𝑚𝑖𝑛
𝑛=
𝑥𝑚𝑎𝑥 − 𝑥𝑚𝑖𝑛

Min-max normalization preserves the relationships among the


original values. We will end up with smaller standard
deviations, which can suppress the effect of outliers.

Chapter 03 - Machine Learning - Faculty of Engineering - BAU 34


Feature Scaling and Normalization

Why do we need feature scaling and normalization?


With few exceptions, ML algorithms don’t perform well when
the input numerical attributes have very different scales.
In this case, the model could allow greater influence for some
of the features in the final results.
By applying feature scaling and normalization, when striving
to minimize the error, the algorithm won’t focus too much on
the extreme errors and will obtain a general solution.

Chapter 03 - Machine Learning - Faculty of Engineering - BAU 35


Data Discretization

This is a process of converting continuous data into a set of


intervals. This makes the data easier to analyze.
This method is also called a data reduction mechanism as it
transforms a large dataset into a set of categorical data.
Data discretization can be classified into two types:
• supervised where the class information is used, and
• unsupervised 'top-down splitting' or 'bottom-up
merging' strategy.
For example, the values for the age attribute can be replaced
by the interval labels such as (0-10, 11-20…) or (kid, youth,
adult, senior).
Chapter 03 - Machine Learning - Faculty of Engineering - BAU 36
Select and Train a Model

You framed the problem, you got the data and explored it,
you sampled a training set and a test set, and you wrote
transformation pipelines preprocess the data.
You are now ready to select and train a Machine Learning
model. Let’s first train a Linear Regression model:
>>> from sklearn.linear_model import LinearRegression
>>> lin_reg = LinearRegression()
>>> lin_reg.fit(housing_prepared, housing_labels)
Done! You now have a working Linear Regression model.

Chapter 03 - Machine Learning - Faculty of Engineering - BAU 37


Select and Train a Model

Let’s try it out on a few instances from the training set:


>>> some_data = housing.iloc[:5]
>>> some_labels = housing_labels.iloc[:5]
>>> some_data_prepared = full_pipeline.transform(some_data)
>>> print("Predictions:", lin_reg.predict(some_data_prepared))
Predictions: [ 210644.6045 317768.8069 210956.4333 59218.9888
189747.5584]
>>> print("Labels:", list(some_labels))
Labels: [286600.0, 340600.0, 196900.0, 46300.0, 254500.0]

Chapter 03 - Machine Learning - Faculty of Engineering - BAU 38


Select and Train a Model

It works, although the predictions are not exactly accurate.


Let’s measure this regression model’s RMSE using
mean_squared_error() function:
>>> from sklearn.metrics import mean_squared_error
>>> housing_predictions = lin_reg.predict(housing_prepared)
>>> lin_mse = mean_squared_error(housing_labels, housing_predictions)
>>> lin_rmse = np.sqrt(lin_mse)
>>> lin_rmse
68628.19819848922
A typical prediction error of $68,628 is not very satisfying.

Chapter 03 - Machine Learning - Faculty of Engineering - BAU 39


Select and Train a Model

Let’s train a Decision Tree Regressor. This is a powerful model,


capable of finding complex nonlinear relationships in the data. The
code is:
>>> from sklearn.tree import DecisionTreeRegressor
>>> tree_reg = DecisionTreeRegressor()
>>> tree_reg.fit(housing_prepared, housing_labels)
Now that the model is trained, let’s evaluate it on the training set:
>>> housing_predictions = tree_reg.predict(housing_prepared)
>>> tree_mse = mean_squared_error(housing_labels, housing_predictions)
>>> tree_rmse = np.sqrt(tree_mse)
>>> tree_rmse
0.0 No error? Is there overfitting?

Chapter 03 - Machine Learning - Faculty of Engineering - BAU 40


Better Evaluation Using Cross-Validation

One way to evaluate the Decision Tree model is to split the


training set into a smaller training set and a validation set,
then train your model again.
It’s a bit of work, but nothing too difficult, and it would work
well.
A great alternative is to use K-fold cross-validation:
Randomly split the training set into 10 distinct subsets called
folds, then train and evaluate your model 10 times, picking a
different fold for evaluation every time and training on the
other 9 folds.

Chapter 03 - Machine Learning - Faculty of Engineering - BAU 41


Better Evaluation Using Cross-Validation

The result is an array containing the 10 evaluation scores:

>>> from sklearn.model_selection import cross_val_score


>>> scores = cross_val_score(tree_reg, housing_prepared, housing_labels,
scoring="neg_mean_squared_error", cv=10)
>>> tree_rmse_scores = np.sqrt(-scores)
>>> display_scores(tree_rmse_scores)

Scores: [70194.33680785 66855.16363941 72432.58244769


70758.73896782 71115.88230639 75585.14172901 70262.86139133
70273.6325285 75366.87952553 71231.65726027]
Mean: 71407.68766037929
Standard deviation: 2439.4345041191004

Chapter 03 - Machine Learning - Faculty of Engineering - BAU 42


Better Evaluation Using Cross-Validation

Let’s compute the same scores for the Linear Regression:


>>> lin_scores = cross_val_score(lin_reg, housing_prepared,
housing_labels, scoring="neg_mean_squared_error", cv=10)
>>> lin_rmse_scores = np.sqrt(-lin_scores)
>>> display_scores(lin_rmse_scores)
Scores: [66782.73843989 66960.118071 70347.95244419
74739.57052552 68031.13388938 71193.84183426
64969.63056405 68281.61137997 71552.91566558
67665.10082067]
Mean: 69052.46136345083
Standard deviation: 2731.674001798348
NOTE: the Decision Tree model is overfitting so badly. It performs
worse than the Linear Regression model.

Chapter 03 - Machine Learning - Faculty of Engineering - BAU 43


Fine-Tune Your Model

Let’s assume that you now have a shortlist of promising


models. You now need to fine-tune them.
You can do this by varying the parameters when building your
models on your training data.
There are two approaches to do this:
• Grid Search
• Random Search

Chapter 03 - Machine Learning - Faculty of Engineering - BAU 44


Grid Search

One option would be to test


with the hyper-parameters
manually, until you find a
great combination of hyper-
parameter values.
This would be very boring
work, and you may not have
time to explore many
combinations.
Some python libraries do it.
(Ex: GridSearchCV).

Chapter 03 - Machine Learning - Faculty of Engineering - BAU 45


Random Search

When the hyperparameter


search space is large, it is
often preferable to use
Random Search instead. (Ex:
RandomizedSearchCV)
This method evaluates a
given number of random
combinations by selecting a
random value for each
hyperparameter at every
iteration.

Chapter 03 - Machine Learning - Faculty of Engineering - BAU 46


Random Search

This approach has two main benefits:


• If you let the randomized search run for, say, 1,000
iterations, this approach will explore 1,000 different values
for each hyperparameter (instead of just a few values per
hyperparameter with the grid search approach).
• Simply by setting the number of iterations, you have more
control over the computing budget you want to allocate to
hyperparameter search.

Chapter 03 - Machine Learning - Faculty of Engineering - BAU 47


Ensemble Methods

Another way to fine-tune your system is to try to combine the


models that perform best.
The group will often perform better than the best individual
model.
Example: a Random Forests perform better than the
individual Decision Tree.
This method is effective especially if the individual models
make very different types of errors.

Chapter 03 - Machine Learning - Faculty of Engineering - BAU 48


Evaluate Your System on the Test Set

After tweaking your models for a while, you eventually have a


system that performs sufficiently well.
Now is the time to evaluate the final model on the test set:
1. get the features and the labels from your test set,
2. run your full pipeline to transform the data, and
3. evaluate the final model on the test set.

Chapter 03 - Machine Learning - Faculty of Engineering - BAU 49


Confusion Matrix

When performing classification predictions, there's four types


of outcomes that could occur:
• True positives: are when you predict an observation
belongs to a class and it does belong to that class.
• True negatives: are when you predict an observation does
not belong to a class and it does not belong to that class.
• False positives: occur when you predict an observation
belongs to a class when it does not.
• False negatives: occur when you predict an observation
does not belong to a class when in fact it does.

Chapter 03 - Machine Learning - Faculty of Engineering - BAU 50


Confusion Matrix

The following confusion matrices are examples for two binary


classification models.
You would generate those matrices to identify each model
prediction.

Chapter 03 - Machine Learning - Faculty of Engineering - BAU 51


Evaluation Metrics – Accuracy

The three main metrics used to evaluate a classification model


are accuracy, precision, and recall.

Accuracy: is defined as the percentage of correct predictions


for the test data. It can be calculated easily by dividing the
number of correct predictions by the number of total
predictions.
𝑐𝑜𝑟𝑟𝑒𝑐𝑡 𝑝𝑟𝑒𝑑𝑖𝑐𝑡𝑖𝑜𝑛𝑠
𝑎𝑐𝑐𝑢𝑟𝑎𝑐𝑦 =
𝑎𝑙𝑙 𝑝𝑟𝑒𝑑𝑖𝑐𝑡𝑖𝑜𝑛𝑠

Chapter 03 - Machine Learning - Faculty of Engineering - BAU 52


Evaluation Metrics – Precision & Recall

Precision: is defined as the fraction of relevant examples (true


positives) among all the examples which were predicted to
belong in a certain class.
𝑡𝑟𝑢𝑒 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒𝑠
𝑝𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 =
𝑡𝑟𝑢𝑒 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒𝑠 + 𝑓𝑎𝑙𝑠𝑒 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒𝑠
Recall: is defined as the fraction of examples which were
predicted to belong to a class with respect to all the examples
that truly belong in the class.
𝑡𝑟𝑢𝑒 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒𝑠
𝑟𝑒𝑐𝑎𝑙𝑙 =
𝑡𝑟𝑢𝑒 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒𝑠 + 𝑓𝑎𝑙𝑠𝑒 𝑛𝑒𝑔𝑎𝑡𝑖𝑣𝑒𝑠

Chapter 03 - Machine Learning - Faculty of Engineering - BAU 53


Evaluation Metrics – f-score

Ultimately, it's nice to have one number to evaluate a


machine learning model.
Thus, it makes sense to combine the precision and recall
metrics as the f-score:
2
𝑝𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 × 𝑟𝑒𝑐𝑎𝑙𝑙
𝐹𝛽 = (1 + 𝛽 ) × 2
𝛽 × 𝑝𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 + 𝑟𝑒𝑐𝑎𝑙𝑙
The 𝛽 parameter allows us to control the tradeoff of
importance between precision and recall.
𝛽 < 1 focuses more on precision while 𝛽 > 1 focuses more
on recall.
Chapter 03 - Machine Learning - Faculty of Engineering - BAU 54
Evaluation Metrics – Mean Squared Error

Mean Squared Error: is simply defined as the average of


squared differences between the predicted output and the
true output.
Squared error is commonly used because it is unsure to
whether the prediction was too high or too low, it just reports
that the prediction was incorrect.
1
𝑀𝑆𝐸 = ෍(𝑦𝑡𝑟𝑢𝑒 − 𝑦𝑝𝑟𝑒𝑑 )2
𝑛

Chapter 03 - Machine Learning - Faculty of Engineering - BAU 55


Launch, Monitor, and Maintain Your System

You can wrap the model within a web service. This makes it
easier to upgrade your model to new versions without
interrupting the main application.

Another popular strategy is to deploy your model on the


cloud using Google Cloud AI Platform.

Chapter 03 - Machine Learning - Faculty of Engineering - BAU 56


Launch, Monitor, and Maintain Your System

But deployment is not the end of the story. You also need to
write monitoring code to check the performance and trigger
alerts when it drops.
Even a model trained to classify pictures of cats and dogs may
need to be retrained regularly.
Not because cats and dogs will mutate overnight, but because
cameras keep changing, along with image formats, sharpness,
brightness, and size ratios.
Moreover, people may love different breeds next year, or they
may decide to dress their pets with tiny hats—who knows?

Chapter 03 - Machine Learning - Faculty of Engineering - BAU 57


Launch, Monitor, and Maintain Your System

If the data keeps evolving, you should probably automate a


process to update your datasets and retrain your model
regularly:
1. Collect fresh data regularly and label it.
2. Write a script to train the model and fine-tune the
hyperparameters automatically. (every day or week, ...)
3. Write another script that will evaluate both the new and
the previous models and deploy the new model if the
performance has not decreased.

Chapter 03 - Machine Learning - Faculty of Engineering - BAU 58


Launch, Monitor, and Maintain Your System

Finally, make sure you keep backups of every model you


create to roll back to a previous model quickly, in case the
new model starts failing badly.
Having backups also makes it possible to easily compare new
models with previous ones.
Similarly, you should keep backups of every version of your
datasets so that you can roll back to a previous dataset if the
new one ever gets corrupted.
Having backups of your datasets also allows you to evaluate
any model against any previous dataset.

Chapter 03 - Machine Learning - Faculty of Engineering - BAU 59


End of Chapter 03

Chapter 03 - Machine Learning - Faculty of Engineering - BAU 60

You might also like