0% found this document useful (0 votes)

39 views60 pages

ML - 03 - Machine Learning Systems

This document summarizes steps for developing a machine learning model using real-world data. It discusses using the California Housing Prices dataset to predict median housing prices based on other district metrics. Key steps covered include framing the problem as a regression task, assessing the current manual estimation system, exploring and preparing the data, selecting a performance measure like RMSE, and creating environments and test sets for model development and evaluation.

Uploaded by

Jok eR

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

39 views60 pages

ML - 03 - Machine Learning Systems

Uploaded by

Jok eR

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 60

Machine Learning

Chapter 03
Machine Learning Systems
Prepared by: Ziad Doughan
Email: [email protected]

Chapter 03 - Machine Learning - Faculty of Engineering - BAU 1

Working with Real Data

IN ML projects, it is best to
experiment with real world
data, not artificial datasets.
There are thousands of open
datasets to choose from.
Example:
The California Housing Prices
dataset is based on data from
the 1990 California bay area.

Chapter 03 - Machine Learning - Faculty of Engineering - BAU 2

Look at the Big Picture

The first step is to understand how a ML is going to be used to

build a intelligent model.
Example:
The housing prices data includes metrics such as the
population, median income, and median housing price for
each block group in California.
Block groups, called “districts”, are the smallest geographical
unit for which the US Census Bureau publishes sample data.
The model should learn to predict the median housing price in
any district, given all the other metrics.

Chapter 03 - Machine Learning - Faculty of Engineering - BAU 3

Frame the Problem

The first question to ask is "what exactly the business

objective is?".
Building a model is probably not the end goal. But, "How does
the company expect to use and benefit from this model?" is
the target.
Knowing the objective is important because it will determine:
• how you frame the problem,
• which algorithms you will select,
• which performance measure you will use to evaluate your
model,
• and how much effort you will spend tweaking it.

Chapter 03 - Machine Learning - Faculty of Engineering - BAU 4

Data Pipeline

The data pipeline is the sequence of data processing.

Pipelines are very common to manipulate and transform data.

Chapter 03 - Machine Learning - Faculty of Engineering - BAU 5

Current System Assessment

The next question to ask is "what the current solution looks

like, if any".
The current situation will often give you a reference for
performance, as well as insights on how to solve the problem.
Example: The district housing prices are currently estimated
manually by experts.
A team gathers up-to-date information about a district, and
when they cannot get the median housing price, they
estimate it using complex rules.

Chapter 03 - Machine Learning - Faculty of Engineering - BAU 6

Current System Assessment

This is costly and time-consuming, and their estimates are not

great.
In cases where they manage to find out the actual median
housing price, they often realize that their estimates were off
by more than 20%.
This is why the company thinks that it would be useful to train
a model to predict a district’s median housing price, given
other data about that district.

Chapter 03 - Machine Learning - Faculty of Engineering - BAU 7

Dataset Assessment

The census data looks like a great dataset to exploit for this
purpose, since it includes the median housing prices of
thousands of districts, as well as other data.
With all this information, you are now ready to start designing
your system.
First, you need to frame the problem:
• Is it supervised, unsupervised, or Reinforcement Learning?
• Is it a classification, a regression, or something else?
• Should you use batch or online learning techniques?

Chapter 03 - Machine Learning - Faculty of Engineering - BAU 8

Dataset Assessment

Let’s return to the housing prices examples:

It is clearly a typical supervised learning task, since we have
labeled training examples.
It is also a typical univariate regression problem, since we are
only trying to predict a single value for each district.
Finally, there is no continuous flow of data coming into the
system, no particular need to adjust to rapidly changing data,
and the data is small enough to fit in memory, so plain batch
learning should do just fine.

Chapter 03 - Machine Learning - Faculty of Engineering - BAU 9

Performance Measure

Your next step is to select a performance measure.

A typical performance measure for regression problems is the
Root Mean Square Error (RMSE).
It gives an idea of how much error the system typically makes
in its predictions, with a higher weight for large errors.
The equation of the RMSE is:
𝑚
1 𝑖 𝑖 2
𝑅𝑀𝑆𝐸 𝑋, ℎ = ෍ ℎ 𝑥 −𝑦
𝑚
𝑖=1

Chapter 03 - Machine Learning - Faculty of Engineering - BAU 10

ML Notations

This equation introduces several notations:

• m is the number of instances in the dataset.
• 𝑥 (𝑖) is a vector of all the feature values.
• 𝑦 (𝑖) is its label of the desired output value.
• X is a matrix containing all the feature of all the records.
• h is the system’s prediction function or hypothesis.
• 𝑦ො (𝑖) = ℎ(𝑥 (𝑖) ) is the predicted value (“y-hat”).

Chapter 03 - Machine Learning - Faculty of Engineering - BAU 11

Mean Absolute Error

RMSE is the preferred performance measure for regression.

But in some contexts, it is preferred to use another function.
For example, suppose there are many outliers.
In that case, you should consider using the mean absolute
error (MAE). It is also called the average absolute deviation:
𝑚
1 𝑖 𝑖
𝑀𝐴𝐸 𝑋, ℎ = ෍ ℎ 𝑥 −𝑦
𝑚
𝑖=1

Mathematically, both the RMSE and the MAE are ways to

measure the distance between two vectors.

Chapter 03 - Machine Learning - Faculty of Engineering - BAU 12

Preparing the Development Infrastructure

Create the Workspace:

• First, you will need to have Python installed.
• Next, you need to create a workspace directory for your
code and datasets.
• Then, you will need some Python modules: NumPy,
pandas, Matplotlib, Scikit-Learn ...
• Finally, make sure to upgrade your modules.

Chapter 03 - Machine Learning - Faculty of Engineering - BAU 13

Creating an Isolated Environment

It is strongly recommended to
work in an isolated environment.
This allows you to work on
different projects without having
conflicting library versions.
In Python you can create an
isolated environment.
Every time you want to use a
specific environment, all you
must do is activate it.

Chapter 03 - Machine Learning - Faculty of Engineering - BAU 14

Getting the Data

In typical environments your data would be available in a

relational database and spread across multiple tables,
documents, or files.
To access it, you would first need to get your credentials and
access authorizations and familiarize yourself with the data
schema.
In simpler projects however, you just need to create or
download a single compressed file, like housing.tgz, which
contains a comma-separated values (CSV) file called
housing.csv with all the data.

Chapter 03 - Machine Learning - Faculty of Engineering - BAU 15

Always Take a Look at the Data

Each row represents one district. There are many attributes:

longitude, latitude, housing_median_age, total_rooms,
total_bedrooms, population, households, median_income,
median_house_value, and ocean_proximity.

Chapter 03 - Machine Learning - Faculty of Engineering - BAU 16

Always Take a Look at the Data

The info() method is useful to All attributes are numerical,

get a quick description of the except the ocean_proximity
data. type is object.

You can find out what

categories exist and how
many districts belong to each
category.

Chapter 03 - Machine Learning - Faculty of Engineering - BAU 17

Always Take a Look at the Data

Chapter 03 - Machine Learning - Faculty of Engineering - BAU 18

Data Histograms

Chapter 03 - Machine Learning - Faculty of Engineering - BAU 19

Data Histograms

These attributes have very different

scales. Thus, we need later to apply
feature scaling on them.
Many histograms are tail-heavy. They
extend much farther to the right of the
median than to the left.
This may make it hard for some ML to
detect patterns. we need later to
transform these attributes to have
more bell-shaped distributions.

Chapter 03 - Machine Learning - Faculty of Engineering - BAU 20

Data Histograms

The median income attribute is not expressed in US dollars

(USD). After checking the data, the numbers represent
roughly $10,000s of dollars (3 means $30,000). Working with
preprocessed attributes is common in Machine Learning.

Chapter 03 - Machine Learning - Faculty of Engineering - BAU 21

Data Histograms

The housing median age and the

median house value were capped. Your
ML may learn that prices never go
beyond that limit. If you need precise
predictions even beyond $500,000,
then you have two options:
a. Collect proper labels for the districts
whose labels were capped.
b. Remove those districts from the
Dataset.

Chapter 03 - Machine Learning - Faculty of Engineering - BAU 22

Create a Test Set

Creating a test set is theoretically simple. Pick some instances

randomly, typically 20% of the dataset or less if your dataset is
very large.
However, you must make sure that your ML model will not be
able to see the whole dataset.
Example: for a small dataset up to 100,000 record 70%
Training set, 30% Test set, or 60% Training set, 20%
Development set, 20% Test set.
However, for big dataset 98% Training set, 1% Development
set, 1% Test set.

Chapter 03 - Machine Learning - Faculty of Engineering - BAU 23

Visualize the Data to Gain Insights

You can create a scatterplot to visualize the data.

Chapter 03 - Machine Learning - Faculty of Engineering - BAU 24

Data Scatterplot

Setting the alpha option to 0.1 makes it much easier to

visualize the places where there is a high density of data
points.

Chapter 03 - Machine Learning - Faculty of Engineering - BAU 25

Data Scatterplot

Red is expensive, blue is cheap, larger circles indicate areas

with a larger population.

Chapter 03 - Machine Learning - Faculty of Engineering - BAU 26

Looking for Correlations

You can easily compute the

standard correlation coefficient
(also called Pearson’s r) between
every pair of attributes using the
corr().
The correlation coefficient ranges
from –1 to 1. When it is close to 1,
it means that there is a strong
positive correlation.
Example: the median house value
tends to go up when the median
income goes up.

Chapter 03 - Machine Learning - Faculty of Engineering - BAU 27

Looking for Correlations

Chapter 03 - Machine Learning - Faculty of Engineering - BAU 28

Prepare the Data for ML

You should write functions for this purpose:

• This will allow you to reproduce these transformations
easily on any dataset.
• You will gradually build a library of transformation
functions that you can reuse in future projects.
• You can use these functions in your live system to
transform the new data before feeding it to your
algorithms.
• This will make it possible for you to easily try various
transformations and see which combination of
transformations works best.
Chapter 03 - Machine Learning - Faculty of Engineering - BAU 29
Data Cleaning

Most ML algorithms cannot work with missing features, so

let’s take care of them.
We saw earlier that the total_bedrooms attribute has some
missing values.
To fix this, you have three options:
1. Get rid of the corresponding districts.
2. Get rid of the whole attribute.
3. Set the values to some value (zero, mean, median, etc.).

Chapter 03 - Machine Learning - Faculty of Engineering - BAU 30

Handling Text and Categorical Attributes

Let’s look at its value for the first 10 instances:

Chapter 03 - Machine Learning - Faculty of Engineering - BAU 31

Feature Scaling and Normalization

What is feature scaling and normalization?

It is the operation that transforms the feature to have a mean
of 0 and a variance of 1.
The (z-score) equation is simply: z = (𝑥 − 𝜇) / 𝜎
Where, x is the feature value, 𝜇 is the mean of x, and 𝜎 is the
standard deviation of x.
1 𝑚 1 𝑚
𝜇= σ𝑖=1 𝑥 [𝑖] ,𝜎 = σ 𝑥 [𝑖] 2
𝑚 𝑚 𝑖=1

Chapter 03 - Machine Learning - Faculty of Engineering - BAU 32

Feature Scaling and Normalization

For a dataset with multiple features, we generally have

different scales for each one. This situation could give rise to
greater influence in the final results for some of the features.

Chapter 03 - Machine Learning - Faculty of Engineering - BAU 33

Feature Scaling and Normalization

Min-Max normalization: also called feature scaling, performs a

linear transformation to get all the scaled data between [0, 1].

The formula to achieve this is the following:

𝑥 − 𝑥𝑚𝑖𝑛
𝑛=
𝑥𝑚𝑎𝑥 − 𝑥𝑚𝑖𝑛

Min-max normalization preserves the relationships among the

original values. We will end up with smaller standard
deviations, which can suppress the effect of outliers.

Chapter 03 - Machine Learning - Faculty of Engineering - BAU 34

Feature Scaling and Normalization

Why do we need feature scaling and normalization?

With few exceptions, ML algorithms don’t perform well when
the input numerical attributes have very different scales.
In this case, the model could allow greater influence for some
of the features in the final results.
By applying feature scaling and normalization, when striving
to minimize the error, the algorithm won’t focus too much on
the extreme errors and will obtain a general solution.

Chapter 03 - Machine Learning - Faculty of Engineering - BAU 35

Data Discretization

This is a process of converting continuous data into a set of

intervals. This makes the data easier to analyze.
This method is also called a data reduction mechanism as it
transforms a large dataset into a set of categorical data.
Data discretization can be classified into two types:
• supervised where the class information is used, and
• unsupervised 'top-down splitting' or 'bottom-up
merging' strategy.
For example, the values for the age attribute can be replaced
by the interval labels such as (0-10, 11-20…) or (kid, youth,
adult, senior).
Chapter 03 - Machine Learning - Faculty of Engineering - BAU 36
Select and Train a Model

You framed the problem, you got the data and explored it,
you sampled a training set and a test set, and you wrote
transformation pipelines preprocess the data.
You are now ready to select and train a Machine Learning
model. Let’s first train a Linear Regression model:
>>> from sklearn.linear_model import LinearRegression
>>> lin_reg = LinearRegression()
>>> lin_reg.fit(housing_prepared, housing_labels)
Done! You now have a working Linear Regression model.

Chapter 03 - Machine Learning - Faculty of Engineering - BAU 37

Select and Train a Model

Let’s try it out on a few instances from the training set:

>>> some_data = housing.iloc[:5]
>>> some_labels = housing_labels.iloc[:5]
>>> some_data_prepared = full_pipeline.transform(some_data)
>>> print("Predictions:", lin_reg.predict(some_data_prepared))
Predictions: [ 210644.6045 317768.8069 210956.4333 59218.9888
189747.5584]
>>> print("Labels:", list(some_labels))
Labels: [286600.0, 340600.0, 196900.0, 46300.0, 254500.0]

Chapter 03 - Machine Learning - Faculty of Engineering - BAU 38

Select and Train a Model

It works, although the predictions are not exactly accurate.

Let’s measure this regression model’s RMSE using
mean_squared_error() function:
>>> from sklearn.metrics import mean_squared_error
>>> housing_predictions = lin_reg.predict(housing_prepared)
>>> lin_mse = mean_squared_error(housing_labels, housing_predictions)
>>> lin_rmse = np.sqrt(lin_mse)
>>> lin_rmse
68628.19819848922
A typical prediction error of $68,628 is not very satisfying.

Chapter 03 - Machine Learning - Faculty of Engineering - BAU 39

Select and Train a Model

Let’s train a Decision Tree Regressor. This is a powerful model,

capable of finding complex nonlinear relationships in the data. The
code is:
>>> from sklearn.tree import DecisionTreeRegressor
>>> tree_reg = DecisionTreeRegressor()
>>> tree_reg.fit(housing_prepared, housing_labels)
Now that the model is trained, let’s evaluate it on the training set:
>>> housing_predictions = tree_reg.predict(housing_prepared)
>>> tree_mse = mean_squared_error(housing_labels, housing_predictions)
>>> tree_rmse = np.sqrt(tree_mse)
>>> tree_rmse
0.0 No error? Is there overfitting?

Chapter 03 - Machine Learning - Faculty of Engineering - BAU 40

Better Evaluation Using Cross-Validation

One way to evaluate the Decision Tree model is to split the

training set into a smaller training set and a validation set,
then train your model again.
It’s a bit of work, but nothing too difficult, and it would work
well.
A great alternative is to use K-fold cross-validation:
Randomly split the training set into 10 distinct subsets called
folds, then train and evaluate your model 10 times, picking a
different fold for evaluation every time and training on the
other 9 folds.

Chapter 03 - Machine Learning - Faculty of Engineering - BAU 41

Better Evaluation Using Cross-Validation

The result is an array containing the 10 evaluation scores:

>>> from sklearn.model_selection import cross_val_score

>>> scores = cross_val_score(tree_reg, housing_prepared, housing_labels,
scoring="neg_mean_squared_error", cv=10)
>>> tree_rmse_scores = np.sqrt(-scores)
>>> display_scores(tree_rmse_scores)

Scores: [70194.33680785 66855.16363941 72432.58244769

70758.73896782 71115.88230639 75585.14172901 70262.86139133
70273.6325285 75366.87952553 71231.65726027]
Mean: 71407.68766037929
Standard deviation: 2439.4345041191004

Chapter 03 - Machine Learning - Faculty of Engineering - BAU 42

Better Evaluation Using Cross-Validation

Let’s compute the same scores for the Linear Regression:

>>> lin_scores = cross_val_score(lin_reg, housing_prepared,
housing_labels, scoring="neg_mean_squared_error", cv=10)
>>> lin_rmse_scores = np.sqrt(-lin_scores)
>>> display_scores(lin_rmse_scores)
Scores: [66782.73843989 66960.118071 70347.95244419
74739.57052552 68031.13388938 71193.84183426
64969.63056405 68281.61137997 71552.91566558
67665.10082067]
Mean: 69052.46136345083
Standard deviation: 2731.674001798348
NOTE: the Decision Tree model is overfitting so badly. It performs
worse than the Linear Regression model.

Chapter 03 - Machine Learning - Faculty of Engineering - BAU 43

Fine-Tune Your Model

Let’s assume that you now have a shortlist of promising

models. You now need to fine-tune them.
You can do this by varying the parameters when building your
models on your training data.
There are two approaches to do this:
• Grid Search
• Random Search

Chapter 03 - Machine Learning - Faculty of Engineering - BAU 44

Grid Search

One option would be to test

with the hyper-parameters
manually, until you find a
great combination of hyper-
parameter values.
This would be very boring
work, and you may not have
time to explore many
combinations.
Some python libraries do it.
(Ex: GridSearchCV).

Chapter 03 - Machine Learning - Faculty of Engineering - BAU 45

Random Search

When the hyperparameter

search space is large, it is
often preferable to use
Random Search instead. (Ex:
RandomizedSearchCV)
This method evaluates a
given number of random
combinations by selecting a
random value for each
hyperparameter at every
iteration.

Chapter 03 - Machine Learning - Faculty of Engineering - BAU 46

Random Search

This approach has two main benefits:

• If you let the randomized search run for, say, 1,000
iterations, this approach will explore 1,000 different values
for each hyperparameter (instead of just a few values per
hyperparameter with the grid search approach).
• Simply by setting the number of iterations, you have more
control over the computing budget you want to allocate to
hyperparameter search.

Chapter 03 - Machine Learning - Faculty of Engineering - BAU 47

Ensemble Methods

Another way to fine-tune your system is to try to combine the

models that perform best.
The group will often perform better than the best individual
model.
Example: a Random Forests perform better than the
individual Decision Tree.
This method is effective especially if the individual models
make very different types of errors.

Chapter 03 - Machine Learning - Faculty of Engineering - BAU 48

Evaluate Your System on the Test Set

After tweaking your models for a while, you eventually have a

system that performs sufficiently well.
Now is the time to evaluate the final model on the test set:
1. get the features and the labels from your test set,
2. run your full pipeline to transform the data, and
3. evaluate the final model on the test set.

Chapter 03 - Machine Learning - Faculty of Engineering - BAU 49

Confusion Matrix

When performing classification predictions, there's four types

of outcomes that could occur:
• True positives: are when you predict an observation
belongs to a class and it does belong to that class.
• True negatives: are when you predict an observation does
not belong to a class and it does not belong to that class.
• False positives: occur when you predict an observation
belongs to a class when it does not.
• False negatives: occur when you predict an observation
does not belong to a class when in fact it does.

Chapter 03 - Machine Learning - Faculty of Engineering - BAU 50

Confusion Matrix

The following confusion matrices are examples for two binary

classification models.
You would generate those matrices to identify each model
prediction.

Chapter 03 - Machine Learning - Faculty of Engineering - BAU 51

Evaluation Metrics – Accuracy

The three main metrics used to evaluate a classification model

are accuracy, precision, and recall.

Accuracy: is defined as the percentage of correct predictions

for the test data. It can be calculated easily by dividing the
number of correct predictions by the number of total
predictions.
𝑐𝑜𝑟𝑟𝑒𝑐𝑡 𝑝𝑟𝑒𝑑𝑖𝑐𝑡𝑖𝑜𝑛𝑠
𝑎𝑐𝑐𝑢𝑟𝑎𝑐𝑦 =
𝑎𝑙𝑙 𝑝𝑟𝑒𝑑𝑖𝑐𝑡𝑖𝑜𝑛𝑠

Chapter 03 - Machine Learning - Faculty of Engineering - BAU 52

Evaluation Metrics – Precision & Recall

Precision: is defined as the fraction of relevant examples (true

positives) among all the examples which were predicted to
belong in a certain class.
𝑡𝑟𝑢𝑒 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒𝑠
𝑝𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 =
𝑡𝑟𝑢𝑒 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒𝑠 + 𝑓𝑎𝑙𝑠𝑒 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒𝑠
Recall: is defined as the fraction of examples which were
predicted to belong to a class with respect to all the examples
that truly belong in the class.
𝑡𝑟𝑢𝑒 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒𝑠
𝑟𝑒𝑐𝑎𝑙𝑙 =
𝑡𝑟𝑢𝑒 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒𝑠 + 𝑓𝑎𝑙𝑠𝑒 𝑛𝑒𝑔𝑎𝑡𝑖𝑣𝑒𝑠

Chapter 03 - Machine Learning - Faculty of Engineering - BAU 53

Evaluation Metrics – f-score

Ultimately, it's nice to have one number to evaluate a

machine learning model.
Thus, it makes sense to combine the precision and recall
metrics as the f-score:
2
𝑝𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 × 𝑟𝑒𝑐𝑎𝑙𝑙
𝐹𝛽 = (1 + 𝛽 ) × 2
𝛽 × 𝑝𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 + 𝑟𝑒𝑐𝑎𝑙𝑙
The 𝛽 parameter allows us to control the tradeoff of
importance between precision and recall.
𝛽 < 1 focuses more on precision while 𝛽 > 1 focuses more
on recall.
Chapter 03 - Machine Learning - Faculty of Engineering - BAU 54
Evaluation Metrics – Mean Squared Error

Mean Squared Error: is simply defined as the average of

squared differences between the predicted output and the
true output.
Squared error is commonly used because it is unsure to
whether the prediction was too high or too low, it just reports
that the prediction was incorrect.
1
𝑀𝑆𝐸 = ෍(𝑦𝑡𝑟𝑢𝑒 − 𝑦𝑝𝑟𝑒𝑑 )2
𝑛

Chapter 03 - Machine Learning - Faculty of Engineering - BAU 55

Launch, Monitor, and Maintain Your System

You can wrap the model within a web service. This makes it
easier to upgrade your model to new versions without
interrupting the main application.

Another popular strategy is to deploy your model on the

cloud using Google Cloud AI Platform.

Chapter 03 - Machine Learning - Faculty of Engineering - BAU 56

Launch, Monitor, and Maintain Your System

But deployment is not the end of the story. You also need to
write monitoring code to check the performance and trigger
alerts when it drops.
Even a model trained to classify pictures of cats and dogs may
need to be retrained regularly.
Not because cats and dogs will mutate overnight, but because
cameras keep changing, along with image formats, sharpness,
brightness, and size ratios.
Moreover, people may love different breeds next year, or they
may decide to dress their pets with tiny hats—who knows?

Chapter 03 - Machine Learning - Faculty of Engineering - BAU 57

Launch, Monitor, and Maintain Your System

If the data keeps evolving, you should probably automate a

process to update your datasets and retrain your model
regularly:
1. Collect fresh data regularly and label it.
2. Write a script to train the model and fine-tune the
hyperparameters automatically. (every day or week, ...)
3. Write another script that will evaluate both the new and
the previous models and deploy the new model if the
performance has not decreased.

Chapter 03 - Machine Learning - Faculty of Engineering - BAU 58

Launch, Monitor, and Maintain Your System

Finally, make sure you keep backups of every model you

create to roll back to a previous model quickly, in case the
new model starts failing badly.
Having backups also makes it possible to easily compare new
models with previous ones.
Similarly, you should keep backups of every version of your
datasets so that you can roll back to a previous dataset if the
new one ever gets corrupted.
Having backups of your datasets also allows you to evaluate
any model against any previous dataset.

Chapter 03 - Machine Learning - Faculty of Engineering - BAU 59

End of Chapter 03

Chapter 03 - Machine Learning - Faculty of Engineering - BAU 60

Disc Behavior Survey
100% (9)
Disc Behavior Survey
5 pages
Machine Learning Simplified
100% (1)
Machine Learning Simplified
109 pages
CS 2 3 4 Aml
No ratings yet
CS 2 3 4 Aml
70 pages
Machine Learning with Clustering: A Visual Guide for Beginners with Examples in Python
From Everand
Machine Learning with Clustering: A Visual Guide for Beginners with Examples in Python
Artem Kovera
No ratings yet
Integrative Teaching Strategies
64% (14)
Integrative Teaching Strategies
21 pages
Online Enclosure Integrity Training PDF
No ratings yet
Online Enclosure Integrity Training PDF
6 pages
A Naidu Book Chapter0 FrontPages 2017-03-12
No ratings yet
A Naidu Book Chapter0 FrontPages 2017-03-12
23 pages
Module 2
No ratings yet
Module 2
35 pages
AIMLlatestmodule 2notes Removed
No ratings yet
AIMLlatestmodule 2notes Removed
33 pages
Module 2
No ratings yet
Module 2
20 pages
Module 5
No ratings yet
Module 5
46 pages
2 DataPreProcessing Code
No ratings yet
2 DataPreProcessing Code
46 pages
Module 2notes
No ratings yet
Module 2notes
44 pages
@vtudeveloper - in ISMLA Mod 5
No ratings yet
@vtudeveloper - in ISMLA Mod 5
30 pages
Lecture 4
No ratings yet
Lecture 4
56 pages
ISMLA Module5
No ratings yet
ISMLA Module5
25 pages
Dawit House
No ratings yet
Dawit House
49 pages
Lecture 4
No ratings yet
Lecture 4
56 pages
ML Syllabus
No ratings yet
ML Syllabus
5 pages
L03 The Regression Pipeline
No ratings yet
L03 The Regression Pipeline
94 pages
ML Book Notes
No ratings yet
ML Book Notes
9 pages
Lecture02. ML Pipeline (Chapter 2)
No ratings yet
Lecture02. ML Pipeline (Chapter 2)
50 pages
EXAMPLE ML in Real Life
No ratings yet
EXAMPLE ML in Real Life
6 pages
ML Lab Manual
No ratings yet
ML Lab Manual
43 pages
Machine Learning (BCSL606) Lab Manual
No ratings yet
Machine Learning (BCSL606) Lab Manual
30 pages
Unit 2
No ratings yet
Unit 2
78 pages
FML Lab Manual
No ratings yet
FML Lab Manual
49 pages
Machine Learning Lab
No ratings yet
Machine Learning Lab
46 pages
Machine Learning With Python
100% (2)
Machine Learning With Python
137 pages
A Short Guide For Feature Engineering and Feature Selection
No ratings yet
A Short Guide For Feature Engineering and Feature Selection
32 pages
Cidu2011 Banerjee Intro To ML 01
No ratings yet
Cidu2011 Banerjee Intro To ML 01
120 pages
House Report
No ratings yet
House Report
26 pages
Unit 1: Shobana T S Assistant Professor Dept. of ISE, BMSCE
No ratings yet
Unit 1: Shobana T S Assistant Professor Dept. of ISE, BMSCE
127 pages
Module 2
No ratings yet
Module 2
24 pages
ML Lectures Summary 2
No ratings yet
ML Lectures Summary 2
52 pages
AnIntroductiontoMachineLearning - Thebook
No ratings yet
AnIntroductiontoMachineLearning - Thebook
234 pages
A Comprehensive Guide To Machine Learning
No ratings yet
A Comprehensive Guide To Machine Learning
152 pages
Act 7
No ratings yet
Act 7
18 pages
Module 3 Data Science Machine Learning
No ratings yet
Module 3 Data Science Machine Learning
53 pages
Manual Data
No ratings yet
Manual Data
13 pages
Machine Learning The Basics
No ratings yet
Machine Learning The Basics
158 pages
Introduction To Machine Learning
No ratings yet
Introduction To Machine Learning
8 pages
TusharGoel Seminar PPT
No ratings yet
TusharGoel Seminar PPT
23 pages
Module 2 Own Notes
No ratings yet
Module 2 Own Notes
10 pages
SEng5305-chap-1-Introduction To ML
No ratings yet
SEng5305-chap-1-Introduction To ML
85 pages
Machine Learning: Dr. Jagan. T Professor Department of ECE, GRIET
No ratings yet
Machine Learning: Dr. Jagan. T Professor Department of ECE, GRIET
69 pages
Machine Learning Path
No ratings yet
Machine Learning Path
21 pages
80 Metod Est Bigdata Clustering Hoteles 2019 I
No ratings yet
80 Metod Est Bigdata Clustering Hoteles 2019 I
81 pages
Machine Learning
No ratings yet
Machine Learning
51 pages
Session 4 Machine Learning Process
No ratings yet
Session 4 Machine Learning Process
28 pages
ML Aml Cse It Lab Manual Final
No ratings yet
ML Aml Cse It Lab Manual Final
22 pages
L03 The Regression Pipeline - 2
No ratings yet
L03 The Regression Pipeline - 2
58 pages
Steven Skiena-The Algorithm Design Manual-En
50% (2)
Steven Skiena-The Algorithm Design Manual-En
27 pages
Thebook PDF
No ratings yet
Thebook PDF
234 pages
Introduction To Machine Learning
No ratings yet
Introduction To Machine Learning
234 pages
Lesson 4 - Introduction Machine Learning
No ratings yet
Lesson 4 - Introduction Machine Learning
44 pages
Case Study 219302405
No ratings yet
Case Study 219302405
14 pages
Tutorial 3
No ratings yet
Tutorial 3
30 pages
CS601 - Machine Learning - Unit 1 - Notes - 1672759748
No ratings yet
CS601 - Machine Learning - Unit 1 - Notes - 1672759748
13 pages
DATA MINING and MACHINE LEARNING. PREDICTIVE TECHNIQUES: ENSEMBLE METHODS, BOOSTING, BAGGING, RANDOM FOREST, DECISION TREES and REGRESSION TREES.: Examples with MATLAB
From Everand
DATA MINING and MACHINE LEARNING. PREDICTIVE TECHNIQUES: ENSEMBLE METHODS, BOOSTING, BAGGING, RANDOM FOREST, DECISION TREES and REGRESSION TREES.: Examples with MATLAB
César Pérez López
No ratings yet
Python Machine Learning: Machine Learning Algorithms for Beginners - Data Management and Analytics for Approaching Deep Learning and Neural Networks from Scratch
From Everand
Python Machine Learning: Machine Learning Algorithms for Beginners - Data Management and Analytics for Approaching Deep Learning and Neural Networks from Scratch
Ahmed Ph. Abbasi
No ratings yet
Computer Algebra: Fundamentals and Applications
From Everand
Computer Algebra: Fundamentals and Applications
Fouad Sabry
No ratings yet
Computational Geometry: Exploring Geometric Insights for Computer Vision
From Everand
Computational Geometry: Exploring Geometric Insights for Computer Vision
Fouad Sabry
No ratings yet
DATA MINING and MACHINE LEARNING. CLASSIFICATION PREDICTIVE TECHNIQUES: SUPPORT VECTOR MACHINE, LOGISTIC REGRESSION, DISCRIMINANT ANALYSIS and DECISION TREES: Examples with MATLAB
From Everand
DATA MINING and MACHINE LEARNING. CLASSIFICATION PREDICTIVE TECHNIQUES: SUPPORT VECTOR MACHINE, LOGISTIC REGRESSION, DISCRIMINANT ANALYSIS and DECISION TREES: Examples with MATLAB
César Pérez López
No ratings yet
Research Paper of Ellah
No ratings yet
Research Paper of Ellah
17 pages
Chapter Three - Foundation of Group Behaviour
No ratings yet
Chapter Three - Foundation of Group Behaviour
13 pages
AdvancED Lesson Obs Form
No ratings yet
AdvancED Lesson Obs Form
2 pages
What Is A PHD
No ratings yet
What Is A PHD
1 page
Media Literacy in Theory
No ratings yet
Media Literacy in Theory
14 pages
Introduction To Philosophy
No ratings yet
Introduction To Philosophy
7 pages
Bcom GST
No ratings yet
Bcom GST
125 pages
Linguistics Fix 1
No ratings yet
Linguistics Fix 1
11 pages
RPMS New PDF
No ratings yet
RPMS New PDF
20 pages
Health&wellness-Final Presentation
No ratings yet
Health&wellness-Final Presentation
11 pages
Teaching English Through English Movie: Advantages and Disadvantages
No ratings yet
Teaching English Through English Movie: Advantages and Disadvantages
6 pages
3 I, S THESIS
No ratings yet
3 I, S THESIS
14 pages
"External Conflict": A Semi - Detailed Lesson Plan in English 8
No ratings yet
"External Conflict": A Semi - Detailed Lesson Plan in English 8
4 pages
Argumentation
No ratings yet
Argumentation
10 pages
A Conversation About Talent
No ratings yet
A Conversation About Talent
3 pages
Methods of Qual-WPS Office
No ratings yet
Methods of Qual-WPS Office
2 pages
Laboratory Activities and Students Practical Performance: The Case of Practical Organic Chemistry I Course of Haramaya University
No ratings yet
Laboratory Activities and Students Practical Performance: The Case of Practical Organic Chemistry I Course of Haramaya University
30 pages
Signed Off - Introduction To Philosophy12 - q1 - m2 - Methods of Philosophizing - v3
No ratings yet
Signed Off - Introduction To Philosophy12 - q1 - m2 - Methods of Philosophizing - v3
25 pages
E-Udaya: One Year Live Online Classroom Program
No ratings yet
E-Udaya: One Year Live Online Classroom Program
3 pages
Exemplification Essay
No ratings yet
Exemplification Essay
2 pages
Eisenhower Principle Infographic
No ratings yet
Eisenhower Principle Infographic
1 page
Chapter 1
No ratings yet
Chapter 1
10 pages
Dhairyasheel C Resume
No ratings yet
Dhairyasheel C Resume
3 pages
Introduction To Learning Disabilities - Course Outline
No ratings yet
Introduction To Learning Disabilities - Course Outline
3 pages
Noble Playbook
100% (1)
Noble Playbook
117 pages
Finance Graduates Knowledge and Skills Development Graduate and Employer Perceptions in United Arab Emirates 2022-04-07 11 - 39 - 03
No ratings yet
Finance Graduates Knowledge and Skills Development Graduate and Employer Perceptions in United Arab Emirates 2022-04-07 11 - 39 - 03
8 pages