Machine Learning Models: by Mayuri Bhandari

The document discusses different types of machine learning models including supervised learning, unsupervised learning, reinforcement learning, and semi-supervised learning. It then describes concepts related to evaluating machine learning models such as bias, variance, underfitting, overfitting, and the bias-variance tradeoff. Finally, it covers evaluation metrics for classification problems including accuracy, confusion matrices, F1 score, and for regression problems including mean absolute error, mean squared error, and maximum likelihood estimation.

Uploaded by

mayuri

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

156 views48 pages

Machine Learning Models: by Mayuri Bhandari

Uploaded by

mayuri

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 48

Machine Learning Models

By Mayuri Bhandari
Types of Machine Learning
Supervised Learning
Supervised learning is when you provide the machine
with a lot of training data to perform a specific task.
In the training data, you’d feed the machine with a lot
of similar examples, and the computer will predict the
answer.
You would then give feedback to the computer as to
whether it made the right prediction or not.
Supervised learning is task-specific, and that’s why it’s
quite common.
Supervised Learning
Unsupervised Learning
As the name suggests, unsupervised learning is the
opposite of supervised learning.
In this case, you don’t provide the machine with any
training data.
The machine has to reach conclusions without any
labeled data.
It’s a little challenging to implement than supervised
learning.
It is used for clustering data and for finding anomalies.
It is also quite popular as it is data-driven.
Unsupervised Learning
Reinforcement Learning
Reinforcement learning is quite different from other types of
machine learning (supervised and unsupervised).
The relation between data and machine is quite different
from other machine learning types as well.
In reinforcement learning, the machine learns by its
mistakes.
You give the machine a specific environment in which it can
perform a given set of actions.
Now, it will learn by trial and error.
Although reinforcement learning is quite challenging to
implement, it finds applications in many industries.
Reinforcement Learning
Semi-supervised Learning
A semi-supervised learning problem starts with a
series of labeled data points as well as some data point
for which labels are not known.
The goal of a semi-supervised model is to classify
some of the unlabeled data using the labeled
information set.
The goal of a semi-supervised learning model is to
make effective use of all of the available data, not just
the labelled data like in supervised learning.
Semi-supervised Learning
Machine Learning Models
Components of Generalization Error

Bias

Variance

Underfitting

Overfitting
Bias Error
Bias is defined as the average squared difference between
predictions and true values.
Bias measures the deviation between the expected output of
our model and the real values, so it indicates the fit of our
model.
Bias results in under-fitting the data.
A high bias means our learning algorithm is missing
important trends among the features.
High bias algorithms are easier to learn but less flexible, due
to this they have lower predictive performance on complex
problems.
Bias Error

Data is almost always noisy in reality, so some bias is

inevitable — called the irreducible error
Variance
Variance measures the amount that the outputs of our
model will change if a different dataset is used.
It is the impacts of using different datasets.
A model is said to have high variance if its predictions
are sensitive to small changes in the input.
Generally, non-parametric machine learning algorithms
that have a lot of flexibility have a high variance
Bias-Variance Trade-off
Example : Bias-Variance Trade-off
Bias-Variance Trade-off

Training dataset 1 Training dataset 2

High Bias
High Variance
How to achieve Trade-off
Dimensionality Reduction
Regularisation in linear models
Using mixture model and Ensemble training
Optimal value of K in KNN
Total Error
Error = Bias + Variance
Total Error = Bias + Variance + Irreducible error
error(X) = bias(X)2 + variance(X) + noise(X)
Bias(X) = E[f^(x)] − f(x)
Underfitting and Overfitting
Underfitting and Overfitting
Overfitting: Good performance on the training data,
poor result while giving other data.

Underfitting: Poor performance on the training data

and poor result while giving the other data.

Underfitting would imply that the model has still

capacity to learn, so you would simply train for more
iterations or collect more data.
Underfitting and Overfitting
A learning system cycle
1. Ideation
The following prerequisites are essential for a successful
ideation:
1. Clear requirements regarding business objectives and
scope
2. Availability of historical data
3. Understanding of end-to-end IT infrastructure
requirements
2. Development
Once key metrics that correspond to the business
objectives are agreed upon and historical data is acquired,
the data scientist can start developing the initial model.
Data scientists have a wide array of tools available to solve
their puzzles:
1. Transforming data to a more useful format
2. Analysis of data to guide modeling approach
3. Writing of the actual machine learning model code
4. Creating numbers and visuals for initial reports towards
stakeholders
3. Production
When the development phase is over, the developed model
needs to be put in production to start generating value.
The complexity of getting a model in production depends
on the context of the problem, the autonomy of data
science teams and the overall maturity of the organization.
The context of the problem consists of a number of factors:
1. Data flow at prediction time
2. Sensitivity of the data
3. Maximum acceptable latency of delivery
4. Maintenance
Once a model is deployed, there are a number of
measures that can be taken to improve robustness and
quality of the machine learning model.
These measures can be roughly divided into four
areas. We call this post-production process
maintenance.
1. Lineage
2. Monitoring
3. Comparison
4. Model Drift
Evaluation Metrics : Accuracy
It is the ratio of number of correct predictions to the
total number of input samples.

It works well only if there are equal number of samples

belonging to each class.
 Classification Accuracy is great, but gives us the false
sense of achieving high accuracy.
Confusion Matrix
Confusion Matrix as the name suggests gives us a
matrix as output and describes the complete
performance of the model.
Confusion Matrix
Lets assume we have a binary classification problem.
We have some samples belonging to two classes : YES
or NO. Also, we have our own classifier which predicts
a class for a given input sample. On testing our model
on 165 samples ,we get the following result.
Confusion Matrix
There are 4 important terms :
True Positives : The cases in which we predicted YES
and the actual output was also YES.
True Negatives : The cases in which we predicted NO
and the actual output was NO.
False Positives : The cases in which we predicted YES
and the actual output was NO.
False Negatives : The cases in which we predicted
NO and the actual output was YES.
Confusion Matrix
Accuracy for the matrix can be calculated by taking
average of the values lying across the “main
diagonal” i.e
F1 Score
F1 Score is used to measure a test’s accuracy
F1 Score is the Harmonic Mean between precision and
recall. The range for F1 Score is [0, 1].
It tells you how precise your classifier is (how many
instances it classifies correctly), as well as how robust
it is (it does not miss a significant number of
instances).
F1 Score : Precision and Recall
Precision : It is the number of correct positive results
divided by the number of positive results predicted by
the classifier.
True Positives
Precision =
True Positives + False Positives
Recall : It is the number of correct positive results
divided by the number of all relevant samples
True Positives
Recall =
True Positives + False Negatives
Mean Absolute Error
Mean Absolute Error is the average of the difference
between the Original Values and the Predicted Values.
It gives us the measure of how far the predictions were
from the actual output.
However, they don’t gives us any idea of the direction
of the error i.e. whether we are under predicting the
data or over predicting the data.
Mean Squared Error
Mean Squared Error(MSE) is quite similar to Mean
Absolute Error, the only difference being that MSE takes
the average of the square of the difference between the
original values and the predicted values.
The advantage of MSE being that it is easier to compute the
gradient
As, we take square of the error, the effect of larger errors
become more pronounced than smaller error, hence the
model can now focus more on the larger errors.
Maximum likelihood Estimation(MLE)
Maximum likelihood estimation is a method that
determines values for the parameters of a model.
The parameter values are found such that they
maximize the likelihood that the process described by
the model produced the data that were actually
observed.
What are parameters?
For a linear model we can write this as y = mx + c. In this
example x could represent the advertising spend and y might be
the revenue generated. m and c are parameters for this model.
Different values for these parameters will give different lines
MLE : Example
Let’s suppose we have observed 10 data points from some process. For
example, each data point could represent the length of time in seconds
that it takes a student to answer a specific exam question. These 10 data
points are shown in the figure below
MLE : Example
For these data we’ll assume that the data generation process can
be adequately described by a Gaussian (normal) distribution.

Gaussian distribution
has 2 parameters. The
mean, μ, and the
standard deviation, σ.
Different values of these
parameters result in
different curves
MLE
We want to know which curve was most likely
responsible for creating the data points that we
observed?
Maximum likelihood estimation is a method that will
find the values of μ and σ that result in the curve that
best fits the data.
The true distribution from which the data were
generated was f1 ~ N(10, 2.25), which is the blue curve
in the figure above.
Posterior probability
A posterior probability, in Bayesian statistics, is the
revised or updated probability of an event occurring
after taking into consideration new information.
The posterior probability is calculated by updating
the prior probability using Bayes' theorem.
In statistical terms, the posterior probability is the
probability of event A occurring given that event B has
occurred.
Bayes' Theorem Formula
The formula to calculate a posterior probability of A
occurring given that B occurred:
Bayes' theorem can be used in many applications, such as medicine,
finance, and economics.
In finance, Bayes' theorem can be used to update a previous belief
once new information is obtained.
Prior probability represents what is originally believed before new
evidence is introduced, and posterior probability takes this new
information into account.
Posterior probability distributions should be a better reflection of the
underlying truth of a data generating process than the prior
probability since the posterior included more information.
 A posterior probability can subsequently become a prior for a new
updated posterior probability as new information arises and is
incorporated into the analysis.
References
https://google.com
https://towardsdatascience.com
https://medium.com
https://www.upgrad.com/
www.edureka.co

Pandas Handbook
No ratings yet
Pandas Handbook
33 pages
(Innovations in Transactional Analysis - Theory and Practice) Sari Van Poelje, Anne de Graaf - New Theory and Practice of Transactional Analysis in Organizations - On The Edge-Routledge (2021)
100% (1)
(Innovations in Transactional Analysis - Theory and Practice) Sari Van Poelje, Anne de Graaf - New Theory and Practice of Transactional Analysis in Organizations - On The Edge-Routledge (2021)
213 pages
Applied ML Notes
No ratings yet
Applied ML Notes
123 pages
Generative AI A Transformative Force in Business Intelligence
No ratings yet
Generative AI A Transformative Force in Business Intelligence
7 pages
M1 - Introducing Google Cloud v5.2 - ILT
No ratings yet
M1 - Introducing Google Cloud v5.2 - ILT
69 pages
Machine Learning For Marketers PowerPoint Presentation Storyboard
No ratings yet
Machine Learning For Marketers PowerPoint Presentation Storyboard
25 pages
SVM Using Python
No ratings yet
SVM Using Python
24 pages
DataMiningForTheMasses (001 158)
No ratings yet
DataMiningForTheMasses (001 158)
158 pages
Deep Learning Methods and Applications For Electrical Power Systems A Comprehensive Review
No ratings yet
Deep Learning Methods and Applications For Electrical Power Systems A Comprehensive Review
22 pages
ML Interview Cheat Sheet
No ratings yet
ML Interview Cheat Sheet
9 pages
CS583 Unsupervised Learning
No ratings yet
CS583 Unsupervised Learning
95 pages
What Is A Support Vector Machine?: Primer
No ratings yet
What Is A Support Vector Machine?: Primer
3 pages
Anatomy Of: Domain - Driven Design
No ratings yet
Anatomy Of: Domain - Driven Design
24 pages
Neural Networks Cheat Sheet - 2020 PDF
No ratings yet
Neural Networks Cheat Sheet - 2020 PDF
14 pages
Machine Learning Algorithms
No ratings yet
Machine Learning Algorithms
9 pages
ML UNIT-IV Notes
100% (1)
ML UNIT-IV Notes
23 pages
Machine Learning Handouts
No ratings yet
Machine Learning Handouts
110 pages
Introduction To Machine Learning PDF
100% (1)
Introduction To Machine Learning PDF
17 pages
Lecture 1
100% (1)
Lecture 1
81 pages
Machine Learning For Tabular Data XGBoost, Deep Learning, and AI (Mark Ryan, Luca Massaron) (Z-Library)
100% (1)
Machine Learning For Tabular Data XGBoost, Deep Learning, and AI (Mark Ryan, Luca Massaron) (Z-Library)
504 pages
Seminar Report Machine Learning
No ratings yet
Seminar Report Machine Learning
20 pages
Confusion Matrix in Machine Learning
No ratings yet
Confusion Matrix in Machine Learning
10 pages
CEC453 Machine Learning
No ratings yet
CEC453 Machine Learning
168 pages
Machine Learning Algorithms
100% (1)
Machine Learning Algorithms
15 pages
Parkinsons Disease Detection
No ratings yet
Parkinsons Disease Detection
80 pages
Lecture 01 (Introduction To Pattern Recognition)
No ratings yet
Lecture 01 (Introduction To Pattern Recognition)
26 pages
Deep Learning Lecture 0 Introduction Alexander Tkachenko
No ratings yet
Deep Learning Lecture 0 Introduction Alexander Tkachenko
31 pages
AdaBoost Classifier in Python (Article) - DataCamp
100% (1)
AdaBoost Classifier in Python (Article) - DataCamp
9 pages
Machine Learning
No ratings yet
Machine Learning
9 pages
1 - Machine Learning (Start)
No ratings yet
1 - Machine Learning (Start)
32 pages
T-GCPBDML-B - M2 - Data Engineering For Streaming Data - ILT Slides
No ratings yet
T-GCPBDML-B - M2 - Data Engineering For Streaming Data - ILT Slides
71 pages
Anomaly Detection: Course: Data Mining II
No ratings yet
Anomaly Detection: Course: Data Mining II
12 pages
Introduction To Basics of Machine Learning Algorithms: Pankaj Oli
100% (1)
Introduction To Basics of Machine Learning Algorithms: Pankaj Oli
13 pages
Unit 4
No ratings yet
Unit 4
108 pages
Jupyter Installation
100% (1)
Jupyter Installation
19 pages
Honours in Artificial Intelligence and Machine Learning: Board of Studies (Computer Engineering)
No ratings yet
Honours in Artificial Intelligence and Machine Learning: Board of Studies (Computer Engineering)
16 pages
Python Programming-Grade 9
No ratings yet
Python Programming-Grade 9
53 pages
Pandas
100% (1)
Pandas
1,131 pages
Medical Image Fusion Method by Deep Learning
No ratings yet
Medical Image Fusion Method by Deep Learning
9 pages
DataMining Lecture 1
No ratings yet
DataMining Lecture 1
35 pages
Mastering Machine Learning With Scikit-Learn: Chapter No. 5 "Nonlinear Classification and Regression With Decision Trees"
No ratings yet
Mastering Machine Learning With Scikit-Learn: Chapter No. 5 "Nonlinear Classification and Regression With Decision Trees"
23 pages
Six Week-Total Handson Internship Program On Machine Learning
No ratings yet
Six Week-Total Handson Internship Program On Machine Learning
8 pages
Abstract On The Artificial Intelegence
No ratings yet
Abstract On The Artificial Intelegence
15 pages
Naïve Bayes Classifier Algorithm
No ratings yet
Naïve Bayes Classifier Algorithm
10 pages
Data Science Intervieew Questions
100% (1)
Data Science Intervieew Questions
16 pages
Machine Learning and Neural Networks: Riccardo Rizzo
100% (1)
Machine Learning and Neural Networks: Riccardo Rizzo
113 pages
Logistic Regression: Jia Li
No ratings yet
Logistic Regression: Jia Li
44 pages
Binary Classification
No ratings yet
Binary Classification
1 page
DecisionTree Numerical ID3Prob
No ratings yet
DecisionTree Numerical ID3Prob
114 pages
Python For Machine Learning
No ratings yet
Python For Machine Learning
384 pages
IBM MDM 11.6 Installation: Topology, Software Bundles, Prerequisites, Steps and Issues
No ratings yet
IBM MDM 11.6 Installation: Topology, Software Bundles, Prerequisites, Steps and Issues
5 pages
Simple Libraries in Python
No ratings yet
Simple Libraries in Python
12 pages
Class Xi Python
100% (2)
Class Xi Python
138 pages
Database Management Systems by Raghu Ramakrishnan: Special Features of Book
No ratings yet
Database Management Systems by Raghu Ramakrishnan: Special Features of Book
3 pages
EDA - The Right Way
No ratings yet
EDA - The Right Way
111 pages
Supervised Machine Learning
No ratings yet
Supervised Machine Learning
112 pages
SECodec: Structural Entropy-Based Compressive Speech Representation Codec For Speech Language Models
100% (1)
SECodec: Structural Entropy-Based Compressive Speech Representation Codec For Speech Language Models
17 pages
Supervised and Deep Learning
No ratings yet
Supervised and Deep Learning
83 pages
Handling Missing Value in Decision Tree Algorithm PDF
No ratings yet
Handling Missing Value in Decision Tree Algorithm PDF
6 pages
Machine Learning and Data Mining in Manufacturing
No ratings yet
Machine Learning and Data Mining in Manufacturing
45 pages
Hebbian Learning: Fundamentals and Applications for Uniting Memory and Learning
From Everand
Hebbian Learning: Fundamentals and Applications for Uniting Memory and Learning
Fouad Sabry
No ratings yet
Hybrid Neural Networks: Fundamentals and Applications for Interacting Biological Neural Networks with Artificial Neuronal Models
From Everand
Hybrid Neural Networks: Fundamentals and Applications for Interacting Biological Neural Networks with Artificial Neuronal Models
Fouad Sabry
No ratings yet
Room Tariff: Special Rates On Continental Plan (CPAI)
No ratings yet
Room Tariff: Special Rates On Continental Plan (CPAI)
4 pages
CHM2032L Lab Manual 8 Spectrophotometry Yavuz-Petrowski Fall 2021 Tde88JS
No ratings yet
CHM2032L Lab Manual 8 Spectrophotometry Yavuz-Petrowski Fall 2021 Tde88JS
21 pages
Jakemurphy
No ratings yet
Jakemurphy
21 pages
Test Bank For Financial Accounting, 11th Edition: Albrecht - 2025 Version Is Available With All Chapters
100% (9)
Test Bank For Financial Accounting, 11th Edition: Albrecht - 2025 Version Is Available With All Chapters
37 pages
Lecture 1a
No ratings yet
Lecture 1a
22 pages
Irish Unemployment p2 Markscheme New
No ratings yet
Irish Unemployment p2 Markscheme New
4 pages
Tutorial Letter 201/1/2018: Organisational Communication
No ratings yet
Tutorial Letter 201/1/2018: Organisational Communication
37 pages
Werner 2018 Geographies of Production I Global Production and Uneven Development
No ratings yet
Werner 2018 Geographies of Production I Global Production and Uneven Development
11 pages
Boom Placer Spare Parts Manual Sp1420 RMC - 80009056 - 0, Edition - Dec '18
No ratings yet
Boom Placer Spare Parts Manual Sp1420 RMC - 80009056 - 0, Edition - Dec '18
104 pages
RESUME CV Tabeti Abdelkader English 2017
No ratings yet
RESUME CV Tabeti Abdelkader English 2017
11 pages
A Milling Machine Is A Machine Tool Used To Machine Solid Materials
No ratings yet
A Milling Machine Is A Machine Tool Used To Machine Solid Materials
7 pages
IClebo Arte User Guide-English
No ratings yet
IClebo Arte User Guide-English
20 pages
FSD Material
No ratings yet
FSD Material
122 pages
Tropical Rainforest: Presented by
No ratings yet
Tropical Rainforest: Presented by
30 pages
Surprise Test Solution
No ratings yet
Surprise Test Solution
1 page
The College Walkthrough Ver 0.39
No ratings yet
The College Walkthrough Ver 0.39
22 pages
Query Optimization in Object Oriented Databases Through Detecting Independent Subqueries
No ratings yet
Query Optimization in Object Oriented Databases Through Detecting Independent Subqueries
5 pages
Company Profile
No ratings yet
Company Profile
28 pages
Brosur Master Steel
No ratings yet
Brosur Master Steel
4 pages
Research Thesis
No ratings yet
Research Thesis
6 pages
Structural Foundation Sections Sheet 1 of 2
No ratings yet
Structural Foundation Sections Sheet 1 of 2
1 page
Updated Constitution of Business Club
No ratings yet
Updated Constitution of Business Club
13 pages
CT TIF Presentation For Kickoff-Final
No ratings yet
CT TIF Presentation For Kickoff-Final
13 pages
Above SSC Applicationform
No ratings yet
Above SSC Applicationform
1 page
GD121 Spare Parts Old
No ratings yet
GD121 Spare Parts Old
647 pages
Activity 3 Earths Interior
No ratings yet
Activity 3 Earths Interior
3 pages
India Patent Form 21
No ratings yet
India Patent Form 21
1 page
Exam 2022 p2 Ans
No ratings yet
Exam 2022 p2 Ans
14 pages

Machine Learning Models: by Mayuri Bhandari

Uploaded by

Machine Learning Models: by Mayuri Bhandari

Uploaded by

Machine Learning Models

Data is almost always noisy in reality, so some bias is

Training dataset 1 Training dataset 2

Underfitting: Poor performance on the training data

Underfitting would imply that the model has still

It works well only if there are equal number of samples

You might also like