0% found this document useful (0 votes)

7 views79 pages

M2 - Supervised Machine Learning

The document provides an overview of supervised machine learning, focusing on regression analysis, feature engineering, and various classification techniques such as logistic regression, decision trees, and support vector machines (SVM). It explains the concepts of dependent and independent variables, performance metrics for regression and classification, and the importance of model evaluation. Additionally, it discusses the strengths and weaknesses of different machine learning methods, highlighting their applications and effectiveness in various scenarios.

Uploaded by

chenkhoonsg

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

7 views79 pages

M2 - Supervised Machine Learning

Uploaded by

chenkhoonsg

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 79

Supervised

Machine Learning
2 of 3 modules
Supervised Learning
Data includes both the input and the desired results.
Training and Test
Sets
Resampling
Imbalanced
Datasets

Resampling + Synthesis of artificial data

SMOTE – Synthetic Minority Oversampling

Ensemble (combined)
Models
Linear Regression
Getting our line straight!
Introduction to Regression Analysis
 Regression analysis is used to:
 Predict the value of a dependent variable based on the value of at least one
independent variable
 Explain the impact of changes in an independent variable on the dependent
variable
• Dependent variable:
The variable we wish to predict or explain
• Independent variable:
The variable used to explain the dependent variable
Simple Only one independent variable, X

Linear Relationship between X and Y is

Regression described by a linear function.

Model Changes in Y are assumed to be

caused by changes in X
More than one independent variable,
Multi X

Linear Relationship between X and Y is

Regression described by a linear function.

Model Changes in Y are assumed to be

caused by changes in X
Types of Relationships
Linear relationships Curvilinear relationships

Y Y

X X

Y Y

X X
Types of Relationships
Strong relationships Weak relationships

Y Y

X X

Y Y

X X
Types of Relationships
No relationship

X
Simple Linear Regression Model

Population Random
Population Independent Error
Slope
Y intercept Variable term
Coefficient
Dependent
Variable

Yi = b + MXi + εi
Linear component Random Error
component
Simple Linear Regression Model - Errors

Y Yi = b + MXi + εi
Observed Value
of Y for Xi

εi Slope = M
Predicted Value Random Error
of Y for Xi
for this X i value

Intercept = b

Xi X
Interpretation of the Slope and the Intercept

• b is the estimated average value of Y when the value of X is zero

• M is the estimated change in the average value of Y as a result of a

one-unit change in X
How do we determine if our
Regression model is doing well or not?
Performance Metrics (Regression)

Mean Squared Error -

Mean Absolute Error - Measures the average of
Sum of the absolute the squares of the errors—
differences between that is, the average
predictions and actual squared difference between
values. the estimated values and
what is estimated.
Let’s dive straight to the Hands-on
using Jupyter notebooks
Feature
Engineering
Improving the model!
Feature engineering

• The first thing we need to do when creating a machine

learning model is to decide what to use as features.
• Features are key to a model, like a person’s name or
favorite color. pieces of information that we take from the
text and give to the algorithm so it can work its magic.
• E.g, if we do classification on health, some features could
be a person’s height, weight, gender, and so on.
• We would exclude things that maybe are known but aren’t useful
Benefits of Feature Engineering

• Reduces Overfitting : Less redundant data means less

opportunity to make decisions based on noise.

• Improves Accuracy : Less misleading data means modeling

accuracy improves.

• Reduces Training Time : Fewer data points reduce

algorithm complexity and algorithms train faster.
Techniques of Feature
Engineering

• Introducing interaction terms

Let’s dive straight to the Hands-on
using Jupyter notebooks
Logistic Regression
What is it and what is the algorithm?
What is the difference
between Linear Regression
& Logistic Regression?
Recap: What is linear
regression?
• Linear regression quantifies the relationship
between one or more predictor variables and
one outcome variable.

• For example, linear regression can be used to

quantify the relative impacts of age, gender, and
diet (the predictor variables) on height (the
outcome variable).
Recap: Example

Sales = 168 + 23
Advertising
Example – Log Reg – Scoring Goals!
• If we are kicking our soccer ball from a variety of distances.
• The results are going to be only Goal or no Goal.
• Our Standard Linear Regression will not work in this scenario!
Nominal
• Nominal scales are used for labeling variables, without
any quantitative value. “Nominal” scales could simply be called
“labels.”
Good to • E.g Male/Female, Red/Green/Yellow

know! Ordinal
• With ordinal scales, the order of the values is what’s important
and significant, but the differences between each one is not really
known.
• E.g Good, Very good, Excellent, Fantastic – 1#, 2#, 3#, 4#
What is logistic regression?

• Logistic regression is the appropriate regression

analysis to conduct when the dependent
variable is atleast binary.
• Like all regression analyses, the logistic
regression is a predictive analysis.
• Logistic regression is used to describe data and
to explain the relationship between one
dependent binary variable and one or more
nominal, ordinal, interval or ratio-level
independent variables.
The Sigmoid function
• We apply sigmoid function on the linear regression equation.
• By doing so, we will push our straight line to be a S shape or Sigmoid
Curve.
Model Evaluation is an integral part
Model of the model development process.

Evaluation It helps to find the best model that

represents our data and how well the
chosen model will work in the future.
Performance Metrics (Classification)

Confusion Matrix Accuracy Precision and

Recall
How do you evaluate classifiers?
Accuracy!
Confusion Matrix

It is a performance measurement for machine learning classification

problem where output can be two or more classes.

It is a table with 4 different combinations of predicted and actual

values.
So how can we use the metrics?
Say we have 2 confusion matrix from 2
models
Actual Class
Actual Class
- + -
+

+ + 10 10
8 1
Predicted Class
Predicted Class

2 89 - 0 80
-

Logistic Regression SVM

We can compare them!

Accuracy:
97% 90%
(TP+TN)/(TP+TN+FP+FN)

Precision:
89% 50%
TP/(TP+FP)
Recall:
80% 100%
TP/(TP+FN)
Precision and Recall
Precision attempts to answer the following question:
What proportion of positive identifications was correct?

Recall attempts to answer the following question:

What proportion of actual positives was identified correctly?
Decision Trees
Decision tree learning
is one of the most
widely used techniques
for classification.

Introduction
The classification
model is a tree, called
decision tree.
A decision tree can be converted to a set of rules
• Build tree split by split.
How we do • Find the best split you can at each step

our tree
• This best split is also known as Greedy
Search.
• We can put a number to our splitting
split? step with :
• Gini Index
Gini Index
• Where pi is the probability of an object being classified to a
particular class.
• While building the decision tree, we would prefer choosing the
attribute/feature with the least Gini index as the root node.
Each inner node is a decision based on a feature
Each leaf node is a class label

Predicting Titanic Survivors

Yes Is sex male? No

Is age > 9.5? Survived

0.73 36%

Died Is sibsp > 2.5?

0.17 61%

Died Survived

0.05 2% 0.89 2%
Build tree split by split,
Find the best split you can at each step

Yes Is sex male? No

Survived
0.73 36%
Build tree split by split,
Find the best split you can at each step

Yes Is sex male? No

Is age > 9.5? Survived

0.73 36%

Died
0.17 61%
Build tree split by split,
Find the best split you can at each step

Yes Is sex male? No

Is age > 9.5? Survived

0.73 36%

Died Is sibsp > 2.5?

0.17 61%

Died Survived

0.05 2% 0.89 2%
• Generates understandable rules.
• Perform classification without requiring
Strengths of much computation.
• able to handle both continuous and
decision tree categorical variables.
• Provides a clear indication of which fields
methods are most important for prediction or
classification.
• Natural multiclass classiﬁer.
• It is less appropriate for estimation tasks where
the goal is to predict the value of a continuous
attribute.
• Prone to errors in classification problems with
many class and relatively small number of
training examples.

Weaknesses of • Computationally expensive to train.

• Growing a decision tree is computationally

decision tree expensive.

• At each node, each candidate splitting field
must be sorted before its best split can be
found.
• Small changes in input data can result in totally
diﬀerent trees.
• Can make mistakes with unbalanced classes.
Support Vector
Machines
• SVMs are linear or non-linear classifiers that
find a hyperplane to separate two class of data,
positive and negative.
What are • SVM not only has a rigorous theoretical
SVMs? foundation, but also performs classification
more accurately than most other methods in
applications, especially for high dimensional
data
Support Vector Machine (SVM)

1
Patient
status 0.5
after 5 yr.
0

Number of positive nodes

Find the best boundary that separates two classes

Support Vector Machine (SVM)
?
1
Patient
status 0.5
after 5 yr.
0

Number of positive nodes

Bad: 3 misclassifications, accuracy 67%

Support Vector Machine (SVM)
?
1
Patient
status 0.5
after 5 yr.
0

Number of positive nodes

One misclassification, accuracy 89%

Support Vector Machine (SVM)
?
1
Patient
status 0.5
after 5 yr.
0

Number of positive nodes

Accuracy: 78%
Support Vector Machine (SVM)
?
1
Patient
status 0.5
after 5 yr.
0

Number of positive nodes

Accuracy: 100%
Support Vector Machine (SVM)
?
1
Patient
status 0.5
after 5 yr.
0

Number of positive nodes

Accuracy: 100%
Support Vector Machine (SVM)
?
1
Patient
status 0.5
after 5 yr.
0

Number of positive nodes

Accuracy: 100%
Support Vector Machine (SVM)

1
Patient
status 0.5
after 5 yr.
0

Number of positive nodes

The margin: No man’s land

2 Features: Number of + nodes, Age
2 Labels: Survived / Lost

Age

Number of positive nodes

Find the line that separates
the classes best

Age

Number of positive nodes

Age

Number of positive nodes

Age

Number of positive nodes

Age

Number of positive nodes

3 features: Find the best boundary plane
(More features: hyperplane)
• The hyperplane that separates positive and negative training
data is
〈w ⋅ x〉 + b = 0
• It is also called the decision boundary (surface).

What is a
hyperplane?
How to choose the
best hyperplane?

• SVM looks for the

separating hyperplane with
the largest margin.
• Machine learning theory
says this hyperplane
minimizes the error bound
• Accuracy

Pros • Works well on smaller cleaner datasets

• It can be more efficient because it uses a subset of
training points

• Isn’t suited to larger datasets as the training time

Cons with SVMs can be high

• Less effective on noisier datasets with overlapping
classes
What have you learned?

4/5/2023 78
Thank you !!
I welcome your questions.

4/5/2023 79

Kawasaki 1100 STX DI Service Manual
83% (12)
Kawasaki 1100 STX DI Service Manual
288 pages
How To Determine The Amperage in Magnetic Particle Inspection
100% (5)
How To Determine The Amperage in Magnetic Particle Inspection
10 pages
DAR Clearance Application
67% (3)
DAR Clearance Application
2 pages
UNIT3 Machine Learning
No ratings yet
UNIT3 Machine Learning
53 pages
Unit 3
No ratings yet
Unit 3
12 pages
Forecasting and Learning Theory
No ratings yet
Forecasting and Learning Theory
46 pages
Week 7. Intro To ML. Regression
No ratings yet
Week 7. Intro To ML. Regression
24 pages
41 Machine Learning Algorithms I
No ratings yet
41 Machine Learning Algorithms I
8 pages
Machinelearning Algorithm Basics2 NOTES
No ratings yet
Machinelearning Algorithm Basics2 NOTES
72 pages
Module 5
No ratings yet
Module 5
48 pages
Regression Models: by Mayuri Bhandari
No ratings yet
Regression Models: by Mayuri Bhandari
64 pages
Lecture - 2 & 3
No ratings yet
Lecture - 2 & 3
62 pages
Commonly Used Machine Learning Algorithms
No ratings yet
Commonly Used Machine Learning Algorithms
27 pages
ML Algorithms Week 3
No ratings yet
ML Algorithms Week 3
30 pages
KCA 034 - Unit 2
No ratings yet
KCA 034 - Unit 2
97 pages
Supervised Learning
No ratings yet
Supervised Learning
187 pages
AI and DS QB1
No ratings yet
AI and DS QB1
31 pages
ICT202B AI ML and Emerging Technologies UNIT 3 (Classification and Regression) 2
No ratings yet
ICT202B AI ML and Emerging Technologies UNIT 3 (Classification and Regression) 2
23 pages
Ds Module 4
No ratings yet
Ds Module 4
73 pages
1 - Intro To Machine Learning
No ratings yet
1 - Intro To Machine Learning
34 pages
Supervised and Unsupervised Learning
No ratings yet
Supervised and Unsupervised Learning
92 pages
Machine Learing Algorithms
No ratings yet
Machine Learing Algorithms
13 pages
2-Machine Learning Algorithms
No ratings yet
2-Machine Learning Algorithms
16 pages
Linear Regression Vs Logistic Regression
No ratings yet
Linear Regression Vs Logistic Regression
8 pages
Lecture 3
No ratings yet
Lecture 3
51 pages
Machine Learning
No ratings yet
Machine Learning
100 pages
S, SVM, LR
No ratings yet
S, SVM, LR
18 pages
MLRS Assignment 1 24070146008 Sreemanth Mannem
No ratings yet
MLRS Assignment 1 24070146008 Sreemanth Mannem
12 pages
Slide 1
No ratings yet
Slide 1
29 pages
DAC ML Tutorial Final Deck
No ratings yet
DAC ML Tutorial Final Deck
150 pages
Machine Learning
No ratings yet
Machine Learning
115 pages
Machine Learning
No ratings yet
Machine Learning
37 pages
UNIT II Machine Learning
No ratings yet
UNIT II Machine Learning
118 pages
AIML
No ratings yet
AIML
30 pages
Accuracy Assessment and Confusion Matrix
No ratings yet
Accuracy Assessment and Confusion Matrix
23 pages
Aiml Unit 3
No ratings yet
Aiml Unit 3
9 pages
Fiches Machine Learning
No ratings yet
Fiches Machine Learning
21 pages
ML Unit-4
No ratings yet
ML Unit-4
20 pages
CS601 - Machine Learning - Unit 1 - Notes - 1672759748
No ratings yet
CS601 - Machine Learning - Unit 1 - Notes - 1672759748
13 pages
Unit 2
No ratings yet
Unit 2
133 pages
Analytics Boot Camp
No ratings yet
Analytics Boot Camp
126 pages
Tutorial 7 Machine Learning Algorithms
No ratings yet
Tutorial 7 Machine Learning Algorithms
30 pages
Machine Learning
No ratings yet
Machine Learning
32 pages
Day 2
No ratings yet
Day 2
52 pages
Machine Learning: Bilal Khan
100% (2)
Machine Learning: Bilal Khan
20 pages
Mbas901 - L4
No ratings yet
Mbas901 - L4
83 pages
Lecture 9
No ratings yet
Lecture 9
27 pages
Machine Learning Notes
No ratings yet
Machine Learning Notes
12 pages
Machine Learning Theory
100% (1)
Machine Learning Theory
12 pages
Data Science Cheatsheet 2.0: Statistics Model Evaluation Logistic Regression
No ratings yet
Data Science Cheatsheet 2.0: Statistics Model Evaluation Logistic Regression
4 pages
Supervised Learning
No ratings yet
Supervised Learning
3 pages
Predictive ModellingAnalytics
No ratings yet
Predictive ModellingAnalytics
27 pages
Week 9 - PROG 8510 Week 9
No ratings yet
Week 9 - PROG 8510 Week 9
27 pages
Machine Learning For Interviews
No ratings yet
Machine Learning For Interviews
12 pages
Unit V - Big Data Programming
No ratings yet
Unit V - Big Data Programming
22 pages
ML-classification Models
No ratings yet
ML-classification Models
27 pages
Machine Learning
No ratings yet
Machine Learning
62 pages
Module3-Fitting A Model To Data
No ratings yet
Module3-Fitting A Model To Data
57 pages
Machine Learning QNA
No ratings yet
Machine Learning QNA
1 page
ML Introduction
No ratings yet
ML Introduction
76 pages
Constitution Law 2 Case Laws
100% (1)
Constitution Law 2 Case Laws
2 pages
Adkins (2011) - Using Gretl For Principles of Econometrics, 4th Edition PDF
No ratings yet
Adkins (2011) - Using Gretl For Principles of Econometrics, 4th Edition PDF
494 pages
2.8 Rules of Court 88-90 EUSEBIO A. GODOY vs. ORELLANO Et. Al
No ratings yet
2.8 Rules of Court 88-90 EUSEBIO A. GODOY vs. ORELLANO Et. Al
1 page
Idpro 86 Bago
No ratings yet
Idpro 86 Bago
12 pages
Customer Tank Gauging
100% (1)
Customer Tank Gauging
63 pages
Jack Up Rig
No ratings yet
Jack Up Rig
39 pages
Core Banking 3.3 User Guide
No ratings yet
Core Banking 3.3 User Guide
608 pages
SPM Unit 5 - 23419379 - 2023 - 11 - 22 - 12 - 32
No ratings yet
SPM Unit 5 - 23419379 - 2023 - 11 - 22 - 12 - 32
36 pages
7 Hints and Tips Writing A Genealogical Report New Style PDF
No ratings yet
7 Hints and Tips Writing A Genealogical Report New Style PDF
4 pages
CSL Mod1
No ratings yet
CSL Mod1
6 pages
Ra Learners Guide
No ratings yet
Ra Learners Guide
104 pages
Cafe Coffee Day
100% (1)
Cafe Coffee Day
11 pages
1UPSCPORTALMagazineCivil Services 2009 Pre Special PDF
100% (1)
1UPSCPORTALMagazineCivil Services 2009 Pre Special PDF
113 pages
Checkpoint April 2020 Mathematics Paper 1 (Checked)
67% (6)
Checkpoint April 2020 Mathematics Paper 1 (Checked)
16 pages
Home Based Care
No ratings yet
Home Based Care
21 pages
Matlab
No ratings yet
Matlab
280 pages
DarkSoulsIII Cheat Table
No ratings yet
DarkSoulsIII Cheat Table
95 pages
THM Range of Gas Turbines: Generator Drive Applications
No ratings yet
THM Range of Gas Turbines: Generator Drive Applications
8 pages
Project 1
No ratings yet
Project 1
27 pages
Applicants Shortlisted For SRF Written Test
No ratings yet
Applicants Shortlisted For SRF Written Test
15 pages
Indianexpress-Com
No ratings yet
Indianexpress-Com
9 pages
Service Package & Strategic Service Vision - Commuter Cleaning-18096
No ratings yet
Service Package & Strategic Service Vision - Commuter Cleaning-18096
1 page
G.R. No. 188069 - Bascara v. Javier
No ratings yet
G.R. No. 188069 - Bascara v. Javier
9 pages
Bendito Jesus
No ratings yet
Bendito Jesus
4 pages
Audio1627988258-M4a Dengan Penanda Waktu
No ratings yet
Audio1627988258-M4a Dengan Penanda Waktu
7 pages
Water Trash Removel System
No ratings yet
Water Trash Removel System
13 pages
Chapter 13: Empirical Evidence On Security Returns: Stock A B C D E F G H I
No ratings yet
Chapter 13: Empirical Evidence On Security Returns: Stock A B C D E F G H I
2 pages

M2 - Supervised Machine Learning

Uploaded by

M2 - Supervised Machine Learning

Uploaded by

Supervised

Resampling + Synthesis of artificial data

Linear Relationship between X and Y is

Model Changes in Y are assumed to be

Linear Relationship between X and Y is

Model Changes in Y are assumed to be

• b is the estimated average value of Y when the value of X is zero

• M is the estimated change in the average value of Y as a result of a

Mean Squared Error -

• The first thing we need to do when creating a machine

• Reduces Overfitting : Less redundant data means less

• Improves Accuracy : Less misleading data means modeling

• Reduces Training Time : Fewer data points reduce

• Introducing interaction terms

• For example, linear regression can be used to

• Logistic regression is the appropriate regression

Evaluation It helps to find the best model that

Confusion Matrix Accuracy Precision and

It is a performance measurement for machine learning classification

It is a table with 4 different combinations of predicted and actual

Logistic Regression SVM

Recall attempts to answer the following question:

Predicting Titanic Survivors

Is age > 9.5? Survived

Died Is sibsp > 2.5?

Yes Is sex male? No

Yes Is sex male? No

Is age > 9.5? Survived

Yes Is sex male? No

Is age > 9.5? Survived

Died Is sibsp > 2.5?

Weaknesses of • Computationally expensive to train.

decision tree expensive.

Number of positive nodes

Find the best boundary that separates two classes

Number of positive nodes

Bad: 3 misclassifications, accuracy 67%

Number of positive nodes

One misclassification, accuracy 89%

Number of positive nodes

Number of positive nodes

Number of positive nodes

Number of positive nodes

Number of positive nodes

The margin: No man’s land

Number of positive nodes

Number of positive nodes

Number of positive nodes

Number of positive nodes

Number of positive nodes

• SVM looks for the

Pros • Works well on smaller cleaner datasets

• Isn’t suited to larger datasets as the training time

Cons with SVMs can be high

You might also like