0% found this document useful (0 votes)

31 views129 pages

Unit 2 ML - Ver 2

Uploaded by

flytondk

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

31 views129 pages

Unit 2 ML - Ver 2

Uploaded by

flytondk

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 129

Machine

Learning
Subject code:
Regulations: 2021 AL3451

Unit - 2 Supervised Learning

Linear Regression Models

Linear Regression Models
• Linear regression is a type of supervised machine-learning
algorithm that learns from the labelled datasets and maps the data
points to the most optimized linear functions, which can be used for
prediction on new datasets.
• Supervised learning has two types:
• Classification: It predicts the class of the dataset based on the independent
input variable. Class is the categorical or discrete values, like the image of
an animal is a cat or dog?
• Regression: It predicts the continuous output variables based on the
independent input variable. Like the prediction of house prices based on
different parameters like house age, distance from the main road, location,
area, etc.
Linear Regression
• Linear regression is a type of supervised machine learning algorithm
that computes the linear relationship between the dependent variable
and one or more independent variable by fitting a linear equation to
observed data.
• When there is only one independent variable, it is known as Simple
Linear Regression, and when there are more than one variable, it is
known as Multiple Linear Regression.
• When there is only one dependent variable, it is considered Univariate
Linear Regression, while when there are more than one dependent
variables, it is known as Multivariate Regression.
• Linear Regression models are simple and requires minimum memory
to implement
Simple Linear regression using
Least Square Method

• Simple linear regression is the simplest form of linear regression,

and it involves only one independent variable and one dependent
variable.
• The equation for simple linear regression is:
+x

where:
is the dependent variable
x is the independent variable
is the intercept
is the slope
Least Square Method
Least Square Method
Least Square Method
Least Square Method
Least Square Method
Least Square Method
Least Square Method
Least Square Method
Least Square Method
Least Square Method
Least Square Method
Least Square Method
Least Square Method
Multiple Linear regression using
Least Square Method

• Multiple linear regression involves more than one independent

variable and one dependent variable.
• The equation for multiple linear regression is:
+ + + + ....... +

where:
is the dependent variable
, , , .... are the independent variable
is the intercept
, , , .... are the slope
Linear Regression
Linear Regression
Linear Regression
Linear Regression
Linear Regression
Multiple Linear Regression
Bayesian Linear Regression
• Regression is a machine learning task to predict continuous
values (real numbers), as compared to classification, that is used
to predict categorical (discrete) values.
• Bayesian Regression can be very useful when we have
insufficient data in the dataset or the data is poorly distributed.
• The output of a Bayesian Regression model is obtained from a
probability distribution.
• The aim of Bayesian Linear Regression is to find the ‘posterior’
distribution for the model parameters.
Bayesian Linear Regression

• The expression of Posterior is

• Posterior is the probability of an event to occur.

• Prior is the probability of an event that has occured previously.

• Likelihood represents the probability of observing the data given
the parameters of the model.
Bayesian Linear Regression
Baye’s Theorem:
Bayesian Linear Regression
Baye’s Theorem:
Bayesian Linear Regression
Bayesian Linear Regression
Bayesian Linear Regression
Bayesian Linear Regression
Gradient Descent
• Gradient Descent is an optimization algorithm
used to find the values of coefficients of a
function (hypothesis) that minimizes the cost
function.
• Mostly used in supervised machine learning
models for optimization of paramters in ML
models.
• The meaning of Gradient is slope of a curve and
the meaning of descent is movement to a lower
point.
• The algorithm makes use of the gradient (slope)
to reach the minimum (lowest point) of a Mean
Squared Error (MSE) function.
Gradient Descent

(2.9, 3.2)

(0.5, 1.4)
(2.3, 1.9)
Gradient Descent
Gradient Descent
Gradient Descent
Gradient Descent
Gradient Descent
Gradient Descent
Gradient Descent
Gradient Descent
Take the derivatives of the sum of squared residuals with
respect to the Intercept

Slope =
Gradient Descent
Gradient Descent
Gradient Descent
4

3.5

2.5
Height

1.5

0.5

0
0 0.5 1 1.5 2 2.5
Weight
Gradient Descent
• Types of Gradient Descent:
1. Batch Gradient Descent
2. Stochastic Gradient Descent
3. Minibatch Gradient Descent
• Batch Gradient Descent involves calculations
over the full training set.
• Stochastic Gradient Descent runs one training
example per iteration.
• Minibatch Gradient Descent divides the training
datasets into small batch sizes and performs the
updates on those batches separately.
Gradient Descent
Reminder .....
• Supervised learning has 2 types of tasks based
on the nature of the target feature (dependent
variable):
1. Regression
2. Classification

• When the target feature (dependent variable) is

CONTINUOUS, it is a REGRESSION task.

• When the target feature (dependent variable) is

DISCRETE, it is a CLASSIFICATION task.

• So far covered, linear regression models like

Least squared method, Bayesian linear
In this lecture....
• The different Linear Classification Models to
handle classification tasks are:
1. Probabilistic Discriminative Model
2. Probabilistic Generative Model
3. Maximum Margin Classifier
• Probabilistic Discriminative Model:
• Focus on the boundary between the classes
rather than on how the data of each class is
distributed.
• Model the posterior probabilities P(Y∣X)
using a parameterized function and typically
do not model the distribution of the features.
• Example: Logistic Regression (Models the
In this lecture....
• Probabilistic Generative Model:
• Unlike discriminative models, Probabilistic
Generative Models estimate how the data
are generated for each class.
• Attempts to model the joint probability
distribution P(X,Y), or separately P(X∣Y) and
P(Y) .
• Example: Naive Bayes (Assumes
independence among features given the
class label and models P(X∣Y) for each
class).
• Maximum Margin Classifier:
• Focus on finding the decision boundary that maximizes
the margin (distance) between the closest points of
Linear Discriminant Function
• The primary use of the linear discriminant
function is to classify data points into two or
more classes.

• It achieves this by creating a decision boundary

that separates different classes in a dataset.

• For example, in a binary classification problem,

the function can predict whether an email is
spam or not based on features derived from
the email's content.

• The equation for a linear discriminant function

in a 2D space is essentially a line equation that
separates two different classes of data.
Linear Discriminant Function
• This line can be described by the general linear
equation:
+ =0
Where:
• and are the input features (the coordinates in
the 2D space),
• and are the weights assigned to these
features, and
• is the bias term.
• In the context of a linear discriminant function,
and represent the coefficients that define the
orientation of the line in the 2D space, and (b)
shifts the line away from the origin.
Linear Discriminant Function
• This line forms the decision boundary that
discriminates between the two classes.
+ =0

• Points on one side of the line are classified into

one category, and points on the other side are
classified into another.

• For easier interpretation, the line equation can

also be converted into the slope-intercept form,
which is more familiar in basic algebra:

Where corresponds to , corresponds to , is the

slope, and is the y-intercept.
Linear Discriminant Function

• By rearranging the line equation for 2D input

feature space:
+ =0

()

• Here, the y-intercept and the slope .

• For a d-dimensional space, the hyperplane

equation will be :
+ + ..... + = 0
Linear Discriminant Function

• [ ...... ]

• Therefore the linear discriminant function is X

Example: + - 1 = 0

[-1 1 1] =0
Perceptron Algoritham
• Perceptron is an example of linear discriminant
function.

• The core purpose of the perceptron is to

classify input data into one of two categories
(binary classification).

• Preceptron uses generalized linear model with

an activation function.

• This activation function is a step function from

-1 to 1 of the form.
Example for Perceptron Algoritham
• Find the weights required to perform the
following classification using Perceptron
network.
• The vectors (1,1,1,1) and (-1,1,-1,-1) are
belong to class 1, vectors (1,1,1,-1) and (1,-1,-
1,1) are belong to class -1.
• Assume the learning rate as 1 and initial
weights as 0.
Example for Perceptron Algoritham
Example for Perceptron Algoritham
Example for Perceptron Algoritham EPOCH-1
Example for Perceptron Algoritham EPOCH-2
Example for Perceptron Algoritham EPOCH-2
Example for Perceptron Algoritham EPOCH-3
Example for Perceptron Algoritham EPOCH-3
Example for Perceptron Algoritham
Probabilistic Discriminative Model
• Discriminative models
learn about the
boundary between
classes within a
dataset.

• Discriminative models
excel at classification
tasks by effectively
distinguishing
•between
Discriminativedifferent
models set out to answer the
classes.
following question:
“What side of the decision boundary is this
data found in?”
Logistic Regression
• Linear regression predicts
the numerical response
but is not suitable for
predicting the categorical
values.

• When categorical
variables are involved, it
is called classification
problem and logistic
regression is suitable for
binary classification
problem.

• For example, logistic

Logistic Regression
• For example, What if
the organization wants
to know whether an
employee would get a
promotion or not
based on their
performance?

• The linear graph won’t

be suitable for this and
Sigmoid Curve is
suitable for this.

• Based on the threshold

values, the
Logistic Regression
• Odds of Success = Odds(θ)=

• Odds(θ)=

• Consider the equation of the straight

line:

• Now to predict the odds of success, we

log
take log on odds formula
Logistic Regression
• Exponentiating on both sides,
we have
• + Y() = Y

• Since , • (1+Y) = Y

• =
• Let Y = ) • =
• Then, Y • =
• = Y*()

• = Y - Y()
Types of Logistic Regression
• There are three types of Logistic Regressions:
1. Binary logistic regression
2. Multinomial logistic regression
3. Ordinal logistic regression

• Binary logistic regression:

It’s an either/or solution. There are just two

possible outcome answers. This concept is
typically represented as a 0 or a 1 in coding.
Examples include: classifying an object as an
animal or not an animal.
Types of Logistic Regression
• Multinomial logistic regression

It is a model where there are multiple

classes (set of three or more classes) that an
item can be classified as. Examples include:
Classifying texts into what language they come
from.
• Ordinal logistic regression

Ordinal logistic regression is also a model

where there are multiple classes that an item
can be classified as; however, in this case an
ordering of classes is required. Examples
include: Ranking restaurants on a scale of 0 to 5
Probabilistic Generative Model
• The Generative model learns a probability
distribution for the dataset, it can reference
this probability distribution to generate new
data instances.
• Generative models often rely on Bayes
theorem to find the joint probability, finding
p(x,y).
• Generative models model how the data was
generated, and answers the following
question:
“What’s the likelihood that this class or
another class generated this data
Naive Bayes Classifier

• The following equation is used to calculate the

posterior probability of each and every hypothesis
and the one hypothesis that give the maximum
value is the solution:
Naive Bayes Classifier
Naive Bayes Classifier
Naive Bayes Classifier
Support Vector Machine
Support Vector Machine
Support Vector Machine
Support Vector Machine
Support Vector Machine
Support Vector Machine
Support Vector Machine
Support Vector Machine
Support Vector Machine
Decision Tree
• A decision tree is a type of supervised learning
algorithm that is commonly used in machine
learning to model and predict outcomes based
on input data.
• It is a tree-like structure where each internal
node tests on attribute, each branch
corresponds to attribute value and each leaf
node represents the final decision or prediction.
• The decision tree algorithm falls under the
category of supervised learning. They can be
used to solve both regression and classification
problems.
Decision Tree

• The goal of machine learning is to decrease

uncertainty or disorders from the dataset and
for this, decision trees are used.
• The concepts like entropy, information gain,
and Gini index are involved in decision tree to
Decision Tree
• Entropy is the measure of uncertainty of a
random variable in our dataset or measure of
disorder. The higher the entropy more the
information content.

• Information gain measures the reduction of

uncertainty by entropy.
Decision Tree
• The Gini Index is a measure of the inequality or
impurity of a distribution, commonly used in
decision trees and other machine learning
algorithms. It ranges from 0 to 0.5, where 0
indicates a pure set (all instances belong to the
same class), and 0.5 indicates a maximally
impure set (instances are evenly distributed
across classes).
Decision Tree
Decision Tree
Decision Tree
Decision Tree
Decision Tree
Decision Tree
Decision Tree
Decision Tree
Decision Tree
Decision Tree
Decision Tree
Decision Tree
ID3 Decision Tree Learning
ID3 Decision Tree Learning
ID3 Decision Tree Learning
ID3 Decision Tree Learning
ID3 Decision Tree Learning
ID3 Decision Tree Learning
ID3 Decision Tree Learning
ID3 Decision Tree Learning
ID3 Decision Tree Learning
ID3 Decision Tree Learning
ID3 Decision Tree Learning
ID3 Decision Tree Learning
ID3 Decision Tree Learning
ID3 Decision Tree Learning
ID3 Decision Tree Learning
ID3 Decision Tree Learning
ID3 Decision Tree Learning
ID3 Decision Tree Learning
ID3 Decision Tree Learning
ID3 Decision Tree Learning
ID3 Decision Tree Learning
Random Forest
Random Forest
Random Forest
Random Forest
Random Forest

Boylestad 11th Edition - Solman
25% (4)
Boylestad 11th Edition - Solman
25 pages
ML Unit 2
No ratings yet
ML Unit 2
21 pages
Unit 2 Machine Learning
No ratings yet
Unit 2 Machine Learning
32 pages
Wk05 Machine Learning
No ratings yet
Wk05 Machine Learning
6 pages
AutoCAD Drawing Commands
No ratings yet
AutoCAD Drawing Commands
9 pages
Ch-2 Supervised Machine Learning
No ratings yet
Ch-2 Supervised Machine Learning
48 pages
Worksheet 4.1: Linear Inequalities in Two Unknowns
No ratings yet
Worksheet 4.1: Linear Inequalities in Two Unknowns
28 pages
Cp4252 ML Unit-II
No ratings yet
Cp4252 ML Unit-II
44 pages
Regression
No ratings yet
Regression
45 pages
Smart Cities
No ratings yet
Smart Cities
6 pages
Applied Ethics
No ratings yet
Applied Ethics
5 pages
Worth1000 Photoshop Tutorials
100% (3)
Worth1000 Photoshop Tutorials
315 pages
Valli A A Compact Course On Linear Pdes
No ratings yet
Valli A A Compact Course On Linear Pdes
267 pages
Log
No ratings yet
Log
24 pages
ML 2
No ratings yet
ML 2
155 pages
Unit 2 ML - Ver 2
No ratings yet
Unit 2 ML - Ver 2
129 pages
Unit I
No ratings yet
Unit I
14 pages
Unit 2 - ML - SRM
No ratings yet
Unit 2 - ML - SRM
89 pages
ML - LAB - BE CSE (DS) Final
No ratings yet
ML - LAB - BE CSE (DS) Final
110 pages
User Manual: ATEQ D570
No ratings yet
User Manual: ATEQ D570
120 pages
Tutorial On Kalman Filter
No ratings yet
Tutorial On Kalman Filter
47 pages
Machine Learning
No ratings yet
Machine Learning
87 pages
Understanding The Geometry of Predictive Models: Workshop at S P Jain School Institute of Management and Research
No ratings yet
Understanding The Geometry of Predictive Models: Workshop at S P Jain School Institute of Management and Research
78 pages
What Are Linear Models in Machine Learning (1) .Docx (Unit3 ML)
No ratings yet
What Are Linear Models in Machine Learning (1) .Docx (Unit3 ML)
60 pages
Progression Linaire
No ratings yet
Progression Linaire
187 pages
2EL1730 ML Lecture02 Linear and Logistic Regression
No ratings yet
2EL1730 ML Lecture02 Linear and Logistic Regression
65 pages
Unit 2 - ML - SRM
No ratings yet
Unit 2 - ML - SRM
66 pages
AI ML 3 Updated
No ratings yet
AI ML 3 Updated
34 pages
Unit 2&3 - 250421 - 215911
No ratings yet
Unit 2&3 - 250421 - 215911
60 pages
AI Lec 2
No ratings yet
AI Lec 2
49 pages
Module3-Fitting A Model To Data
No ratings yet
Module3-Fitting A Model To Data
57 pages
Machine Leraning Unit 2
No ratings yet
Machine Leraning Unit 2
62 pages
Empowering Technology Lesson2
No ratings yet
Empowering Technology Lesson2
54 pages
Day.9 SML
No ratings yet
Day.9 SML
23 pages
Fileml
No ratings yet
Fileml
54 pages
Final ML
No ratings yet
Final ML
54 pages
Cs3491 Aiml Unit 3 Qbank
No ratings yet
Cs3491 Aiml Unit 3 Qbank
50 pages
Bayesian Linear Regression For Posterior Predictive Distribution MATLAB
No ratings yet
Bayesian Linear Regression For Posterior Predictive Distribution MATLAB
46 pages
ML Unit-2 Final
No ratings yet
ML Unit-2 Final
32 pages
03 Linear Models
No ratings yet
03 Linear Models
46 pages
02-Linear Regression
No ratings yet
02-Linear Regression
17 pages
W8-Supervised Learning Methods
No ratings yet
W8-Supervised Learning Methods
30 pages
Module 3.1
No ratings yet
Module 3.1
25 pages
ML Unit Ii
No ratings yet
ML Unit Ii
30 pages
Jntuk Machine Learning 3-2 Unit-2
No ratings yet
Jntuk Machine Learning 3-2 Unit-2
47 pages
TESY Heating 2018 EN v3.0 2018 PDF
No ratings yet
TESY Heating 2018 EN v3.0 2018 PDF
36 pages
Linear-Regression ML
No ratings yet
Linear-Regression ML
36 pages
Ai ML 3
No ratings yet
Ai ML 3
27 pages
Unit 2 Supervised Learning and Applications
No ratings yet
Unit 2 Supervised Learning and Applications
13 pages
Week 1 Lecture Notes
No ratings yet
Week 1 Lecture Notes
7 pages
Lec 3-5 (Function Approximation)
No ratings yet
Lec 3-5 (Function Approximation)
34 pages
04 LinearModels
No ratings yet
04 LinearModels
28 pages
Supervised and Unsupervised Learning Algorithm-2
No ratings yet
Supervised and Unsupervised Learning Algorithm-2
52 pages
Knight's Tour
No ratings yet
Knight's Tour
8 pages
Joshuahuntresume 1
No ratings yet
Joshuahuntresume 1
2 pages
ML Lecture - 3
No ratings yet
ML Lecture - 3
47 pages
Lec9 - Linear Models
No ratings yet
Lec9 - Linear Models
44 pages
16631271278
No ratings yet
16631271278
12 pages
Linear Regression
No ratings yet
Linear Regression
18 pages
Oop Lab Report 5-6
No ratings yet
Oop Lab Report 5-6
5 pages
Linear Regression
No ratings yet
Linear Regression
36 pages
DB6CONV 640 v47
No ratings yet
DB6CONV 640 v47
50 pages
Handout 1
No ratings yet
Handout 1
5 pages
Lecture 3 - Linear Regression
No ratings yet
Lecture 3 - Linear Regression
31 pages
Camara Linksys WVC2300 Wireless
No ratings yet
Camara Linksys WVC2300 Wireless
80 pages
Excel Cad
No ratings yet
Excel Cad
8 pages
Unit Iii
No ratings yet
Unit Iii
18 pages
Sample Course - Answer-Booklet
No ratings yet
Sample Course - Answer-Booklet
20 pages
Aiml Unit 3
No ratings yet
Aiml Unit 3
9 pages
PRu 4
No ratings yet
PRu 4
13 pages
AI & ML Unit 3 Notes
No ratings yet
AI & ML Unit 3 Notes
20 pages
BSA Question (November 2024)
No ratings yet
BSA Question (November 2024)
2 pages
Corning 144F (12x12) Armoured
No ratings yet
Corning 144F (12x12) Armoured
4 pages
Razberi User Manual (Razberi VMS)
No ratings yet
Razberi User Manual (Razberi VMS)
34 pages
Week 9 - PROG 8510 Week 9
No ratings yet
Week 9 - PROG 8510 Week 9
27 pages
Magel Is
No ratings yet
Magel Is
40 pages
Data Scientist Gaurav 3-1 01-Oct-22 10.22.29-1
No ratings yet
Data Scientist Gaurav 3-1 01-Oct-22 10.22.29-1
3 pages
ML:Introduction: Week 1 Lecture Notes
No ratings yet
ML:Introduction: Week 1 Lecture Notes
10 pages
CA ERwin Tutorial
No ratings yet
CA ERwin Tutorial
12 pages
Linear Regression
No ratings yet
Linear Regression
11 pages
ML:Introduction: Week 1 Lecture Notes
No ratings yet
ML:Introduction: Week 1 Lecture Notes
5 pages
Resume Francesco Rene Loli
No ratings yet
Resume Francesco Rene Loli
2 pages
Huawei WISP Solution v2.0
No ratings yet
Huawei WISP Solution v2.0
27 pages
Chapter 6 Supervised Learning
No ratings yet
Chapter 6 Supervised Learning
6 pages
Pana Bežični Manujal kx-tcd150FX
No ratings yet
Pana Bežični Manujal kx-tcd150FX
77 pages
LinearRegression PDF
No ratings yet
LinearRegression PDF
4 pages
ML Summary PDF
No ratings yet
ML Summary PDF
5 pages
Introduction to Advanced Mathematical Analysis
From Everand
Introduction to Advanced Mathematical Analysis
Simone Malacrida
No ratings yet
Exercises of Multi-Variable Functions
From Everand
Exercises of Multi-Variable Functions
Simone Malacrida
No ratings yet
Correlation and Regression: Six Sigma Thinking, #8
From Everand
Correlation and Regression: Six Sigma Thinking, #8
Sumeet Savant
5/5 (1)

Unit 2 ML - Ver 2

Uploaded by

Unit 2 ML - Ver 2

Uploaded by

Machine

Unit - 2 Supervised Learning

Linear Regression Models

• Simple linear regression is the simplest form of linear regression,

• Multiple linear regression involves more than one independent

• The expression of Posterior is

• Posterior is the probability of an event to occur.

• Prior is the probability of an event that has occured previously.

• When the target feature (dependent variable) is

• When the target feature (dependent variable) is

• So far covered, linear regression models like

• It achieves this by creating a decision boundary

• For example, in a binary classification problem,

• The equation for a linear discriminant function

• Points on one side of the line are classified into

• For easier interpretation, the line equation can

Where corresponds to , corresponds to , is the

• By rearranging the line equation for 2D input

• Here, the y-intercept and the slope .

• For a d-dimensional space, the hyperplane

• Therefore the linear discriminant function is X

• The core purpose of the perceptron is to

• Preceptron uses generalized linear model with

• This activation function is a step function from

• For example, logistic

• The linear graph won’t

• Based on the threshold

• Consider the equation of the straight

• Now to predict the odds of success, we

• Binary logistic regression:

It’s an either/or solution. There are just two

It is a model where there are multiple

Ordinal logistic regression is also a model

• The following equation is used to calculate the

• The goal of machine learning is to decrease

• Information gain measures the reduction of

You might also like