Supervised Learning Algorithms

The document provides an overview of supervised learning, focusing on regression algorithms used for predicting continuous variables. It explains the concepts of linear regression, including simple and multiple linear regression, and discusses their applications in real-world scenarios such as student performance analysis and agricultural yield prediction. Additionally, it covers key terminologies, the importance of finding the best fit line, and methods for evaluating model performance, including R-squared and Mean Squared Error.

Uploaded by

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

19 views20 pages

Supervised Learning Algorithms

Uploaded by

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 20

Supervised Learning

Algorithms
By : Dr Sonali Vyas
UPES
Introduction
• The supervised learning models are trained using the labeled dataset.
Once the training and processing are done, the model is tested by
providing a sample test data to check whether it predicts the correct
output.
• Supervised learning is a process of providing input data as well as
correct output data to the machine learning model. The aim of a
supervised learning algorithm is to find a mapping function to map the
input variable(x) with the output variable(y).
• In the real-world, supervised learning can be used for Risk
Assessment, Image classification, Fraud Detection, spam filtering, etc.
Regression
Regression algorithms are used if there is a relationship between the
input variable and the output variable. It is used for the prediction of
continuous variables, such as Weather forecasting, Market Trends,
etc.
Regression Analysis in Machine learning
• Regression analysis is a statistical method to model the relationship
between a dependent (target) and independent (predictor) variables with
one or more independent variables.
• It predicts continuous/real values such as temperature, age, salary, price,
etc.
• Regression shows a line or curve that passes through all the datapoints on
target-predictor graph in such a way that the vertical distance between the
datapoints and the regression line is minimum.
• Example:
• Prediction of rain using temperature and other factors
• Determining Market trends
• Prediction of road accidents due to rash driving.
Terminologies Related to the Regression Analysis:
• Dependent Variable: The main factor in Regression analysis which we want to predict
or understand is called the dependent variable. It is also called target variable.
• Independent Variable: The factors which affect the dependent variables or which are
used to predict the values of the dependent variables are called independent variable,
also called as a predictor.
• Outliers: Outlier is an observation which contains either very low value or very high
value in comparison to other observed values. An outlier may hamper the result, so it
should be avoided.
• Multicollinearity: If the independent variables are highly correlated with each other
than other variables, then such condition is called Multicollinearity. It should not be
present in the dataset, because it creates problem while ranking the most affecting
variable.
• Underfitting and Overfitting: If our algorithm works well with the training dataset but
not well with test dataset, then such problem is called Overfitting. And if our algorithm
does not perform well even with training dataset, then such problem is called
underfitting.
Linear Regression
• Linear regression is a statistical method that is used for predictive analysis.
• Linear regression makes predictions for continuous/real or numeric variables such
as sales, salary, age, product price, etc.
• It assumes a linear relationship between a dependent variable (Y-axis) and one or
more independent variables (X-axis). The relationship is represented by the
equation: Y = a + bX

• Y is the dependent variable.

• X is the independent variable.
• a is the intercept (the value of Y when X=0X = 0X=0).
• b is the slope of the line (the rate of change in Y with respect to X).
Types of Linear Regression
Linear regression can be further divided into two types of the algorithm:
• Simple Linear Regression:
If a single independent variable is used to predict the value of a numerical
dependent variable, then such a Linear Regression algorithm is called
Simple Linear Regression.
Equation:
• Y = β0+ β1X

where:
Y is the dependent variable

X is the independent variable

β0 is the intercept

β1 is the slope
USE CASE:
• In a case study evaluating student performance analysts use simple
linear regression to examine the relationship between study hours and
exam scores. By collecting data on the number of hours students
studied and their corresponding exam results the analysts developed a
model that reveal correlation, for each additional hour spent studying,
student's exam scores increased by an average of 5 points. This case
highlights the utility of simple linear regression in understanding and
improving academic performance.
• Multiple Linear regression:

If more than one independent variable is used to predict the value of a numerical
dependent variable, then such a Linear Regression algorithm is called Multiple
Linear Regression. The goal of the algorithm is to find the best Fit Line equation that
can predict the values based on the independent variables.
The equation for multiple linear regression is:

y=β0+β1X1+β2X2+………βnXny=β0+β1X1+β2X2+………βnXn

where:
• Y is the dependent variable
• X1, X2, …, Xn are the independent variables
• β0 is the intercept
• β1, β2, …, βn are the slopes
USE CASE
• Agricultural Yield Prediction: Farmers can use MLR to estimate
crop yields based on several variables like rainfall, temperature, soil
quality and fertilizer usage. This information helps in planning
agricultural practices for optimal productivity.

• E-commerce Sales Analysis: An e-commerce company can utilize

MLR to assess how various factors such as product price, marketing
promotions and seasonal trends impact sales.
Linear Regression Line

A regression line can show two types of

relationship:
• Positive Linear Relationship: If the dependent
variable increases on the Y-axis and independent
variable increases on X-axis, then such a
relationship is termed as a Positive linear
relationship.
• Negative Linear Relationship: If the
dependent variable decreases on the Y-axis
and independent variable increases on the
X-axis, then such a relationship is called a
negative linear relationship.
Finding the best fit line

• When working with linear regression, our main goal is to find the best
fit line that means the error between predicted values and actual values
should be minimized. The best fit line will have the least error.
• The different values for weights or the coefficient of lines (a0, a1)
gives a different line of regression, so we need to calculate the best
values for a0 and a1 to find the best fit line, so to calculate this we use
cost function.
• For Linear Regression, we use the Mean Squared Error (MSE) cost function, which is
the average of squared error occurred between the predicted values and actual values. It
can be written as:
• For the above linear equation, MSE can be calculated as:

Where,
• N=Total number of observation
• Yi = Actual value
• (a1xi+a0)= Predicted value.

Residuals: The distance between the actual value and predicted values is called residual.
If the observed points are far from the regression line, then the residual will be high, and
so cost function will high. If the scatter points are close to the regression line, then the
residual will be small and hence the cost function.
• Gradient Descent : A linear regression model can be trained using the
optimization algorithm Gradient Descent by iteratively modifying the model’s
parameters to reduce the mean squared error (MSE) of the model on a training
dataset. The idea is to start with random θ1 and θ2 values and then iteratively
update the values, reaching minimum cost.
Finding the coefficients of a linear equation that best fits the training data is the
objective of linear regression. By moving in the direction of the Mean Squared Error
negative gradient with respect to the coefficients, the coefficients can be changed. And
the respective intercept and coefficient of X will be if α is the learning rate.
R-squared Method
• R-squared is a statistical method that determines the goodness of fit.
• It measures the strength of the relationship between the dependent and
independent variables on a scale of 0-100%.
• The high value of R-square determines the less difference between the
predicted values and actual values and hence represents a good model.
• It is also called a coefficient of determination, or coefficient of multiple
determination for multiple regression.
• It can be calculated from the below formula:

Tensorflow Playground:: Exercise 2
No ratings yet
Tensorflow Playground:: Exercise 2
2 pages
ML Unit-2
No ratings yet
ML Unit-2
123 pages
Data Science
100% (1)
Data Science
14 pages
L4a - Supervised Learning
No ratings yet
L4a - Supervised Learning
25 pages
Unit 2
No ratings yet
Unit 2
136 pages
Linear Regression
No ratings yet
Linear Regression
24 pages
ML Unit Ii
No ratings yet
ML Unit Ii
30 pages
ML - Module 2
No ratings yet
ML - Module 2
16 pages
Machine Learning and Deep Learning Course
No ratings yet
Machine Learning and Deep Learning Course
23 pages
Linear Regression
No ratings yet
Linear Regression
11 pages
Unit-3 Part 2 DA
No ratings yet
Unit-3 Part 2 DA
20 pages
Regression: Unit Iii
No ratings yet
Regression: Unit Iii
54 pages
Day.9 SML
No ratings yet
Day.9 SML
23 pages
Regression Analysis in Machine Learning
No ratings yet
Regression Analysis in Machine Learning
13 pages
18-Linear Regression
No ratings yet
18-Linear Regression
29 pages
What Are Linear Models in Machine Learning (1) .Docx (Unit3 ML)
No ratings yet
What Are Linear Models in Machine Learning (1) .Docx (Unit3 ML)
60 pages
Ch-2 Supervised Machine Learning
No ratings yet
Ch-2 Supervised Machine Learning
48 pages
Unit 2
No ratings yet
Unit 2
19 pages
Linear Regression
No ratings yet
Linear Regression
49 pages
Unit 2
No ratings yet
Unit 2
100 pages
Unit - 3 Machine Learning
No ratings yet
Unit - 3 Machine Learning
30 pages
Types of Supervised Learning2
No ratings yet
Types of Supervised Learning2
66 pages
Machine Learning: Bilal Khan
100% (2)
Machine Learning: Bilal Khan
20 pages
Simple Linear Regression With Example Problem
No ratings yet
Simple Linear Regression With Example Problem
12 pages
Teit ML2
No ratings yet
Teit ML2
11 pages
AI - Mod 5. Part 3
No ratings yet
AI - Mod 5. Part 3
26 pages
Unit - 3 - ML - 24
No ratings yet
Unit - 3 - ML - 24
41 pages
ML U2 Regression
No ratings yet
ML U2 Regression
20 pages
Linear Regression
100% (1)
Linear Regression
8 pages
Linear Regression
No ratings yet
Linear Regression
9 pages
Unit 2
No ratings yet
Unit 2
67 pages
CSL0777 L12
No ratings yet
CSL0777 L12
18 pages
ML Unit3b
No ratings yet
ML Unit3b
175 pages
Unit-4 DS Student
No ratings yet
Unit-4 DS Student
43 pages
DA unit-III
No ratings yet
DA unit-III
30 pages
Linear Regression
No ratings yet
Linear Regression
36 pages
4 ML
No ratings yet
4 ML
41 pages
AAI Lecture 10 SP 25
No ratings yet
AAI Lecture 10 SP 25
37 pages
Machine Learning Notes
No ratings yet
Machine Learning Notes
12 pages
ML Unit 2
No ratings yet
ML Unit 2
27 pages
OE-ML Unit - 3
No ratings yet
OE-ML Unit - 3
29 pages
Unit I
No ratings yet
Unit I
14 pages
Predictive Analytics
No ratings yet
Predictive Analytics
46 pages
Regression v33
No ratings yet
Regression v33
81 pages
Linear Regression - Everything You Need To Know About Linear Regression
No ratings yet
Linear Regression - Everything You Need To Know About Linear Regression
17 pages
ML PR-2
No ratings yet
ML PR-2
11 pages
Unit - Iii Data Analysis
No ratings yet
Unit - Iii Data Analysis
39 pages
Module 5.2
No ratings yet
Module 5.2
51 pages
Linear Regression
No ratings yet
Linear Regression
7 pages
Lecture 9-10
No ratings yet
Lecture 9-10
28 pages
Unit 3 Notes
No ratings yet
Unit 3 Notes
33 pages
Regression Analysis in Machine Learning
No ratings yet
Regression Analysis in Machine Learning
26 pages
Linear Regression
No ratings yet
Linear Regression
89 pages
Chapter - 2 - Linear and Logistic Regression
No ratings yet
Chapter - 2 - Linear and Logistic Regression
34 pages
Machine Learning
No ratings yet
Machine Learning
62 pages
Linear Regression
No ratings yet
Linear Regression
18 pages
Regression
No ratings yet
Regression
11 pages
Lecture 2: Lower Bounds For Randomized Algorithms
No ratings yet
Lecture 2: Lower Bounds For Randomized Algorithms
10 pages
Junaid (Quick Sort)
No ratings yet
Junaid (Quick Sort)
14 pages
Review Paper On Real Time Image Processing: Methods, Techniques, Applications
No ratings yet
Review Paper On Real Time Image Processing: Methods, Techniques, Applications
22 pages
Introduction To Machine Learning - Unit 4 - Week 2
100% (1)
Introduction To Machine Learning - Unit 4 - Week 2
3 pages
Implementation of FIR Filters On FPGA
No ratings yet
Implementation of FIR Filters On FPGA
73 pages
Crypto Midsem
No ratings yet
Crypto Midsem
1 page
Assignment Model
No ratings yet
Assignment Model
17 pages
Term-2 Practical List Computer Science With Python (083) : Class XI
No ratings yet
Term-2 Practical List Computer Science With Python (083) : Class XI
1 page
Design and Implementation of Proportional Integral Observer Based Linear Model Predictive Controller
No ratings yet
Design and Implementation of Proportional Integral Observer Based Linear Model Predictive Controller
8 pages
HO6 HW2Solutions
No ratings yet
HO6 HW2Solutions
11 pages
Spanning Trees: Spanning Trees: A Subgraph T of A Undirected Graph G (V, E) Is A Spanning Tree of G If It Is A
No ratings yet
Spanning Trees: Spanning Trees: A Subgraph T of A Undirected Graph G (V, E) Is A Spanning Tree of G If It Is A
4 pages
Event-Driven Scheduling: (Closely Following Jane Liu S Book)
No ratings yet
Event-Driven Scheduling: (Closely Following Jane Liu S Book)
46 pages
GENED Math-Drills
No ratings yet
GENED Math-Drills
2 pages
Green University of Bangladesh Department of Computer Science and Engineering (CSE)
No ratings yet
Green University of Bangladesh Department of Computer Science and Engineering (CSE)
4 pages
EC - Module 4 Digital Communication
No ratings yet
EC - Module 4 Digital Communication
34 pages
Stanford University, Structural Health Monitoring in Extreme Events From Machine Learning Perspective
No ratings yet
Stanford University, Structural Health Monitoring in Extreme Events From Machine Learning Perspective
5 pages
Lecture-1 CFD
No ratings yet
Lecture-1 CFD
11 pages
Lecture O03: ENGR90024 Computational Fluid Dynamics
No ratings yet
Lecture O03: ENGR90024 Computational Fluid Dynamics
43 pages
Kmean
No ratings yet
Kmean
4 pages
PHD Thesis
No ratings yet
PHD Thesis
200 pages
NonLinear Equations
No ratings yet
NonLinear Equations
40 pages
Insertion-Sort 9
No ratings yet
Insertion-Sort 9
5 pages
Machine Learning Report
No ratings yet
Machine Learning Report
28 pages
DBMS Unit-5
No ratings yet
DBMS Unit-5
42 pages
Digital Signal Processing Digital Signal Processing: DSP Lab Manual DSP Lab Manual
No ratings yet
Digital Signal Processing Digital Signal Processing: DSP Lab Manual DSP Lab Manual
37 pages
Numerical Error Analysis
No ratings yet
Numerical Error Analysis
16 pages
The Routh Stability Criterion
No ratings yet
The Routh Stability Criterion
9 pages
Chapter Two - PPTX Final
No ratings yet
Chapter Two - PPTX Final
61 pages

Supervised Learning Algorithms

Uploaded by

Supervised Learning Algorithms

Uploaded by

Supervised Learning

• Y is the dependent variable.

X is the independent variable

• E-commerce Sales Analysis: An e-commerce company can utilize

A regression line can show two types of

You might also like