0% found this document useful (0 votes)

65 views25 pages

L4a - Supervised Learning

Supervised Learning This document discusses supervised learning techniques for regression analysis, including linear regression, polynomial regression, and logistic regression. Linear regression models the relationship between a dependent variable and one or more independent variables to make continuous predictions. Polynomial regression extends linear regression to model nonlinear relationships. Logistic regression is used for classification problems to predict binary outcomes like true/false using probabilities calculated from predictor variables.

Uploaded by

Kinya Kageni

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

65 views25 pages

L4a - Supervised Learning

Uploaded by

Kinya Kageni

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 25

Supervised Learning

Regression Analysis in Machine

learning
Learning Objectives
• Linear Regression
• Polynomial Regression
Introduction

• Regression analysis is a statistical method to model the relationship

between a dependent (target) and independent (predictor) variables with
one or more independent variables.
• Regression analysis helps us to understand how the value of the
dependent variable is changing corresponding to an independent variable
when other independent variables are held fixed.
• It predicts continuous/real values such as temperature, age, salary,
price, etc.
• Regression is a supervised learning technique which helps in finding the
correlation between variables and enables us to predict the continuous
output variable based on the one or more predictor variables. It is mainly
used for prediction, forecasting, time series modeling, and determining
the causal-effect relationship between variables.
Introduction

• In Regression, we plot a graph between the variables which best fits the given
datapoints, using this plot, the machine learning model can make predictions
about the data.
• In simple words, "Regression shows a line or curve that passes through all
the datapoints on target-predictor graph in such a way that the vertical
distance between the datapoints and the regression line is minimum."
• The distance between datapoints and line tells whether a model has captured
a strong relationship or not.
• Some examples of regression can be as:
– Prediction of rain using temperature and other factors
– Determining Market trends
– Prediction of road accidents due to rash driving.
Example 1: Regression
• Suppose there is a marketing company A, who does various advertisement
every year and get sales on that. The below list shows the advertisement
made by the company in the last 5 years and the corresponding sales:

• Now, the company wants to do the advertisement of $200 in the year

2019 and wants to know the prediction about the sales for this year. So
to solve such type of prediction problems in machine learning, we need
regression analysis.
Terminologies Related to the Regression Analysis:

• Dependent Variable: The main factor in Regression analysis which we want to predict

or understand is called the dependent variable. It is also called target variable.
• Independent Variable: The factors which affect the dependent variables or which are
used to predict the values of the dependent variables are called independent variable,
also called as a predictor.
• Outliers: Outlier is an observation which contains either very low value or very high
value in comparison to other observed values. An outlier may hamper the result, so it
should be avoided.
• Multicollinearity: If the independent variables are highly correlated with each other
than other variables, then such condition is called Multicollinearity. It should not be
present in the dataset, because it creates problem while ranking the most affecting
variable.
• Underfitting and Overfitting: If our algorithm works well with the training dataset but
not well with test dataset, then such problem is called Overfitting. And if our algorithm
does not perform well even with training dataset, then such problem is
called underfitting.
Why do we use Regression Analysis?

• Below are some reasons for using Regression analysis:

1. Regression estimates the relationship between the target and the
independent variable.
2. It is used to find the trends in data.
3. It helps to predict real/continuous values.
4. By performing the regression, we can confidently determine
the most important factor, the least important factor, and how
each factor is affecting the other factors.
Types of Regression
• There are various types of regressions which are used in data science and
machine learning.
• Each type has its own importance on different scenarios, but at the core,
all the regression methods analyze the effect of the independent variable
on dependent variables.
• Here we are discussing some important types of regression which are
given below:
Linear Regression
Introduction to Linear Regression

• Linear regression is a statistical regression method which is used for

predictive analysis.
• It is one of the very simple and easy algorithms which works on
regression and shows the relationship between the continuous
variables.
• It is used for solving the regression problem in machine learning.
• Linear regression shows the linear relationship between the
independent variable (X-axis) and the dependent variable (Y-axis),
hence called linear regression.
• If there is only one input variable (x), then such linear regression is
called simple linear regression. And if there is more than one input
variable, then such linear regression is called multiple linear regression.
A. Simple Linear Regression
• Simple Linear Regression is a type of Regression algorithms that models
the relationship between a dependent variable and a single independent
variable.
• The relationship shown by a Simple Linear Regression model is linear or a
sloped straight line, hence it is called Simple Linear Regression.
• The key point in Simple Linear Regression is that the dependent variable
must be a continuous/real value. However, the independent variable can
be measured on continuous or categorical values.
• Simple Linear regression algorithm has mainly two objectives:
1. Model the relationship between the two variables. Such as the relationship
between Income and expenditure, experience and Salary, etc.
2. Forecasting new observations. Such as Weather forecasting according to
temperature, Revenue of a company according to the investments in a year, etc.
A. Simple Linear Regression
Simple Linear Regression Model:
• The Simple Linear Regression model can be represented
using the below equation:
y= a0+a1x+ ε
• Where,
• a0= It is the intercept of the Regression line (can be
obtained putting x=0)
a1= It is the slope of the regression line, which tells
whether the line is increasing or decreasing.
ε = The error term. (For a good model it will be negligible)
B. Multiple Linear Regression (MLR)
• In the previous topic, we have learned about Simple Linear Regression, where a
single Independent/Predictor(X) variable is used to model the response variable (Y).
But there may be various cases in which the response variable is affected by more
than one predictor variable; for such cases, the Multiple Linear Regression algorithm
is used.
• Moreover, Multiple Linear Regression is an extension of Simple Linear regression as it
takes more than one predictor variable to predict the response variable. We can
define it as:
• Multiple Linear Regression is one of the important regression algorithms which
models the linear relationship between a single dependent continuous variable and
more than one independent variable.
• Some key points about MLR:
1. For MLR, the dependent or target variable(Y) must be the continuous/real, but the
predictor or independent variable may be of continuous or categorical form.
2. Each feature variable must model the linear relationship with the dependent
variable.
3. MLR tries to fit a regression line through a multidimensional space of data-points.
MLR equation:
• In Multiple Linear Regression, the target variable(Y) is a linear combination of
multiple predictor variables x1, x2, x3, ...,xn. Since it is an enhancement of Simple
Linear Regression, so the same is applied for the multiple linear regression
equation, the equation becomes:
Y= b0+b1x1+ b2x2+ b3x3+...... Bnxn
Where,
– Y= Output/Response variable
– b0, b1, b2, b3 , bn....= Coefficients of the model.
– x1, x2, x3, x4,...= Various Independent/feature variable
Assumptions for Multiple Linear Regression:
• A linear relationship should exist between the Target and predictor variables.
• The regression residuals must be normally distributed.
• MLR assumes little or no multicollinearity (correlation between the
independent variable) in data.
Applications of Multiple Linear Regression:
• There are mainly two applications of Multiple Linear Regression:
• Effectiveness of Independent variable on prediction:
• Predicting the impact of changes:
Logistic Regression
Introduction
• Logistic regression is another supervised learning algorithm which is used to solve the classification
problems.
• In classification problems, we have dependent variables in a binary or discrete format such as 0 or 1.
• Logistic regression algorithm works with the categorical variable such as 0 or 1, Yes or No, True or
False, Spam or not spam, etc.
• It is a predictive analysis algorithm which works on the concept of probability.
• Logistic regression is a type of regression, but it is different from the linear regression algorithm in
the term how they are used.
• Linear Regression is used for solving Regression problems, whereas Logistic regression is used for
solving the classification problems.
• Logistic regression uses sigmoid function or logistic function which is a complex cost function to
model the data in logistic regression.
• The function can be represented as:

• S(x)= Output between the 0 and 1 value.

• x= input to the function
• e= base of natural logarithm.
• In Logistic regression, instead of fitting a regression line, we fit an "S"
shaped logistic function, which predicts two maximum values (0 or 1).
• A sigmoid function is a mathematical function having a characteristic
"S"-shaped curve or sigmoid curve.

• It uses the concept of threshold levels, values above the threshold level
are rounded up to 1, and values below the threshold level are rounded
up to 0.
Logistic Regression Equation:
• The Logistic regression equation can be obtained from the Linear
Regression equation.
• The mathematical steps to get Logistic Regression equations are given
below:
• We know the equation of the straight line can be written as:
y= b0+b1x1+ b2x2+ b3x3+...... Bnxn
• In Logistic Regression y can be between 0 and 1 only, so for this let's
divide the above equation by (1-y):

• But we need range between -[infinity] to +[infinity], then take

logarithm of the equation it will become:

• The above equation is the final equation for Logistic Regression.

• The curve from the logistic function indicates the likelihood
of something such as whether the cells are cancerous or
not, a mouse is obese or not based on its weight, etc.
• Logistic Regression is a significant machine learning
algorithm because it has the ability to provide probabilities
and classify new data using continuous and discrete
datasets.
• Logistic Regression can be used to classify the observations
using different types of data and can easily determine the
most effective variables used for the classification.
There are three types of logistic regression:
• Binomial Logistic regression, there can be only two possible types of
the dependent variables, such as 0 or 1, Pass or Fail, etc
• Multinomial Logistic regression, there can be 3 or more possible
unordered types of the dependent variable, such as "cat", "dogs", or
"sheep”
• Ordinal Logistic regression, there can be 3 or more possible ordered
types of dependent variables, such as "low", "Medium", or "High".
Assumptions for Logistic Regression:
• The dependent variable must be categorical in nature.
• The independent variable should not have multicollinearity.
Polynomial Regression
Introduction

• Polynomial Regression is a type of regression which models

the non-linear dataset using a linear model.
• It is similar to multiple linear regression, but it fits a non-linear
curve between the value of x and corresponding conditional
values of y.
• Suppose there is a dataset which consists of datapoints which
are present in a non-linear fashion, so for such case, linear
regression will not best fit to those datapoints. To cover such
datapoints, we need Polynomial regression.
• In Polynomial regression, the original features are transformed
into polynomial features of given degree and then modeled
using a linear model. Which means the datapoints are best fitted
using a polynomial line.
• In statistics, polynomial regression is a form of regression analysis in
which the relationship between the independent variable x and
the dependent variable y is modelled as an nth degree polynomial in x.
• Polynomial regression fits a nonlinear relationship between the value
of x and the corresponding conditional mean of y, denoted E(y |x).
• Although polynomial regression fits a nonlinear model to the data, as
a statistical estimation problem it is linear, in the sense that the regression
function E(y | x) is linear in the unknown parameters that are estimated
from the data.
– For this reason, polynomial regression is considered to be a special
case of multiple linear regression.
• However, Polynomial regression is different from Multiple Linear
regression in such a way that in Polynomial regression, a single element
has different degrees instead of multiple variables with the same degree.
• The equation for polynomial regression also derived from linear regression
equation that means Linear regression equation Y= b0+ b1x, is transformed
into Polynomial regression equation:
Y= b0+b1x+ b2x2+ b3x3+.....+ bnxn
• Here Y is the predicted/target output, b0, b1,... bn are the regression
coefficients. x is our independent/input variable.
• The model is still linear as the coefficients are still linear with quadratic
• To be discussed in future chapters
i. Ridge Regression
ii. Lasso Regression
iii. Support Vector Regression
iv. Decision Tree Regression

Regression: Unit Iii
No ratings yet
Regression: Unit Iii
54 pages
Unit 2
No ratings yet
Unit 2
100 pages
Data Science
100% (1)
Data Science
14 pages
Ch-2 Supervised Machine Learning
No ratings yet
Ch-2 Supervised Machine Learning
48 pages
Unit 2 Notes - Final
No ratings yet
Unit 2 Notes - Final
32 pages
Regression Analysis in Machine Learning
No ratings yet
Regression Analysis in Machine Learning
9 pages
Regression in M.L
No ratings yet
Regression in M.L
13 pages
Unit 2
No ratings yet
Unit 2
19 pages
Unit 2 Topic 1 REGRESSION
No ratings yet
Unit 2 Topic 1 REGRESSION
19 pages
BA3 4 5modules
No ratings yet
BA3 4 5modules
258 pages
Regression
No ratings yet
Regression
14 pages
Unit 2
No ratings yet
Unit 2
136 pages
Module 4
No ratings yet
Module 4
41 pages
DMML Unit4
No ratings yet
DMML Unit4
77 pages
Linear Regression
No ratings yet
Linear Regression
7 pages
228w1f0065 ML
No ratings yet
228w1f0065 ML
15 pages
Types of Supervised Learning2
No ratings yet
Types of Supervised Learning2
66 pages
U-4 Iml
No ratings yet
U-4 Iml
17 pages
Athul - Summer Internship Project
100% (2)
Athul - Summer Internship Project
6 pages
Unit-3 Part 2 DA
No ratings yet
Unit-3 Part 2 DA
20 pages
Regression
No ratings yet
Regression
11 pages
DA Unit-3
No ratings yet
DA Unit-3
13 pages
ML Unit 2
No ratings yet
ML Unit 2
27 pages
Unit I
No ratings yet
Unit I
14 pages
5 Regression-1
No ratings yet
5 Regression-1
46 pages
Unit - 3 Machine Learning
No ratings yet
Unit - 3 Machine Learning
30 pages
ML U2 Regression
No ratings yet
ML U2 Regression
20 pages
ML Unit-2
No ratings yet
ML Unit-2
123 pages
Unit - 2 MLA
No ratings yet
Unit - 2 MLA
57 pages
Regression Analysis in Machine Learning
No ratings yet
Regression Analysis in Machine Learning
13 pages
Supervised Learning
No ratings yet
Supervised Learning
24 pages
4 ML
No ratings yet
4 ML
41 pages
ML Using Python Unit3 PDF
No ratings yet
ML Using Python Unit3 PDF
8 pages
Wa0023.
No ratings yet
Wa0023.
22 pages
Regression Modelling
No ratings yet
Regression Modelling
25 pages
Unit 2
No ratings yet
Unit 2
67 pages
MLT Unit 2
No ratings yet
MLT Unit 2
53 pages
What Are Linear Models in Machine Learning (1) .Docx (Unit3 ML)
No ratings yet
What Are Linear Models in Machine Learning (1) .Docx (Unit3 ML)
60 pages
Regression Analysis in Machine Learning: Temperature, Age, Salary, Price
No ratings yet
Regression Analysis in Machine Learning: Temperature, Age, Salary, Price
12 pages
6 Regression Analysis
No ratings yet
6 Regression Analysis
12 pages
ML Algorithm
No ratings yet
ML Algorithm
4 pages
Unit - Iii Data Analysis
No ratings yet
Unit - Iii Data Analysis
39 pages
AAI Lecture 10 SP 25
No ratings yet
AAI Lecture 10 SP 25
37 pages
Lecture 2
No ratings yet
Lecture 2
17 pages
Regression Unit-2
No ratings yet
Regression Unit-2
5 pages
Unit 2 3 Notes
No ratings yet
Unit 2 3 Notes
16 pages
Unit 3
No ratings yet
Unit 3
45 pages
Regression: UNIT - V Regression Model
100% (1)
Regression: UNIT - V Regression Model
21 pages
18-Linear Regression
No ratings yet
18-Linear Regression
29 pages
Supervised Learning Algorithms
No ratings yet
Supervised Learning Algorithms
20 pages
Unit-4 DS Student
No ratings yet
Unit-4 DS Student
43 pages
Mod3 Eda
No ratings yet
Mod3 Eda
16 pages
Notes 2
No ratings yet
Notes 2
22 pages
Hanan
No ratings yet
Hanan
9 pages
Linear Regression
No ratings yet
Linear Regression
24 pages
DS Unit-Iv
No ratings yet
DS Unit-Iv
34 pages
Regression Analysis in Machine Learning
No ratings yet
Regression Analysis in Machine Learning
26 pages
Linear Regression
No ratings yet
Linear Regression
11 pages
Chapter 2 Alumni Management System Draft
No ratings yet
Chapter 2 Alumni Management System Draft
12 pages
Machine Learning: Bilal Khan
100% (2)
Machine Learning: Bilal Khan
20 pages
NASA/Army Rotorcraft Technology: NI/SI
No ratings yet
NASA/Army Rotorcraft Technology: NI/SI
594 pages
Interpretive Marketing Research
100% (3)
Interpretive Marketing Research
18 pages
Benchmarking and Engineering Specifications
No ratings yet
Benchmarking and Engineering Specifications
21 pages
L2 - Machine Learning Process
No ratings yet
L2 - Machine Learning Process
17 pages
Bone and Soft Tissue Augmentation in Implantology
No ratings yet
Bone and Soft Tissue Augmentation in Implantology
12 pages
MKT470 Marketing Research (MNE)
No ratings yet
MKT470 Marketing Research (MNE)
37 pages
Tourettes Syndrome
No ratings yet
Tourettes Syndrome
12 pages
Final Report of The NASA Technology Readiness Assessment (TRA) Study Team
No ratings yet
Final Report of The NASA Technology Readiness Assessment (TRA) Study Team
63 pages
O+M 4.0-EA (Energy Efficiency Best Management Practices)
No ratings yet
O+M 4.0-EA (Energy Efficiency Best Management Practices)
11 pages
Oliveira Et Al., 2016
No ratings yet
Oliveira Et Al., 2016
9 pages
Lesson 3 EAPP Oct 23
No ratings yet
Lesson 3 EAPP Oct 23
36 pages
NIS Book
No ratings yet
NIS Book
186 pages
L4b - Perfomance Evaluation Metric - Regression
No ratings yet
L4b - Perfomance Evaluation Metric - Regression
6 pages
Ijc H2 Math P2
No ratings yet
Ijc H2 Math P2
5 pages
L4b - Perfomance Evaluation Metric - Regression
No ratings yet
L4b - Perfomance Evaluation Metric - Regression
6 pages
L3 - Data Exploration Visualization and Pre-Processing
No ratings yet
L3 - Data Exploration Visualization and Pre-Processing
99 pages
External Environment 2. Natural Environment 3. Sources of Data and Techniques
No ratings yet
External Environment 2. Natural Environment 3. Sources of Data and Techniques
80 pages
Green Marketing
No ratings yet
Green Marketing
13 pages
Dechow Dan Dichev
No ratings yet
Dechow Dan Dichev
26 pages
A Comparison of The Effects of Visual Deprivation and Regular Body Weight Support Treadmill Training On Improving Over-Ground Walking of Stroke Patien
No ratings yet
A Comparison of The Effects of Visual Deprivation and Regular Body Weight Support Treadmill Training On Improving Over-Ground Walking of Stroke Patien
9 pages
3i PAPER
No ratings yet
3i PAPER
9 pages
Successful HIV Prevention Programming For HIV-Positive MSM
No ratings yet
Successful HIV Prevention Programming For HIV-Positive MSM
60 pages
Explore Machine Learning and Regression Learner Machine Learning
No ratings yet
Explore Machine Learning and Regression Learner Machine Learning
3 pages
Template 1st ICGC
No ratings yet
Template 1st ICGC
3 pages
Important Files and Document
No ratings yet
Important Files and Document
6 pages
Case Study On The Adaptive Teaching Mechanism of Subject Teacher Educators Under The Background of New Normal
No ratings yet
Case Study On The Adaptive Teaching Mechanism of Subject Teacher Educators Under The Background of New Normal
5 pages
Lab01 Assignment
No ratings yet
Lab01 Assignment
7 pages
Measuring Public Spending Preferences Using An Interactive Budgeting Questionnaire
No ratings yet
Measuring Public Spending Preferences Using An Interactive Budgeting Questionnaire
9 pages
Program Need Analysis Questionnaire For DEP Program
100% (1)
Program Need Analysis Questionnaire For DEP Program
7 pages
Computers and Geotechnics: Ning Luo, Richard J. Bathurst, Sina Javankhoshdel
No ratings yet
Computers and Geotechnics: Ning Luo, Richard J. Bathurst, Sina Javankhoshdel
11 pages
SIT215 - PBL Task 2 - A Knights Tour
No ratings yet
SIT215 - PBL Task 2 - A Knights Tour
3 pages
BIAS Bhimtal Newsletter
No ratings yet
BIAS Bhimtal Newsletter
4 pages
A Conversation About Calculus
From Everand
A Conversation About Calculus
Ginachukwu Amah
No ratings yet
Correlation and Regression: Six Sigma Thinking, #8
From Everand
Correlation and Regression: Six Sigma Thinking, #8
Sumeet Savant
5/5 (1)

L4a - Supervised Learning

Uploaded by

L4a - Supervised Learning

Uploaded by

Supervised Learning

Regression Analysis in Machine

• Regression analysis is a statistical method to model the relationship

• Now, the company wants to do the advertisement of $200 in the year

• Dependent Variable: The main factor in Regression analysis which we want to predict

• Below are some reasons for using Regression analysis:

• Linear regression is a statistical regression method which is used for

• S(x)= Output between the 0 and 1 value.

• But we need range between -[infinity] to +[infinity], then take

• The above equation is the final equation for Logistic Regression.

• Polynomial Regression is a type of regression which models

You might also like