Regression Modelling

Uploaded by

Rohan Rathi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

28 views25 pages

Regression Modelling

Uploaded by

Rohan Rathi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 25

Introduction

 Regression analysis is a form of predictive modelling technique which

investigates the relationship between a dependent (target) and independent
variable (s) (predictor). This technique is used for forecasting, time series
modelling and finding the casual effect analysis between the variables. For
example, relationship between rash driving and number of road accidents by
a driver is best studied through regression.

 Regression analysis is an important tool for modelling and analyzing

data. Here, we fit a curve / line to the data points, in such a manner that the
differences between the distances of data points from the curve or line is
minimized. I’ll explain this in more details in coming sections.
Need of Regression Analysis
 As mentioned above, regression analysis estimates the relationship between
two or more variables. Let’s understand this with an easy example:
 Let’s say, you want to estimate growth in sales of a company based on current
economic conditions. You have the recent company data which indicates that
the growth in sales is around two and a half times the growth in the economy.
Using this insight, we can predict future sales of the company based on current
& past information.
 There are multiple benefits of using regression analysis. They are as follows:
 It indicates the significant relationships between dependent variable and
independent variable.
 It indicates the strength of impact of multiple independent variables on a
dependent variable.
 Regression analysis also allows us to compare the effects of variables measured
on different scales, such as the effect of price changes and the number of
promotional activities. These benefits help market researchers / data analysts /
data scientists to eliminate and evaluate the best set of variables to be used
for building predictive models.
Types of Regression

Regression Analysis Linear

Regression
Multiple
Regression
Logistic
Regression
Types of Regression
 Simple linear regression
 One dependent variable (interval or ratio)
 One independent variable (interval or ratio or dichotomous)
 Multiple linear regression
 One dependent variable (interval or ratio)
 Two or more independent variables (interval or ratio or dichotomous)
 Logistic regression
 One dependent variable (binary)
 Two or more independent variable(s) (interval or ratio or dichotomous)
 Ordinal regression
 One dependent variable (ordinal)
 One or more independent variable(s) (nominal or dichotomous)
 Multinomial regression
 One dependent variable (nominal)
 One or more independent variable(s) (interval or ratio or dichotomous)
 Discriminant analysis
 One dependent variable (nominal)
 One or more independent variable(s) (interval or ratio)
Linear Regression
 Linear regression is a quiet and the simplest statistical regression method used for
predictive analysis in machine learning. Linear regression shows the linear
relationship between the independent(predictor) variable i.e. X-axis and the
dependent(output) variable i.e. Y-axis, called linear regression. If there is a single
input variable X(dependent variable), such linear regression is called simple
linear regression.

The above graph presents the linear relationship between the output(y) variable
and predictor(X) variables. The blue line is referred to as the best fit straight
line. Based on the given data points, we attempt to plot a line that fits the points
the best.
Linear Regression
 Calculate best-fit line linear regression uses a traditional slope-intercept form which is
given below,
 Yi = β0 + β1Xi
 Where Yi = Dependent variable to the given value of Independent variable.
 β0 = constant/Intercept the predicted value of y when x is 0.
 β1 = Slope or regression coefficient (how much we except y to change as x increase).
 Xi = Independent variable (The variable we expect influencing the dependent variable
y).
 This algorithm explains the linear relationship between the dependent(output) variable
y and the independent(predictor) variable X using a straight line.

 But how the linear regression finds out which is the best fit line?

The goal of the linear regression algorithm is to get the best values for B0 and B1 to find
the best fit line. The best fit line is a line that has the least error which means the error
between predicted values and actual values should be minimum.
 You can use the simple linear regression when you want to know:
1. How strong the relationship between two variables.
2. The value of dependent variable at a certain value of independent variable.
Assumptions Linear Regression
1. Homogeneity of Variance: The size of the error in our prediction doesn’t change
significantly across the value of independent variable.
2. Independence of observations: the observations in the dataset were collected using
statistically valid sampling methods, and there is no hidden relationships among
variables.
3. Normality: The data follows a normal distribution.

Simple LR makes one additional assumption.

1. The relationship between the independent and dependent variable is linear. The line
of the best fit through the data point is a straight line rather than a curve.
Calculate R2
 To check the goodness of fit, R2 is calculated.
 R-Squarred value is the statistical measure of how close the data
are to the fitted regression line.
 It determines that x and y are correlated or not. Large value shows
the better model.
 For this calculate:
 Distance of actual-mean
 Distance of predicted-mean

 ( y  y)
2
p
 This is R2=
 ( y  y)
2
Linear Regression Solved Numerical
Linear Regression Solved Numerical
Linear Regression Solved Numerical
Linear Regression Solved Numerical
Multiple Linear Regression
 Multiple linear regression (MLR), also known simply as multiple
regression, is a statistical technique that uses several explanatory variables
to predict the outcome of a response variable. The goal of multiple linear
regression is to model the linear relationship between the explanatory
(independent) variables and response (dependent) variables.

 yi=β0+β1xi1+β2xi2+...+βpxip+ϵ
 where, for i=n observations:
 yi=dependent variable
 xi=explanatory variables Independent Variable
 β0=y intercept (constant term)
 βp=slope coefficients for each explanatory variable
 ϵ=the model’s error term (also known as the residuals)
 As the number of independent variable increases to 2 our graph become
3D. The added 3rd dimension represents other independent variable.
Multiple Linear Regression Numerical
Multiple Linear Regression Numerical
Multiple Linear Regression Numerical
Multiple Linear Regression Numerical
Multiple Linear Regression Numerical
Multiple Linear Regression
Numerical
Multiple Linear Regression Numerical
Logistic Regression
 Logistic Regression is a “Supervised machine learning” algorithm that can be
used to model the probability of a certain class or event. It is used when the data is
linearly separable and the outcome is binary in nature.
 That means Logistic regression is usually used for Binary classification problems.

 Logistic Regression =

 Binary Classification refers to predicting the output variable that is discrete

in two classes. A few examples of Binary classification are Yes/No, Pass/Fail,
Win/Lose, Cancerous/Non-cancerous, etc.
Logistic Regression
 Logistic regression is one of the most popular Machine learning algorithm that
comes under Supervised Learning techniques.
 It can be used for Classification as well as for Regression problems, but mainly
used for Classification problems.
 Logistic regression is used to predict the categorical dependent variable with the
help of independent variables.
 The output of Logistic Regression problem can be only between the 0 and 1.
 Logistic regression can be used where the probabilities between two classes is
required. Such as whether it will rain today or not, either 0 or 1, true or false etc.
 Logistic regression is based on the concept of Maximum Likelihood estimation.
According to this estimation, the observed data should be most probable.
 In logistic regression, we pass the weighted sum of inputs through an activation
function that can map values in between 0 and 1. Such activation function is known
as sigmoid function and the curve obtained is called as sigmoid curve or S-curve.
Difference between Linear Regression
and Logistic Regression
Linear Regression Logistic Regression
Linear regression is used to predict the continuous Logistic Regression is used to predict the categorical
dependent variable using a given set of independent dependent variable using a given set of independent
variables. variables.
Linear Regression is used for solving Regression Logistic regression is used for solving Classification
problem. problems.
In Linear regression, we predict the value of In logistic Regression, we predict the values of
continuous variables. categorical variables.
In linear regression, we find the best fit line, by In Logistic Regression, we find the S-curve by which
which we can easily predict the output. we can classify the samples.
Least square estimation method is used for estimation Maximum likelihood estimation method is used for
of accuracy. estimation of accuracy.
The output for Linear Regression must be a The output of Logistic Regression must be a
continuous value, such as price, age, etc. Categorical value such as 0 or 1, Yes or No, etc.
In Linear regression, it is required that relationship In Logistic regression, it is not required to have the
between dependent variable and independent variable linear relationship between the dependent and
must be linear. independent variable.
In linear regression, there may be collinearity In logistic regression, there should not be collinearity
between the independent variables. between the independent variable.

Linear Regression
No ratings yet
Linear Regression
16 pages
Gaussian Elimination With Partial Pivoting
No ratings yet
Gaussian Elimination With Partial Pivoting
4 pages
L4a - Supervised Learning
No ratings yet
L4a - Supervised Learning
25 pages
Unit-Iii-1 1
No ratings yet
Unit-Iii-1 1
31 pages
DA Unit-3
No ratings yet
DA Unit-3
13 pages
Regression in M.L
No ratings yet
Regression in M.L
13 pages
Unit 2
No ratings yet
Unit 2
19 pages
Unit - Iii Data Analysis
No ratings yet
Unit - Iii Data Analysis
39 pages
Unit-2 Notes
No ratings yet
Unit-2 Notes
30 pages
Machine Learning: Bilal Khan
100% (2)
Machine Learning: Bilal Khan
20 pages
ML Unit-2 Half
No ratings yet
ML Unit-2 Half
16 pages
UNIT 2 Machine Learning BCAI601BCDS062
No ratings yet
UNIT 2 Machine Learning BCAI601BCDS062
244 pages
Applying Machine Learning Algorithms With Scikit-Learn (Sklearn) - Notes
No ratings yet
Applying Machine Learning Algorithms With Scikit-Learn (Sklearn) - Notes
19 pages
Understanding Regression
No ratings yet
Understanding Regression
40 pages
Unit 2 Topic 1 REGRESSION
No ratings yet
Unit 2 Topic 1 REGRESSION
19 pages
Linear Regression For Machine Learning
No ratings yet
Linear Regression For Machine Learning
9 pages
ML Algorithm
No ratings yet
ML Algorithm
4 pages
228w1f0065 ML
No ratings yet
228w1f0065 ML
15 pages
Linear Regression Using R
No ratings yet
Linear Regression Using R
11 pages
Types of Supervised Learning2
No ratings yet
Types of Supervised Learning2
66 pages
Linear Regression
No ratings yet
Linear Regression
7 pages
Unit-3 Part 2 DA
No ratings yet
Unit-3 Part 2 DA
20 pages
4 ML
No ratings yet
4 ML
41 pages
Unit-2: Machine Learning Techniques (KCS-055) Module-2
No ratings yet
Unit-2: Machine Learning Techniques (KCS-055) Module-2
199 pages
6 Regression Analysis
No ratings yet
6 Regression Analysis
12 pages
Unit 2
No ratings yet
Unit 2
67 pages
Unit 2 3 Notes
No ratings yet
Unit 2 3 Notes
16 pages
Unit-2 ML
No ratings yet
Unit-2 ML
39 pages
Linear Regression
No ratings yet
Linear Regression
11 pages
Unit - 2 MLA
No ratings yet
Unit - 2 MLA
57 pages
U-4 Iml
No ratings yet
U-4 Iml
17 pages
ML 7th Sem AIML ITE Notes Complete LONG (1) - 34-62
No ratings yet
ML 7th Sem AIML ITE Notes Complete LONG (1) - 34-62
29 pages
AAI Lecture 10 SP 25
No ratings yet
AAI Lecture 10 SP 25
37 pages
ML - Module 2
No ratings yet
ML - Module 2
16 pages
Unit - 3 Machine Learning
No ratings yet
Unit - 3 Machine Learning
30 pages
SML Updated UNIT 3
No ratings yet
SML Updated UNIT 3
41 pages
Lecture 2
No ratings yet
Lecture 2
17 pages
UNIT 2-3 - Notes - Unit-2-3-Notes
No ratings yet
UNIT 2-3 - Notes - Unit-2-3-Notes
16 pages
Regression Analysis in Machine Learning
No ratings yet
Regression Analysis in Machine Learning
26 pages
Simple Linear and Logistic Regression
No ratings yet
Simple Linear and Logistic Regression
81 pages
BA3 4 5modules
No ratings yet
BA3 4 5modules
258 pages
DS Unit-Iv
No ratings yet
DS Unit-Iv
34 pages
Unit 2 Notes - Final
No ratings yet
Unit 2 Notes - Final
32 pages
Hanan
No ratings yet
Hanan
9 pages
Regression
No ratings yet
Regression
11 pages
Regression
No ratings yet
Regression
4 pages
Regression: Unit Iii
No ratings yet
Regression: Unit Iii
54 pages
Unit 2 ML
No ratings yet
Unit 2 ML
201 pages
What Are Linear Models in Machine Learning (1) .Docx (Unit3 ML)
No ratings yet
What Are Linear Models in Machine Learning (1) .Docx (Unit3 ML)
60 pages
Unit - II - DA
No ratings yet
Unit - II - DA
22 pages
18-Linear Regression
No ratings yet
18-Linear Regression
29 pages
DMML Unit4
No ratings yet
DMML Unit4
77 pages
Regression Analysis
100% (2)
Regression Analysis
11 pages
6 RegressionEDITED
No ratings yet
6 RegressionEDITED
26 pages
Regression Unit-2
No ratings yet
Regression Unit-2
5 pages
Linear Regression
No ratings yet
Linear Regression
18 pages
Fai Module 3
No ratings yet
Fai Module 3
67 pages
(Unit-04) Part-01 - ML Algo
No ratings yet
(Unit-04) Part-01 - ML Algo
49 pages
Regression Analysis in Machine Learning
No ratings yet
Regression Analysis in Machine Learning
13 pages
2.1 Regression Analysis
No ratings yet
2.1 Regression Analysis
28 pages
SMTP and E-Mail
No ratings yet
SMTP and E-Mail
16 pages
Unit 3 SE
No ratings yet
Unit 3 SE
118 pages
DNS
No ratings yet
DNS
20 pages
VHS Unit-1
No ratings yet
VHS Unit-1
104 pages
Assigment 6
No ratings yet
Assigment 6
3 pages
VHS Unit-3
No ratings yet
VHS Unit-3
19 pages
A Step by Step Backpropagation Example - Matt Mazur
No ratings yet
A Step by Step Backpropagation Example - Matt Mazur
10 pages
Exercises ILP
No ratings yet
Exercises ILP
19 pages
Douglas Rachford Optimization
No ratings yet
Douglas Rachford Optimization
4 pages
Machine Learning Finance - Thesis
0% (1)
Machine Learning Finance - Thesis
66 pages
Design & Analysis of Algorithm (CSC-321) : Mona Leeza, Computer Sciences Department Bahria University (Karachi Campus)
No ratings yet
Design & Analysis of Algorithm (CSC-321) : Mona Leeza, Computer Sciences Department Bahria University (Karachi Campus)
19 pages
4 Lecture Note 9 Fixed Point Itration Method
No ratings yet
4 Lecture Note 9 Fixed Point Itration Method
17 pages
Transsipment
No ratings yet
Transsipment
26 pages
Assignment Problem: A. R. Dani
No ratings yet
Assignment Problem: A. R. Dani
58 pages
NNFLC Question
No ratings yet
NNFLC Question
1 page
Solving Nonlinear Least Squares Problem Using Gauss-Newton Method
No ratings yet
Solving Nonlinear Least Squares Problem Using Gauss-Newton Method
5 pages
Prob On Unit 1 and 2 New
No ratings yet
Prob On Unit 1 and 2 New
11 pages
1D Spring/Truss Elements: MCEN 4173/5173
No ratings yet
1D Spring/Truss Elements: MCEN 4173/5173
32 pages
U1 - Introduction To Linear Programming and Applications: U1-T1-S1 - A1
No ratings yet
U1 - Introduction To Linear Programming and Applications: U1-T1-S1 - A1
31 pages
Exercise11 Part2 Luenberger
No ratings yet
Exercise11 Part2 Luenberger
1 page
An Exact Solution To The Colebrook Equation
No ratings yet
An Exact Solution To The Colebrook Equation
1 page
Department of Computing: MATH333: Numerical Analysis
No ratings yet
Department of Computing: MATH333: Numerical Analysis
3 pages
Quadratic Function
No ratings yet
Quadratic Function
11 pages
Module 2 Hebb Net
No ratings yet
Module 2 Hebb Net
12 pages
Instructions:: Problem 1
No ratings yet
Instructions:: Problem 1
3 pages
Wolfe Method
No ratings yet
Wolfe Method
10 pages
Question Bank Conm-Final
No ratings yet
Question Bank Conm-Final
5 pages
300 Solved-Numerical Problems
No ratings yet
300 Solved-Numerical Problems
37 pages
BSC 1 Sem Ecs Mathematics Numerical Methods SLR SC 8 2018
No ratings yet
BSC 1 Sem Ecs Mathematics Numerical Methods SLR SC 8 2018
4 pages
جزوه هوش مصنوعی
No ratings yet
جزوه هوش مصنوعی
16 pages
Factorisation: Synopsis - 1
No ratings yet
Factorisation: Synopsis - 1
6 pages
Deep Learning - Assignment 11 Your Name, Roll Number 1. What Is The Difference Between Backpropagation Algorithm and Backpropagation Through Time (BPTT) Algorithm ?
No ratings yet
Deep Learning - Assignment 11 Your Name, Roll Number 1. What Is The Difference Between Backpropagation Algorithm and Backpropagation Through Time (BPTT) Algorithm ?
10 pages
CPC L10 LQR
No ratings yet
CPC L10 LQR
11 pages
Reinforcement Learning
No ratings yet
Reinforcement Learning
46 pages

Regression Modelling

Uploaded by

Regression Modelling

Uploaded by

Introduction

 Regression analysis is a form of predictive modelling technique which

 Regression analysis is an important tool for modelling and analyzing

Regression Analysis Linear

Simple LR makes one additional assumption.

 Binary Classification refers to predicting the output variable that is discrete

You might also like