0% found this document useful (0 votes)
20 views20 pages

Lecture Note #8 - PEC-CS701E

Uploaded by

halderriya56732
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
20 views20 pages

Lecture Note #8 - PEC-CS701E

Uploaded by

halderriya56732
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 20

What is Regression Analysis?

Regression analysis is a statistical method to model


the relationship between a dependent (target) and
independent (predictor) variables with one or more
independent variables.
It helps us to understand how the value of the
dependent variable is changing corresponding to an
independent variable when other independent
variables are held fixed.

It predicts continuous/real values such


as temperature, age, salary, price, etc.
Example
When and why do you use Regression?
Regression is performed when the dependent
variable is of continuous data type and
Predictors or independent variables could be of
any data type like continuous,
nominal/categorical etc.
Regression method tries to find the best fit line
which shows the relationship between the
dependent variable and predictors with least
error.
In regression, the output/dependent variable is
the function of an independent variable and the
coefficient and the error term.
Types of Regression
What is LinearRegression?
4 Probability & Bayesian Inference

 In classification, we seek to identify the categorical class Ck


associate with a given input vector x.
 In regression, we seek to identify (or estimate) a continuous
variable y associated with a given input vector x.
 y is called the dependent variable.
 x is called the independent variable.
 If y is a vector, we call this multiple regression.
Linear Regression
Linear regression shows the linear relationship
between the independent variable (X-axis) and the
dependent variable (Y-axis), hence called linear
regression.

If there is only one input variable (x), then such linear


regression is called simple linear regression. And if
there is more than one input variable, then such
linear regression is called multiple linear regression.
Representing Linear Regression Model
The linear regression model provides a sloped straight line
representing the relationship between the variables.
Example
The relationship between variables in the linear
regression model can be explained using the below
image. Here we are predicting the salary of an
employee on the basis of the year of experience.
Types of Linear Regression
Based on the number of independent variables, there are two types of linear
regression-
Simple Linear Regression
In simple linear regression, the dependent variable depends only
on a single independent variable.

For simple linear regression, the form of the model is-


Y = β0 + β1X
Here,
Y is a dependent variable.
X is an independent variable.
β0 and β1 are the regression coefficients.
β0 is the intercept or the bias that fixes the offset to a line.
β1 is the slope or weight that specifies the factor by which X has
an impact on Y.
Where is the Linear in Linear Regression?
5

 In regression we assume that y is a function of x.


The exact nature of this function is governed by an
unknown parameter vector w:
( )
y = y x,w
 The regression is linear if y is linear in w. In other
words, we can express y as
y = wt x ()
where
()
 x is some (potentially nonlinear) function of x.
Linear Regression Line
A linear line showing the relationship between the dependent and independent variables
is called a regression line.

Case-01: β1 < 0

•It indicates that variable X has negative impact on Y.


•If X increases, Y will decrease and vice-versa.
Case-02: β1 = 0

•It indicates that variable X has no impact on Y.


•If X changes, there will be no change in Y.

Case-03: β1 > 0

•It indicates that variable X has positive impact on Y.


•If X increases, Y will increase and vice-versa.
Multiple Linear Regression
In multiple linear regression, the dependent variable depends on
more than one independent variables.

For multiple linear regression, the form of the model is-

Y = β0 + β1X1 + β2X2 + β3X3 + …… + βnXn

Here,
•Y is a dependent variable.
•X1, X2, …., Xn are independent variables.
•β0, β1,…, βn are the regression coefficients.
•βj (1<=j<=n) is the slope or weight that specifies the factor by
which Xj has an impact on Y.
Cost function
•The different values for weights or coefficient of lines gives the
different line of regression, and the cost function is used to
estimate the values of the coefficient for the best fit line.

•Cost function optimizes the regression coefficients or weights. It


measures how a linear regression model is performing.

•We can use the cost function to find the accuracy of


the mapping function, which maps the input variable to the
output variable. This mapping function is also known
as Hypothesis function.
Mean Squared Error (MSE)
For Linear Regression, we use the Mean Squared Error (MSE) cost function, which is the
average of squared error occurred between the predicted values and actual values. It can
be written as:

Where,
N=Total number of observation
Yi = Actual value
(a1xi+a0)= Predicted value.
Gradient Descent:

•Gradient descent is used to minimize the MSE


by calculating the gradient of the cost function.

•A regression model uses gradient descent to


update the coefficients of the line by reducing
the cost function.

•It is done by a random selection of values of


coefficient and then iteratively update the
values to reach the minimum cost function.
Model Performance:
R-squared method:
•R-squared is a statistical method that determines the goodness
of fit.
•It measures the strength of the relationship between the
dependent and independent variables on a scale of 0-100%.
•The high value of R-square determines the less difference
between the predicted values and actual values and hence
represents a good model.
•It is also called a coefficient of determination, or coefficient of
multiple determination for multiple regression.
•It can be calculated from the below formula:
Some popular applications of linear
regression are:

•Analyzing trends and sales estimates


•Salary forecasting
•Real estate prediction
•Arriving at ETAs in traffic.

You might also like