0% found this document useful (0 votes)
8 views20 pages

Lec 3

The document provides an overview of regression analysis, focusing on its purpose to predict the value of a response variable from one or more attribute variables. It explains the concepts of simple and multiple linear regression, the Ordinary Least Squares (OLS) method for estimating parameters, and the gradient descent optimization technique. Additionally, it includes mathematical derivations and examples to illustrate the application of these concepts in predicting outcomes based on given data.

Uploaded by

ku777965
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views20 pages

Lec 3

The document provides an overview of regression analysis, focusing on its purpose to predict the value of a response variable from one or more attribute variables. It explains the concepts of simple and multiple linear regression, the Ordinary Least Squares (OLS) method for estimating parameters, and the gradient descent optimization technique. Additionally, it includes mathematical derivations and examples to illustrate the application of these concepts in predicting outcomes based on given data.

Uploaded by

ku777965
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 20

Dr.

Supriyo Mandal,
Ph.D. (IIT Patna)
Postdoc (ZBW, University of Kiel,
Germany)

Course code: CS31002 (L-T-P-Cr: 3-1-0-4) Course Name: Machine Learning


Credits: 4
What is Regression

Regression – predict value of response variable from attribute variables.


Variables – continuous numeric values

Regression analysis – a set of statistical processes for estimating the relationships between a dependent variable and
one or more independent variables.
v Dependent variables are often called the 'predictand', 'outcome' or 'response' variable;
v Independent variables are often called 'predictors', 'covariates', 'explanatory variables' or 'features'.
v Regression analysis is a way of mathematically sorting out which of those variables does indeed have an
impact. Used for modeling the future relationship between the variables.

Statistical process – a science of collecting, exploring, organizing, analyzing, interpreting data and exploring patterns
and trends to answer questions and make decisions (Broad area).
y=a+bx
;a=intercept
b=slope/gradient/steep size
y= Dependent variable
x=independent variable
Basics of Regression Models

v Regression models predict a value of the Y variable given known values of the X variables.

v Prediction within the range of values in the dataset used for model-fitting is known as interpolation.

v Prediction outside this range of the data is known as extrapolation.

v First, a model to estimate the outcome need to be fixed.

v Then the parameters of that model need to be estimated using any chosen method (e.g., least
squares).
Example..........

Hour (x) Marks (y)

5 40
7 120
12 180

y:marks
16 210
20 240

y=a+bx
;a=intercept
b=slope/gradient/steep size
y= Dependent variable
x=independent variable x: hour

Calculate the gradient........


Linear Regression with Gradient Decent....
v Gradient descent is an iterative optimization algorithm to find the minimum of a function. Here that function is our
Loss Function. We will use Mean Square Error (MSE) as Loss Function in this topic which is shown below:
� � � �
v E= �=�
[�� − (� + ���)]2 = �=�
(�� − �� )2
� �

v Understanding Gradient Descent


Linear Regression with Gradient Decent....
v Mathematical derivation of Gradient Descent in simple Linear Regression :
v 1. Initially let a = 0 and b = 0. Let L be our learning rate. This controls how much the value of b changes with each step. L
could be a small value like 0.0001 for good accuracy.
v 2. Calculate the partial derivative of the loss function with respect to a and b, and plug in the current values of x, y, b and a in
it to obtain the derivative value D.
1 �
v Db = 2 [�
�=1 �
− (� + ���)](−��)

−� �
v Db = � [�
�=� � �
− (� + ���)]

−� �
v Db = � (�
�=� � �
− �� )

v Db is the value of the partial derivative with respect to b.

v Similarly, the partial derivative with respect to a is Da:


1 �
v Da = 2 [�
�=1 �
− (� + ���)](−1)

−� �
v Da = �=�
[�� − (� + ���)]

−� �
v Da = �=�
(�� − �� )

Linear Regression with Gradient Decent....
v Mathematical derivation of Gradient Descent in simple Linear Regression :
v 3. Now we update the current value of b and a using the following equation:
v b = b − L×Db

v a = a − L×Da

v 4. We repeat this process until our loss function is a very small value or ideally 0 (which means 0 error or 100%
accuracy). The value of b and a that we are left with now will be the optimum values.

v Now going back to our analogy, b can be considered the current position of the person. D is equivalent to the
steepness of the slope and L can be the speed with which he moves. Now the new value of b that we calculate
using the above equation will be his next position, and L×D will be the size of the steps he will take.
v When the slope is more steep (D is more) he takes longer steps and when it is less steep (D is less), he takes smaller
steps.
v Finally he arrives at the bottom of the valley which corresponds to our loss = 0.
v Now with the optimum value of b and a our model is ready to make predictions !
Linear Regression with Gradient Decent....
Example..........

Hour Marks

5 40

7 120

12 180

y:Marks
16 210

20 240

Find the regression line.....

x: hour
Simple linear regression: There is only one continuous independent variable x and the assumed relation
between the independent variable and
the dependent variable y is
y = a + bx.
Details study.................
v Let x be the independent predictor variable and y the dependent variable.
v Assume that we have a set of observed values of x and y. A simple linear regression model defines the relationship between x
and y using a line defined by an equation in the following form:
y = a + bx
v To determine the optimal estimates of α and β, an estimation method known as Ordinary Least Squares (OLS).

v The OLS method


v In the OLS method, the values of y-intercept and slope are chosen such that they minimize the sum of the squared errors; that
is, the sum of the squares of the vertical distance between the predicted y-value and the actual y-value (see Figure 7.1). Let ��
be the predicted value of ��

v Then the sum of squares of errors is given by



v E= �=1
(�� − ��)2

= [�
�=1 �
− (� + ���)]2

v So we are required to find the values of a and b such that E is minimum.


Details study.................

v E= �=1
(�� − ��)2

= [�
�=1 �
− (� + ���)]2
v To solve the above equation we have to take two partial derivations as below:
�� ��
v =0 ------(i) and =0 ------(ii)
�� ��

v By solving eq(i)

=> 2 [�
�=1 �
− � − ���](-1) = 0
� � �
=> −2 �
�=1 �
+ 2� �=1
1 + 2� �
�=1 �
=0
� �
=> − �
�=1 �
+ �� + � �
�=1 �
=0
� �
=> �� = �
�=1 �
−� �
�=1 �
1 � 1 �
=> � = �
�=1 �
−� �
�=1 �
� �

=> � = � − ��
� � � �
where � = �
�=� �
(mean of values of y), � = �
�=� �
(mean of values of x)
� �
Details study.................

v E= �=1
(�� − ��)2

= �=1
[�� − (� + ���)]2

v To solve the above equation we have to take two partial derivations as below:
�� ��
v ��
=0 ------(i) and ��
=0 ------(ii)

v By solving eq(ii)

=> 2 �=1
[�� − � − ���](−��) = 0
� � �
=> −2 ��
�=1 � �
+ 2� �
�=1 �
+ 2� �2
�=1 �
=0
� � �
=> − ��
�=1 � �
+� �
�=1 �
+� �2
�=1 �
=0
� � �
=> − ��
�=1 � �
+ (� − ��) �
�=1 �
+� �2
�=1 �
=0
� � � �
=> �( �2
�=1 �
−� �)
�=1 �
= ��
�=1 � �
−� �
�=1 �
� � � �
=> �[ �{
�=1 � �=1
(�� − �)}] = �{
�=1 � �=1
(�� − �)}

�=� (��− �)
=> � = �
�=� (��− �)

1 n
By multiplying n−1
{ i=1
(xi − x)} in the numerator and denominator of RHS
1 � �
{ �=1 (��− �)} �=1 (��− �) ���(�,�)
=> � = �−1
1 � � =
�−1
{ �=1 (��− �)} �=1 (��− �)
���(�)
For problem and solutions........

https://www.ncl.ac.uk/webtemplate/ask-assets/external/maths-resources/statistics/regression-and-
correlation/simple-linear-regression.html
v We assume that there are N independent variables x1, x2, ⋯ , xN . Let the dependent variable be y.
v Let there also be n observed values of these variables:

v The multiple linear regression model defines the relationship between the N independent variables and the
dependent variable by an equation of the following form:
y = β0 + β1x1 + ⋯ + βNxN
v As in simple linear regression, here also we use the ordinary least squares (OLS) method to obtain the
optimal estimates of β0, β1, ⋯ , βN. The method yields the following procedure for the computation of these
optimal estimates. Let

v Then it can be shown that the regression coefficients are given by

B = (XT X)−1 XT Y
v Example:
v Fit a multiple linear regression model to the following data:

v Solution:
v In this problem, there are two independent variables and four sets of values of the variables. Thus, in the notations
used above, we have n = 2 and N = 4. The multiple linear regression model for this problem has the form
y = β0 + β1x 1 + β2x 2.
v The computations are shown below.
y = 2.0625 − 2.3750x1 + 3.2500x2

You might also like