0% found this document useful (0 votes)

50 views15 pages

Linear Regression Analysis: Module - Ii

1) The document discusses two methods of linear regression - orthogonal regression and reduced major axis regression. 2) Orthogonal regression minimizes the squared perpendicular distance between observed data points and the regression line to obtain estimates that account for errors in both the independent and dependent variables. 3) Reduced major axis regression instead minimizes the total area of rectangles defined between observed data points and the nearest point on the regression line, accounting for uncertainties in both study and explanatory variables.

Uploaded by

Lola Sam

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

50 views15 pages

Linear Regression Analysis: Module - Ii

Uploaded by

Lola Sam

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 15

Dr.

Shalabh
Department of Mathematics and Statistics
Indian Institute of Technology Kanpur

LINEAR REGRESSION ANALYSIS

MODULE II

Lecture - 7

Simple Linear Regression
Analysis
2

Orthogonal regression method (or major axis regression method)

The direct and reverse regression methods of estimation assume that the errors in the observations are either in x-direction
or y-direction. In other words, the errors can be either in dependent variable or independent variable. There can be
situations when uncertainties are involved in dependent and independent variables both. In such situations, the orthogonal
regression is more appropriate. In order to take care of errors in both the directions, the least squares principle in
orthogonal regression minimizes the squared perpendicular distance between the observed data points and the line in the
following scatter diagram to obtain the estimates of regression coefficients. This is also known as major axis regression
method. The estimates obtained are called as orthogonal regression
estimates or major axis regression estimates of regression coefficients.

If we assume that the regression line to be fitted is ,
then it is expected that all the observations
lie on this line. But these points deviate from the line and in such a
case, the squared perpendicular distance of observed data
from the line is given by
where denotes the pair of observation without any error
which lie on the line.

0 1 i i
Y X = +
( , ), 1,2,...,
i i
x y i n =
(xi, yi)

yi,

xi,

(Xi, Yi)

Orthogonal or major axis regression method
( , )( 1,2,..., )
i i
x y i n =
2 2 2
( ) ( )
i i i i i
d X x Y y = +
( , )
i i
X Y
th
i
3
The objective is to minimize the sum of squared perpendicular distances given by to obtain the estimates of
and .
The observations are expected to lie on the line

so let

The regression coefficients are obtained by minimizing under the constraints using the Lagrangians multiplier
method. The Lagrangian function is

where are the Lagrangian multipliers.
The set of equations are obtained by setting

Thus we find

2
1
n
i
i
d
=

( , )( 1,2,..., )
i i
x y i n =
0 1 i i
Y X = +
0 1
0.
i i i
E Y X = =
2
1
n
i
i
d
=

'
i
E s
2
0
1 1
2
n n
i i i
i i
L d E
= =
=

1
,...,
n

0 0 0 0
0 1
0, 0, 0 0( 1,2,..., ). and
i i
L L L L
i n
X Y

= = = = =

0
1
0
0
1 0
0
1 1
( ) 0
( ) 0
0
0.
i i i
i
i i i
i
n
i
i
n
i i
i
L
X x
X
L
Y y
Y
L
L
X

=
=

= + =

= =

4
Since

so substituting these values in , we obtain

Also using this in the equation we get

and using we get

Substituting in this equation, we get

Using in the equation and using the equation , we solve

1 i i i
i i i
X x
Y y

=
= +
0 1
1
2
1
( )
0
1
n
i i
i
x y

=
+
=
+

1
1
( ) 0.
n
i i i
i
x
=
=

2 2
0 1 1 0 1
1 1
2 2 2
1 1
( ) ( )
0. (1)
(1 ) (1 )
n n
i i i i i i
i i
x x y x x y

= =
+ +
=
+ +

0 1
1
2
1
( )
0.
1
n
i i
i
x y

=
+
=
+

i
E
0 1 1
0 1
2
1
( ) ( ) 0
.
1
i i i i i
i i
i
E y x
x y

= + =
+
=
+
i

1
0,
n
i
i

=
=

1
1
( ) 0 0, and
n
i i i i i
i
X x X
=
+ = =

1
0
n
i
i

=
=

5
The solution provides an orthogonal regression estimate of as

where is an orthogonal regression estimate of

Now, substituting in equation (1), we get

0

0 1

OR OR
y x =
1

1
.
0OR

( )
| | | |
2
2 2
1 1 1 1 1 1
1 1
2
2
1 1 1 1
1 1
2 2
1 1 1 1
1 1
(1 ) 0
(1 ) ( ) ( ) ( ) 0
(1 ) ( )( ) ( ) 0

or
or
where

n n
i i i i i i i
i i
n n
i i i i i
i i
n n
i i i i i
i i
yx xx x x y y x x y
x y y x x y y x x
u x v u v u
u

= =
= =
= =
( + + + =

+ + + =
+ + + + =

1 1
2 2 2
1 1
1
2
1 1
,
.
0,
( ) 0
( ) 0.

Since so

or

i i
i i
n n
i i
i i
n
i i i i i i
i
xy xx yy xy
x x
v y y
u v
u v u v u v
s s s s

= =
=
=
=
= =
( + =

+ =

6
Solving this quadratic equation provides the orthogonal regression estimate of as

where sign denotes the sign of which can be positive or negative. So

Notice that this gives two solutions for . We choose the solution which minimizes
The other solution maximizes and is in the direction perpendicular to the optimal solution.
The optimal solution can be chosen with the sign of .
1

( ) ( )
2 2
1
( ) 4

2
xy
yy xx xy xx yy
OR
xy
s s sign s s s s
s

+ +
=
( )
xy
s
xy
s
1

2
1
.
n
i
i
d
=

2
1
n
i
i
d
=

xy
s
1 0
( )
1 0.
if
if
xy
xy
xy
s
sign s
s
>

=

<

7

Reduced major axis regression method

The direct, reverse and orthogonal methods of estimation minimize the errors in a particular direction which is usually the
distance between the observed data points and the line in the scatter diagram. Alternatively, one can consider the area
extended by the data points in certain neighbourhood and instead of distances, the area of rectangles defined between
corresponding observed data point and nearest point on the line in the following scatter diagram can also be minimized.
Such an approach is more appropriate when the uncertainties are present in study as well as explanatory variables. This
approach is termed as reduced major axis regression.

Suppose the regression line is on which all the observed points are expected to lie. Suppose the points
are observed which lie away from the line.

0 1 i i
Y X = +
( , ), 1,2,...,
i i
x y i n =
8
The area of rectangle extended between the i
th
observed data point and the line is

where denotes the i
th
pair of observation without any error which lie on the line.
The total area extended by n data points is
All observed data points are expected to lie on the line

and let

So now the objective is to minimize the sum of areas under the constraints to obtain the reduced major axis estimates
of regression coefficients. Using the Lagrangian multiplies method, the Lagrangian function is

where are the Lagrangian multipliers. The set of equations are obtained by setting

( ~ )( ~ ) ( 1,2,..., )
i i i i i
A X x Y y i n = =
( , )
i i
X Y
1 1
( ~ )( ~ ).
n n
i i i i i
i i
A X x Y y
= =
=

( , ), ( 1,2,..., )
i i
x y i n =
0 1 i i
Y X = +
*
0 1
0.
i i i
E Y X = =
*
1 1
*
1 1
( )( )
n n
R i i i
i i
n n
i i i i i i
i i
L A E
X x Y y E

= =
= =
=
=

1
,...,
n

0 1
0, 0, 0, 0 ( 1,2,..., ).
R R R R
i i
L L L L
i n
X Y

= = = = =

*
i
E
9
Thus

Now

Substituting in , we get the reduced major axis regression estimate of is obtained as

where is the reduced major axis regression estimate of . Using and in
we get

1
1 0
1 1
( ) 0
( ) 0
0
0.
R
i i i
i
R
i i i
i
n
R
i
i
n
R
i i
i
L
Y y
X
L
X x
Y
L
L
X

=
=

= + =

= =

1
0 1 1
0 1 1
0 1
1
( )
.
2
i i i
i i i
i i i
i i i i
i i
i
X x
Y y
X y
x y
y x

= +
=
+ =
+ + =

=
i

1
0
n
i
i

=
=
0

0 1

RM RM
y x =
1

1
,
i i i i
X x = +
0

1
0,
n
i i
i
X
=
=

1 1 1 1
1
1 1
0.
2 2
n
i i i i
i
i
y y x x y y x x
x

=
| || | + +
=
| |
\ .\ .

10
Let then this equation can be re-expressed as
Using we get

Solving this equation, the reduced major axis regression estimate of is obtained as

where

We choose the regression estimator which has same sign as that of s
xy.

, and
i i i i
u x x v y y = =
1 1 1
1
( )( 2 ) 0.
n
i i i i
i
v u v u x
=
+ + =

1 1
0,
n n
i i
i i
u v
= =
= =

2 2 2
1
1 1
0.
n n
i i
i i
v u
= =
=

1

( )
yy
RM xy
xx
s
sign s
s
=
1 0
( )
1 0.
if
if
xy
xy
xy
s
sign s
s
>

=

<

11

Least absolute deviation regression method

The least squares principle advocates the minimization of sum of squared errors. The idea of squaring the errors is useful in
place of simple errors because the random errors can be positive as well as negative. So consequently their sum can be
close to zero indicating that there is no error in the model which can be misleading. Instead of the sum of random errors,
the sum of absolute random errors can be considered which avoids the problem due to positive and negative random
errors.
In the method of least squares, the estimates of the parameters and in the model
are chosen such that the sum of squares of deviations is minimum. In the method of least absolute deviation (LAD)
regression, the parameters and are estimated such that the
sum of absolute deviations is minimum. It minimizes the
absolute vertical sum of errors as in the following scatter diagram:

0

0 1
. ( 1,2,..., )
i i i
y x i n = + + =
2
1
n
i
i

1
n
i
i

12

The LAD estimates and are the values and , respectively which minimize

for the given observations
0

0 1 0 1
1
( , )
n
i i
i
LAD y x
=
=

( , )( 1,2,..., ).
i i
x y i n =
Conceptually, LAD procedure is simpler than OLS procedure because |e| (absolute residuals) is a more straightforward
measure of the size of the residual than e
2
(squared residuals). The LAD regression estimates of and are not
available in closed form. Rather they can be obtained numerically based on algorithms. Moreover, this creates the
problems of non-uniqueness and degeneracy in the estimates. The concept of non-uniqueness relates to more than one
best lines passing through a data point. The degeneracy concept describes that the best line through a data point also
passes through more than one other data points. The non-uniqueness and degeneracy concepts are used in algorithms to
judge the quality of the estimates. The algorithm for finding the estimators generally proceeds in steps. At each step, the
best line is found that passes through a given data point. The best line always passes through another data point, and this
data point is used in the next step. When there is non-uniqueness, then there are more than one best lines. When there
is degeneracy, then the best line passes through more than one other data point. When either of the problem is present,
then there is more than one choice for the data point to be used in the next step and the algorithm may go around in circles
or make a wrong choice of the LAD regression line. The exact tests of hypothesis and confidence intervals for the LAD
regression estimates can not be derived analytically. Instead they are derived analogous to the tests of hypothesis and
confidence intervals related to ordinary least squares estimates.
0

13

Estimation of parameters when X is stochastic

In a usual linear regression model, the study variable is supped to be random and explanatory variables are assumed to be
fixed. In practice, there may be situations in which the explanatory variable also becomes random.
Suppose both dependent and independent variables are stochastic in the simple linear regression model

where is the associated random error component. The observations are assumed to be jointly
distributed. Then the statistical inferences can be drawn in such cases which are conditional on X.

Assume the joint distribution of X and y to be bivariate normal where and are the means of X
and and are the variances of X and y, and is the correlation coefficient between X and y. Then the
conditional distribution of y given X = x is univariate normal conditional mean

and conditional variance of y given X = x is

where

and

0 1
y X = + +
( , ), 1,2,...,
i i
x y i n =
2 2
( , , , , )
x y x y
N
x

2
;
x
y
2
y

| 0 1
( | )
y x
E y X x x = = = +
2 2 2
|
( | ) (1 )
y x y
Var y X x = = =
0 1 y x
=
1
.
y
x

=
Moreover, the correlation coefficient

can be estimated by the sample correlation coefficient

14
When both X and y are stochastic, then the problem of estimation of parameters can be reformulated as follows. Consider
a conditional random variable y|X = x having a normal distribution with mean as conditional mean and variance as
conditional variance . Obtain n independently distributed observation from
with nonstochastic X. Now the method of maximum likelihood can be used to estimate the parameters
which yields the estimates of and as earlier in the case of nonstochastic X as
| y x

and

respectively.
1
b y b x =

1
xy
xx
s
b
s
=

( )( )
y x
y x
E y X

=
1
2 2
1 1
1
( )( )

( ) ( )
.
n
i i
i
n n
i i
i i
xy
xx yy
xx
yy
y y x x
x x y y
s
s s
s
b
s

=
= =

=

=
=

15
Thus

which is same as the coefficient of determination.

Thus R
2
has the same expression as in the case when X is fixed.

Thus R
2
again measures the goodness of fitted model even when X is stochastic.
2 2
1
1
2
1
2

xx
yy
xy
yy
n
yy i
i
yy
s
b
s
s
b
s
s
s
R

=
=
=

=
=

Activity
No ratings yet
Activity
14 pages
Case Study On Food Corporation of India
80% (10)
Case Study On Food Corporation of India
23 pages
Mahouka Koukou No Rettousei Vol. 27 PDF
No ratings yet
Mahouka Koukou No Rettousei Vol. 27 PDF
184 pages
Draft Petition For Legal Beneficiary
67% (3)
Draft Petition For Legal Beneficiary
4 pages
Least Square Method
100% (2)
Least Square Method
14 pages
The $5 Trillion Cold War Hoax - Eustace Mullins
100% (4)
The $5 Trillion Cold War Hoax - Eustace Mullins
16 pages
Blake and Mouton Leadership Grid
No ratings yet
Blake and Mouton Leadership Grid
2 pages
Language Policy and Globalization 1: The Portuguese Language in The Twenty-First Century
No ratings yet
Language Policy and Globalization 1: The Portuguese Language in The Twenty-First Century
20 pages
History of Corrections
No ratings yet
History of Corrections
17 pages
MAF3821 2024 Part1
100% (1)
MAF3821 2024 Part1
35 pages
VSR 411 QB Anaesthesia
No ratings yet
VSR 411 QB Anaesthesia
7 pages
Chapter1 Econometrics IntroductionToEconometrics
No ratings yet
Chapter1 Econometrics IntroductionToEconometrics
42 pages
General Mathematics: Quarter 1 - Module 26: Domain and Range of Logarithmic Functions
No ratings yet
General Mathematics: Quarter 1 - Module 26: Domain and Range of Logarithmic Functions
19 pages
All The Great Scholars
No ratings yet
All The Great Scholars
97 pages
Least Square Adjustament PDF
No ratings yet
Least Square Adjustament PDF
55 pages
Behavioural Interview Questions
No ratings yet
Behavioural Interview Questions
3 pages
V. Nonlinear Regression by Modified Gauss-Newton Method: Theory
No ratings yet
V. Nonlinear Regression by Modified Gauss-Newton Method: Theory
39 pages
Straight Line
No ratings yet
Straight Line
4 pages
OLS 2 Variables
No ratings yet
OLS 2 Variables
55 pages
Week9 PDF
No ratings yet
Week9 PDF
34 pages
Chapter2 Regression SimpleLinearRegressionAnalysis PDF
No ratings yet
Chapter2 Regression SimpleLinearRegressionAnalysis PDF
42 pages
Notes Simple Linear Regression Analysis
No ratings yet
Notes Simple Linear Regression Analysis
39 pages
Chapter2 Regression SimpleLinearRegressionAnalysis
No ratings yet
Chapter2 Regression SimpleLinearRegressionAnalysis
41 pages
Introduction To Mathematical Modeling: Simple Linear Regression
No ratings yet
Introduction To Mathematical Modeling: Simple Linear Regression
21 pages
Multiple Linear Reegression
No ratings yet
Multiple Linear Reegression
21 pages
MFIN 305 - Lecture1
No ratings yet
MFIN 305 - Lecture1
77 pages
4-Curve Fitting and Interpolation
No ratings yet
4-Curve Fitting and Interpolation
48 pages
Unit4 Multivariate Analysis
No ratings yet
Unit4 Multivariate Analysis
20 pages
Curve Fitting
No ratings yet
Curve Fitting
17 pages
Least Squares Method
No ratings yet
Least Squares Method
36 pages
Chapter2
No ratings yet
Chapter2
20 pages
L4 Emt 2101 Engineering Mathematics Iii
No ratings yet
L4 Emt 2101 Engineering Mathematics Iii
25 pages
Lec 3
No ratings yet
Lec 3
20 pages
Chapter 2 Econometrics Simple Linear Regression Analysis
No ratings yet
Chapter 2 Econometrics Simple Linear Regression Analysis
42 pages
Chapter 02
No ratings yet
Chapter 02
14 pages
Scientific Investigation PDF File
No ratings yet
Scientific Investigation PDF File
27 pages
Linear Least Squared
No ratings yet
Linear Least Squared
23 pages
Chapter2 (Simple Linear Regression)
No ratings yet
Chapter2 (Simple Linear Regression)
11 pages
Simple Linear Regression
No ratings yet
Simple Linear Regression
11 pages
REg 03
No ratings yet
REg 03
39 pages
Rabino Lineth B Activity-8 Consyslab
No ratings yet
Rabino Lineth B Activity-8 Consyslab
13 pages
Linear Regression Course
No ratings yet
Linear Regression Course
22 pages
Lecture2 241007 162001
No ratings yet
Lecture2 241007 162001
11 pages
EC501 Lecture 01
No ratings yet
EC501 Lecture 01
28 pages
Lecture25 Ps
No ratings yet
Lecture25 Ps
10 pages
Regression
No ratings yet
Regression
12 pages
ML Lec-3
No ratings yet
ML Lec-3
11 pages
Ols PDF
No ratings yet
Ols PDF
8 pages
EDAC SampleExam09 WEB 001 PDF
No ratings yet
EDAC SampleExam09 WEB 001 PDF
8 pages
Mungadze Linear
No ratings yet
Mungadze Linear
21 pages
Topic 3 Curve Fitting Method Linear Regression
No ratings yet
Topic 3 Curve Fitting Method Linear Regression
13 pages
KS-5 Vol 10 Additional List - 6pp
No ratings yet
KS-5 Vol 10 Additional List - 6pp
16 pages
OLS
No ratings yet
OLS
18 pages
09 Curve Fitting II
No ratings yet
09 Curve Fitting II
17 pages
Chapter 3 Econometrics (Edited)
No ratings yet
Chapter 3 Econometrics (Edited)
32 pages
The Upper Paleolithic Triple Burial of Dolnı Vestonice Pathology and Funerary Behavior
No ratings yet
The Upper Paleolithic Triple Burial of Dolnı Vestonice Pathology and Funerary Behavior
8 pages
EEE 531 A MGT and Organization
No ratings yet
EEE 531 A MGT and Organization
8 pages
LLICO2b ECO1 English
No ratings yet
LLICO2b ECO1 English
15 pages
Topic 6B Regression
No ratings yet
Topic 6B Regression
13 pages
Performance of Differential Evolution Method in Least Squares Fitting of Some Typical Nonlinear Curves
No ratings yet
Performance of Differential Evolution Method in Least Squares Fitting of Some Typical Nonlinear Curves
21 pages
Wa0072
No ratings yet
Wa0072
12 pages
Lec 13
No ratings yet
Lec 13
10 pages
Regression Analysis Material
No ratings yet
Regression Analysis Material
12 pages
Python Notes
No ratings yet
Python Notes
25 pages
Crypto Exchange: Marketing Options
No ratings yet
Crypto Exchange: Marketing Options
6 pages
Candida-Associated Denture Stomatitis
No ratings yet
Candida-Associated Denture Stomatitis
5 pages
4-Regression Analysis 2
No ratings yet
4-Regression Analysis 2
10 pages
Emily Covington Critical Review Relaxofon
No ratings yet
Emily Covington Critical Review Relaxofon
6 pages
Detailed LP Cookery 6
No ratings yet
Detailed LP Cookery 6
8 pages
Lecture 2 Multivariate Linear Regression Models
No ratings yet
Lecture 2 Multivariate Linear Regression Models
15 pages
Handout3 26
No ratings yet
Handout3 26
7 pages
Econometric Theory: Module - Ii
No ratings yet
Econometric Theory: Module - Ii
8 pages
Lang Art 4 M
No ratings yet
Lang Art 4 M
7 pages
Chapter 12 Lecture Notes
No ratings yet
Chapter 12 Lecture Notes
4 pages
Least Squares Fit
No ratings yet
Least Squares Fit
13 pages
Paper On Polynomial Regression
No ratings yet
Paper On Polynomial Regression
7 pages
1 The Idea
No ratings yet
1 The Idea
6 pages
STAT 3008 Applied Regression Analysis Tutorial 1 - Term 2, 2019 20
No ratings yet
STAT 3008 Applied Regression Analysis Tutorial 1 - Term 2, 2019 20
2 pages
My Kundalini Awakening: by Murli Menon
No ratings yet
My Kundalini Awakening: by Murli Menon
3 pages
Least Square
No ratings yet
Least Square
3 pages
1.105 Solid Mechanics Laboratory: Least Squares Fit of Straight Line To Data
No ratings yet
1.105 Solid Mechanics Laboratory: Least Squares Fit of Straight Line To Data
3 pages
Assimilation
No ratings yet
Assimilation
2 pages
SP.769 Photovoltaic Systems: Least Squares Fit of Straight Line To Data
No ratings yet
SP.769 Photovoltaic Systems: Least Squares Fit of Straight Line To Data
3 pages
Iii. Financial Assessment 1. Profitability Ratios
No ratings yet
Iii. Financial Assessment 1. Profitability Ratios
2 pages
Ecology in Natural Cereal Ferment
No ratings yet
Ecology in Natural Cereal Ferment
8 pages
Auditing Theory - Overview
No ratings yet
Auditing Theory - Overview
1 page
Controlling Chapter Quiz
No ratings yet
Controlling Chapter Quiz
1 page
Wire Sculpture Lesson Plan
No ratings yet
Wire Sculpture Lesson Plan
4 pages
A-level Maths Revision: Cheeky Revision Shortcuts
From Everand
A-level Maths Revision: Cheeky Revision Shortcuts
Scool Revision
3.5/5 (8)
Application of Derivatives Tangents and Normals (Calculus) Mathematics E-Book For Public Exams
From Everand
Application of Derivatives Tangents and Normals (Calculus) Mathematics E-Book For Public Exams
Mohmmad Khaja Shareef
5/5 (1)

Linear Regression Analysis: Module - Ii

Uploaded by

Linear Regression Analysis: Module - Ii

Uploaded by

Dr.

You might also like