0% found this document useful (0 votes)

64 views5 pages

Classical Multiple Regression

- Classical multiple linear regression models a dependent variable y as a linear function of independent variables (x's) plus some error term ε. - The OLS estimator β is calculated as β = (X'X)-1X'Y and minimizes the sum of squared errors between the observed and predicted y values. - Key assumptions include: linearity, errors are independent and normally distributed with mean 0 and constant variance σ2, and X is fixed with full column rank. Violations can cause statistical issues. - Goodness of fit is assessed using R2, which measures the proportion of variance in y explained by the regression. Adjusted R2 corrects for the number of predictors. -

Uploaded by

13sandip

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOC, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

64 views5 pages

Classical Multiple Regression

Uploaded by

13sandip

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOC, PDF, TXT or read online on Scribd

You are on page 1/ 5

Classical Multiple Regression

y is a random scalar that is partially explained by x but partial explained by unobserved

things that together are denoted by the random variable . Since is unobserved, we
scale it so that its mean is zero, but it has a variance 2 that must be measured
y=x+ population theory
Typically x includes the variable 1 as the first variable along with p other variables so it is
a p+1-vector. We will use k=p+1. If we have n observations of (y,x), we lay them in a
row and stack them:
1 x 11 x 1p
y1

Y y i X 1 x i1 x ip
n 1
n k

1 x n1 x np
y n

X is called the design matrix as though the research chose the values of x exogenously.
The critical assumption is actually that x is uncorrelated with the error .
Classical Linear Regression Assumptions
1. Y=X+ Linearity
2. E[Y|X]=X or E[|X]=0 explanatory variables are exogenous, independent of errors
3. Var(Y|X)=I errors are iid ( independent and identically distributed)
4. X is fixed
5. X has full column rank: nk and columns of X are not dependent
6. is normally distributed
Problems that might arise with these assumptions
1. wrong regressors, nonlinearity in the parameters, changing parameters
2. biased intercept
3. autocorrelation and heteroskedasticity
4. errors in variables, lagged values, simultaneous equation bias
5. multicollinearity
6. inappropriate tests
Least Squares Estimator
We want to know the latent values of the parameters and so we have to use
the data Y,X to create estimators. Start with . Let the guess of what might be denoted
by the letter b. If Y=Xb+e, then this is really a definition of the resulting residual errors
from a guess b. We want to make these small in a summed squared sense:
min b SSE ee=(Y-Xb)(Y-Xb)=YY-2bXY+bXXb.

SSE
2X ' Y 2X ' Xb 0 , or
b

OLS estimator of

b=(XX)-1XY

Slightly different derivation: Y=Xb+e multiply by X to get XY=XXb+Xe. Since X

and e are independent, the average value of Xe that we might see is approximately zero.
Thus XY=XXb which also gives the OLS estimator of .
1
Interpretation: b (X ' X ) X ' Y

cov(X, Y )

var(X )

cov(X, Y )
var(X )

var(Y )

var(Y )
var(X )

sY
sX

Hence, b is like a correlation between x and y when we do not standardize the scales of
the variables.
The residual vector e is by definition e=Y-Xb or
e=Y-X(XX)-1XY=(I-X(XX)-1X)Y =MY,
where M=I- X(XX)-1X. This matrix M is the centering matrix around the regression
line and is very much like the mean centering matrix H=I-1 1/n=I-1 (11)-11.
Theorem: M is symmetric and idempotent (MM=M), tr(M)=n-k, MX=0.
Given the regression centering matrix M, the sum of squared errors is SSE=ee=
(MY)(MY)=YMY.
Theorem: E[b]= OLS is unbiased.
proof: E[b]=E[(XX)-1XY]= E[(XX)-1X(X+)]= E[(XX)-1XX+(XX)-1X)]
= +(XX)-1XE[]=.
Theorem: var[b]=2(XX)-1.
proof: E[(b-)(b-)]=+(XX)-1X-)+(XX)-1X-)]=
=(XX)-1XX(XX)-1]=(XX)-1XE[]X(XX)-1
=(XX)-1X2IX(XX)-1=2(XX)-1XX(XX)-1=2(XX)-1.
Theorem: Xe=0, estimated errors are orthogonal to the data generating them.
Proof: XMY=(X-XX(XX)-1X)Y=(X-X)Y=0Y=0.
Now consider estimating . SSE=ee=YMY=(X+)M(X+)=
XMX+2MX+M. The first two terms are zero because MX=0. Hence
ee=M=tr(M) (note: the trace of a scalar is trivial)=tr(M). Given this, the
expected value of ee is just tr(ME[])=tr(M=2tr(M)=2(n-k).

Theorem: s2ee/(n-k)=YMY/(n-k) is an unbiased estimator of 2 and s2(XX)-1 is an

unbiased estimator of var[b]. s is called the standard error of the estimate
Gauss-Markov Theorem: The OLS estimator b=(XX)-1XY is BLUE (the Best Linear
Unbiased Estimator) of .
proof: Let a be another estimator of . For linearity a=AY. For unbiased,
E[AY]=E[AX+A]=AX =, so AX=I. That is a=+A.
var(a)=E[(AY-)(AY-)]=E[AA]=2AA.
Define D=A-(XX)-1X, then
var(a)=2((XX)-1X+D)((XX)-1X+D)=2{(XX)-1+DD+(XX)-1XD+DX(XX)-1}.
But DX=AX-(XX)-1XX=I-I=0. Hence var(a)= 2(XX)-1+2DD. The first term is the
variance of b and the second term is a positive definite matrix so var[a]>var[b].
Note: apply this with just an intercept and it implies that x is the BLUE of
How good is the fit? YHY is a measure of the spread in values of y and is called the sum
of squares Total. The regression can reduce the unknown elements to just the sum of
squared Errors, ee. The amount of sum of squares that the regression explains is the
difference: SST-SSE=SSR. R2 is a common measure of performance (also called the
coefficient of determination:
R2

SSR
SSE
e' e
Y ' MY
1
1
1
SST
SST
Y' HY
Y ' HY .

Note: since b minimizes ee, it also maximizes R2.

R2 always goes up if you add a new variable (since we could always set the
coefficient of that variable to zero, using it optimally always reduces error). But it can
reduce the variance matrix (XX) and hence increase the variance of the estimators.
Adjusted R2 corrects for the number of independent variables:
R2 1

n 1
(1 R 2 )
nk
.

Adding a variable with a t-stat >1.0 will increase adjusted R2. Notice: not t-stat>1.96.

Normal Distribution in Regression

Suppose Y|X ~ N(X, 2I). Note: up to now the only statistical assumption that we made
is is iid and independent of X. Now we layer on normality. The likelihood of observing
Y is just the pdf for multivariate normal:
L(Y|X,2I)=(22)-n/2 exp[-(Y-X)(Y-X)/2].
L or equivalently max ln(L) L .
Maximum Likelihood Estimation (MLE): max
,
,
L
0 X ' (Y X) / 2

MLE (X ' X) 1 X' Y

Note: MLE=b from OLS.

L
n 1 1 1
0

(Y X)' (Y X)
2
2 2 2 4

2 MLE e' e / n .
Note: divide by n not n-k. Hence MLE of 2 is biased.
Theorem: Y|X ~ N(X, 2I) then b~N(,2(XX)-1),
independent.

nk 2
s ~ 2n k , and b and s2 are
2

Confidence Intervals
Joint:
(b-)(XX)(b-) ks2Fk,n-k()
One at a time:
bi SE(bi) tn-k(/2)
Simultaneous:
bi SE(bi) kFk , n k ()
Hypothesis Testing
Ho: R=r, where R qk and r q1 for linear restrictions on k1.
Let a be the OLS estimators of using the above q restrictions: a min ee s.t. Ra=r. Let
b be the unconstrained OLS estimators of . Likelihood ratio LR = La/Lb. Define
1
1
LR = -2ln(LR)=2ln(Lb)-2ln(La)= 2 (Y Xb)' (Y Xb) 2 ( Y Xa )' (Y Xa ) . If we

replace 2 with s2=ee/(n-k), then

(Y - Xa)' (Y - Xa) - e' e / q ~ F

q ,n k .
e' e /( n k )
Hence we can test the restrictions R=r by running the regression with the constraints and
unconstrained and computing

(SSE constr SSE unconstr ) / q

SSE unconst /(n k )

and compare to critical value

Fq,n-k().
There are two other tests that are sometimes done: Lagrange Multiplier test and Wald test.
See graph below. The Wald test is a 2 test of whether Runcontr-r is different from zero.
The Lagrange multiplier test is a 2 test of whether the slope of the likelihood is zero at
the constrained value contr. It has the advantage over the LR test of not requiring unconstr
from being estimated.

?
= 0 Lagrange Multiplier Test

Likelihood
?
Ratio 0 =
Test

R=r

?
= 0 Wald Test
constr

unconstr

Properties of The OLS Estimator: Quantitative Methods 2
No ratings yet
Properties of The OLS Estimator: Quantitative Methods 2
57 pages
Chapter 3 Multiple Regression
No ratings yet
Chapter 3 Multiple Regression
49 pages
Zero Dynamics
No ratings yet
Zero Dynamics
27 pages
Econometric S
No ratings yet
Econometric S
8 pages
SharmaJK 2016 Contents OperationsResearchThe
100% (1)
SharmaJK 2016 Contents OperationsResearchThe
100 pages
Chapter 03 Test Bank
No ratings yet
Chapter 03 Test Bank
128 pages
Econometric Lec3
No ratings yet
Econometric Lec3
76 pages
EC2C4 Econometrics II
No ratings yet
EC2C4 Econometrics II
56 pages
Two-Variable Regression Model - The Problem of Estimation
No ratings yet
Two-Variable Regression Model - The Problem of Estimation
35 pages
UIL 4TH - 5TH Grade MATH Practice Test
0% (1)
UIL 4TH - 5TH Grade MATH Practice Test
10 pages
Binet's Formula Via Generating Function
No ratings yet
Binet's Formula Via Generating Function
4 pages
The Hardest SAT Math Questions Ever PrepScholar
No ratings yet
The Hardest SAT Math Questions Ever PrepScholar
1 page
Lecture 4
No ratings yet
Lecture 4
29 pages
TA Session 06
No ratings yet
TA Session 06
13 pages
Gauss-Markov Theorem
No ratings yet
Gauss-Markov Theorem
5 pages
4 - Multiple Linear Regressions
No ratings yet
4 - Multiple Linear Regressions
61 pages
CH 15
No ratings yet
CH 15
16 pages
MLRM
No ratings yet
MLRM
22 pages
03 Assumptions and Gauss Markov
No ratings yet
03 Assumptions and Gauss Markov
5 pages
Linera Regression II PDF
No ratings yet
Linera Regression II PDF
14 pages
EC501 Lecture 02
No ratings yet
EC501 Lecture 02
27 pages
Week2 Lecture2
No ratings yet
Week2 Lecture2
59 pages
Multiple Linear Regression Model by Jeevan Bista
No ratings yet
Multiple Linear Regression Model by Jeevan Bista
16 pages
File 46016
100% (1)
File 46016
3 pages
3 Further Graph Transformations X1
No ratings yet
3 Further Graph Transformations X1
21 pages
Cheatsheet
No ratings yet
Cheatsheet
2 pages
UnivariateRegression 2
No ratings yet
UnivariateRegression 2
72 pages
Futher Maths Lesson Plan WK 5
No ratings yet
Futher Maths Lesson Plan WK 5
3 pages
Azuma 2000
No ratings yet
Azuma 2000
6 pages
Chapter 5. Probability and Probability Distributions Part 2
No ratings yet
Chapter 5. Probability and Probability Distributions Part 2
82 pages
Simple Linear Regression Model
No ratings yet
Simple Linear Regression Model
6 pages
Multiple Regression Model
No ratings yet
Multiple Regression Model
6 pages
Developing Techniques To 3D Print Electric Motors
No ratings yet
Developing Techniques To 3D Print Electric Motors
5 pages
MTH140 - Fall 2023 Midterm Test
No ratings yet
MTH140 - Fall 2023 Midterm Test
14 pages
Numsol Fuck
No ratings yet
Numsol Fuck
2 pages
E0901
No ratings yet
E0901
120 pages
PC Assignment 2
No ratings yet
PC Assignment 2
2 pages
Vartanian2009 PDF
No ratings yet
Vartanian2009 PDF
6 pages
Jan S Day 3
No ratings yet
Jan S Day 3
12 pages
UnivariateRegression 3
No ratings yet
UnivariateRegression 3
81 pages
Chapter2 Econometrics MultipleLinearRegressionModel 1 1
No ratings yet
Chapter2 Econometrics MultipleLinearRegressionModel 1 1
34 pages
Linear Model
No ratings yet
Linear Model
14 pages
Education and Research: UP School of Statistics Student Council
No ratings yet
Education and Research: UP School of Statistics Student Council
26 pages
Week 1
No ratings yet
Week 1
42 pages
Nat5 Maths Straight Line Worksheet
No ratings yet
Nat5 Maths Straight Line Worksheet
1 page
SLRM Note
No ratings yet
SLRM Note
15 pages
Pow 1 The Sprinkler Dilemma
No ratings yet
Pow 1 The Sprinkler Dilemma
5 pages
Kang 2015
No ratings yet
Kang 2015
7 pages
C.B.S.E. SAMPLE PAPER 2021-22 (TERM-II) : Mathematics Class-XII
No ratings yet
C.B.S.E. SAMPLE PAPER 2021-22 (TERM-II) : Mathematics Class-XII
57 pages
MATH Q2 With TOS FINALE
No ratings yet
MATH Q2 With TOS FINALE
6 pages
Classical Linear Regression and Its Assumptions
No ratings yet
Classical Linear Regression and Its Assumptions
63 pages
Lect 6
No ratings yet
Lect 6
20 pages
Leveraging Intellectual Property Assets For Business Success
No ratings yet
Leveraging Intellectual Property Assets For Business Success
51 pages
Econometrics Handout Session 2
No ratings yet
Econometrics Handout Session 2
18 pages
Ec 2
No ratings yet
Ec 2
12 pages
Stat Prob Las 10
No ratings yet
Stat Prob Las 10
6 pages
Lecture 2
No ratings yet
Lecture 2
14 pages
United States Patent: (10) Patent No .: US 10, 174, 276 B2
No ratings yet
United States Patent: (10) Patent No .: US 10, 174, 276 B2
35 pages
Mtap Grade 5
No ratings yet
Mtap Grade 5
2 pages
Multiple Linear Reegression
No ratings yet
Multiple Linear Reegression
21 pages
30 1 1 TGT English Advt No.9-2015
No ratings yet
30 1 1 TGT English Advt No.9-2015
8 pages
DUWIND Brochure 10LQ
No ratings yet
DUWIND Brochure 10LQ
16 pages
Introduction To Mathematical Modeling: Simple Linear Regression
No ratings yet
Introduction To Mathematical Modeling: Simple Linear Regression
21 pages
G Lecture05
No ratings yet
G Lecture05
39 pages
Wooldridge 6e AppE IM
No ratings yet
Wooldridge 6e AppE IM
5 pages
3 SimpleLinearRegression
No ratings yet
3 SimpleLinearRegression
30 pages
Properties of OLS Estimators: Assumptions Underlying Model
100% (1)
Properties of OLS Estimators: Assumptions Underlying Model
23 pages
Metal 3D Printing As A Disruptive Technology For Superalloys
No ratings yet
Metal 3D Printing As A Disruptive Technology For Superalloys
4 pages
Appendex E
No ratings yet
Appendex E
9 pages
Investigation of 3D Non Random Porous ST
No ratings yet
Investigation of 3D Non Random Porous ST
7 pages
Unit - 1
No ratings yet
Unit - 1
8 pages
Chem Noc020
No ratings yet
Chem Noc020
1 page
I-9 Dos and Don'ts: Instructions
No ratings yet
I-9 Dos and Don'ts: Instructions
1 page
Assignments Ashoka University
No ratings yet
Assignments Ashoka University
32 pages
Grothendieck Universe
No ratings yet
Grothendieck Universe
8 pages
Chapter 1 - Linear Regression With 1 Predictor: Statistical Model
No ratings yet
Chapter 1 - Linear Regression With 1 Predictor: Statistical Model
35 pages
Hasse Diagram AND Lattice
No ratings yet
Hasse Diagram AND Lattice
10 pages
Wooldridge Notes
No ratings yet
Wooldridge Notes
15 pages
Advanced Econometrics PDF
No ratings yet
Advanced Econometrics PDF
58 pages
2 Classical Linear Regression Models: 2.1 Assumptions For The Ordinary Least Squares Regression
No ratings yet
2 Classical Linear Regression Models: 2.1 Assumptions For The Ordinary Least Squares Regression
18 pages
Form 9 Application UCPR
No ratings yet
Form 9 Application UCPR
3 pages
Form 9 - Report On Inspection and Testing of Backflow Prevention Devices, Registered Air Gaps and Registered Break Tank
No ratings yet
Form 9 - Report On Inspection and Testing of Backflow Prevention Devices, Registered Air Gaps and Registered Break Tank
3 pages
Non-Spherical Errors: 1 Efficient OLS
No ratings yet
Non-Spherical Errors: 1 Efficient OLS
14 pages
Form No. 9: (See Rule 107)
No ratings yet
Form No. 9: (See Rule 107)
1 page
Math644 - Chapter 1 - Part2 PDF
No ratings yet
Math644 - Chapter 1 - Part2 PDF
14 pages
Scientific Notation
No ratings yet
Scientific Notation
15 pages
統計摘要
No ratings yet
統計摘要
12 pages
Ec2 1
No ratings yet
Ec2 1
11 pages
Ordinary Least Squares
No ratings yet
Ordinary Least Squares
21 pages
Matrix OLS NYU Notes
No ratings yet
Matrix OLS NYU Notes
14 pages
Chapter 6: Regression
No ratings yet
Chapter 6: Regression
7 pages
Rotary Inverted Pendulum: ME 452 Course Project II
No ratings yet
Rotary Inverted Pendulum: ME 452 Course Project II
25 pages
8 - S.Y.B.Sc. Mathematics
No ratings yet
8 - S.Y.B.Sc. Mathematics
8 pages
Lesson01 PDF 02
No ratings yet
Lesson01 PDF 02
5 pages
Kinematics - Motion in A Plane
No ratings yet
Kinematics - Motion in A Plane
6 pages
Sheetcheat Econometrics
No ratings yet
Sheetcheat Econometrics
1 page
Guidelines For Registration of Pesticides (Other THAN HERBICIDES) U/S 9 (3) / 9 (3B) - As On 05-10-2011
No ratings yet
Guidelines For Registration of Pesticides (Other THAN HERBICIDES) U/S 9 (3) / 9 (3B) - As On 05-10-2011
7 pages
Aircraft Viscous Drag Reduction Using Riblets - Viswanath
No ratings yet
Aircraft Viscous Drag Reduction Using Riblets - Viswanath
30 pages
Ant Ma8551 U1
No ratings yet
Ant Ma8551 U1
73 pages
Emet2007 Notes
No ratings yet
Emet2007 Notes
6 pages
Python Notes
No ratings yet
Python Notes
4 pages

Classical Multiple Regression

Uploaded by

Classical Multiple Regression

Uploaded by

Classical Multiple Regression

y is a random scalar that is partially explained by x but partial explained by unobserved

Slightly different derivation: Y=Xb+e multiply by X to get XY=XXb+Xe. Since X

Theorem: s2ee/(n-k)=YMY/(n-k) is an unbiased estimator of 2 and s2(XX)-1 is an

Note: since b minimizes ee, it also maximizes R2.

Normal Distribution in Regression

MLE (X ' X) 1 X' Y

Note: MLE=b from OLS.

replace 2 with s2=ee/(n-k), then

(Y - Xa)' (Y - Xa) - e' e / q ~ F

(SSE constr SSE unconstr ) / q

and compare to critical value

You might also like