0% found this document useful (0 votes)
13 views

Econometrics Lecture4 MultipleRegression

Uploaded by

Pb H
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views

Econometrics Lecture4 MultipleRegression

Uploaded by

Pb H
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 40

ECO2009 Econometrics I

Chapter 3 Multiple Regression Analysis: Estimation

Yaein Baek

Sogang University

Spring 2024

ECO2009 Econometrics I Chapter 3 Spring 2024 1 / 40


Multiple Regression Model

Suppose we have k independent (explanatory) variables


X1 , X2 , . . . , Xk
A model that explains Y in terms of variables X1 , X2 , . . . , Xk ?
The multiple linear regression (MLR) model can be written in the
population as

Y = β0 + β1 X1 + β2 X2 + · · · + βk Xk + u

▶ β0 : intercept
▶ βj : slope parameter associated with Xj , j = 1, . . . , k
▶ u: error term (disturbance) that contains factors other than
X1 , X2 , . . . , Xk that affect Y

ECO2009 Econometrics I Chapter 3 Spring 2024 2 / 40


Interpretation of the Multiple Regression Model

The MLR model is linear in parameters β0 , β1 , . . . , βk

∆Y = β1 ∆X1 + β2 ∆X2 + · · · + βk ∆Xk + ∆u

Assuming ∆X1 = · · · = ∆Xj−1 = ∆Xj+1 = · · · = ∆u = 0, we have

∆Y
βj =
∆Xj

βj measures how much the dependent variable changes if the jth


independent variable is increased by one unit, hold all other
independent variables constant
“Ceteris paribus” interpretation
▶ We still have to assume that unobserved factors do not change if the
explanatory variables are changed, ∆u = 0

ECO2009 Econometrics I Chapter 3 Spring 2024 3 / 40


Motivation for Multiple Regression

The simple regression model


▶ The error term u represents factors other than X that affect Y
▶ The key assumption for ceteris paribus conclusions (SLR.4 ZCM):

E (u|X ) = 0

▶ This assumption is not likely to hold in many situations


▶ Hardly used in empirical economics
The multiple regression model
▶ Incorporate more explanatory factors into the model
▶ Explicitly control for many other factors that affect the dependent
variable
▶ Allow for more flexible functional forms

ECO2009 Econometrics I Chapter 3 Spring 2024 4 / 40


Motivation for Multiple Regression
Example: Wage equation

The wage is determined by the two explanatory variables, education


and experience
Compared with the simple regression model, exper is taken out of the
error term and put explicitly in the equation
Because the equation contains experience explicitly, we will be able to
measure the effect of education on wage, holding experience fixed

ECO2009 Econometrics I Chapter 3 Spring 2024 5 / 40


Motivation for Multiple Regression
Example: Average test scores and per student spending

Per student spending is likely to be correlated with average family


income at a given high school because of school financing
Omitting average family income in regression would lead to biased
estimate of the effect of spending on average test scores
In a simple regression model, effect of per student spending would
partly include the effect of family income on test scores

ECO2009 Econometrics I Chapter 3 Spring 2024 6 / 40


Motivation for Multiple Regression
Example: Family income and family consumption

MLR is also useful for generalizing functional relationships between


variables
Consumption is explained as a quadratic function of income
One has to be very careful when interpreting the coefficients:

ECO2009 Econometrics I Chapter 3 Spring 2024 7 / 40


Motivation for Multiple Regression
Example: CEO salary, sales and CEO tenure

Model assumes a constant elasticity relationship between CEO salary


and the sales of his or her firm
Model assumes a quadratic relationship between CEO salary and his
or her tenure with the firm
Meaning of “linear” regression: the model has to be linear in the
parameters (not in the variables)

ECO2009 Econometrics I Chapter 3 Spring 2024 8 / 40


Multiple Regression Model

The (multivariate) ZCM assumption:

E (u|X1 , X2 , . . . , Xk ) = 0

Under the ZCM assumption

E (Y |X1 , X2 , . . . , Xk ) = β0 + β1 X1 + · · · + βk Xk

βj is the partial effect of Xj on E (Y |X1 , X2 , . . . , Xk )

∂E (Y |X1 = x1 , X2 = x2 , . . . , Xk = xk )
βj =
∂xj

ECO2009 Econometrics I Chapter 3 Spring 2024 9 / 40


Obtaining the OLS estimators
Suppose we have a random sample of n observations
{(Yi , Xi1 , Xi2 , . . . , Xik ) : i = 1, 2, . . . , n}
The parameters are estimated by minimizing the sum of squared residuals
n
X
min (Yi − b0 − b1 Xi1 − · · · − bk Xik )2
b
i=1

where b ≡ (b0 , b1 , . . . , bk )
The OLS estimators (β̂0 , β̂1 , . . . , β̂k ) are obtained from (k + 1) FOCs
X
(Yi − β̂0 − β̂1 Xi1 − · · · − β̂k Xik ) = 0
i
X
(Yi − β̂0 − β̂1 Xi1 − · · · − β̂k Xik )Xi1 = 0
i
..
.
X
(Yi − β̂0 − β̂1 Xi1 − · · · − β̂k Xik )Xik = 0
i

ECO2009 Econometrics I Chapter 3 Spring 2024 10 / 40


Interpreting the OLS Regression Equation

The sample regression function (SRF or the OLS regression line)

Ŷ = β̂0 + β̂1 X1 + β̂2 X2 + · · · + β̂k Xk

The estimators β̂j have partial effect (ceteris paribus) interpretations

∆Ŷ = β̂j ∆Xj

holding X1 , . . . , Xj−1 , Xj+1 , . . . , Xj fixed


→ we have controlled for the variables X1 , . . . , Xj−1 , Xj+1 , . . . , Xj

ECO2009 Econometrics I Chapter 3 Spring 2024 11 / 40


Interpreting the OLS Regression Equation
Example 3.1 Determinants of college GPA

Holding ACT fixed, another point on high school grade point is


associated with another 0.453 points college grade point, on average
Or: If we compare two students with the same ACT , but the hsGPA
of student A is one point higher, we predict student A to have a
colGPA that is 0.453 higher than that of student B
Holding high school grade point average fixed, another 10 points on
ACT are associated with 0.094 point on college GPA, on average

ECO2009 Econometrics I Chapter 3 Spring 2024 12 / 40


OLS Fitted Values and Residuals

Fitted value for observation i

ŷi = β̂0 + β̂1 xi1 + β̂2 xi2 + · · · + β̂k xik

Residual for observation i

ûi = yi − ŷi

Algebraic properties of OLS regression


n
X
1 ûi = 0 sample covariance between Xj and u-
i=1 hat is zero. >> the sample covariance
n
X between the OLS fitted values y-hat
2 xij ûi = 0, j = 1, . . . , k
and the OLS residuals u-hat is zero.
i=1
3 ȳ = β̂0 + β̂1 x̄1 + β̂2 x̄2 + · · · + β̂k x̄k

ECO2009 Econometrics I Chapter 3 Spring 2024 13 / 40


“Partialling Out” Interpretation of Multiple Regression
Consider a MLR model with k = 2 independent variables

Yi = β0 + β1 Xi1 + β2 Xi2 + ui ,

the OLS estimate β̂1 can be obtained in two steps


1 Regress Xi1 on Xi2 and obtain the residuals
(Yi) (Xi)
xi1 = α0 + α1 xi2 + ri1 , i = 1, . . . , n
rˆi1 = xi1 − α̂0 − α̂1 xi2

2 Regress Yi on the residuals rˆi1 , then


(Xi) Pn
rˆi1 yi
β̂1 = Pi=1 n 2
i=1 rˆi1 The sample covariance of
the OLS residual and
rˆi1 are the part of xi1 that is uncorrelated with xi2 independent variable is zero.
Or: rˆi1 is xi1 after the effects of xi2 have been partialled out
(netted out)
ECO2009 Econometrics I Chapter 3 Spring 2024 14 / 40
“Partialling Out” Interpretation of Multiple Regression
Pn
rˆi1 yi
β̂1 = Pi=1
n 2
i=1 rˆi1

β̂1 measures the sample relationship between Y and X1 after X2 has


been partialled out
▶ It represents the isolated effect of the explanatory variable on the
dependent variable
In the general model with k explanatory variables, β̂j can be written as
Pn
rˆij yi
β̂j = Pi=1
n 2
i=1 rˆij

where the residuals rˆij come from the regression of Xj on


X1 , . . . , Xj−1 , Xj+1 , . . . , Xk >> "take out " the part in Xj that is correlated with other
independent variabls.
Holding other variables X1 , . . . , Xj−1 , Xj+1 , . . . , Xk fixed, β̂j is the
average effect of Xj on Y
ECO2009 Econometrics I Chapter 3 Spring 2024 15 / 40
“Partialling Out” Interpretation of Multiple Regression

Frisch-Waugh theorem
It can be shown that β̂j can be obtained from the following procedure
1 Regress Y on X1 , . . . , Xj−1 , Xj+1 , . . . , Xk and obtain the residual û Y
2 Regress Xj on X1 , . . . , Xj−1 , Xj+1 , . . . , Xk and obtain the residual û Xj
3 Regress û Y on û Xj

ECO2009 Econometrics I Chapter 3 Spring 2024 16 / 40


Goodness-of-Fit
We cannot use R-squared to determine whether to include variables or not.

Decomposition of total variation

SST = SSE + SSR

where SST = i (yi − ȳ )2 , SSE = i (ŷi − ȳ )2 , SSR = i ûi2


P P P

R-squared
SSE SSR
R2 ≡ =1−
SST SST
R 2 is non-decreasing in the number of independent variables
R-squared never decreases, and usually increases when other independent variable is
included in the regression.
Even if we add an independent variable that is irrelevant to Y, R-squared increases.

ECO2009 Econometrics I Chapter 3 Spring 2024 17 / 40


Goodness-of-Fit
Example 3.5 Explaining Arrest Records

An additional explanatory variable is added

Limited additional explanatory power as R-squared increases by little


The sign of the coefficient on avgsen is also unexpected: longer average
sentence length increases criminal activity
Even if R-squared is small, regression may still provide good estimates of
ceteris paribus effects
ECO2009 Econometrics I Chapter 3 Spring 2024 18 / 40
Assumptions for the Multiple Regression Model

MLR Assumptions
MLR.1 (Linear in Parameters) The population model is

Y = β0 + β1 X1 + β2 X2 + · · · + βk Xk + u

MLR.2 Random Sampling We have a random sample of n


observations, {(xi1 , xi2 , . . . , xik , yi ) : i = 1, 2, . . . , n}, following the
population model in MLR.1.

ECO2009 Econometrics I Chapter 3 Spring 2024 19 / 40


Assumptions for the Multiple Regression Model

MLR Assumptions
MLR.3 (No Perfect Collinearity) In the sample (and therefore in
the population), none of the independent variables is constant, and
there are no exact linear relationships among the independent
variables.

The assumption only rules out perfect collinearity between


explanatory variables; imperfect correlation is allowed.
If an explanatory variable is a perfect linear combination of other
explanatory variables it is superfluous and may be eliminated.
Constant variables are also ruled out (collinear with intercept).

ECO2009 Econometrics I Chapter 3 Spring 2024 20 / 40


Assumptions for the Multiple Regression Model

Example for perfect collinearity: small sample

Example for perfect collinearity: relationships between regressors

ECO2009 Econometrics I Chapter 3 Spring 2024 21 / 40


Assumptions for the Multiple Regression Model

MLR Assumptions
MLR.4 (Zero Conditional Mean) The error u has an expected
value of zero given any values of the independent variables.

E (u|X1 , X2 , . . . , Xk ) = 0.

Notations:
▶ Vector of n observations of the jth explanatory variable:
Xj = (X1j , . . . , Xij , . . . , Xnj )′ , j = 1, . . . , k
▶ X ≡ [1 X1 X2 · · · Xk ], where 1 is a (n × 1) vector of ones
Assumption MLR.4 with MLR.2 implies E (ui |X) = 0 for i = 1, . . . , n

ECO2009 Econometrics I Chapter 3 Spring 2024 22 / 40


Assumptions for the Multiple Regression Model

In a MLR, the ZCM assumption is much more likely to hold because


fewer things end up in the error.
The ZCM assumption fails if
▶ The functional relationship between the dependent and independent
variables is misspecified
▶ Omitting an important factor that is correlated with any of
X1 , X2 , . . . , Xk
▶ Measurement error (Ch. 9, 15)
▶ Simultaneous equations models (Ch. 16)

ECO2009 Econometrics I Chapter 3 Spring 2024 23 / 40


Unbiasedness of OLS

Theorem 3.1 Unbiasedness of OLS


Under Assumptions MLR.1 through MLR.4,

E (β̂j ) = βj , j = 0, 1, . . . , k

for any values of the population parameters βj . In other words, the OLS
estimators are unbiased estimators of the population parameters.

ECO2009 Econometrics I Chapter 3 Spring 2024 24 / 40


Omitted Variable Bias

The true population model has two explanatory variables and an error
term
Y = β0 + β1 X1 + β2 X2 + u
and assume this model satisfies Assumptions MLR.1-MLR.4
Suppose we specify the model by excluding X2 :

Y = β̃0 + β̃1 X1 + ũ

What will happen to the OLS estimator of β̃1 ?

ECO2009 Econometrics I Chapter 3 Spring 2024 25 / 40


Omitted Variable Bias
Recall that
Pn
ˆ (Xi1 − X̄1 )Yi
β̃1 = Pi=1
n 2
i=1 (Xi1 − X̄1 )
Pn Pn
i=1 (Xi1 − X̄1 )Xi2 β2 (Xi1 − X̄1 )ui
= β1 + Pn 2
+ Pi=1
n 2
. (1)
i=1 (Xi1 − X̄1 ) i=1 (Xi1 − X̄1 )

Consider the following regression where E (v |X1 ) = 0

X2 = γ0 + γ1 X1 + v ,

Note that the OLS estimator of γ1 is unbiased


 Pn 
(Xi1 − X̄1 )Xi2
E [γˆ1 |X1 ] = E Pi=1n 2
X 1 = γ1
i=1 (Xi1 − X̄1 )

where X1 = (X11 , . . . , Xi1 , . . . , Xn1 )′


ECO2009 Econometrics I Chapter 3 Spring 2024 26 / 40
Omitted Variable Bias
Take the conditional expectation of (1) given X1
 Pn 
ˆ i=1 (Xi1 − X̄1 )Xi2
h i
E β̃1 |X1 = β1 + β2 E Pn 2
X1
i=1 (Xi1 − X̄1 )
 Pn 
(Xi1 − X̄1 )ui
+ E Pi=1 n 2
X 1 .
i=1 (Xi1 − X̄1 )

By Law of Iterated Expectations,


"P # " "P # #
n n
(X i1 − X̄1 )u i (X i1 − X̄1 )ui
E Pin 2
X1 = E E Pin 2
X X1 = 0
i (Xi1 − X̄1 ) i (Xi1 − X̄1 )

under Assumption MLR.4 (ZCM)


We also have  Pn 
i=1 (Xi1 − X̄1 )Xi2
E Pn 2
X1 = γ 1
i=1 (Xi1 − X̄1 )

Therefore,
E β̃ˆ1 |X1 = β1 + β2 · γ1
h i

ECO2009 Econometrics I Chapter 3 Spring 2024 27 / 40


Omitted Variable Bias
The term β2 · γ1 is called the omitted variable bias,

E β̃ˆ1 |X1 − β1 = β2 · γ1
h i

which can be positive, negative, or zero.


We can infer the bias in β̃1 from the signs of β2 and correlation
between X1 and X2

There is no omitted variable bias if the omitted variable is


▶ irrelevant: β2 = 0
▶ uncorrelated with X1 : γ1 = 0
Can be generalized to the case of more than 2 regressors

ECO2009 Econometrics I Chapter 3 Spring 2024 28 / 40


Omitted Variable Bias
Example: Omitting ability in a wage equation

wage = β0 + β1 educ + β2 abil + u


abil = γ0 + γ1 educ + v

More ability leads to higher productivity and therefore higher wages:


β2 > 0
Likely that educ and abil are positively correlated: γ1 > 0
The OLS estimator of β̃1 from the simple regression

wage = β̃0 + β̃1 educ + ũ

are on average too large (overestimated) because β2 γ1 > 0

ECO2009 Econometrics I Chapter 3 Spring 2024 29 / 40


The Variance of the OLS Esitmators

MLR Assumptions
MLR.5 (Homoskedasticity): The error u has the same variance
given any value of the explanatory variables.

Var(u|X1 , . . . , Xk ) = σ 2

Example: Wage equation

wage = β0 + β1 educ + β2 exper + β3 tenure + u


Var(ui |educi , experi , tenurei ) = σ 2

ECO2009 Econometrics I Chapter 3 Spring 2024 30 / 40


The Variance of the OLS Estimators

Theorem 3.2 Sampling variances of the OLS slope estimators


Under Assumptions MLR.1 through MLR.5,

σ2
Var(β̂j |X) = , j = 1, . . . , k
SSTj (1 − Rj2 )

where SSTj = ni=1 (Xij − X̄j )2 and Rj2 is the R-squared from regressing
P
Xj on all other independent variables (including an intercept).

Components of the variance


1 The error variance: σ 2
2 The total variation in Xj : SSTj
3 The linear relationship among X1 , . . . , Xk : Rj2
⋆ Variance inflation factor

ECO2009 Econometrics I Chapter 3 Spring 2024 31 / 40


The Components of the OLS Variances: Multicollinearity

1 The error variance σ 2


▶ A high error variance increases the sampling variance because there is
more “noise” in the equation.
▶ The error variance does not decrease with sample size.
2 The total sample variation in the explanatory variable SSTj
▶ More sample variation leads to more precise estimates.
▶ Total sample variation automatically increases with the sample size,
thus increasing the sample size is thus a way to get more precise
estimates.
▶ A small (nonzero) SSTj is not a violation of Assumption MLR.3.

ECO2009 Econometrics I Chapter 3 Spring 2024 32 / 40


The Components of the OLS Variances: Multicollinearity

3 Linear relationships among the independent variables


▶ Regress Xj on all other independent variables; the R-squared of this
regression will be the higher when Xj can be better explained by the
other independent variables.
▶ The sampling variance of the slope estimator for Xj will be higher when
Xj can be better explained by the other independent variables.
▶ High (but not perfect) correlation between two or more independent
variables is called multicollinearity.
⋆ This is not a violation of Assumption MLR.3.
▶ A high degree of correlation between certain independent variables can
be irrelevant as to how well we can estimate other parameters of
interest.

ECO2009 Econometrics I Chapter 3 Spring 2024 33 / 40


The Components of the OLS Variances: Multicollinearity
An example for multicollinearity

The different expenditure categories will be strongly correlated


because if a school has a lot of resources it will spend a lot on
everything.
As a consequence, sampling variance of the estimated effects will be
large.
Only the sampling variance of the variables involved in
multicollinearity will be inflated; the estimates of other effects may be
very precise.

ECO2009 Econometrics I Chapter 3 Spring 2024 34 / 40


Including Irrelevant Variables in a Regression

Suppose that the true population model is

Yi = β0 + β1 X1 + β2 X2 + U

and assuming Assumptions MLR.1-MLR.4, we specify the model as


follows:
Yi = β̃0 + β̃1 X1 + β̃2 X2 + β̃3 X3 + Ũ
What will happen to the OLS estimators of β̃1 and β̃2 ?

ECO2009 Econometrics I Chapter 3 Spring 2024 35 / 40


Including Irrelevant Variables in a Regression

In terms of unbiasedness of β̃ˆ1 and β̃ˆ2 , there is no effect, which is


immediate from Theorem 3.1.
▶ Remember that unbiasedness means E (β̂j ) = βj for any value of βj
However, inclusion of irrelevant variables increases the variances of
the OLS estimators. Remember that
σ2
Var(β̂j |X) = , j = 1, . . . , k
SSTj (1 − Rj2 )

where SSTj = ni=1 (Xij − X̄j )2 and Rj2 is the R-squared from
P
regressing Xj on all other independent variables (including an
intercept)

ECO2009 Econometrics I Chapter 3 Spring 2024 36 / 40


Standard Errors of the OLS Estimators

The unbiased estimator of σ 2 in a MLR is


n
2 1 X
σ̂ = ûi2
n−k −1
i=1

where ûi = yi − β̂0 − β̂1 xi1 − · · · − β̂k xik


The degrees of freedom (df)

df = n − (k + 1)
= (number of observations) − (number of estimated parameters)

ECO2009 Econometrics I Chapter 3 Spring 2024 37 / 40


Standard Errors of the OLS Estimators
Unbiased Estimation of σ 2
Under Assumptions MLR.1 through MLR.5, E (σ̂ 2 ) = σ 2

The standard deviation of β̂j

σ
q
sd(β̂j ) = Var(β̂j ) = q
SSTj (1 − Rj2 )

The standard error of β̂j

σ̂
q
se(β̂j ) = Var(
d β̂j ) = q
SSTj (1 − Rj2 )

Note that these formulas are only valid under Assumptions


MLR.1-MLR.5 (in particular, there has to be homoskedasticity)

ECO2009 Econometrics I Chapter 3 Spring 2024 38 / 40


The Gauss-Markov Theorem

Under assumptions MLR.1-MLR.4, OLS is unbiased


However, under these assumptions there may be many other
estimators that are unbiased.
Which one is the unbiased estimator with the smallest variance?
▶ We define the “best” estimator as the one with the smallest variance.
In order to answer this question one usually limits oneself to linear
estimators; i.e., estimators linear in the dependent variable.
n
X
β̃j = Wij Yi
i=1

▶ Wij may be an arbitrary function of the sample values of all the


explanatory variables; the OLS estimator can be shown to be of this
form

ECO2009 Econometrics I Chapter 3 Spring 2024 39 / 40


The Gauss-Markov Theorem

Theorem 3.4 Gauss-Morkov Theorem


Under assumptions MLR.1-MLR.5, the OLS estimator β̂j is the best
linear unbiased estimator (BLUE) of the regression coefficient βj , i.e.

Var(β̂j ) ≤ Var(β̃j ), j = 0, 1, . . . , k
Pn
for all β̃j = i=1 Wij Yi for which E (β̃j ) = βj , j = 0, 1, . . . , k.

OLS is only the best estimator if MLR.1 – MLR.5 hold; if there is


heteroskedasticity, OLS no longer has the smallest variance among
linear unbiased estimators

ECO2009 Econometrics I Chapter 3 Spring 2024 40 / 40

You might also like