Lecture 3 - Econometria I
Lecture 3 - Econometria I
𝑦𝑖 = 𝛽0 + 𝛽1 𝑥𝑖 + 𝑢𝑖 , 𝑖 = 1, 2, … , 𝑛
𝐸 𝑢|𝑥 = 0
𝑉𝑎𝑟 𝑢|x = 𝜎 2
Simple linear regression model
Simple linear regression model
Simple linear regression model
Now suppose you want to explore the effect of public spending per
student (expend) on their average standardized test score (avgscore).
However, in the USA, public school expenditure per student is based
on property and local income taxes, thus we could include family
average income in our regression.
𝑦 = 𝛽0 + 𝛽1 𝑥1 + 𝛽2 𝑥2 + 𝛽3 𝑥3 + ⋯ + 𝛽𝑘 𝑥𝑘 + 𝑢
𝐸 𝑢|𝑥1 , 𝑥2 , … , 𝑥𝑘 = 0
𝑦 = 𝛽0 + 𝛽1 𝑥1 + 𝛽2 𝑥2
where 𝛽0 is the estimate of 𝛽0 , 𝛽1 is the estimate of 𝛽1 and 𝛽2 is the
estimate of 𝛽2 .
Multiple linear regression model
𝑛
2
𝑦𝑖 − 𝛽0 − 𝛽1 𝑥𝑖1 − 𝛽2 𝑥𝑖2
𝑖=1
And the sample linear regression function (or OLS regression line) is
given by
𝑛
2
𝑦𝑖 − 𝛽0 − 𝛽1 𝑥𝑖1 − 𝛽2 𝑥𝑖2 − ⋯ − 𝛽𝑘 𝑥𝑖𝑘
𝑖=1
Deriving the OLS estimator
𝑦 = 𝛽0 + 𝛽1 𝑥1 + 𝛽2 𝑥2
The intercept 𝛽0 is the predicted value when both 𝑥1 and 𝑥2 equal
zero. The estimates 𝛽1 and 𝛽2 have partial effect or ceteris paribus
interpretations.
The first two multiple linear regression (MLR) assumptions are very similar
to the SLR assumptions:
𝐸 𝑢|𝑥1 , 𝑥2 , … , 𝑥𝑘 = 0
There are several ways that this assumption might not be met.
Multiple linear regression assumptions
For now, we only need to worry about the first two. When MLR.4. holds,
it is often said that we have exogenous explanatory variables.
Unbiasedness of OLS
𝐸(𝛽𝑗 ) = 𝛽𝑗 , 𝑗 = 0, 1, … , 𝑘
This theorem, properly described here, is also applied to SLR. Essentially,
it describes the unbiasedness of OLS.
Multiple linear regression assumptions
𝑉𝑎𝑟 𝑢|𝑥1 , 𝑥2 , … , 𝑥𝑘 = 𝜎 2
Just as in the simple linear regression model, here the variance of the
unobserved factor is not affected by different values of the
explanatory variable. This is true for each explanatory variable in the
model.
Multiple linear regression assumptions
𝑦 = 𝛽0 + 𝛽1 𝑥1 + 𝛽2 𝑥2 + 𝛽3 𝑥3 + 𝑢
𝑦 = 𝛽0 + 𝛽1 𝑥1 + 𝛽2 𝑥2 + 𝛽3 𝑥3
Including irrelevant variable in a
regression model.
Thankfully, the inclusion of an irrelevant regressor does not affect the
unbiasedness of OLS. That is to say, if assumptions MLR.1 through
MLR.4 hold, the OLS estimators for each 𝛽𝑗 are unbiased, i.e.
𝐸 𝛽𝑗 = 𝛽𝑗 for any value of 𝛽𝑗 .
The actual value of 𝛽𝑗 will most likely not be zero, it should be very
close to it and, on average across all samples, it will be zero.
While the OLS estimator remains unbiased, the inclusion of an
irrelevant variable does affect the variance of OLS estimators (to be
seen).
Exclusion of a relevant variable in a
regression model
The second case, where we exclude a relevant variable in a
regression model, is more complex.
We saw previously that the exclusion of a relevant variable in our
model can be problematic, specially if that variable is correlated
with one of the other regressors as this invalidates MLR.4 and the
unbiasedness of OLS estimators. Now we have a closer look at the
process. Suppose we want to estimate the following:
𝑤𝑎𝑔𝑒 = 𝛽0 + 𝛽1 𝑒𝑑𝑢𝑐 + 𝑣
where 𝑣 = 𝛽2 𝑎𝑏𝑖𝑙 + 𝑢.
𝑦 = 𝛽0 + 𝛽1 𝑒𝑑𝑢𝑐
Exclusion of a relevant variable in a
regression model
Notice the difference between “~” and “^” in the model. We use
this to explicitly show that we are using an underspecified model.
In this case, the underspecified coefficient 𝛽1 can be written as
𝛽1 = 𝛽1 + 𝛽2 𝛿1
𝛽1 and 𝛽1 come from estimates from the true population regression
function while 𝛿1 is the slope of the simple linear regression of abil on
educ, i.e. ability as a function of education.
Exclusion of a relevant variable in a
regression model
Since we can assume 𝛿1 is a non-random value and assumptions
MLR.1-MLR.4 are met, we can write:
𝐸 𝛽1 = 𝐸 𝛽1 + 𝛽2 𝛿1 = 𝐸 𝛽1 + 𝐸 𝛽2 𝛿1 = 𝛽1 + 𝛽2 𝛿1
Bias 𝛽1 = 𝐸 𝛽1 − 𝛽1 = 𝛽2 𝛿1
Exclusion of a relevant variable in a
regression model
Bias 𝛽1 = 𝛽2 𝛿1
The term on the right side of the equation is called omitted variable
bias and we can infer the direction of bias based on the value of both
𝛽2 and 𝛿1 . Of course, if either one of them are equal to zero then the
OLS estimate for 𝛽2 will be unbiased.
Note that we can tell the direction of bias in this way, but not the magnitude .
Exclusion of a relevant variable in a
regression model
Bias 𝛽1 = 𝛽2 𝛿1
To summarise, if the excluded variable is not correlated with any other
regressor, then there is no bias in the OLS estimates. However, if there is
correlation, the direction of the bias will depend on the sign of both
the true parameter and the correlation between the regressors.