Sampling Distributions of The OLS Estimators

Download as pdf or txt
Download as pdf or txt
You are on page 1of 27

Multiple linear regression:

hypothesis testing
BY SHARPER SIKOTA-LECTURER KWAME NKRUMAH UNIVERSITY

WOOLDRIDGE CHAPTERS 4 – 9
Cheat Sheet Time

Make yourself a cheat sheet with the formulae for:

The population model, fitted values, residuals, OVB formula,


SST, SSR, SSE, R squared, relationship between SST, SSE and
SSR, Variance of beta, VIF, and whatever else seems
appropriate. Then keep it handy while studying. It will prove
useful!
Chapter 4 Roadmap

1. Sampling Distributions of the OLS Estimators


2. Testing Hypotheses about a Single Population Parameter:
The t Test
3. Confidence Intervals
4. Testing Hypotheses about a Single Linear Combination of
the Parameters
5. Testing Multiple Linear Restrictions: The F Test
6. Reporting Regression Results

3
Logwage = β0 + β1female + β2educ + β3age + β4african + β5coloured +
β6indas + β7depressed + β8hhrural + u

4/
Now we can ask questions
What is the distribution of β? 4.1

What is the true coefficient on female? 4.2


Is a coefficient equal to a particular value? 4.2, 4.3


Are two or more coefficients related somehow? 4.4


Should all the variables have been included? 4.5


NB! What is the best way to report regressions? 4.6


5/
Chapter 4 Roadmap

1. Sampling Distributions of the OLS Estimators


2. Testing Hypotheses about a Single Population
Parameter: The t Test
3. Confidence Intervals
4. Testing Hypotheses about a Single Linear Combination
of the Parameters
5. Testing Multiple Linear Restrictions: The F Test
6. Reporting Regression Results

6
MLR Assumptions 1-5

MLR.1 linear in parameters


Y = β0+ β1x1 + β2x2 + ... + βkxk + u
MLR.2 random sampling
MLR.3 No perfect collinearity
MLR.4 Zero conditional mean
E(u | x1,x2, …, x1k) = 0
MLR.5 homoskedasticity
Var(u | x1,x2, …, x1k) = σ2

7/
Distribution of the OLS Estimator

➢What does it mean to talk about the distribution of 𝛃?


𝑦ො = 𝛽መ0 + 𝛽መ1 𝑥1 + 𝑢ො
➢Surely I just get one value for 𝛽መ 1?
➢ Yes. You do. In that sample. What about in another sample?
➢Thought experiment: (1) Draw many many samples, (2)
estimate 𝛽መ 1 in each sample, and (3) examine the distribution of
all of those values for 𝛽መ 1.
➢Is it normal? Uniform? Poisson?

If our model is good, it has (1) low variance, and (2) is centered

around the true population value of 𝛽1

8/
Distribution of the OLS Estimator
▪ How are the sample coefficients distributed?

▪ We need to know this to perform hypothesis tests

▪ We have a new normality assumption about u:


MLR.6

▪ MLR.6 will help us figure out how 𝛽 is distributed

9/
MLR.6 implies MLR.4 and MLR.5

▪u is independent of the x's, so E(u|x) = E(u)

We know E(u) = 0, so E(u|x) = 0, so MLR.6 → MLR.4

▪u is independent of the x’s, so Var(u|x) = Var(u)

MLR.6 also says Var(u) = σ2, so Var(u|x) = σ2 which is MLR.5


(homoskedasticity) so MLR.6 → MLR.5

10 /
Classical Linear Model (CLM)

MLR.1
● linear in parameters
Y = β0+ β1x1 + β2x2 + ... + βkxk + u

MLR.2
● random sampling
MLR.3
● No perfect collinearity
MLR.4
● Zero conditional mean E(u|x1, x2, …, x1k) = 0
MLR.5
● Homoskedasticity Var(u|x1, x2, …, x1k) = σ2
MLR.6
● Normality u ~ Normal(0, σ2)

11 /
What does the CLM imply?

MLR.1 – MLR.6 is the classical linear model (CLM)

OLS estimator now has the lowest variance of ALL the unbiased
estimators of βi (i.e. not just the linear ones).

Given MLR.1 – MLR.6 we now know:

y|x ~ Normal(β0+ β1x1 + β2x2 + ... + βkxk, σ2)

Why? We need some proof (next slide)

12 /
Proof: Var(y|x) = Var(u|x) = σ2
y = β0+ β1x1 + β2x2 + ... + βkxk + u

Let A = β0+ β1x1 + β2x2 + ... + βkxk


Therefore y = A + u
And Var(y|x) = Var(A + u|x)

Var(y|x) = Var(A|x) + Var(u|x)


(by independence of A and u)

Var(y|x) = 0 + Var(u|x) = 0 + σ2 = σ2
By laws of conditional variance (Proved)
NB – I did this in Chapter 3 – including it here for convenience

13 /
Why is Var(y|x) = Var(u|x) = σ2?
Well y = β0+ β1x1 + β2x2 + ... + βkxk + u
This is unwieldy. Let's do the following:
Let A = β0+ β1x1 + β2x2 + ... + βkxk
So y = A + u, and Var(y|x) = Var(A + u|x)
Var(y|x) = Var(A|x) + Var(u|x)
Why? We can only do this if A and u are independent. Are they? Yes if E(u|x) =
E(u) = 0 holds. If A and u were related, we would have a Cov(A,u) term in the mix
somewhere too (remember the laws of how to find Var(A+B) for A and B any
random variables)
Why? Because E(u|x) = E(u) implies u and x are unrelated, so u and any function
of x (which is what A is), should also be unrelated.
So now what?
Var(y|x) = 0 + Var(u|x) = 0 + σ2
Why is Var(A|x) = 0? Because once you are given the values of x (this is what
conditioning on x is), there is no variance in an expression of x – it is a constant,
and thus has zero variance.
14 /
Why is E(y|x) = β0+ β1x1 + β2x2 + ... + βkxk?
Given: y = β0+ β1x1 + β2x2 + ... + βkxk + u
E(y|x) = E(β0+ β1x1 + β2x2 + ... + βkxk + u|x)
E(y|x) = E(β0+ β1x1 + β2x2 + ... + βkxk |x) + E(u|x)
because the expected value of a sum = the sum of the expected values
But: x and u are independent, so E(u|x) = E(u)
So: E(y|x) = E(β0+ β1x1 + β2x2 + ... + βkxk |x) + E(u)
But E(u) = 0 (by the way the OLS estimator is constructed)
Therefore:
E(y|x) = E(β0+ β1x1 + β2x2 + ... + βkxk|x) = β0+ β1x1 + β2x2 + ... + βkxk

because E(f(x)|x) = f(x) (laws of conditional expectations)

15 /
2
y|x ~ Normal(β0+ β1x1 + β2x2 + ... + βkxk, σ )
▪Why? You proved: 1. Var(y|x) = σ2

▪And you know: 2. E(y|x) = β0+ β1x1 + β2x2 + ... + βkxk


because of the independence of x and u

▪So why is 3. y|x normally distributed?


▪Because y = f (the xs, u),
Therefore y|x = f (a constant, u),
▪Remember: conditioning on x – i.e. being given the values of x -
means any function of x is a constant with zero variance.
▪The only variation in y is from u, which is normally distributed (by
MLR.6), → y|x ~ Normal

16 /
y|x~Normal(β0+ β1x1 + β2x2 + ... + βkxk, σ2)

17 /
Are the errors really normal?
u = the sum of many things affecting y
Maybe according to the Central Limit Theorem:
The distribution of a sum of independent random variables tends
towards the standard normal.

This won’t hold if:


▪ u is a complicated function of the left out x’s (i.e. not linear)
▪ OR the x's have weird distributions.

▪ It's an empirical question as to whether the u's are normal.


▪ What if they're not?

18 /
Are Wages Normally Distributed?
What about Log Wages?

19 /
20 /
Who is in the regression?
Stata can predict yhat for you, if you have data for all the x's,
even if you don't have data for lmonthlywage

21 /
Graph the Residuals
Histogram uhat, scheme(s1manual)

22 /
Which y's are Normally Distributed?
● E.g. Is Wage|educ, exper, tenure normally distributed?
No. Wage is bounded at zero

● A categorical y with few values is probably not normal


E.g, number of times arrested

●Taking the log can make something more normal, we call


this log normal – you just saw this.

●Otherwise in large samples, it's OK to assume this


(Chapter 5). Phew!
23 /
Given MLR1-6, β is Normal. Conclusions?

● We care because our hypothesis testing depends on the


distribution of β
We also get told:
● Any linear combination of β , β , …, β is normal
1 2 k
● Any subset of β , β , …, β is jointly normal
1 2 k
Cool!
24 /
What is Joint Normality?
▪ E.g. if x1, x2, x3 are jointly normal, any linear combination of x1,
x2, and x3 is normal:
▪ So the expression:
𝑎1𝑥1 + 𝑎2𝑥2 + 𝑎3𝑥3
where a1, a2 and a3 are any constants, is a new random variable,
which is normally distributed.

▪ NB: the reverse isn't true.

▪ If you take a collection of normally distributed x variables, a


linear combination of those variables may not be normal.

25 /
Following Joint Normality
▪ Why isn’t a linear combination of normal variables
not always normal?
▪ Surely it should be? No. Not necessarily.
▪ It will definitely be if they are jointly normal (as
opposed to only individually)
▪ But what is true is:

Property Normal.4:
Any linear combination of independent, identically
distributed normal random variables has a normal
distribution.
26 /
You are told β is normal. Proof?
▪ You could do this - you have the tools.
▪ The errors are independent, identically distributed (iid)
Normal (0, σ2) random variables.
▪ VERY brief summary: each β is a linear combination of the
ui's and thus β is normal
▪ NB: the textbook has an error – it refers to equation 3.62,
but actually it is another equation – find it in the chapter 3
appendices

NOT EXAMINABLE

27 /

You might also like