0% found this document useful (0 votes)
19 views16 pages

Chap 4

Uploaded by

23005230
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views16 pages

Chap 4

Uploaded by

23005230
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 16

14/04/2015

Introductory Econometrics:
A modern approach (Wooldridge)
Chapter 4

Multiple Regression Analysis:


Inference

Lê Văn Chơn – IU-VNU, 2015 Chap 9-1

4.1 Sampling Distributions of the OLS


Estimators
2

To perform statistical inference, we need to know not only the first


two moments of β̂ j , but the full sampling distribution of the β̂ j.

It depends on the underlying distribution of the errors.

Assumption MLR.6 (Normality)


The population error u is independent of the explanatory variables
x1, x2 ,…, xk and is normally distributed: u ~ N(0, σ ).
2

When u is independent of the xj, then E(u|x1,…, xk) = E(u) = 0 and


Var(u|x1,…, xk) = Var(u) = σ . Or Assumption MLR.6 implies
2

Assumptions MLR.3 and MLR.5.

Lê Văn Chơn – IU-VNU, 2015

1
14/04/2015

4.1 Sampling Distributions of the OLS


Estimators
3

Assumptions MLR.1 through MLR.6 are called the classical linear


model (CLM) assumptions.

Under the CLM assumptions, OLS is not only BLUE, but is the
minimum variance unbiased estimator.

We can summarize the population assumptions of the CLM as


y|x ~ N( β 0 + β1 x1 + β 2 x2 + ... + β k xk , σ 2 )

Conditional on x, y has a normal distribution with mean linear in


x1,…, xk and a constant variance.

Lê Văn Chơn – IU-VNU, 2015

4.1 Sampling Distributions of the OLS


Estimators
4

Normal distribution of y with a single independent variable:


y
f(y|x)

E(y|x) = β0 + β1x

Normal
distributions
x1 x2 x
Lê Văn Chơn – IU-VNU, 2015

2
14/04/2015

4.1 Sampling Distributions of the OLS


Estimators
5

Why do the errors have a normal distribution?


Because u is the sum of many different unobserved factors affecting
y, we can invoke the central limit theorem to conclude that u has an
approximate normal distribution.

Central Limit Theorem


Let {Y1, Y2,…, Yn} be a random sample with mean µ and variance σ
2

Yn − µ
Zn =
σ/ n
has an asymptotic standard normal distribution.

Lê Văn Chơn – IU-VNU, 2015

4.1 Sampling Distributions of the OLS


Estimators
6

Normality of the u translates into normal sampling distributions of β̂ j ' s


Theorem 4.1 (Normal Sampling Distributions)
Under the CLM assumptions MLR.1 through MLR.6, conditional on
the sample values of the independent variables,
βˆ ~ N [ β , Var ( βˆ )]
j j j
(4.1)
Therefore, βˆ j − β j
~ N (0,1)
sd (βˆ )j

In addition to (4.1), any linear combination of the βˆ0 , βˆ1 ,..., βˆk is also
normally distributed, and any subset of the β̂ j has a joint normal
distribution.

Lê Văn Chơn – IU-VNU, 2015

3
14/04/2015

4.2 Testing Hypotheses: the t Test


7

This section is on testing hypotheses about a single population β j .

We assume the population model satisfies the CLM assumptions:


y = β 0 + β1 x1 + β 2 x2 + ... + β k xk + u (4.2)
We know OLS produces unbiased estimators of the β j .

To test hypotheses about a β j , we need the following result:


Theorem 4.2 (t distribution for the standardized estimators)
Under the CLM assumptions MLR.1 through MLR.6,
βˆ j − β j
~ tn − k −1 (4.3)
se( βˆ ) j

where k + 1 is the number of unknown population parameters.

Lê Văn Chơn – IU-VNU, 2015

4.2 Testing Hypotheses: the t Test


8

This result differs from Theorem 4.1 in that σ in sd( β̂ j ) has been
replaced with the random variable σˆ .

Theorem 4.2 allows us to test hypotheses involving the β j .

Our primary interest lies in testing the null hypothesis:


H0: β j = 0 (4.4)

Null hypothesis means that once x1, x2,…, xj-1, xj+1,…, xk have been
accounted for, xj has no effect on the expected value of y.

Lê Văn Chơn – IU-VNU, 2015

4
14/04/2015

4.2 Testing Hypotheses: the t Test


9

E.g., consider the wage equation


log(wage) = β 0 + β1 educ + β 2 exper + β 3 tenure + u

Null hypothesis H0: β 2 = 0 means that once education and tenure


have been accounted for, experience has no effect on hourly wage.

If β 2 > 0, then prior work experience contributes to productivity, and


hence to wage.

To test H0, we use t statistic or t ratio of β̂ j , which is defined as


βˆ j
t βˆ ≡ (4.5)
j
se( βˆ j )

Lê Văn Chơn – IU-VNU, 2015

4.2 Testing Hypotheses: the t Test


10

To test H0: β j = 0, it is natural to look at β̂ j , unbiased for β j .


Point estimate β̂ j is never exactly 0, whether or not H0 is true.

Question: How far is β̂ j from 0?


A sample value of β̂ j very far from 0 provides evidence against H0.

But the size of β̂ j must be weighted against its sampling error.


Values of t β̂ sufficiently far from 0 will result in a rejection of H0.
j

Precise rejection rule depends on the alternative hypothesis H1 and


the chosen significance level.

Lê Văn Chơn – IU-VNU, 2015

5
14/04/2015

4.2 Testing against One-sided Alternatives


11

Consider a one-sided alternative:


H1: β j > 0 (4.6)
It means we do not care β j < 0 as it may be ruled out theoretically.

How should we choose a rejection rule?


First, decide on a significance level α ( α probability of rejecting
H0 when it is in fact true). The most popular is 5% significance level.

Second, look up the (1 – α )th percentile in a t distribution with n–k–1


degrees of freedom, and call this c, the critical value.

Rejection rule: if t β̂ j > c, reject H0 at the α % level.


if t βˆ j < c, fail to reject H0 at the α % level.
Lê Văn Chơn – IU-VNU, 2015

4.2 Testing against One-sided Alternatives


12

Population model: y = β 0 + β1 x1 + β 2 x2 + ... + β k xk + u


Null hypothesis H0: β j = 0
Alternative H1: β j > 0

Fail to reject
reject
(1 − α) α
0 c
As the df in the t distribution gets large, the t distribution approaches
the standard normal distribution.

Lê Văn Chơn – IU-VNU, 2015

6
14/04/2015

4.2 Testing against One-sided Alternatives


13

E.g., WAGE1. Estimated equation:


lôg(wage) = .284 + .092educ + .0041exper + .022 tenure
(.104) (.007) (.0017) (.003)
n = 526, R2 = .316
where standard errors appear in parentheses.

We can test H0: β exp er = 0 versus H1: β exp er > 0.

t β̂
exp er
= .0041/.0017 ≈ 2.41 > 2.326 (the 1% critical value).
β exp er is statistically greater than 0 at the 1% significance level.
Or the partial effect of experience is positive in the population.

Lê Văn Chơn – IU-VNU, 2015

4.2 Testing against Two-sided Alternatives


14

It is common to test H0: β j = 0


against a two-sided alternative H1: β j ≠ 0.

Under H1, xj has a ceteris paribus effect on y without specifying


whether the effect is positive or negative.

Rejection rule: | t βˆ | > c (4.11)


j

α
For a two-tailed test, the critical value c is chosen based on .
2
If H0 is rejected at the α % level, we say that “xj is statistically
significant (or statistically different from 0) at the α % level.”

If not, xj is statistically insignificant at the α % level.


Lê Văn Chơn – IU-VNU, 2015

7
14/04/2015

4.2 Testing against One-sided Alternatives


15

Population model: y = β 0 + β1 x1 + β 2 x2 + ... + β k xk + u


Null hypothesis H0: β j = 0
Alternative H1: β j ≠ 0

fail to reject

reject reject
α/2 (1 − α) α/2
-c 0 c
Lê Văn Chơn – IU-VNU, 2015

4.2 Testing against Two-sided Alternatives


16

E.g., GPA1. Estimate a model explaining college GPA with average


number of lectures missed per week (skipped) as another x.
côlGPA = 1.39 + .412hsGPA + .015ACT – .083skipped
(.33) (.094) (.011) (.026)
n = 141, R2 = .234

With 137 df, 5% critical value is 1.96, and 1% critical value 2.58.

t statistics on hsGPA and skipped > 2.58.


hsGPA and skipped are statistically significant and have
expected signs. Do not need to do a one-tailed test.

Against H1: β 2 > 0, ACT is significant at the 10% level, not at the 5%
level.
Lê Văn Chơn – IU-VNU, 2015

8
14/04/2015

4.2 Testing Other Hypotheses About β j


17

We sometimes want to test whether β j is equal to a constant.


Null H0: β j = aj
where aj is a hypothesized value of β j .

βˆ j − a j
t statistic: t= ~ tn - k - 1
se( βˆ j )

Lê Văn Chơn – IU-VNU, 2015

4.2 Computing p-Values for t Tests


18

After stating H1, we choose a significance level , which is arbitrary.


There is no “correct” significance level.

Rather than testing at different significance levels, we answer the


question: Given t statistic, what is the smallest significance level at
which H0 would be rejected?
This level is known as the p-value for the test.

P-value for two-sided tests: P(|T| > |t|) (4.15)


where T denotes a t distributed random variable with n – k – 1 df,
and t denotes the calculated test statistic.

Lê Văn Chơn – IU-VNU, 2015

9
14/04/2015

4.2 Computing p-Values for t Tests


19

P-value is the probability of observing the calculated t if H0 is true.


Small p-values are evidence against H0;
Large p-values provide little evidence against H0.

If α denotes the significance level of the test, then


if p-value < α , H0 is rejected at the 100. α % level;
if p-value > α , H0 is not rejected at the 100. α % level.

Lê Văn Chơn – IU-VNU, 2015

4.3 Confidence Intervals


20

Under CLM assumptions, we can construct a confidence interval


(CI) for the β j , using the same critical value as was used for a two-
sided test.

βˆ j − β j
As ~ tn − k −1 , simple manipulation leads to a (1 – α )%
se( βˆ j )
confidence interval: βˆ j ± c • se( βˆ j ) (4.16)
α
where c is the (1 – )th percentile in a tn – k – 1 distribution,
2
lower bound is β ≡ βˆ − c • se( βˆ ) ,
j j j

upper bound is β j ≡ βˆ j + c • se( βˆ j ) .

Lê Văn Chơn – IU-VNU, 2015

10
14/04/2015

4.3 Confidence Intervals


21

Meaning: If many random samples were obtained, with β j and β j


computed each time, then β j would lie in the interval ( β j , β j ) for
(1 – α )% of the samples.
We have no guarantee that β j is contained in the CI of the sample.

Test the null H0: β j = aj


against the alternative H1: β j ≠ aj
H0 is rejected at the α % significance level if and only if aj is not in
the (1 – α )% confidence interval.

Lê Văn Chơn – IU-VNU, 2015

4.4 Testing a Linear Combination of βj


22

E.g., TWOYEAR. We want to compare the returns to education at


junior colleges and four-year colleges, called “universities”.
log(wage) = β 0 + β1 jc + β 2 univ + β 3 exper + u (4.17)
where jc is number of years attending a two-year college and
univ is number of years at a four-year college.

Null hypothesis H0: β1 = β 2


meaning a year at a junior college is worth a year at a university.
Alternative H1: β1 < β 2
βˆ1 − βˆ2
We can use the t statistic but obtaining the standard
se( βˆ1 − βˆ2 )
error in the denominator is difficult.

Lê Văn Chơn – IU-VNU, 2015

11
14/04/2015

4.4 Testing a Linear Combination of βj


23

We can let β1 = θ1 + β 2 and the model will become:


log(wage) = β 0 + θ1 jc + β 2 (jc + univ) + β 3 exper + u (4.25)

We can use t statistic or p-value to test H0: θ1 = 0.

Lê Văn Chơn – IU-VNU, 2015

4.5 Testing Multiple Linear Restrictions:


The F Test
24

So far we have tested a single linear restriction ( β 3 = 0 or β1 = β 2 )


by using the t statistic.

Frequently, we wish to test multiple hypotheses about our


parameters.

A typical example is testing “exclusion restrictions” – whether a


group of parameters are all equal to 0.
Null H0: β k −q +1 = 0, …, β k = 0, (4.29)
Alternative H1: H0 is not true (4.30)

We can’t just check each t statistic separately because the q


parameters may be individually insignificant but jointly significant at
a given level.
Lê Văn Chơn – IU-VNU, 2015

12
14/04/2015

4.5 Testing Multiple Linear Restrictions:


The F Test
25

MLB1: A model explains major league baseball players’ salaries:


log( salary ) = β 0 + β1 years + β 2 gamesyr + β 3bavg + β 4 hrunsyr + β 5 rbisyr + u
where salary is the 1993 total salary,
years is years in the league,
gamesyr is average games played per year,
bavg is career batting average,
hrunsyr is home runs per year,
rbisyr is runs batted in per year.

Suppose we want to test the null hypothesis that bavg, hrunsyr, and
rbisyr have no effect on salary.
H0: β 3 = 0, β 4 = 0, β 5 = 0

Lê Văn Chơn – IU-VNU, 2015

4.5 Testing Multiple Linear Restrictions:


The F Test
26

To do the test, we need to estimate the “restricted model” without


xk-q+1, …, xk, and the “unrestricted model” with all x’s included.

Intuitively, we want to know if the change in SSR is big enough to


warrant inclusion of xk-q+1, …, xk.
( SSRr − SSRur ) / q
F statistic: F≡ (4.37)
SSRur /( n − k − 1)
where r is restricted and ur is unrestricted,
q = number of restrictions or dfr – dfur,
n – k – 1 = dfur.

F statistic is always nonnegative since SSRr > SSRur.

Lê Văn Chơn – IU-VNU, 2015

13
14/04/2015

4.5 Testing Multiple Linear Restrictions:


The F Test
27

The F statistic measures the relative increase in SSR when moving


from the unrestricted to restricted model.

To decide if the increase in SSR is “big enough” to reject H0, we


need to know the sampling distribution of the F statistic.

Under H0 (and the CLM assumptions):


F ~ Fq,n – k – 1

We reject H0 at the α % significance level if


F>c
where c is the α % critical value.

Lê Văn Chơn – IU-VNU, 2015

4.5 Testing Multiple Linear Restrictions:


The F Test
28

Rejection rule: H0 is rejected at α % significance level if F > c.

f(F)
fail to reject

reject
(1 − α) α
0 c F
Lê Văn Chơn – IU-VNU, 2015

14
14/04/2015

4.5 Testing Multiple Linear Restrictions:


The F Test
29

E.g., MLB1.DTA. Unrestricted model:


log(sâlary) = 11.10 + .0689years + .0126gamesyr + .00098bavg
(0.29) (.0121) (.0026) (.0011)
+ .0144hrunyr + .0108rbisyr
(.0161) (.0072)
n = 353, SSR = 183.186, R2 = .6278
Restricted model:
log(sâlary) = 11.22 + .0713years + .0202gamesyr
(.11) (.0125) (.0013)
n = 353, SSR = 198.311, R2 = .5971
F = (198.311 – 183.186)/183.186 x 347/3 = 9.55 > 3.78 (1% Fcrit)
Reject H0.
Lê Văn Chơn – IU-VNU, 2015

4.5 The R2 form of the F statistic


30

Because the SSR’s can be very large and unwieldy, the R2 form of
the formula is useful.

We use the fact that R2 = 1 – SSR/SST or SSR = SST(1 – R2) and


substitute it for SSRr and SSRur in (4.37):
( Rur2 − Rr2 ) / q
F≡ (4.41)
(1 − Rur2 ) /( n − k − 1)

Lê Văn Chơn – IU-VNU, 2015

15
14/04/2015

4.5 Overall Significance of a Regression


31

A special case of exclusion restrictions is to test if none of the


explanatory variables has an effect on y:
H0: β1 = β 2 = ... = β k = 0

Since the R2 from a model with only an intercept is zero, the F


statistic is simply
R2 / q
F≡
(1 − R 2 ) /( n − k − 1)

Lê Văn Chơn – IU-VNU, 2015

16

You might also like