0% found this document useful (0 votes)
22 views44 pages

Econometrics I - Lecture 5 (Wooldridge) Color

The document provides an overview of multiple regression analysis, focusing on the assumptions and statistical methods used for hypothesis testing, including the use of OLS estimators and the normality of population errors. It details the process for testing single and multiple linear restrictions using t-tests and F-tests, along with the formulation of confidence intervals and p-values. Additionally, it emphasizes the importance of understanding the significance of regression models and the implications of statistical results in econometrics.

Uploaded by

Le Ngoc Mai
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
22 views44 pages

Econometrics I - Lecture 5 (Wooldridge) Color

The document provides an overview of multiple regression analysis, focusing on the assumptions and statistical methods used for hypothesis testing, including the use of OLS estimators and the normality of population errors. It details the process for testing single and multiple linear restrictions using t-tests and F-tests, along with the formulation of confidence intervals and p-values. Additionally, it emphasizes the importance of understanding the significance of regression models and the implications of statistical results in econometrics.

Uploaded by

Le Ngoc Mai
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 44

ESOM Program

ECONOMETRICS I
National Economics
University
2019
MA Hai Duong
1
MULTIPLE REGRESSION:
INFERENCE

2
RECAP.
In the multiple regression model
where
under fairly minimal assumptions the OLS estimator is an unbiased
estimator of
Under the assumptions that:
1. the population model is linear in parameters;
2. the conditional expectation of true errors given all explanatory
variables is zero;
3. the sample is randomly selected; and
4. the explanatory variables are not perfectly collinear
If we add no conditional heteroskedasticity to the above assumptions,
22
then OLS is the B.L.U.E. for
RECAP.
Under these assumptions, the variance-covariance matrix of
conditional on is , which we estimate using

Where:

The standard errors of coefficients reported by all statistical packages


are the square root of the diagonal elements of this estimated variance-
covariance matrix
Statistical packages also report the standard error of the regression,
which is , the squared root of
4
LECTURE OUTLINE
Sampling Distributions of the OLS Estimators (Textbook reference 4-1)
Testing Hypotheses About a Single Population Parameter (Textbook
reference 4-2(a-f))
Confidence Intervals (Textbook reference 4-3)
Testing Single Linear Restrictions (Textbook reference 4-4)
Testing a joint hypothesis of several restrictions - The F test (Textbook
reference 4-5a)
An important special case: Testing the overall significance of a
regression model (Textbook reference 4-5e)
Application of t-test to testing a single linear restriction involving several
parameters (textbook reference 4-4) 5
SAMPLING DISTRIBUTIONS OF
THE OLS ESTIMATORS
We want to test hypotheses about the . This means we hypothesize
that a population parameter is a certain value, then use the data to
determine whether the hypothesis is likely to be false.
Example: Hypothesis: Missing lectures has no causal effect on final
exam performance

The null hypothesis is

6
SAMPLING DISTRIBUTIONS OF
THE OLS ESTIMATORS
Assumptions we have made so far gave us

And

But hypothesis testing requires the sampling distributions of


So, we need more.
Assumption MLR.6 or E.5 (Normality): Conditional on X the
population errors are normally distributed

7
ASSUMPTION MLR.6 OR E.5
(NORMALITY)
The random sampling assumption implies that population errors are
independent of each other
The zero conditional expectation assumption implies that
population errors have mean zero
The no heteroskedasticity assumption implies that population
errors have variance
Assumption MLR.6 can therefore be expressed as: conditional on
explanatory variables, population errors are i.i.d.
Similarly, E.5 can be expressed as
Assumptions MLR.1 to MLR.6 are called the Classical Linear Model
(CLM) assumptions
8
THE CLASSICAL LINEAR
MODEL

MLR.6: 𝒖¿

𝒖(−)
9
SAMPLING DISTRIBUTIONS OF
THE OLS ESTIMATORS
The beauty of Normal distribution is that any linear combination of
Normal random variable is also Normally distributed
Recall that:

which shows that conditional on , is a linear combination of


normally distributed errors, and therefore we obtain the following
result:
Under CLM assumptions

10
SAMPLING DISTRIBUTIONS OF
THE OLS ESTIMATORS
That is for

Where:

(all elements are on main diagonal of Var-Cov matrix)


Therefore, the standardized estimator
(a new variable has standardized normally distributed)

11
TESTING HYPOTHESES ABOUT A
SINGLE POPULATION PARAMETER
We cannot directly use the result to test hypotheses about because
depends on , which is unknown.
But we have as an estimator of . Using this in place of gives us the
standard error,
But replacing (an unknown constant) with (an estimator that varies
across samples), takes us from the standard normal to the distribution
(Student distribution).
Under the CLM Assumptions,

12
TESTING HYPOTHESES
(CONT.)
The t distribution also has a bell shape, but is more spread out (has
fatter tails) than the . It gets more similar to as its (degree of free)
increases.

13
TESTING HYPOTHESES
(CONT.)
We use the result on the distribution to test the null hypothesis
about a single
Most routinely we use it to test if controlling for all other , has no
partial effect on :

for which we use the statistic (or ratio),

which is computed automatically by most statistical packages for


each estimated coefficient
But to conduct the test we need an alternative hypothesis
14
TESTING HYPOTHESES
(CONT.)
The alternative hypothesis can be one-sided, such as or , or it can be two-
sided
The alternative determines what kind of evidence is considered legitimate
against the null. For example, for we only are excited to reject the null if
we find evidence that it is negative. We won’t be interested in any
evidence that is positive, and we won’t reject the null if we found such
evidence.
Example: The effect of missing lectures on final performance, the
alternative hypothesis is that missing lectures has negative effect on your
final, even after controlling for how smart you are. We are not interested in
any evidence that missing lectures improves final performance
With two sided alternatives, we take any evidence that may not be zero,
whether positive or negative, as legitimate 15
TESTING HYPOTHESES
(CONT.)
We also need to specify the size or the significance level of the
test, which is the probability that we wrongly reject the null when it
is true (Type I error)
Using the distribution, the significance level and the type of
alternative, we determine the critical value that defines the
rejection region.

16
TESTING HYPOTHESES
(CONT.)

Fail to reject
the null

17
TESTING HYPOTHESES
(CONT.)
If (the value of the -statistic in our sample) falls in the critical
region, we reject the null hypothesis.
When we reject the null, we say that is statistically significant at
the % level
When we fail to reject the null, we say that is not statistically
significant at the % level

18
TESTING HYPOTHESES
(CONT.)
We can use test to test the null hypothesis:

where is a constant, not necessarily zero. Under the assumptions of CLM,


we know that:
if is true
So we test the null using this statistic.
Percentiles of distribution with various degrees of freedom are given in
Table G.2 on page 745 of the 6th edition.

19
P-VALUE
Another way, probability value (P-value or Significant) can be used
to making decision for testing hypothesis
P-value formula: P-value
P-value
P-value
Decision rule: P-value Reject the null
P-value Doesn’t reject the null
These values are automatically calculated by Eviews (only for two-
sided test). In one-sided cases, we can use a half of P-value

20
P-VALUE CALCULATED BY
EVIEWS

21
STEPS INVOLVED IN STATISTICAL
VERIFICATION OF A QUESTION
1. Formulating and for the question
2. Determining the appropriate test statistic and stating its
distribution under
3. Determining the rejection region with reference to the null
distribution, and the desired level of significance of the test (),
using statistical tables or software
4. Calculating the test statistic from regression results
5. Arriving at a conclusion: Rejecting if the value of the test
statistic falls inside the rejection region, and not rejecting
otherwise
6. Explaining the conclusion in the context of the question
22
CONFIDENCE INTERVALS
Another way to use classical statistical testing is to construct a
confidence interval using the same critical value as was used for a
two-sided test
A confidence interval is defined as

where is the percentile of a distribution


The interpretation of a confidence interval is that the interval will
cover the true parameter with probability
If the confidence interval does not contain zero, we can deduce
that is statistically significant at the level
23
TESTING MULTIPLE LINEAR
RESTRICTIONS: THE F TEST
Sometimes we want to test multiple restrictions. For example, in
the regression model

we are always interested in the overall significance of the model by


testing

or we may be interested in testing

or even a more exotic hypothesis such as:

24
TESTING MULTIPLE LINEAR
RESTRICTIONS (CONT.)
Question: The first null involves four restrictions, the second
involves ... restriction and the third involves ... restrictions?
The alternative can only be that at least one of these restrictions is
not true
The test statistic involves estimating two equations, one without
restrictions (the unrestricted model) and one with the restrictions
imposed (the restricted model), and seeing how much their sum of
squared residuals differ
This is particularly easy for testing exclusion restrictions like the
first two nulls on the previous slide

25
TESTING MULTIPLE LINEAR
RESTRICTIONS (CONT.)
For example, (note: 2 restrictions)
We have the alternative is : at least one of or is not zero
The unrestricted model is:
(UR)
and the restricted model is:
(R)

26
TESTING MULTIPLE LINEAR
RESTRICTIONS (CONT.)
The test statistic is
under
where is the number of restrictions, and is an distribution with
degrees of freedom. The value is called the numerator , and is
called the denominator .
Note: The F statistic is always positive (Recall Statistics in E&B) - so
if you happen to get a negative number, you should realize that
you must have made a mistake
The 10%, 5% and 1% critical values of the distribution for various
numerator and denominator s are given in Tables G.3a, G.3b and
G.3c of the textbook
27
TESTING MULTIPLE LINEAR
RESTRICTIONS (CONT.)
Suppose and . Then from Table G.3b the

We reject the null if , and not reject otherwise 28


A USEFUL FORMULATION OF
THE F TEST
The of the restricted and the unrestricted models are the same
(why?), therefore we have:

This allows us to write the as:

which is more useful because is reported more often than

29
TEST FOR OVERALL
SIGNIFICANCE OF A MODEL
This is the that is reported in the Eviews output any time we estimate a
regression
It is for the special null hypothesis that all slope parameters are zero

meaning that none of the explanatory variables helps explain


If we cannot reject this null, our model is useless
The test statistic is under
Eviews also produces the P-value of this test (“Prob(F-statistic)” in the
regression output)

30
EXAMPLE FOR THE F TEST
Using data VHLSS 2010, determinant the effect of the number of high-
education labors on the number of location labors, we will use the
estimation results:
(UR)

(R)

Where: is the number of located labors, 2010 (1000 people)


is the total population of location, 2010 (1000 people)
is the population density of location, 2010 (people per squared km)
and are the number of universities and colleges students, location 31
EXAMPLE FOR THE F TEST

In this sample, we have:

Level of significance of the test: 5%


Critical value from Table G.3b between 3.15 for and 3.23 for . In
such cases we prefer to be conservative and choose the closest
degree of freedom less than what we are after

32
EXAMPLE FOR THE F TEST
is greater than both and ⇒ There is strongly enough evidence to
suggest that the effect of the number of high educated labors on
the number of located labors is statistically significant, after
controlling for the effects of total population and the population
density of location to the number of located labors.
The for overall significance of both regressions shows that both
models have at least one significant explanatory variable for
explaining the number of located labors.
You can do these test by yourself, can’t you?

33
TESTING A SINGLE
RESTRICTION THAT
INVOLVES MORE THAN ONE
PARAMETER
In the regression model

sometimes we want to test hypotheses such as:

or as another example, we may want to test

Both of hypothesizes test that a single linear combination of is


equal to a constant (what is that for Null 1?)

34
TESTING A SINGLE
RESTRICTION THAT
INVOLVES MORE THAN ONE
PARAMETER
We can use the as usual, there would be no good reason to study
this case separately
However, with a single hypothesis, we can have one-sided
alternatives. For example, for Null 1, we have
(Alt 1)
and for Null 2 we have
(Alt 2)
F-test cannot be used for one sided alternatives
We can re-formulate the problem and use a t-test

35
TESTING A SINGLE
RESTRICTION THAT
INVOLVES MORE THAN ONE
PARAMETER
Consider Null1. Define
We can write Null1 and Alt1 as

The estimator for is


And (too messy to calculate)
Under CLM assumptions conditional on is Normally distributed, is
also Normally distributed
We can use the above to come up with a -test to test or use a cool
trick

36
REPARAMETERISATION: A
COOL USEFUL TRICK
Consider Null1 and Alt1
and set
Replace in the model with and rearrange

which shows that if we regress on a constant, , and , the OLS


estimate of the coefficient of will be and its standard error is
directly given to us without any messy calculations! Amazing!

37
REPARAMETERISATION:
EXAMPLE
Test that controlling for IQ, one additional year of education has the
same proportional effect on wage as one additional year of
experience, against the alternative that it has a larger effect:

38
REPARAMETERISATION:
EXAMPLE
Note that only the estimated parameter of EDUC on the right is on
the left
is the parameter of EDUC in the reparameterised model
= 5.59 which is larger than 1.65 the 5% one-sided CV of
Therefore, we reject the null
Conclusion: After controlling for IQ, one additional year of education
has a larger proportional effect on wage as one additional year of
experience.

39
SUMMARY
In this lecture we learned that if we add the assumption that
population errors are normally distributed, we get that the OLS
estimator, conditional on X is normally distributed
Using this, we learned how to test hypothesis on any of the slope
parameters, or on a single linear combination of several of them,
by using a test
We learned that we can reparameterise the population model to
enable us to test a single linear combination of parameters easily
We also learned how to form confidence intervals for any of the
slope parameters
We learned that hypothesis tests could also be done with the help
of confidence intervals, or alternatively using p-values 40
SUMMARY
Further, we can test multiple linear restrictions using an
The test statistic is based on sum of squared residuals of the
unrestricted and the restricted models ( and )
under
Using the significance level of the test, we can find the of the test
from the F table
We reject the null of > , and not reject otherwise
The for overall significance of a regression model is an important
case that is provided by all statistical packages
Finally, we learned how to test general linear restrictions

41
TEXTBOOK EXERCISES
Problems 5, 6 (page 142)
Problems 8, 9 (page 143)
Problems 12, 13 (page 145)

42
COMPUTER EXCERCISES
C8 (page 147)
C11, C12 (page 148)

43
THANK FOR YOUR
ATTENDANCE - Q & A

44

You might also like