0% found this document useful (0 votes)
23 views9 pages

Hypothesis Testing and Interval Estimation

Hypothesis testing and confidence intervals involve: 1) Hypothesis testing is used to determine if estimated regression results are significantly different from zero or a hypothesized value. This can involve single hypothesis tests using a t-test or joint hypothesis tests using an F-test. 2) An example tests the hypothesis that the return to education (β1) is equal to zero using a two-tailed t-test, and finds the t-statistic is greater than the critical value so the null hypothesis is rejected. 3) Testing the joint hypothesis that all regression coefficients are equal to zero can be done using an F-test comparing the explained and unexplained variances, with the calculated F-statistic

Uploaded by

Andrés Mahecha
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
23 views9 pages

Hypothesis Testing and Interval Estimation

Hypothesis testing and confidence intervals involve: 1) Hypothesis testing is used to determine if estimated regression results are significantly different from zero or a hypothesized value. This can involve single hypothesis tests using a t-test or joint hypothesis tests using an F-test. 2) An example tests the hypothesis that the return to education (β1) is equal to zero using a two-tailed t-test, and finds the t-statistic is greater than the critical value so the null hypothesis is rejected. 3) Testing the joint hypothesis that all regression coefficients are equal to zero can be done using an F-test comparing the explained and unexplained variances, with the calculated F-statistic

Uploaded by

Andrés Mahecha
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

Hypothesis testing and confidence intervals

Andrés Camacho Murillo, PhD


Update 24 Feb 2022

1. Hypothesis testing

Hypothesis testing is fundamental in econometric studies to identify whether the estimated


results are significantly different from zero (to test if we can reject the null hypothesis). We
can test single hypothesis and joint hypotheses.

We have the following equation:


𝒚𝒊 = 𝜷𝟎 + 𝜷𝟏 𝒙𝟏𝒊 + 𝜷𝟐 𝒙𝟐𝒊 + 𝜷𝟑 𝒙𝟑𝒊 + 𝒖𝒊

Hypothesis Restrictions, parameters Test


t test, F test
𝐻0 : 𝛽1 = 0 1 restriction with 1 parameter
(Single test)
𝐻0 : 𝛽1 = 0, 𝛽2 = 0, 𝛽3 = 0 or
𝐻0 : 𝛽1 = 𝛽2 = 𝛽3 = 0 3 restrictions with 3 parameters
F test
(Joint test)
𝐻0 : 𝛽2 = 0, 𝛽3 = 0 2 restrictions with 2 parameters
Twald test, F test
𝐻0 : 𝛽1 + 𝛽2 = 1 1 restriction with 2 parameters
(Single test)
F test
𝐻0 : 𝛽1 = 0, 𝐻0 : 𝛽1 + 𝛽2 = 0 2 restrictions
(Joint test)
Matrix form
𝐻0 : 2𝛽1 + 1𝛽2 = 2
1𝛽1 = 0

𝓡 𝓑 𝒓 F test
𝛽0 𝓡∙𝓑=𝒓
(Joint test)
0 2 1 0 𝛽1 2
𝐻0 : [ ][ ] = [ ]
0 1 0 0 𝛽2 0
𝛽3
A. Testing for single hypothesis (using the t test)
We can apply a:

Left-tail test Two-tail test Right-tale test


𝐻0 : 𝛽𝑘 = 𝑐
𝐻1 : 𝛽𝑘 < 𝑐 𝐻1 : 𝛽𝑘 ≠ 𝑐 𝐻1 : 𝛽𝑘 > 𝑐
Left-tail Two-tail Right-tale

Our t test is as follows:


̂ 𝒌 −𝜷𝒌
𝜷
𝒕= ̂ 𝒌) Where:
̂ 𝒌 is the estimated beta (we get from our estimated regression)
𝒔𝒆(𝜷 𝜷
𝜷𝒌 is the beta of our hypothesis
𝒔𝒆(𝜷 ̂ 𝒌 ) is the standard error of the estimated beta

Graphically we have:

A two-tail test (𝛽𝑘 ≠ 𝑐)


𝐻0 : 𝛽𝑘 = 𝑐
𝐻1 : 𝛽𝑘 ≠ 𝑐

A right-tale test (𝛽𝑘 > 𝑐)


𝐻0 : 𝛽𝑘 = 𝑐
𝐻1 : 𝛽𝑘 > 𝑐

A left-tale test (𝛽𝑘 < 𝑐)


𝐻0 : 𝛽𝑘 = 𝑐
𝐻1 : 𝛽𝑘 < 𝑐
E.g. 1.
We have the following estimated equation (it comes from Mincer’s theory on human capital):

𝐥𝐨𝐠(𝒘𝒂𝒈𝒆) = 𝜷𝟎 + 𝜷𝟏 𝑬𝒅𝒖𝒄 + 𝜷𝟐 𝑬𝒙𝒑𝒆𝒓 + 𝜷𝟑 𝑻𝒆𝒏𝒖𝒓𝒆 + 𝒖

̂
log(𝑤𝑎𝑔𝑒) = 0.284 + 0.092 𝐸𝑑𝑢𝑐 + 0.004 𝐸𝑥𝑝𝑒𝑟 + 0.022 𝑇𝑒𝑛𝑢𝑟𝑒
(0.11) (0.0073) (0.001) (0.003)

𝑛 = 526 ; 𝑅 2 = 0.3160 (Standard errors in parenthesis)

Where:
𝑤𝑎𝑔𝑒: hourly wage for workers (the worker can be from a certain region or city)
𝐸𝑑𝑢𝑐: years of education
𝐸𝑥𝑝𝑒𝑟: years of work experience
𝑇𝑒𝑛𝑢𝑟𝑒: years of loyalty (working for the same company)

a) Test whether return to education is statistically different from zero at the 1% critical value
(significance level). NB: different from zero means that it can be greater than or less than zero.

This is a two-tail test (this is the test that Stata will perform)
𝐻0 : 𝛽1 = 0
𝐻1 : 𝛽1 ≠ 0 (two-tail test)

Recall: T table 𝑡𝛼,𝑛−𝑘−1 where 𝛼: significance level; n: # of observations; k: # of parameters

T table 𝑡0.01,526−3−1 so T table 𝑡0.01,522 = 2.58


0.092−0
T calculated 𝑡 = 0.0073 = 12.56 (this is the value that appears in the regression output in Stata)

We reject the null hypothesis that the estimated 𝛽1 is equals to zero. There is enough evidence to
suggest that the return to education is different from zero at the 1% significance level.

b) Some researchers claim that the return to education is greater than 0.05 (this is, a 1-year increase
in education leads to an average increase in workers’ wage of 0.05 or 5%). We can test this
hypothesis using the t-test statistics at the 1% significance level. We write the hypotheses as
follows:

𝐻0 : 𝛽2 = 0.05 NB: we need to reject this null hypothesis


𝐻1 : 𝛽2 > 0.05 (right-tail test)

T table 𝑡0.01,526−3−1 so T table 𝑡0.01,522 = 2.33


0.092−0.05
T calculated 𝑡 = = 5.75
0.0073

We reject the null hypothesis that the estimated 𝛽1 is equals to 0.05. There is enough evidence to
say that return to education is greater than 0.05 at the 1% significance level.
E.g.2:
𝒚𝒊 = 𝜷𝟎 + 𝜷𝟏 𝒙𝒊 + 𝒖𝒊

Where:
𝑦: 𝑓𝑜𝑜𝑑 𝑒𝑥𝑝𝑒𝑛𝑑𝑖𝑡𝑢𝑟𝑒 (𝑤𝑒𝑒𝑘𝑙𝑦)
𝑥: 𝑖𝑛𝑐𝑜𝑚𝑒 (𝑤𝑒𝑒𝑘𝑙𝑦)
𝑛 = 40 (Number of observations)
𝛼 = 0.05 (significance level -critical value); the error level
This is a 5% probability of being wrong with the results

Results from the regression:

̂𝒊 = 𝟒𝟎. 𝟏𝟖 + 𝟎. 𝟕𝟎𝒙𝒊
𝒚
se (0.1434) se is the standard error
t (4.88) t is the t-test statistic (student’s t-test)

T table 𝑡𝛼,𝑛−𝑘−1 where 𝛼: significance level; n: # of observations; k: # of parameters

a) Test whether the impact of income on food expenditure is different from zero at the 1% critical
value.
𝐻0 : 𝛽1 = 0
𝐻1 : 𝛽1 ≠ 0 (two-tail test)

T table 𝑡0.01,40−1−1 = 2.71


0.70−0
T calculated 𝑡 = 0.1434 = 4.88

We reject the null hypothesis at the 1% critical value (the t calculated is greater than the t table).
There is enough statistical evidence to suggest that changes in income leads to changes in food
expenditure at the 1% critical value (or significance level).

Note that we can reject the null hypothesis at the 5% significance level (an at the 10% level, too):
T table 𝑡0.05,40−1−1 so T table 𝑡0.05,38 = 2.02

b) Test whether the impact of income on food expenditure is greater than zero at the 1% critical
value (like the income elasticity of demand). Note that we always write our statement as the
alternative (H1) hypothesis.

This is a right-tale test.


𝐻0 : 𝛽1 = 0
𝐻1 : 𝛽1 > 0 (right-tail test)

T table 𝑡0.01,40−1−1 so T table 𝑡0.01,38 = 2.42


0.70−0
T calculated 𝑡 = 0.1434 = 4.88
We reject the null hypothesis at the 1% critical value. There is enough evidence to suggest that
changes in income explain the positive changes in food expenditure at the 1% critical value.
c) Test whether an increase in income causes an increase in food expenditure greater than $0.41 at
the 1% significance level.

This is also a right-tale test.


𝐻0 : 𝛽1 = 0.41 or 𝐻0 : 𝛽1 < 0.41
𝐻1 : 𝛽1 > 0.41 (right-tail test)

T table 𝑡0.01,40−1−1 so T table 𝑡0.01,38 = 2.42


0.70−0.41
T calculated 𝑡 = 0.1434 = 2.02

We do not reject the null hypothesis at the 1% significance level. So, there is not enough statistical
evidence to suggest that a one-unit increase in income causes an increase in food expenditure
greater than $0.41

Note that we can reject the null hypothesis at the 5% significance level:
T table 𝑡0.05,40−1−1 so T table 𝑡0.05,38 = 1.68
0.70−0.41
T calculated 𝑡 = 0.1434 = 2.02

There is statistical evidence to suggest that a one-unit increase in income cause increase in food
expenditure greater than $0.41 at the 5% critical value.
B. Testing for joint hypotheses (using the F test to compare two variances)
We have: 𝑦𝑖 = 𝛽0 + 𝛽1 𝑥1𝑖 + 𝛽2 𝑥2𝑖 + 𝛽3 𝑥3𝑖 + 𝑢𝑖

𝐻0 : 𝛽1 = 𝛽2 = 𝛽3 , … , 𝛽𝑘 = 0 (all the regression coefficients are simultaneously equal to zero).


𝐻1 : 𝛽𝑘 ≠ 0 (at least one of the regression coefficients is different from zero)

See application in Stata


𝐥𝐨𝐠(𝒘𝒂𝒈𝒆) = 𝜷𝟎 + 𝜷𝟏 𝑬𝒅𝒖𝒄 + 𝜷𝟐 𝑬𝒙𝒑𝒆𝒓 + 𝜷𝟑 𝑻𝒆𝒏𝒖𝒓𝒆 + 𝒖

B.1 From ANOVA:

𝑆𝑆𝐸/𝑑𝑓 𝑀𝑆𝐸 𝐸𝑥𝑝𝑙𝑎𝑖𝑛𝑒𝑑 𝑣𝑎𝑟𝑖𝑎𝑛𝑐𝑒


𝐹= = 𝑇ℎ𝑖𝑠 𝑖𝑠:
𝑆𝑆𝑅/𝑑𝑓 𝑀𝑆𝑅 𝑈𝑛𝑒𝑥𝑝𝑙𝑎𝑖𝑛𝑒𝑑 𝑣𝑎𝑟𝑖𝑎𝑛𝑐𝑒
This F calculated
(80.39) is greater
Source SS df MS Number of obs = 526 than the F table
F(3, 522) = 80.39
(3.78) at the 1%
Model 46.8741805 3 15.6247268 Prob > F = 0.0000
Residual 101.455581 522 .194359351 R-squared = 0.3160
critical value.
Adj R-squared = 0.3121 df numer.= 3
Total 148.329762 525 .28253288 Root MSE = .44086 df denomina =522

MSE: Mean Squares Explained (or Model); MSR: Mean Squares Residual 15.62/0.1943=80.39

As F (3, 522) = 80.39 (and Prob > F = 0.0000), we can reject the null hypothesis that all betas
are simultaneously equal to zero. This suggest that at least one beta is different from zero,
and, therefore, it can serve to predict the dependent variable.

B.2 From the equation using Sum of Squares Residual (SSR)

𝑆𝑆𝑅𝑟 − 𝑆𝑆𝑅𝑢𝑟 /𝑞
𝐹=
𝑆𝑆𝑅𝑢𝑟
𝑛−𝑘−1
Where:
𝑆𝑆𝑅𝑟 : sum of squares residual (restricted) (only with intercept)
𝑆𝑆𝑅𝑢𝑟 : sum of squares residual (unrestricted) (intercept and coefficients)
𝑞 = 𝑑𝑓𝑟 − 𝑑𝑓𝑢𝑟 where:𝑑𝑓 = 𝑛 − 𝑘; 𝐹~𝐹𝑞 , 𝑛 − 𝑘 − 1
𝑑𝑓: degrees of freedom

- Unrestricted equation: regress lwage educ exper tenure; SSR=101.4555


- Restricted equation: regress lwage, constant; SSR=148.3297

Then, we apply the equation: F= [(SSRr-SSRur)/q] / [SSRur/(n-k-1)]


F= [(148.3297-101.4555)/(526-0)-(526-3)]/(101.45/(526-3-1)) = 80.39
B.3 From the equation using the R-squared.

𝑹𝟐𝒖𝒓 − 𝑹𝟐𝒓 /𝒒
𝑭=
(𝟏 − 𝑹𝟐𝒖𝒓 )⁄𝒏 − 𝒌 − 𝟏

Where:
𝑅𝑟2 : R-squared (restricted) (only with intercept)
2
𝑅𝑢𝑟 : R-squared (unrestricted) (intercept and coefficients)
𝑞 = 𝑑𝑓𝑟 − 𝑑𝑓𝑢𝑟 where:𝑑𝑓 = 𝑛 − 𝑘; 𝐹~𝐹𝑞 , 𝑛 − 𝑘 − 1
𝑑𝑓: degrees of freedom

- Unrestricted equation: regress lwage educ exper tenure; R-squared=0.3160


- Restricted equation: regress lwage, constant; R-squared=0

Then, we apply the equation F= [(RSQur-RSQr)/q] / [(1-RSQur)/(n-k-1)]


F= [(0.3160-0.0)/(526-0)-(526-3)]/[(1-0.3160)/(526-3-1)] = 80.39

Example using t-test, F test and Lagrange Multiplier test (Chi-squared


C. Testing other type of hypotheses
Test whether the return to scale is greater than 1 (increasing return to scales)

In economic growth:
𝐻0 : 𝛽1 + 𝛽2 = 1 constant return to scale
𝐻1 : 𝛽1 + 𝛽2 > 1 increasing return to scale

Test whether the return to scale are different from 1

In economic growth:
𝐻0 : 𝛽1 + 𝛽2 = 1 constant return to scale
𝐻1 : 𝛽1 + 𝛽2 ≠ 1 increasing return to scale

2. Confidence Interval (CI)

1) Following the same example of return to education:

̂
log(𝑤𝑎𝑔𝑒) = 0.284 + 0.092 𝐸𝑑𝑢𝑐 + 0.004 𝐸𝑥𝑝𝑒𝑟 + 0.022 𝑇𝑒𝑛𝑢𝑟𝑒
(0.11) (0.0073) (0.001) (0.003)

𝐻0 : 𝛽1 = 0
𝐻1 : 𝛽1 ≠ 0 (two tail)

T table 𝑡0.05,526−3−1 so T table 𝑡0.05,522 = 1.96


0.092−0
T calculated 𝑡 = 0.0073 = 12.56

The confidence interval at the 5% critical value is:

𝑃[𝛽̂ − 𝑡𝑐 ∙ 𝑠𝑒(𝛽) ≤ 𝛽 ≤ 𝛽̂ + 𝑡𝑐 ∙ se(𝛽̂ )] = 1 − 𝛼

𝑡𝑐𝛼,𝑛−𝑘−1 (t table or t critical value)


𝑡𝑐0.05,522 = 1.96

𝑃(0.092 − 1.96 ∙ (0.0073) ≤ 𝛽 ≤ 0.092 + 1.96 ∙ (0.0073) = 1 − 0.05


𝑃(0.077 ≤ 𝛽 ≤ 0.106) = 0.95
𝐼𝐶 (0.077 𝑡𝑜 0.106)
There is a 95% probability that 𝛽 lies between 0.077 and 0.106
2) Following the same example of food expenditure:

̂𝒊 = 𝟒𝟎. 𝟏𝟖 + 𝟎. 𝟕𝟎𝒙𝒊
𝒚
se (0.1434)
t (4.88)

𝐻0 : 𝛽1 = 0
𝐻1 : 𝛽1 ≠ 0 (two tail)

T table 𝑡0.05,40−1−1 so T table 𝑡0.05,38 = 2.02


0.70−0
T calculated 𝑡 = 0.1434 = 4.88

The confidence interval at the 5% critical value is:

𝑃[𝛽̂ − 𝑡𝑐 ∙ 𝑠𝑒(𝛽) ≤ 𝛽 ≤ 𝛽̂ + 𝑡𝑐 ∙ se(𝛽̂ )] = 1 − 𝛼

𝑡𝑐𝛼,𝑛−𝑘−1 (t table or t critical value)


𝑡𝑐0.05,40−1−1 = 2.02

𝑃(0.70 − 2.02 ∙ (0.1434) ≤ 𝛽 ≤ 0.70 + 2.02(0.1434) = 1 − 0.05


𝑃(0.4103 ≤ 𝛽 ≤ 0.9896) = 0.95
𝐼𝐶 (0.4103 𝑡𝑜 0.9896)
There is a 95% probability that 𝛽 lies between 0.41 and 0.98

You might also like