0% found this document useful (0 votes)
46 views38 pages

AE 2023 Lecture4 PDF

1) The document discusses hypothesis testing in empirical research. It introduces the concepts of the null hypothesis (H0) and alternative hypothesis (H1). 2) It provides an example to test if a new diet helps with weight loss. The null hypothesis is that the average effect of the diet is zero, while the alternative is that the average effect is less than zero (people lose weight). 3) Five steps of hypothesis testing are outlined: state hypotheses, choose a test statistic, select a significance level, define rejection rules, and interpret results. Common tests like the t-test are discussed.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
46 views38 pages

AE 2023 Lecture4 PDF

1) The document discusses hypothesis testing in empirical research. It introduces the concepts of the null hypothesis (H0) and alternative hypothesis (H1). 2) It provides an example to test if a new diet helps with weight loss. The null hypothesis is that the average effect of the diet is zero, while the alternative is that the average effect is less than zero (people lose weight). 3) Five steps of hypothesis testing are outlined: state hypotheses, choose a test statistic, select a significance level, define rejection rules, and interpret results. Common tests like the t-test are discussed.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 38

Lecture 4: Hypothesis Testing

Applied Econometrics
Dr. Le Anh Tuan

1
Hypothesis testing in empirical research

Research
•Introduction Hypothesis •Results
•Motivations •Theory background, •Confirm Hypothesis
•Contributions •Hypotheses
•Direction of
influence, specific
Research Statistical
question hypothesis

2
A Reminder
►Imagine you want to find out whether a new diet actually
helps people lose weight or whether it is completely
useless (most diets are)
►You’ve collected data about 100 people, who had been on
the diet for 8 weeks
►Let di denote the difference in the weights of ith person
after and before the diet:
!" = $%"&ℎ( )*(%+ – $%"&ℎ( -%*.+%
►You’re testing the average effect of a diet, which means
you’re making a hypothesis about the /./01)(".2 3%)2
of !, called μ
►First, you need to state:
►H0: the null hypothesis
►H1: the alternative hypothesis
3
A Reminder
► In hypothesis testing, the null hypothesis is something you’re trying
to disprove (reject) using the evidence in our data.
► In order to show the diet works, you’ll actually be disproving it
doesn’t. Therefore, the null hypothesis will be:
H0: μ = 0 (on average, there’s no effect)
► The alternative hypothesis is a vague definition of what you’re trying
to show, e.g.:
H1: μ < 0 (on average, people lose weight)
► Next, you look at the average effect of diet in your data and find that,
say, "̅ = – 1.5 (in your sample, on average, people lost 1.5 kg)
► Is that a reason to reject H0?
► We don’t know yet
► Even if H0 is actually true, we would not expect the sample
average to be exactly 0
► The question is whether –1.5 is sufficiently far away from 0
so that we can reject H0.

4
A Reminder
►In hypothesis testing, we can make Type I Error mistakes.

►A Type I Error is rejecting H0 when it is true. The


probability of a Type I error is called the significance level,
usually denoted by !.

►Significance level, !, is the probability of rejecting a null


hypothesis when it is true.

►The level of significance is popular at:


►10%
►5 %
►1 %

5
Hypothesis Testing
► Consider the following multiple regression model:
!" = $% + $' ('" + ⋯ + $* ('* + +"

► We wish to test the hypothesis that $, = - where b is some


known value (e.g., zero) against the alternative that $, is not
equal to -:
.% : $, = -
.' : $, ≠ -
► However, for hypothesis tests we need to know their
distribution.
► In order to derive their distribution we need additional
assumptions.
► Assumption about distribution of errors: normal distribution.

6
Hypothesis Testing
► Assumption MLR.6 (Normality of error terms). The population error
! is independent of the explanatory variables and is normally
distributed with zero mean and variance "# .

independently of
It is assumed that the
unobserved factors are
normally distributed around
the population regression
function.

The form and the variance of


the distribution does not
depend on any of the
explanatory variables.

7
Hypothesis Testing

► Examples where normality cannot hold:


► Wages (nonnegative; also: minimum wage)
► Number of arrests (takes on a small number of integer
values)
► Unemployment (indicator variable, takes on only 1 or 0)
► In some cases, normality can be achieved through
transformations of the dependent variable (e.g. use log(wage)
instead of wage)
► Under normality, OLS is the best (even nonlinear) unbiased
estimator
► Based on central limit theorem, the assumption of normality can
be replaced by a large sample size.

8
Hypothesis Testing Process

► To actually test a hypothesis, we need to follow five steps:


1. State the null (H0) and alternative (H1) hypotheses
► The null hypothesis is presumed true until we have evidence to
reject it
► Most common is testing that the population parameter is zero
► !0: $% = 0 against the alternative !1: $% >, <, ≠ 0
► The form of the alternative will determine whether you
perform a one-sided or two-sided test.

2. Choose and calculate a test statistic (t) with a known distribution.


► The test statistic provides a measure of how far our sample
estimator ($,% ) is from our hypothesized population value ($% )
relative to the standard error of the estimator, se($,% ).

9
Hypothesis Testing Process

3. Choose significance level (α) and find critical value (c)


► Sig. level is typically 10%, 5%, or 1%. This value is the
probability that we reject the null when the null is true.
► Critical value is found on the appropriate table (t, z, or F) for the
chosen significance level and given degrees of freedom
► Be careful, the critical value will change depending on whether
the test is one-sided or two-sided!

4. Define our rejection rule. Reject H0 if and only if:


! > #

5. Interpret. We either:
► Reject the null
► Fail to reject the null
► We NEVER accept the null

10
Testing Hypotheses about a Single Population
Parameter:
The t Test

11
Hypothesis Testing
► Under MLR.1 through MLR.6, the sampling distribution of
standardized estimators is Student’s distribution, t-statistics, !"–#–1

If the standardization is done using the


estimated standard deviation
(= standard error), the normal
distribution is replaced by a t-
distribution
Note:
The t-distribution is close to the standard normal distribution if n-k-1 is large.
► Degree of freedom: n-k-1
► n: number of observations
► k: number of independent variables

12
Hypothesis Testing
► Null hypothesis (for more general hypotheses)
The population parameter is equal to zero, i.e.
after controlling for the other independent
variables, there is no effect of !" on #

► t-statistic (or t-ratio)


,+- 3oe66(3(/7$
$ − &'$() = =
./(,1- ) .$'78'&8 /&&)&
The t-statistic will be used to test the above null hypothesis. The farther the
estimated coefficient is away from zero, the less likely it is that the null
hypothesis holds true. But what does "far" away from zero mean?

This depends on the variability of the estimated coefficient, i.e. its standard
deviation. The t-statistic measures how many estimated standard deviations
the estimated coefficient is away from zero.

13
Decision rule
► Choose a significant level (1%, 5%, or 10%)
► If the deviation of ! "# from the null hypothesis value "# =0 is large enough, one
would reject H0
► Intuition: If t is very large (or very small) then
a) the estimated mean ! "# is far from "# (under H0) and/or
b) the standard deviation of the estimated deviation is small relative to ! "# - "#
!,
+
$ − &'$() =
0,)
-.(+

14
Testing against one-sided alternatives
(greater than zero)
Test against

Reject the null hypothesis in favour of the


alternative hypothesis if the estimated coefficient
is “too large” (i.e. larger than a critical value).

Construct the critical value so that, if the


null hypothesis is true, it is rejected in,
for example, 5% of the cases.

In this example, this is the point of the t-


distribution with 28 degrees of freedom that
is exceeded in 5% of the cases.

c0.05=1.701

Reject if t-statistic is greater than critical value


(1.701)

15
Testing against one-sided alternatives
(greater than zero)
► Example WAGE: Test whether, after controlling for education and
tenure, higher work experience leads to higher hourly wages

► Test : !0: $%&'%( = 0 (no effect at all) Standard Errors


!, : $%&'%( > 0 (a positive effect of experience on hourly
wage)

16
Testing against one-sided alternatives
(greater than zero)
(.(*+
► ! − #!$!%#!%&# = (.(+, = 2.41
► Degrees of freedom 01 = 2 − 3 − 1 = 526 − 3 − 1 = 522
► At significant 5% level :
► &7%!%&$8 9$8:;: &(.(= = 1.645
► t-statistics > &(.(= → we reject the null

► At significant 1% level :
► &7%!%&$8 9$8:;: &(.(+ = 2.326
► t-statistics > &(.(+ → we reject the null

“The effect of experience on hourly wage is statistically


greater than zero at the 5% (and even at the 1%)
significance level.”
17
Testing against one-sided alternatives
(greater than zero)
►At 5% significance level
►Test whether, after controlling for experience and tenure,
higher education leads to higher hourly wages??
►Test whether, after controlling for experience and
education, longer tenure leads to higher hourly wages??

18
Testing against one-sided alternatives (less
than zero)
Test against

Reject the null hypothesis in favour of the


alternative hypothesis if the estimated
coefficient is “too small” (i.e. smaller than a
critical value).

Construct the critical value so that, if the null


hypothesis is true, it is rejected in, for
example, 5% of the cases.

In the given example, this is the point of the t-


distribution with 18 degrees of freedom so
that 5% of the cases are below the point.

Reject if t-statistic is less than critical value (-


1.734)

19
Testing against one-sided alternatives (less than
zero)
► Example: Student performance and school size-MEAP93
► Test whether smaller school size leads to better student
performance
Percentage of students Average annual Staff per one Student enrollment
passing maths test teacher compensation thousand students (= school size)

► Test : !0: $%&'() = 0 (no effect at all)


!- : $%&'() < 0 (a negative effect of school size on student
performance)

20
Testing against one-sided alternatives (less than
zero)
().)))+
► ! − #!$!%#!%&# = = −0.91
).)))++
► Degrees of freedom /0 = 1 − 2 − 1 = 408 − 3 − 1 = 404
► At significant 5% level :
► &6%!%&$7 8$79:: &).)< = −1.65
► t-statistics > &).)< → we cannot reject the null

One cannot reject the hypothesis that there is no effect


of school size on student performance at 5%
significance level
or
The impact of school size on student performance is
insignificant at 5% level

21
Testing against one-sided alternatives
(less than zero)

►Test : !0: $%&'()*+,-) = 0 , !1 : $%&'()*+,-) < 0


91.;<
► 3 − 5363753785 = = −1.87
=.><
► Degrees of freedom BC = D − E − 1 = 408 − 3 − 1 = 404
► At significant 5% level :
► 8H73786I J6IKL: 8=.=M = −1.65
► t-statistics < 8=.=M → we reject the null
► The impact of school size on student performance is significant
at 5%.
► +1% enrollment ; -0.0129 percentage points students pass test
22
Testing against two-sided alternatives

Test against
Reject the null hypothesis in favour of the
alternative hypothesis if the absolute value of
the estimated coefficient is too large.

Construct the critical value so that, if the


null hypothesis is true, it is rejected in,
for example, 5% of the cases.

In the given example, these are the points


of the t-distribution so that 5% of the cases lie
in the two tails.

Reject if absolute value of t-statistic is


greater than critical value

Reject if value of t-statistic is less than -2.06 or


greater than 2.06
23
Testing against two-sided alternatives

► Example: Example: Determinants of college GPA


► Test whether smaller school size leads to better student
performance
Lectures missed
per week

► Test : !0: $%&'() = 0 (no effect of high school GPA on college GPA)
!- : $%&'() ≠ 0 (a significant effect of hsGPA on colGPA)

24
Testing against two-sided alternatives
(.*+,
► !"#$%& = (.(-* = 4.38
► Degrees of freedom 12 = 3 − 5 − 1 = 141 − 3 − 1 = 137
► At significant 1% level :
► 89:!:8;< =;<>?: 8(.(+ = 2.576
► !"#$%& > 8(.(+ → we can reject the null

The effects of hsGPA is significantly different from zero


at the 1% significance level.

► !&DE = 1.36 < 80.1=1.645 → The effect of ACT is not significantly


different from zero, not even at the 10% significance level.
► |!#HIJJKL | = −3.19 > 8(.(+ = 2.576 → The effect of Skipped is
significantly different from zero, at the 1% significance level.

25
Summary
► If a regression coefficient is different from zero in a two-
sided test, the corresponding variable is said to be
“statistically significant”
► If the number of degrees of freedom is large enough so that
the normal approximation applies, the following rules of
thumb apply:

“statistically significant at 10% level”

“statistically significant at 5% level”

“statistically significant at 1% level”

26
Testing more general hypotheses about a
regression coefficient
► Null hypothesis
Hypothesized value
of the coefficient

► t-statistic

► The test works exactly as before, except that the hypothesized


value is substracted from the estimate when forming the
statistic.

27
Testing more general hypotheses about
a regression coefficient
► Example: Campus crime and enrollment
► We would like to test whether crime increases by one percent
if enrollment is increased by one percent

► Test : !0: $%&'()*+,-) = 1


!1 : $%&'()*+,-) ≠ 1
1.5671
► 3%&'()*+,-) = 8.11
= 2.45 > =0.05=1.96

► The hypothesis is rejected at the 5% level

28
Using p-Values for Hypothesis Testing

29
Using p-Values for Hypothesis Testing
What is the p-value?
► Classical approach to hypothesis testing: first choose the
significance level, then test the hypothesis at the given
level of significance (e.g. 5%)
► However, there is no ”correct” significance level.
► What is the smallest significance level at which the null
hypothesis would still be rejected?
► p-value is the smallest significance level at which
the null hypothesis would be rejected.
► Remember that the significance level describes the
probability of type I error.
→ the smaller the p-value, the more evidence there is in the
sample data against the null hypothesis and for the
alternative hypothesis.
→ if p-value is less than our level of significance, we reject !0
30
Using p-Values for Hypothesis Testing

In the two-sided case, the p-value is


thus the probability that the t-
distributed variable takes on a larger
absolute value than the realized value
These would be the
of the test statistic, e.g.:
critical values for a 5%
significance level

From this, it is clear that a


null hypothesis is rejected
if and only if the
corresponding p-value is
smaller than the
significance level.
For example, for a
significance level of 5%
the t-statistic would not
lie in the rejection
region.
31
Using p-Values for Hypothesis Testing
► p-values are more informative than tests at fixed
significance levels because you can choose your own
significance level.

* ...... p-value < 0.10 → we can reject H0 at 10%


** ..... p-value < 0.05→ we can reject H0 at 5%
*** ... p-value < 0.01→ we can reject H0 at 1%

32
Discussing economic and statistical
significance
► If a variable is statistically significant, discuss the magnitude of
the coefficient to get an idea of its economic or practical
importance
► The fact that a coefficient is statistically significant does not
necessarily mean it is economically or practically significant!
► If a variable is statistically and economically important but has
the “wrong” sign, the regression model might be misspecified
► If a variable is statistically insignificant at the usual levels (10%,
5%, or 1%), one may think of dropping it from the regression
► If the sample size is small, effects might be imprecisely estimated
so that the case for dropping insignificant variables is less strong

33
Confidence Intervals

34
Confidence Intervals Critical value of
two-sided test

Lower bound of the


Upper bound of the Confidence
Confidence interval
Confidence interval level

Interpretation: generally speaking, it’s an interval


that covers the true population parameter !" in 95%
of all samples.

35
Confidence Intervals
► A confidence interval (or interval estimate) for !" is the interval
given by
#% ± '. )*($
$ #% )
► At 90% confidence level:
#% − '-.. . )*($
$ #% ) ≤ $
#% ≤ $#% + '-.. . )*($
#% )

► At 95% confidence level:


#% − '-.-2 . )*($
$ #% ) ≤ $
#% ≤ $
#% + '-.-2 . )*($
#% )

► At 99% confidence level:


#% − '-.-. . )*($
$ #% ) ≤ $
#% ≤ $
#% + '-.-. . )*($
#% )

With

36
Using Confidence Intervals for Hypothesis
Testing
► Confidence intervals can be used to easily carry out the two-
tailed test
!0: $% = 'j
!1: $% ≠ '%
► The rule is as follows:
H0 is rejected at the 5% significance level if, and only if, * is
not in the 95% confidence interval for βj

reject in favor of

37
Confidence Intervals – Example

► Confidence intervals can be used to easily carry out the two-


tailed test Profits as
Spending on R&D percentage of sales
Annual sales

The effect of sales on R&D is relatively precisely This effect is imprecisely estimated as the
estimated as the interval is narrow. Moreover, the interval is very wide. It is not even
effect is significantly different from zero because statistically significant because zero lies in
zero is outside the interval. the interval.

38

You might also like