Regression With One Regressor-Hypothesis Tests and Confidence Intervals
Regression With One Regressor-Hypothesis Tests and Confidence Intervals
Regression With One Regressor-Hypothesis Tests and Confidence Intervals
where
Estimator of the variance of is
where is the residual.
There is no reason to memorize this.
It is computed automatically by regression software.
is reported by regression software.
It is less complicated than it seems. The numerator estimates
Var(ν), the denominator estimates Var(X).
The calculation of the t-statistic:
t-statistic testing
The 1% two-sided significance level is 2.58, so we reject the
null at 1% significance level.
Alternatively, we can compute the p-value.
The p-value based on the large-n standard normal approximation to
the t-statistic is 0.00001.
Confidence intervals for
In general, if the sample distribution of an estimator is nomal for
large n, then a 95% confidence interval can be constructed as
estimator ±1.96 standard error, that is
Example: Test Scores and STR, California data
Estimated regression line:
Equivalent statements:
The 95% confidence interval does not include zero.
The hypothesis = 0 is rejected at the 5% level.
A concise (and conventional) way to report
regressions:
Put standard errors in parentheses below the estimated
coefficients to which they apply.
that is:
so:
Estimation:
Test:
The R2
Write Yi as the sum of the OLS prediction + OLS residual:
Yi = Yˆi+ uˆi
The R2 is the fraction of the sample variance of Yi “explained” by
ˆ
the regression, that is,Yby :
i
ESS
R2 = TSS ,
n n
Gauss-Markov Theorem:
Under the Gauss-Markov conditions, the OLS estimator is
BLUE (Best Linear Unbiased Estimator). That is,
for all linear
conditionally unbiased estimators .
where
is a linear unbiased estimator.
Proof of the Gauss-Markov Theorem
we need
With the above two conditions for
where
In virtually all applied regression analysis, OLS is used – and that is what we will do
in this course too.
45
Inference if u is Homoskedastic and Normal:
the Student t Distribution
Recall the five extended LS assumptions:
1.E(u| X = x) = 0.
2.(X i,Y i), i = 1,…,n, are i.i.d.
3.Large outliers are rare (E(Y 4) < , E(X 4) < ).
4.u is homoskedastic
5.u is distributed N (0,2)
( X i X )ui
ˆ1 – 1 = i 1
n
i
( X
i 1
X ) 2
1 n ( Xi X )
=
n i 1
w u
i i , where wi =
1 n
.
(
n i 1 i
X X ) 2
47
In addition, under assumptions 1 – 5, under the null hypothesis the t
statistic has a Student t distribution with n – 2 degrees of freedom
Why n – 2? because we estimated 2 parameters, 0 and 1
For n < 30, the t critical values can be a fair bit larger than the N (0,1)
critical values
For n > 50 or so, the difference in tn–2 and N (0,1) distributions is
negligible. Recall the Student t table:
degrees of freedom 5% t-distribution critical value
10 2.23
20 2.09
30 2.04
60 2.00
1.96
48
Practical implication:
If n < 50 and you really believe that, for your application, u is
homoskedastic and normally distributed, then use the tn–2 instead of the
N (0,1) critical values for hypothesis tests and confidence intervals.
In most econometric applications, there is no reason to believe that u is
homoskedastic and normal – usually, there is good reason to believe
that neither assumption holds.
Fortunately, in modern applications, n > 50, so we can rely on the
large-n results presented earlier, based on the CLT, to perform
hypothesis tests and construct confidence intervals using the large-n
normal approximation.
49
Summary and Assessment
The initial policy question:
Suppose new teachers are hired so the student-teacher ratio
falls by one student per class. What is the effect of this policy
intervention (this “treatment”) on test scores?
Does our regression analysis give a convincing answer? Not
really - districts with low ST R tend to be ones with lots of
other resources and higher income families, which provide
kids with more learning opportunities outside school. This
suggests that
so
Digression on Causality
The original question (what is the quantitative effect of an
intervention that reduces class size?) is a question about a
causal effect: the effect on Y of applying a unit of the
treatment is 1.
But what is, precisely, a causal effect?
The common-sense definition of causality isn’t precise enough
for our purposes.
In this course, we define a causal effect as the effect that is
measured in an ideal randomized controlled experiment.
Ideal Randomized Controlled Experiment
Ideal: subjects all follow the treatment protocol – perfect
compliance, no errors in reporting, etc.!
Randomized: subjects from the population of interest are
randomly assigned to a treatment or control group (so there are
no confounding factors)
Controlled: having a control group permits measuring the
differential effect of the treatment
Experiment: the treatment is assigned as part of the
experiment: the subjects have no choice, which means that
there is no “reverse causality” in which subjects choose the
treatment they think will work best.
Back to class size:
What is an ideal randomized controlled experiment for measuring
the effect on Test Score of reducing STR?
How does our regression analysis of observational data differ from
this ideal?
The treatment is not randomly assigned
In the US – in our observational data – districts with higher
family incomes are likely to have both smaller classes and higher
test scores.
As a result it is plausible that E(ui|Xi=x) = 0.
If so, Least Squares Assumption #1 does not hold.
If so, is biased: does an omitted factor make class size seem
more important than it really is?