Lect 2
Lect 2
Lecture outline
• scatter plot
• sample covariance
• sample correlation
• Suppose we would like to test whether the mean wages of men and
women with a master degree differ by an amount d0
H0 : µw M − µw F = d0 H1 : µw M − µw F 6= d0
Step 2: Estimate σWM and σWF to obtain SE W M − W F
s
2 2
sW M
sW F
SE W M − W F = +
nM nF
Suppose we have random samples of 500 men and 500 women with a
master degree
and we would like to test that the mean wages are equal:
H0 : µw M − µw F = 0 H1 : µw M − µw F 6= 0
(W M −W F )−0
Step 3: t act = SE (W M −W F )
= 10996.04
1240.709
= 8.86
Group Obs Mean Std. Err. Std. Dev. [95% Conf. Interval]
• Thus the 95% confidence interval for (µWM − µWF ) are the values of d0
within ±1.96 standard errors of W M − W F
{8561.34 , 13430.73}
8
• The mean causal effect is the difference between the mean outcome
when treated and the mean outcome when untreated
H0 : µX =1 − µX =0 = 0 H1 : µX =1 − µX =0 6= 0
Y Treated − Y Control
Step 2: Compute SE Y Treated − Y Control
• The test on the previous slide is based on the sample size n being large
Y − µY ,0
t act =
SE Y
• What is the relation between the beer tax and traffic fatalities?
• What is the relation between class size and student test scores?
• In this and coming lectures we will focus on the last of these questions.
12
• We will use a data set that contains data on test performance, school
characteristics and student demographic backgrounds.
• The data are from 420 districts in California.
• Data were obtained from the California Department of Education
• Main variables of interest:
• TestScore is the district average of the reading and math scores of
5th grade students
• ClassSize is defined as the number of students divided by the
number of full-time equivalent teachers in the district.
13
700
680
Test score
660
640
620
600
14 16 18 20 22 24 26
Class size
.
14
Sample covariance
• If (Xi , Yi )are i.i.d and have finite fourth moments E X 4 < ∞ &
E Y4 < ∞
p
sXY −→ σXY
• The sample covariance between class size and test scores sCT =-8.16
15
Sample correlation
• What does it mean for the sample covariance between test scores and
class size to equal -8.16?
• The units of the covariance are the units of test scores multiplies by the
units of class size
• The sample correlation between class size and test scores rCT =-0.23
Friday January 13 10:48:09 2017 Page 1 16
test_s~e class_~e
Friday January 13 10:48:38 2017 Page 1
test_score 363.03
class_size -8.15932 3.57895
___ _
/__
___/ /
To compute the sample correlation in Stata: Statis
test_s~e class_~e
test_score 1.0000
class_size -0.2264 1.0000
.
Linear regression with one regressor
18
What is the effect on district test scores if we would increase district average
class size by 1 student?
βClassSize is the definition of the slope of a straight line relating test scores and
class size
Test score = β0 + βClassSize × Class size
where β0 is the intercept of the straight line.
19
• The average test score in district i does not only depend on the average
class size
• Student background
• .....
• The equation describing the linear relation between Test score and
Class size is better written as
Yi = β0 + β1 Xi + ui
where
u1 u6
X
22
700
680
Test score
660
640
620
600
14 16 18 20 22 24 26
Class size
23
The OLS estimator chooses the regression coefficients so that the estimated
regression line is as close as possible to the observed data,
where closeness is measured by the sum of the squared
mistakes made in predicting Y given X
Yi − (b0 + b1 Xi ) = Yi − b0 − b1 Xi
Yi = µY + ui
• Let m be an estimator of µY
−2 ni=1 Yi + 2 · n · m
P
=0
1
Pn
n i=1 Yi − m =0
• Solving for m gives
n
1X
m= Yi = Y
n
i=1
25
Yi = β0 + β1 Xi + ui
• Step 1:
n 2
∂ X
Yi − βb0 − βb1 Xi = 0
∂ βb0 i=1
• Step 2:
n 2
∂ X
Yi − βb0 − βb1 Xi = 0
∂ βb1 i=1
26
Pn Pn
∂
∂β i=1 ui2 = −2 i=1 Yi − βb0 − βb1 Xi =0
0
b
P
n Pn
1
βb0 − ni=1 βb1 Xi
P
= n i=1 Yi − i=1 =0
Pn
1
Yi − n1 nβb0 − βb1 n1 ni=1 Xi
P
= n i=1 =0
= Yi − βb0 − βb1 Xi =0
• This gives
c0 = Y − βb1 X
β
27
Pn Pn
∂
∂β i=1 ui2 = −2 · i=1 −Xi Yi − βb0 − βb1 Xi =0
1
b
rewrite
Pn
i=1 Xi Yi − Y − βb1 Xi − βb1 X
rewrite
Pn
Yi − Y − βb1 ni=1 Xi Xi − X
P
= i=1 Xi =0
Algebra trick
Pn P
= i=1 Xi − X Yi − Y − βb1 ni=1 Xi − X Xi − X =0
28
Algebra trick:
Pn Pn Pn Pn Pn
i=1 Xi − X Yi − Y = i=1 Xi Yi − i=1 Xi Y − i=1 X Yi + i=1 XY
Pn Pn 1
Pn
= i=1 Xi Yi − i=1 Xi Y − nX n i=1 Yi + nX Y
Pn Pn
= i=1 Xi Yi − i=1 Xi Y −nX Y + nX Y
Pn Pn
= Xi Yi − i=1 Xi Y
i=1
= ni=1 Xi Yi − Y
P
By a similar reasoning:
Pn P
i=1 Xi Xi − X = ni=1 Xi − X Xi − X .
29
Pn Pn P
∂
∂β i=1 ui2 = i=1 Xi − X Yi − Y − βb1 ni=1 Xi − X Xi − X =0
1
b
Pn 1
Pn
Pi=1
(Xi −X )(Yi −Y ) n−1 i=1 (Xi −X )(Yi −Y ) sxy
β
c1 = n = = sx2
i=1 (Xi −X )(Xi −X )
1
Pn
n−1 i=1 (Xi −X )(Xi −X )
Y
bi = βb0 + βb1 Xi
bi = Yi − Y
u bi
30
680
Test score
660
640
620
600
15 20 25
Class size
.
31
Robust
test_score Coef. Std. Err. t P>|t| [95% Conf. Interval]
• βb0 = 698.93 The expected test score when class size is zero equals
698.93 (what does it mean for class size to be zero)?
Friday January 13 15:00:27 2017 Page 1 32
1 . mean test_score
test_score
Friday 654.1565
January 13 15:00:50 2017 .9297082
Page 1 652.3291 655.984
1 . regress test_score
.
33
Measures of fit
How well does the estimated regression line describe the data?
• Are the observations in the scatter plot clustered closely around the
regression line?
Two measures of how well the OLS line fits the data.
The standard error of the regression SER measures how far Yi typically is
from its predicted value
34
2
The R
2
The R
The total sum of squares TSS can be divided in the explained sum of
squares ESS and the residual sum of squares SSR:
Pn 2 Pn b 2 P
i=1 Yi − Y = i=1 Yi − Y + ni=1 u
bi2
Pn
ESS TSS − SSR SSR b2
u
R2 = = =1− = P i=1 i 2
TSS TSS TSS n
i=1 Yi − Y
36
2
The R
Friday
Example: January
Class 13 test
size and 14:48:31 2017
scores Page 1
Robust
test_score Coef. Std. Err. t P>|t| [95% Conf. Interval]
R 2 = 0.0512
It measures the spread of the observations around the regression line in the
units of the dependent variable
Robust
test_score Coef. Std. Err. t P>|t| [95% Conf. Interval]
SER = 18.6