36-401 Modern Regression HW #9 Solutions: Problem 1 (44 Points)
36-401 Modern Regression HW #9 Solutions: Problem 1 (44 Points)
Let
n
X
SSE = (Yi − βXi )2 .
i=1
n
∂ X
SSE = −2 (Yi − βXi )Xi
∂β i=1
Set
∂
SSE = 0.
∂β
Then,
n
X
−2 (Yi − βXi )Xi = 0
i=1
Xn
(Yi Xi − βXi2 ) = 0
i=1
Pn
Yi Xi
β = Pi=1
n 2 .
i=1 Xi
And
n
∂2 X
SSE = 2 Xi2
∂β 2 i=1
>0
So Pn
Yi Xi
Pi=1
n 2
i=1 Xi
(b) (7 pts.)
Let
n
X (Yi − βXi )2
W SSE =
i=1
σi2
n
!2
X Yi − βXi
=
i=1
σi
n
!
∂ X Yi Xi − βXi2
W SSE = −2
∂β i=1
σi2
1
Set
∂
W SSE = 0.
∂β
Then,
n
X Yi Xi
i=1
σi2
β= n
X Xi2
i=1
σi2
And
n
∂2 X Xi2
W SSE = 2
∂β 2
i=1
σi2
>0
So
n
X Yi Xi
i=1
σi2
n
X Xi2
i=1
σi2
(c) (7 pts.)
"P #
n
i=1 Yi Xi
E[β] = E Pn
b
2
i=1 Xi
Pn
i=1 Xi E[Yi ]
= P n
X2
Pni=1 2i
β X
= Pni=1 2i
i=1 Xi
=β
Pn !
i=1 Y i X i
Var(β)
b = Var Pn
2
i=1 Xi
Pn
Xi2 σi2
= Pi=1
n 2 2
i=1 Xi
" Pn Yi X i #
i=1 σi2
E[β]
e =E
Pn Xi2
i=1 σi2
n
1 X Xi E[Yi ]
=P
n Xi2
i=1
σi2
i=1 σi2
n
β X X2 i
=P
n Xi2
i=1
σi2
i=1 σi2
=β
2
Pn Yi X i !
i=1 σi2
Var(β)
e = Var
Pn Xi2
i=1 σi2
n
!
1 X Yi Xi
= P 2 Var
n Xi2
i=1
σi2
i=1 σi2
n
1 X X 2 Var(Yi )
i
= P 2
n Xi2
i=1
σi4
i=1 σi2
Pn Xi2
i=1 σi2
= P 2
n Xi2
i=1 σi2
1
=P Xi2
n
i=1 σi2
(d) (8 pts.)
1
Var(β)
e =
Pn Xi2
i=1 σi2
Pn 2 2
i=1 Xi σi
= Pn Xi2 Pn 2 2
i=1 σi2 i=1 Xi σi
Pn 2 2
i=1 Xi σi
= Pn 2P 2
Xi n
i=1 σi i=1 Xi σi
Pn
X 2 σ2
≤ Pi=1 i i 2
n 2
i=1 Xi
= Var(β),
b
where the inequality comes from Cauchy-Schwartz.
(e) (7 pts.)
Above we showed
n
1 X Xi
βe = n Yi .
X Xi2 i=1
σi2
i=1
σi2
βe is a linear combination of Normal random variables Yi , and thus also Normally distributed. We have
already found the mean and variance in part (c). Therefore,
!
1
βe ∼ N β, P Xi2
n
i=1 σi2
3
(f) (8 pts.)
set.seed(100)
n = 100
b.OLS <- rep(NA,1000)
b.WLS.known.var <- rep(NA,1000)
b.WLS.unknown.var <- rep(NA,1000)
for (itr in 1:1000){
x = runif(n)
s = x^2
y = 3*x + rnorm(n, mean = 0, sd = s)
w = 1/s2
out3 = lm(y ~ x - 1,weights=w)
b.WLS.unknown.var[itr] <- out3$coefficients[1]
}
4
80
300
70
60
250
50
200
Frequency
Frequency
40
150
30
100
20
50
10
0
0
2.5 2.6 2.7 2.8 2.9 3 3.1 3.2 3.3 3.4 3.5 2.5 2.6 2.7 2.8 2.9 3 3.1 3.2 3.3 3.4 3.5
OLS beta WLS beta, known s
150
125
100
Frequency
75
50
25
0
2.5 2.6 2.7 2.8 2.9 3 3.1 3.2 3.3 3.4 3.5
If we are dealing with a highly heteroskedastic data set such as this, and do not know the variance of the
noise, using weighted least squares based on estimated variances is a better strategy.
5
Problem 2 [28 points]
(a) (7 pts.)
80
Day
40
0
25
n
15
5
ybar
20
10
1.0
SD
0.0
0 40 80 10 15 20 25
5 10 15 0.5 1.0 1.5 2.0
80
Day
40
0
10 15
n
5
30
ybar
20
10
1.5
SD
0.5
0 40 80 10 20 30
6
Type 0
Type 1
25
20
ybar
15
10
0 20 40 60 80 100
Day
A common intercept looks feasible, however, ybar appears to increase at a faster rate in Type 1.
(b) (7 pts.)
##
## Call:
## lm(formula = ybar ~ Day * Type, data = allshoots)
##
## Residuals:
## Min 1Q Median 3Q Max
## -1.74747 -0.21000 0.08631 0.35212 0.89507
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 9.475879 0.230981 41.025 < 2e-16 ***
## Day 0.187238 0.003696 50.655 < 2e-16 ***
## Type 0.339406 0.329997 1.029 0.309
## Day:Type 0.031217 0.005625 5.550 1.21e-06 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.5917 on 48 degrees of freedom
## Multiple R-squared: 0.9909, Adjusted R-squared: 0.9903
## F-statistic: 1741 on 3 and 48 DF, p-value: < 2.2e-16
7
Table 2: 90% confidence intervals for regression coefficeints
5% 95 %
(Intercept) 9.0884732 9.8632855
Day 0.1810386 0.1934377
Type -0.2140739 0.8928853
Day:Type 0.0217825 0.0406507
Type 0
1
Type 1
1
Studentized residuals
Studentized residuals
0
0
−1
−1
−2
−3
−2
0 20 40 60 80 100 0 1
Day Type
Residuals vs Fitted Normal Q−Q
2
1.0
0.5
1
Standardized residuals
0.0
0
Residuals
−1
−1.0
3 15
−2
15
3
35
−3
−2.0
35
10 15 20 25 30 −2 −1 0 1 2
35
0.20
1
Standardized residuals
52
Cook's distance
0
0.15
3
−1
0.10
52
−2
3
0.05
0.5
−3
35
0.00
Cook's distance
1
8
(c) (7 pts.)
##
## Call:
## lm(formula = ybar ~ Day * Type, data = allshoots, weights = n)
##
## Weighted Residuals:
## Min 1Q Median 3Q Max
## -4.2166 -0.8300 0.1597 0.9882 3.3196
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 9.488374 0.238615 39.764 < 2e-16 ***
## Day 0.187258 0.003486 53.722 < 2e-16 ***
## Type 0.485380 0.362496 1.339 0.187
## Day:Type 0.030072 0.005800 5.185 4.28e-06 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1.675 on 48 degrees of freedom
## Multiple R-squared: 0.9906, Adjusted R-squared: 0.9901
## F-statistic: 1695 on 3 and 48 DF, p-value: < 2.2e-16
5% 95 %
(Intercept) 9.0881641 9.8885842
Day 0.1814118 0.1931043
Type -0.1226072 1.0933663
Day:Type 0.0203446 0.0397999
9
(d) (7 pts.)
Type 0
Type 1
25
20
ybar
15
10
0 20 40 60 80 100
Day
10
Problem 3 [28 points]
(a) (7 pts.)
##
## Call:
## lm(formula = FoodIndex ~ ., data = BigMac2003)
##
## Residuals:
## Min 1Q Median 3Q Max
## -27.0642 -6.3965 -0.0262 5.6928 26.3002
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -1.09968 11.19872 -0.098 0.9221
## BigMac -0.20569 0.07798 -2.638 0.0107 *
## Bread 0.44383 0.10564 4.201 9.11e-05 ***
## Rice 0.26881 0.13597 1.977 0.0527 .
## Bus 3.59014 2.83317 1.267 0.2101
## Apt 0.01825 0.00434 4.204 9.02e-05 ***
## TeachGI -0.97768 0.86750 -1.127 0.2643
## TeachNI 2.22275 1.13819 1.953 0.0556 .
## TaxRate 0.26530 0.25724 1.031 0.3066
## TeachHours 0.48015 0.20478 2.345 0.0224 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 11.86 on 59 degrees of freedom
## Multiple R-squared: 0.7981, Adjusted R-squared: 0.7673
## F-statistic: 25.91 on 9 and 59 DF, p-value: < 2.2e-16
11
Stud. Resids Stud. Resids Stud. Resids Stud. Resids
−2 −1 0 1 2 −2 −1 0 1 2 −2 −1 0 1 2 −2 −1 0 1 2
0
20
10
50
500
20
40
Apt
30
Rice
1000
100
BigMac
TeachNI
60
40
1500
150
80
50
2000
12
Stud. Resids Stud. Resids Stud. Resids Stud. Resids
−2 −1 0 1 2 −2 −1 0 1 2 −2 −1 0 1 2 −2 −1 0 1 2
0
20
20
10
40
40
2
Bus
Bread
20
TaxRate
TeachGI
60
30
60
3
80
40
80
2
1
Stud. Resids
0
−1
−2
20 30 40 50
TeachHours
Residuals vs Fitted 3
Normal Q−Q
30
Tokyo Tokyo
2
20
Standardized residuals
10
1
Residuals
0
−30 −20 −10
−1
London
−2
Miami
Miami
Mumbai
20 40 60 80 100 −2 −1 0 1 2
Mumbai
0.5
0.5
2
Standardized residuals
Nairobi
0.4
Cook's distance
1
0.3
Nairobi
0.2
−1
Shanghi
Shanghi
0.1
−2
0.5
Mumbai
Cook's distance
0.0
−3
13
(b) (7 pts.)
0.5 % 99.5 %
BigMac -0.4132679 0.0018835
The confidence interval includes 0 so, given all the other variables, we cannot conclude the price of a BigMac
has a significant association with Food Index at level α = 0.01.
(c) (7 pts.)
We are testing the hypothesis that all other variables are conditionally uncorrelated with Food Index, given
the price of a BigMac. The ANOVA table shows there is very strong evidence in favor of the alternative (we
reject).
(d) (7 pts.)
library(DAAG)
out1 <- cv.lm(df = BigMac2003, form.lm = formula(FoodIndex ~ .), m = 10, plotit = F)
out2 <- cv.lm(df = BigMac2003, form.lm = formula(FoodIndex ~ BigMac), m = 10, plotit = F)
The predictive MSEs of each model are estimated by 10-fold cross validation. We conclude the model only
utilizing BigMac has better predictive accuracy.
Err
df ull = 1764, Err
dBigM ac = 472
14