3008 Assignment 1 - Due Oct 9th Revised
3008 Assignment 1 - Due Oct 9th Revised
2020-21 Term 1
Assignment #1(Problem 2b revised)
Problem 1 [30 points]: Suppose the following regression model is fitted to a data set with
observations {(xi, yi), i = 1, 2, …, n}:
i .i .d .
yi xi ei , ei ~ N (0, 2 )
(a) [9 points] Based on the least squares method and the fact that RSS/ 2 ~ n21 (df = n-1
since df = n from the data and df = 1 from estimating ), compute the least squares
estimates for and 2 .
What the values of the least squares estimates ˆ and ̂ ? Does the sum of residuals
2
equal to zero?
Problem 2 [18 points]: Suppose a simple linear regression is fitted to the data {(xi, yi), i = 1,
2, …, n} with x1 = x2 = xn-1 = a and xn = a+nδ. Should be (n-1).
i.e. average of
(a) [5 points] Show that SXX n(n 1) 2 . the first (n-1) yi
n
(b) [7 points] Show that the OLS estimate for β1 is ˆ1 yn yn 1 , where yn 1
1 1
n
y .
n 1 i 1
i
(c) [6 points] Do you think the regression line obtained from the OLS estimates would pass
through Point A and B below? Verify
Point A: ( x, y) a, yn 1 Point B: ( x, y) ( xn , yn )
Page 1/3
Problem 3 [10 points]: Consider the residuals { êi } from the simple linear regression:
where ˆ1 SXY/SXX and ˆ0 y - ˆ1 x are the OLS estimates for β1and β0.
Show that { êi , i=1,2,…n} are uncorrelated with the explanatory variables {xi, i= 1,2,…n}.
1 n
That is, ˆ ( x, eˆ) ( xi x )(eˆi eˆ) 0 .
n 1 i 1
Problem 4 [22 points]: Suppose simple linear regression is fitted to the data {(x1, y1), … (x19, y19)},
with E(Y | X x) 0 1x, Var(Y | X x) 2
The coefficient table and ANOVA table below shows some of the estimated values:
(a) [11 points] Replicate the two tables above, and fill in ALL the missing values (in 5 significant
figures) from the two tables.
(The p-values can be obtained from R commands like “> 1-pf(F0 , df1, df2)” for the
right-hand tailed probability of Fdf1, df2, or “pt(t0,d)” for the cdf of td)
(b) [3 points] Based on the results in part (a), what is the sample correlation coefficient between
(c) [8 points] Based on the results in part (a), test the hypotheses on whether β1 = -0.2 at α=0.05.
You should setup the 4 steps of hypothesis testing as on Ch2 page 65.
Problem 5 (R problem) [20 points]: The R library ‘alr3’ contains the “segreg” data, which
contains the electricity consumption (in KWH) and mean temperature (in F) for a building at
the University of Minnesota Twin Cities campus for 39 months in 1988-1992.
(https://www.rdocumentation.org/packages/alr3/versions/2.0.5/topics/segreg)
Suppose that we are interested in how the electricity consumption (y=segreg$C) is affected
by the monthly mean temperature (x=segreg$Temp), primarily driven by the use of air
conditioning.
(a) [10 points] Based on the R codes similar to those from Ch2 page 23, obtain the OLS
Page 2/3
scatterplot of the data, and add the regression line obtained in part (a) to the plot.
(c) [4 points] Suppose an outlier is defined as observation (xi, yi) with | eˆi | 2̂ . Do you
Page 3/3