0% found this document useful (0 votes)
11 views5 pages

Demo0 Sol1

notes

Uploaded by

psbr6m92hj
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views5 pages

Demo0 Sol1

notes

Uploaded by

psbr6m92hj
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

R University

Statistics Exam 2023-02-28 Exam ID 00001

Name:

Student ID:

Signature:

1. (a) (b) X (c) (d)

2. (a) X (b) X (c) (d) X (e)

3. (a) (b) X (c) (d)

4. (a) (b) X (c) (d) X

5. (a) X (b) (c)

6. (a) (b) (c) X

7. (a) (b) (c) X

8. (a) X (b) (c)

9. (a) X (b) (c) (d) X

10. proof
Statistics Exam: 00001 2

1. Problem
Which of the following are TRUE?
(a) Let V (Y | X) = σ 2 . Then, the estimate σ̂ measures the proportion of the variation
explained by the regression model.
(b) P
As long as the mean function of a linear regression model includes the intercept, then
n
i=1 ei = 0, where ei are the regression residuals.

(c) Let vi = P(xxi2−x̄)


Pn
−nx̄2
. Then, i=1 vi xi = 0.
i i

(d) Given βˆ0 the ordinary least squares estimate of the intercept, then its variance is:
x̄2
V (β̂0 | X) = σ 2 SS x
, where x̄ is the mean of the predictor X, SSx is the sum of squares
of X and σ 2 is the residual variance.
2. Problem
A multiple regression model of the following form is fitted to a data set:

y = β0 + β1 x1 + β2 x2 + β3 x3 + β4 x4 + ε, ε ∼ N (0, σ 2 ).

The model is fitted using the software R and the following summary output is obtained:

Estimate Std. Error t value Pr(>|t|)


(Intercept) NA 0.2242 3.3054 0.0015
x1 -0.2897 NA -1.2162 0.2278
x2 0.3502 0.2454 NA 0.1577
x3 0.0330 0.2278 0.1447 0.8853
x4 0.5185 NA 1.8751 0.0647

Residual standard error: 1.9381 on 74 degrees of freedom


Multiple R-squared: 0.104 , Adjusted R-squared: ???
F-statistic: ??? on 4 and 74 DF, p-value: 0.08423

Which of the following statements are TRUE?


(a) The value of the t-statistics of β2 is smaller than 1.437
(b) The value of the Residual Sum Squares (RSS) is greater than 277.862
(c) The value of the F-statistic is 0.116
(d) The two-sided 95% confidence interval for β3 is about 0.033 ± 1.99 × 0.2278
(e) The estimated value of y for x1 = 1, x2 = 0, x3 = 0 and x4 = 1 is ≈ 1.05

3. Problem
Which of the following are TRUE?
(a) If you have estimated coefficients from a fit of E(Y | X1 ) = β0 + β1 x1 , then you know
the sign of β1 in E(Y | X1 , X2 ) = β0 + β1 x1 + β2 x2 .
(b) We have a regression with mean function E(Y | X1 , X2 ) = β0 + β1 x1 + β2 x2 . Suppose
that the two terms X1 and X2 have sample correlation equal to 0. Then the value of
the slope of the regression for X2 on X1 is 0.
(c) If you fit a multiple regression with 15 data points and 3 predictors (including the inter-
cept), then the “hat” matrix is 3 × 3.
Statistics Exam: 00001 3

(d) Given the√model y = β0 + β1 x1 + β2 x2 + β3 x3 + ε, ε ∼ N (0, σ 2 ), the standard error of


β0 is σ̂ × 0.524 when X> X is:

x0 x1 x2 x3
0.524 −0.507 −0.19 −0.059
−0.507 0.652 0.262 0.167
−0.19 0.262 0.283 0.079
−0.059 0.167 0.079 0.212

4. Problem
Which of the following are TRUE?

(a) You did a regression analysis with model y = β0 + β1 x1 + β2 x2 + ε and you are looking
forward to add a new regressor to improve your model. Given the plot below, adding X3
to the model might increase multicollinearity, rather adding X4 might decrease multi-
X3

X4

X2 X2

collinearity issues:
(b) Consider the linear regression model E[Y | X1 , X2 , X3 ] = β0 + β1 x1 + β2 x2 + β3 x3 . To
compute the VIF of β̂1 you need the R2 of the regression model E[X1 | X2 , X3 ].
(c) In a regression model with a numeric predictor, a dummy predictor and without any
interaction term, there can be more than one slope, but only one intercept.
(d) Let X be a categorical variable with 3 levels (a, b, c) and consider the representation
using the corresponding dummies x1 , x2 and x3 . Consider the fitted linear regression
y = 0.85 − 0.5x2 − 0.35x3 + ε , where level “a” is the baseline. Then, −0.5 represents
the difference in the average Y between level “b” and “a”.
5. Problem
Which of the following is TRUE?
−1
(a) The leverage of the observation i is x> >
i X X xi , where X is the design matrix with
>
rows xi .
(b) High leverage points are the observations that do not fit the model.
−1
(c) Let V (Y | X) = Σ. Then, the Generalized Least Square estimate is β̂ = X> ΣX X> Σy,
where X is the design matrix and y the observed response vector.
6. Problem
Given the residual plots below, which of the following is FALSE?
Statistics Exam: 00001 4

Residuals (i) (ii)

Residuals
0
0

Fitted values Fitted values

(a) Plot (i) suggests that the assumption of constant variance is not consistent with ob-
served data.
(b) Plot (ii) indicates some nonlinearity.
(c) Neither
7. Problem
Which of the following is TRUE?
(a) The backward stepwise selection searches through 2p possible models.
(b) Models with many parameters are always better for prediction then simple models with
just a few parameters.
(c) The forward stepwise selection searches through 1 + p(p + 1)/2 models.
8. Problem
Consider the output of the variable selection procedure carried out using the R function
regsubsets(). Which of the following is TRUE?

Subset selection object


Call: eval(expr, envir, enclos)
6 Variables (and intercept)
Forced in Forced out
x1 FALSE FALSE
x2 FALSE FALSE
x3 FALSE FALSE
x4 FALSE FALSE
x5 FALSE FALSE
x6 FALSE FALSE
1 subsets of each size up to 6
Selection Algorithm: exhaustive
x1 x2 x3 x4 x5 x6
1 ( 1 ) " " " " " " " " " " "*"
2 ( 1 ) " " "*" " " " " " " "*"
3 ( 1 ) "*" "*" " " " " " " "*"
4 ( 1 ) "*" "*" "*" " " " " "*"
5 ( 1 ) "*" "*" "*" "*" " " "*"
6 ( 1 ) "*" "*" "*" "*" "*" "*"

(a) Neither.
(b) The best 2-variable model contains only x1 and x2.
Statistics Exam: 00001 5

(c) The best model is the one including all the 6 predictors.

9. Problem
Which of the following is TRUE?
(a) We consider the logistic regression model with linear predictor 0.3 + 0.6x1 . For x1 = 0,
the estimated probability of success is smaller than 0.58.
q
(b) The standard error of the logistic regression coefficient βj is given by θ̂(1 − θ̂), where
θ̂ is the estimated probability of success.
(c) We consider the logistic regression model with linear predictor β0 +β1 x. Let [−0.39; 0.87]
be the 95%-confidence interval for β1 . In this case, a z-Test with significance level 1%
rejects the null hypothesis H0 : β1 = 0.
(d) We consider the logistic regression model with linear predictor 0.6 + 0.3x1 . For x1 = 2,
the estimated probability of success is higher than 0.64.
10. Problem
Derive the global F-statistic for a multiple linear regression model and show that, when only
one predictor is included, then the F-statistic is equal to the square of the t-statistic of β1 .

You might also like