2.4 Confidence Intervals and Prediction Intervals in Linear Models
2.4 Confidence Intervals and Prediction Intervals in Linear Models
Confidence interval
First, note that Y ∗ is Gaussian. Its mean is
So we have
or
x∗⊤ β^ − x∗⊤ β
∼ N (0, 1).
σ x∗⊤ (X ⊤ X)−1 x∗
^=
It can be shown that β(X ⊤ X)−1 X ⊤ y and y − y ^ = (I − H)y are independent (by showing
⊤ ⊤
that any row of (X X)−1 X is orthogonal to any row of I − H .
^ and (n − p)σ
It then follows that x∗⊤ β ^ 2 /σ2 has a χ2n−p
^ 2 /σ2 are independent. Since (n − p)σ
distribution, the quotient of the two has a Student-t distribution with (n − p) degrees of freedom:
(n−p)σ^2
x∗⊤ β^ − x∗⊤ β σ 2
/ ∼ tn−p
σ x∗⊤ (X ⊤ X)−1 x∗ n−p
or
x∗⊤ β^ − x∗⊤ β
∼ tn−p.
^
σ x∗⊤ (X ⊤ X)−1 x∗
Prediction intervals
^∗
Define Y = x∗⊤ β^ and note that E(Y ∗ − Y^ ∗ ) = 0. Since x∗⊤ β^ and ϵ∗ are independent,
(n−p)σ^2
Y ∗ − Y^ ∗ 2
/ σ
∼ tn−p.
⊤ n−p
σ 1 + x∗⊤ (X X)−1 x∗
Y ∗ − Y^ ∗
∼ tn−p.
^
σ 1+ x∗⊤ (X ⊤ X)−1 x∗
You may wish to watch the following example video below to help you reinforce your interpretation
of this topic.
An error occurred.
An error occurred.
The value returned by predict() for a linear model object is a list containing predictions and standard
errors if se.fit = T, otherwise just the predictions are returned.
Here the response is Maxtemp and the predictors are Modst , Modsp and Modthik .
To use the predict.lm() function, we set up a data frame containing our new data (one row for each
of our two predictions).
Then, using the predict.lm() function we can obtain the confidence and prediction intervals as
follows:
$fit
fit lwr upr
1 15.50811 14.81804 16.19818
2 15.85861 15.09306 16.62416
$se.fit
1 2
0.3509169 0.3892985
$df
[1] 365
$residual.scale
[1] 3.009404
$fit
fit lwr upr
1 15.50811 9.550067 21.46615
2 15.85861 9.891352 21.82587
$se.fit
1 2
0.3509169 0.3892985
$df
[1] 365
$residual.scale
[1] 3.009404
Activity in R: Plotting confidence and prediction intervals
Consider the following generated x -predictor and y -response vectors:
x<-rnorm(15)
x
y<-x+rnorm(15)
y
Next, define the new values of x for which your predictions will be calculated:
In this activity
1. Predict the values of y given your new values of x
2. Find confidence and prediction intervals
3. Visualise your intervals using the matplot() function in R
Additional activity
Question
the confidence interval refers to the confidence interval for the conditional mean, in our
⊤
notation, x∗ β ;