0% found this document useful (0 votes)

17 views7 pages

2.4 Confidence Intervals and Prediction Intervals in Linear Models

Uploaded by

yusra fayyaz

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

17 views7 pages

2.4 Confidence Intervals and Prediction Intervals in Linear Models

Uploaded by

yusra fayyaz

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 7

2.

4 Conﬁdence intervals and prediction intervals

in Linear Models

Conﬁdence and prediction intervals

In this section, we will calculate the conﬁdence and prediction intervals for the Linear Gaussian
Models.

Conﬁdence vs Prediction Interval

Given a certain vector of predictors x∗ , we want to find a confidence interval for the conditional
mean x∗⊤ β and a prediction interval for a future unobserved observation Y ∗ = x∗⊤ β + ϵ∗ where
ϵ∗ is an error independent of ϵi, i = 1, ..., n, drawn from N (0, σ2 ). Below we give the statistical
distributions from which the confidence and prediction intervals are derived from.

Conﬁdence interval
First, note that Y ∗ is Gaussian. Its mean is

E(x∗⊤ β^) = x∗⊤ E[β

^ ] = x∗⊤ β

and its variance

Var(x∗⊤ β^) = Cov(x∗⊤ β^) = x∗⊤ Cov(β^)x∗ = σ2 x∗⊤ (X ⊤ X)−1 x∗

where X is the design matrix for the ﬁtted linear model.

So we have

x∗⊤ β^ ∼ N (x∗⊤ β, σ2 x∗⊤ (X ⊤ X)−1 x∗ )

x∗⊤ β^ − x∗⊤ β
∼ N (0, 1).
σ x∗⊤ (X ⊤ X)−1 x∗

^=
It can be shown that β(X ⊤ X)−1 X ⊤ y and y − y ^ = (I − H)y are independent (by showing
⊤ ⊤
that any row of (X X)−1 X is orthogonal to any row of I − H .

^ and (n − p)σ
It then follows that x∗⊤ β ^ 2 /σ2 has a χ2n−p
^ 2 /σ2 are independent. Since (n − p)σ
distribution, the quotient of the two has a Student-t distribution with (n − p) degrees of freedom:

(n−p)σ^2
x∗⊤ β^ − x∗⊤ β σ 2
/ ∼ tn−p
σ x∗⊤ (X ⊤ X)−1 x∗ n−p

x∗⊤ β^ − x∗⊤ β
∼ tn−p.
^
σ x∗⊤ (X ⊤ X)−1 x∗

Prediction intervals
^∗
Deﬁne Y = x∗⊤ β^ and note that E(Y ∗ − Y^ ∗ ) = 0. Since x∗⊤ β^ and ϵ∗ are independent,

Var(Y ∗ − Y^ ∗ ) = Var(x∗⊤ β^) + Var(ϵ∗ )

= σ2 x∗⊤ (X ⊤ X)−1 x∗ + σ2
= σ2 (1 + x∗⊤ (X ⊤ X)−1 x∗ ).

With a similar argument as above, it can be shown that Y ∗ − Y^ ∗ and (n − p)σ

^ 2 /σ2 are
independent. Thus

(n−p)σ^2
Y ∗ − Y^ ∗ 2
/ σ
∼ tn−p.
⊤ n−p
σ 1 + x∗⊤ (X X)−1 x∗

and upon simplifying

Y ∗ − Y^ ∗
∼ tn−p.
^
σ 1+ x∗⊤ (X ⊤ X)−1 x∗

You may wish to watch the following example video below to help you reinforce your interpretation
of this topic.

An error occurred.

Try watching this video on www.youtube.com, or enable JavaScript if it is

disabled in your browser.
How to calculate conﬁdence and prediction intervals for simple linear regression predictions in R

An error occurred.

Try watching this video on www.youtube.com, or enable JavaScript if it is

disabled in your browser.

Conﬁdence and prediction interval for Linear Gaussian Models

Example: Sydney maximum temperatures

In R, predict() is a general function while predict.lm() is speciﬁcally for prediction using a linear
model object. The help ﬁle under predict.lm() will give you the details of required and optional
arguments.

ﬁrst argument (required) is a ﬁtted linear model;

newdata argument: data frame, contains values at which predictions are to be made;
If newdata is omitted, predictions are made for the original predictors;
se.fit argument: if se.fit = T estimated standard errors of the predictions are returned;

The value returned by predict() for a linear model object is a list containing predictions and standard
errors if se.fit = T, otherwise just the predictions are returned.

Consider now the Sydney maximum temperature dataset mos.df :

mos.df <- read.table("/course/data/mos.df.txt", header=TRUE, quote="\"")

head(mos.df)

Maxtemp Modst Modsp Modthik

1 14.3 288.6 1000.8 5591.2
2 15.2 284.6 1010.8 5438.6
3 18.7 285.7 1011.0 5505.2
4 18.4 285.8 1004.5 5560.3
5 20.9 286.5 1002.9 5530.8
6 23.4 287.6 1003.4 5558.3

Here the response is Maxtemp and the predictors are Modst , Modsp and Modthik .

Suppose we want predictions of maximum temperature when

Modst = 285.1, Modsp = 1021.2 and Modthik=5380.4

Modst = 288.0, Modsp = 1026.8 and Modthik=5388.4

To use the predict.lm() function, we set up a data frame containing our new data (one row for each
of our two predictions).

Then, using the predict.lm() function we can obtain the conﬁdence and prediction intervals as
follows:

mosnew.df <-data.frame(Modst=c(285.1,288.0), Modsp=c(1021.2,1026.8), Modthik=c(5380.4,5388.4))

mos.df <- read.table("/course/data/mos.df.txt", header=TRUE, quote="\"")

mos.lm <- lm(Maxtemp ~ ., mos.df)

mos.pred <- predict.lm(mos.lm, newdata=mosnew.df, se.fit=T, interval="confidence", level=0.95)

mos.pred

mos.pred <- predict.lm(mos.lm, newdata=mosnew.df, se.fit=T, interval="prediction", level=0.95)

mos.pred

$fit
fit lwr upr
1 15.50811 14.81804 16.19818
2 15.85861 15.09306 16.62416

$se.fit
1 2
0.3509169 0.3892985

$df
[1] 365

$residual.scale
[1] 3.009404

$fit
fit lwr upr
1 15.50811 9.550067 21.46615
2 15.85861 9.891352 21.82587

$se.fit
1 2
0.3509169 0.3892985
$df
[1] 365

$residual.scale
[1] 3.009404
Activity in R: Plotting conﬁdence and prediction intervals
Consider the following generated x -predictor and y -response vectors:

x<-rnorm(15)
x

## [1] -0.4723019 2.3502426 0.5491130 0.3013668 -1.7001447 -2.1002455

## [7] 1.0016725 1.1004800 0.5962411 -0.5250126 1.1277512 0.5301167
## [13] 0.1709285 -1.2492901 0.2313901

y<-x+rnorm(15)
y

## [1] -0.5248888 2.5948113 -1.5645586 0.9136845 -2.2149522 -2.8905325

## [7] -0.8397268 0.7740053 0.7110674 -0.8100348 0.6468322 -0.6134713
## [13] 0.5671229 -2.1764990 1.5912105

Next, deﬁne the new values of x for which your predictions will be calculated:

new <- data.frame(x = seq(-3, 3, 0.5))

In this activity
1. Predict the values of y given your new values of x
2. Find conﬁdence and prediction intervals
3. Visualise your intervals using the matplot() function in R
Additional activity

Question

Which of the following is true of conﬁdence and predication intervals:

the prediction interval is narrower than the conﬁdence interval;

the conﬁdence interval refers to the conﬁdence interval for the conditional mean, in our
⊤
notation, x∗ β ;

the conﬁdence interval can be calculated in R using the predict() function.

Project On Quantitative Techniques of Business Stat
No ratings yet
Project On Quantitative Techniques of Business Stat
22 pages
Advanced Econometrics PDF
No ratings yet
Advanced Econometrics PDF
58 pages
Lecture 9: Predictive Inference
No ratings yet
Lecture 9: Predictive Inference
10 pages
Inference in Regression: Brian Caffo, Jeff Leek and Roger Peng Johns Hopkins Bloomberg School of Public Health
No ratings yet
Inference in Regression: Brian Caffo, Jeff Leek and Roger Peng Johns Hopkins Bloomberg School of Public Health
14 pages
M1T1L4-Simple Linear Regression Final
No ratings yet
M1T1L4-Simple Linear Regression Final
17 pages
Linear Regression
100% (2)
Linear Regression
228 pages
Lecture 22: Review For Exam 2 1 Basic Model Assumptions (Without Gaussian Noise)
No ratings yet
Lecture 22: Review For Exam 2 1 Basic Model Assumptions (Without Gaussian Noise)
7 pages
Module05 Notes
No ratings yet
Module05 Notes
19 pages
R Programming Student Lab Manual-52-63-3-12
No ratings yet
R Programming Student Lab Manual-52-63-3-12
10 pages
R Stastics PDF
No ratings yet
R Stastics PDF
30 pages
03 Assumptions and Gauss Markov
No ratings yet
03 Assumptions and Gauss Markov
5 pages
Gary Chamberlain Econometric S
No ratings yet
Gary Chamberlain Econometric S
152 pages
Business Analytics Unit - V Notes - 60637708 - 2025 - 05 - 15 - 02 - 16
No ratings yet
Business Analytics Unit - V Notes - 60637708 - 2025 - 05 - 15 - 02 - 16
37 pages
Linear Regression
No ratings yet
Linear Regression
13 pages
Using R For Linear Regression
No ratings yet
Using R For Linear Regression
9 pages
SC&RP - Unit 5
No ratings yet
SC&RP - Unit 5
36 pages
R-Programming - Unit 5
No ratings yet
R-Programming - Unit 5
43 pages
Regression Analysis
100% (1)
Regression Analysis
280 pages
Exam 1 Notes
No ratings yet
Exam 1 Notes
4 pages
Preliminaries: Prediction and Confidence Intervals in Regression
No ratings yet
Preliminaries: Prediction and Confidence Intervals in Regression
10 pages
Interval Estimation13oct - Slides
No ratings yet
Interval Estimation13oct - Slides
38 pages
H-311 Linear Regression Analysis With R
100% (1)
H-311 Linear Regression Analysis With R
71 pages
WST 311 Notes Part 2 2024
No ratings yet
WST 311 Notes Part 2 2024
21 pages
Adv Analytical Theory and Methods: Regression
No ratings yet
Adv Analytical Theory and Methods: Regression
45 pages
Linear Regression
No ratings yet
Linear Regression
108 pages
Lecture 1
No ratings yet
Lecture 1
8 pages
Inference in The Regression Model
No ratings yet
Inference in The Regression Model
4 pages
Chapter 2: Simple Linear Regression (Cont'd)
No ratings yet
Chapter 2: Simple Linear Regression (Cont'd)
37 pages
Statlearn PDF
No ratings yet
Statlearn PDF
123 pages
Linear Regression With LM Function, Diagnostic Plots, Interaction Term, Non-Linear Transformation of The Predictors, Qualitative Predictors
100% (1)
Linear Regression With LM Function, Diagnostic Plots, Interaction Term, Non-Linear Transformation of The Predictors, Qualitative Predictors
15 pages
Day 24 Supervised Learning - REgression Analysis - 2
No ratings yet
Day 24 Supervised Learning - REgression Analysis - 2
18 pages
MOOC Econometrics: Philip Hans Franses
No ratings yet
MOOC Econometrics: Philip Hans Franses
4 pages
Chapter 14 (14.5)
No ratings yet
Chapter 14 (14.5)
9 pages
Lab 5
No ratings yet
Lab 5
2 pages
Reading 5 A
No ratings yet
Reading 5 A
10 pages
SimpleLinearRegression PDF
No ratings yet
SimpleLinearRegression PDF
86 pages
Estimation
No ratings yet
Estimation
10 pages
Stat Modelling Notes
No ratings yet
Stat Modelling Notes
49 pages
Linear Regression
No ratings yet
Linear Regression
56 pages
Unit - 1
No ratings yet
Unit - 1
8 pages
ch12 0
No ratings yet
ch12 0
43 pages
Var F
No ratings yet
Var F
1 page
Inference For Regression
No ratings yet
Inference For Regression
24 pages
Objectives: - by The End of This Topic Students Will
No ratings yet
Objectives: - by The End of This Topic Students Will
17 pages
Simple Linear Regression: Parameters
No ratings yet
Simple Linear Regression: Parameters
34 pages
Linera Regression II PDF
No ratings yet
Linera Regression II PDF
14 pages
Simple Linear Regression Analysis
No ratings yet
Simple Linear Regression Analysis
55 pages
Math644 - Chapter 1 - Part2 PDF
No ratings yet
Math644 - Chapter 1 - Part2 PDF
14 pages
Linear Model
No ratings yet
Linear Model
14 pages
Confidence Interval, Model Fitness and Prediction: S S T B
No ratings yet
Confidence Interval, Model Fitness and Prediction: S S T B
8 pages
Statistical Analysis
No ratings yet
Statistical Analysis
26 pages
Unit 2
No ratings yet
Unit 2
63 pages
Chap 7
No ratings yet
Chap 7
7 pages
Predictions in Linear Regression Model
No ratings yet
Predictions in Linear Regression Model
15 pages
EC501 Lecture 02
No ratings yet
EC501 Lecture 02
27 pages
Statistical Regression
No ratings yet
Statistical Regression
32 pages
Multiple Linear Reegression
No ratings yet
Multiple Linear Reegression
21 pages
Multiple Linear Regression
No ratings yet
Multiple Linear Regression
18 pages
Worked Examples in Mathematics for Scientists and Engineers
From Everand
Worked Examples in Mathematics for Scientists and Engineers
G. Stephenson
No ratings yet
A-level Maths Revision: Cheeky Revision Shortcuts
From Everand
A-level Maths Revision: Cheeky Revision Shortcuts
Scool Revision
3.5/5 (8)
Differential Forms
From Everand
Differential Forms
Henri Cartan
5/5 (2)
2012-Response Surface Methodology Process and Product Optimization Using Designed Experiments
No ratings yet
2012-Response Surface Methodology Process and Product Optimization Using Designed Experiments
4 pages
Overview of Course: Pgdm@Nmims Batch - 04 Trimester - 04 Course: Business Analytics For Decision Making
No ratings yet
Overview of Course: Pgdm@Nmims Batch - 04 Trimester - 04 Course: Business Analytics For Decision Making
7 pages
The Influence of Flash Sales and Free Shipping On
No ratings yet
The Influence of Flash Sales and Free Shipping On
12 pages
MTH - SDS
No ratings yet
MTH - SDS
16 pages
Community Medicine Items
No ratings yet
Community Medicine Items
10 pages
The Independence of Irrelevant Alternatives - 230919 - 191757
No ratings yet
The Independence of Irrelevant Alternatives - 230919 - 191757
26 pages
Introduction To Multilevel Modeling Using HLM 6
No ratings yet
Introduction To Multilevel Modeling Using HLM 6
21 pages
LIMDEP Chapter33 PDF
No ratings yet
LIMDEP Chapter33 PDF
143 pages
DRDO
No ratings yet
DRDO
26 pages
Edexcel GCE: Time: 1 Hour 30 Minutes
No ratings yet
Edexcel GCE: Time: 1 Hour 30 Minutes
13 pages
CPA Syllabus 2009
No ratings yet
CPA Syllabus 2009
2 pages
Sensor Characteristics: (Part One)
No ratings yet
Sensor Characteristics: (Part One)
31 pages
Akar
No ratings yet
Akar
2 pages
Klinker EMA MEA (Pre) PDF
No ratings yet
Klinker EMA MEA (Pre) PDF
11 pages
Design, Modeling, and Validation of A Soft Magnetic 3-D Force Sensor
No ratings yet
Design, Modeling, and Validation of A Soft Magnetic 3-D Force Sensor
12 pages
ECON2280 2023-24 Common Course Outline
No ratings yet
ECON2280 2023-24 Common Course Outline
5 pages
British Tourist Authority Price Sensitivity of Tourism To Britain
No ratings yet
British Tourist Authority Price Sensitivity of Tourism To Britain
132 pages
Group 1: "Forecasting"
No ratings yet
Group 1: "Forecasting"
17 pages
Fritz and MacKinnon - 2007 - Required Sample Size To Detect The Mediated Effect
No ratings yet
Fritz and MacKinnon - 2007 - Required Sample Size To Detect The Mediated Effect
7 pages
HW9 Solutions
No ratings yet
HW9 Solutions
9 pages
CIS2205-24-25-Assignment 2
No ratings yet
CIS2205-24-25-Assignment 2
10 pages
Cross Model by DR - Zafar
No ratings yet
Cross Model by DR - Zafar
4 pages
Hard Potato A Python Library To Control Commercial Potentiostats and To Automate Electrochemical Experiments
No ratings yet
Hard Potato A Python Library To Control Commercial Potentiostats and To Automate Electrochemical Experiments
6 pages
Act - 03 - Cost Analysis
No ratings yet
Act - 03 - Cost Analysis
9 pages
Tobit Models: Econ 60303 Bill Evans
No ratings yet
Tobit Models: Econ 60303 Bill Evans
20 pages
Log Explanation
No ratings yet
Log Explanation
3 pages
Ai 900
No ratings yet
Ai 900
23 pages
B.B.A. Revised Papers BBA-Ist Semester
No ratings yet
B.B.A. Revised Papers BBA-Ist Semester
37 pages
Predictive Modelling Sweta Kumari
No ratings yet
Predictive Modelling Sweta Kumari
35 pages

2.4 Confidence Intervals and Prediction Intervals in Linear Models

Uploaded by

2.4 Confidence Intervals and Prediction Intervals in Linear Models

Uploaded by

2.

4 Conﬁdence intervals and prediction intervals

Conﬁdence and prediction intervals

Conﬁdence vs Prediction Interval

E(x∗⊤ β^) = x∗⊤ E[β

and its variance

Var(x∗⊤ β^) = Cov(x∗⊤ β^) = x∗⊤ Cov(β^)x∗ = σ2 x∗⊤ (X ⊤ X)−1 x∗

where X is the design matrix for the ﬁtted linear model.

x∗⊤ β^ ∼ N (x∗⊤ β, σ2 x∗⊤ (X ⊤ X)−1 x∗ )

Var(Y ∗ − Y^ ∗ ) = Var(x∗⊤ β^) + Var(ϵ∗ )

With a similar argument as above, it can be shown that Y ∗ − Y^ ∗ and (n − p)σ

and upon simplifying

Try watching this video on www.youtube.com, or enable JavaScript if it is

Try watching this video on www.youtube.com, or enable JavaScript if it is

Conﬁdence and prediction interval for Linear Gaussian Models

Example: Sydney maximum temperatures

ﬁrst argument (required) is a ﬁtted linear model;

Consider now the Sydney maximum temperature dataset mos.df :

mos.df <- read.table("/course/data/mos.df.txt", header=TRUE, quote="\"")

Maxtemp Modst Modsp Modthik

Suppose we want predictions of maximum temperature when

Modst = 285.1, Modsp = 1021.2 and Modthik=5380.4

mosnew.df <-data.frame(Modst=c(285.1,288.0), Modsp=c(1021.2,1026.8), Modthik=c(5380.4,5388.4))

mos.df <- read.table("/course/data/mos.df.txt", header=TRUE, quote="\"")

mos.pred <- predict.lm(mos.lm, newdata=mosnew.df, se.fit=T, interval="confidence", level=0.95)

mos.pred <- predict.lm(mos.lm, newdata=mosnew.df, se.fit=T, interval="prediction", level=0.95)

## [1] -0.4723019 2.3502426 0.5491130 0.3013668 -1.7001447 -2.1002455

## [1] -0.5248888 2.5948113 -1.5645586 0.9136845 -2.2149522 -2.8905325

new <- data.frame(x = seq(-3, 3, 0.5))

Which of the following is true of conﬁdence and predication intervals:

the prediction interval is narrower than the conﬁdence interval;

the conﬁdence interval can be calculated in R using the predict() function.

You might also like