0% found this document useful (0 votes)
9 views

Lecture 5

This document summarizes a lecture on inference for the slope parameter (β1) in normal linear regression models. It shows that the estimate of the slope (b1) follows a t-distribution with n-2 degrees of freedom. This allows constructing confidence intervals for β1 and performing hypothesis tests on β1 using the t-distribution. Formulas for 95% confidence intervals on β1 and examples of hypothesis tests on β1 are provided.

Uploaded by

Noman Shahzad
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views

Lecture 5

This document summarizes a lecture on inference for the slope parameter (β1) in normal linear regression models. It shows that the estimate of the slope (b1) follows a t-distribution with n-2 degrees of freedom. This allows constructing confidence intervals for β1 and performing hypothesis tests on β1 using the t-distribution. Formulas for 95% confidence intervals on β1 and examples of hypothesis tests on β1 are provided.

Uploaded by

Noman Shahzad
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 45

Inference in Normal Regression

Model
Dr. Frank Wood

Frank Wood, [email protected] Linear Regression Models Lecture 5, Slide 1


Remember
• Last class we derived the sampling variance
of the estimator of the slope, it being

2
2  σ
σ {b1 } = (Xi −X̄)2

• And we made the point that an estimate of


σ{b1} could be arrived at by substituting the
MSE for the unknown error variance.
SSE
2 M SE
s {b1 } = (X −X̄)2 = (X
 n−2
2
i i −X̄)
Frank Wood, [email protected] Linear Regression Models Lecture 5, Slide 2
Sampling Distribution of (b1 - β)/s{b1}
• We determined that b1 is normally distributed
so (b1-β)/σ{b1} is a standard normal variable
• We don’t know σ{b1} so it must be estimated
from data. We have already denoted it’s
estimate s{b1}
• Using this estimate we it can be shown that
b1 −β1 
s{b1 } ∼ t(n − 2) s{b1 } = s2 {b1 }

Frank Wood, [email protected] Linear Regression Models Lecture 5, Slide 3


Where does this come from?
• We need to rely upon the following theorem
– For the normal error regression model


(Yi −Ŷi )2
SSE
σ2 = σ2 ∼ χ2 (n − 2)

and is independent of b0 and b1

• Intuitively this follows the standard result for the sum


of squared normal random variables
– Here there are two linear constraints imposed by the
regression parameter estimation that each reduce the
number of degrees of freedom by one.

Frank Wood, [email protected] Linear Regression Models Lecture 5, Slide 4


Another useful fact : t distribution
• Let z and χ(ν) be independent random
variables (standard normal (N(0,1)) and χ
respectively). We then define a t random
variable as follows:

t(ν) =  χz2 (ν)


ν

This version of the t distribution has one


parameter, the degrees of freedom ν

Frank Wood, [email protected] Linear Regression Models Lecture 5, Slide 5


Distribution of the studentized statistic
• To derive the distribution of this statistic using
the provided theorems, first we do the
following rewrite The numerator
is a N(0,1)
b1 −β1 normal variable
b1 −β1 σ{b1 }
s{b1 } = s{b1 }
σ{b1 }


s{b1 } s2 {b1 }
σ{b1 } = σ 2 {b1 }

Frank Wood, [email protected] Linear Regression Models Lecture 5, Slide 6


Studentized statistic cont.
• And note the following
 M SE
s2 {b1 } (Xi −X̄)2
M SE SSE
σ 2 {b1 } =  σ2
= σ2 = σ 2 (n−2)
(Xi −X̄)2

where we know (by the given theorem) the


distribution of the last term is χ and indep. of
b1 and b0
SSE χ2 (n−2)
σ 2 (n−2) ∼ n−2

Frank Wood, [email protected] Linear Regression Models Lecture 5, Slide 7


Studentized statistic final
• But by the given definition of the t distribution
we have our result
b1 −β1 z
s{b1 } ∼ 
χ2 (n−2)
n−2

because putting everything together we can


see that
b1 −β1
s{b1 } ∼ t(n − 2)
Frank Wood, [email protected] Linear Regression Models Lecture 5, Slide 8
Confidence Intervals and Hypothesis Tests
• Now that we know the sampling distribution of
b (t with n-2 degrees of freedom) we can
construct confidence intervals and hypothesis
tests easily

Frank Wood, [email protected] Linear Regression Models Lecture 5, Slide 9


Confidence Interval for β
• Since the “studentized” statistic follows a t
distribution we can make the following
probability statement
b1 −β1
P (t(α/2; n − 2) ≤ s{b1 } ≤ t(1 − α/2; n − 2)) = 1 − α
0.4 1 3
ν = 10 ν = 10 ICDF ν = 10
0.9
0.35
2
0.8
0.3
0.7
1
0.25
0.6

0.2 0.5 0

0.4
0.15

0.3 -1
0.1
0.2
-2
0.05
0.1

0 0
-10 -8 -6 -4 -2 0 2 4 6 8 10 -10 -8 -6 -4 -2 0 2 4 6 8 10 -3
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Frank Wood, [email protected] Linear Regression Models Lecture 5, Slide 10


Interval arriving from picking α
• Note that by symmetry
t(α/2; n − 2) = −t(1 − α/2; n − 2)

• Rearranging terms and using this fact we


have
P (b1 − t(1 − α/2; n − 2)s{b1 } ≤ β1 ≤ b1 + t(1 − α/2; n − 2)s{b1 }) = 1 − α

• And now we can use a table to look up and


produce confidence intervals

Frank Wood, [email protected] Linear Regression Models Lecture 5, Slide 11


Using tables for Computing Intervals
• The tables in the book (table B.2 in the
appendix) for t(1-α/2;ν) where
– P{t(ν) ≤ t(1-α/2; ν)} = A
• Provides the inverse CDF of the t-distribution
• This can be arrived at computationally as well
– Matlab: tinv(1-α/2, ν)

Frank Wood, [email protected] Linear Regression Models Lecture 5, Slide 12


1-α confidence limits for β
• The 1-α confidence limits for β are

b1 ± t(1 − α/2; n − 2)s{b1 }


• Note that this quantity can be used to
calculate confidence intervals given n and α.
– Fixing α can guide the choice of sample size if a
particular confidence interval is desired
– Give a sample size, vice versa.
• Also useful for hypothesis testing
Show demo.m

Frank Wood, [email protected] Linear Regression Models Lecture 5, Slide 13


Tests Concerning β
• Example 1
– Two-sided test
• H0 : β = 0
• Ha : β ≠ 0
• Test statistic

∗ b1 −0
t = Ŝ(b1 )

Frank Wood, [email protected] Linear Regression Models Lecture 5, Slide 14


Tests Concerning β
• We have an estimate of the sampling
distribution of b1 from the data.
• If the null hypothesis holds then the b1
estimate coming from the data should be
within the 95% confidence interval of the
sampling distribution centered at 0 (in this
case)

∗ b1 −0
t = s{b1 }

Frank Wood, [email protected] Linear Regression Models Lecture 5, Slide 15


Decision rules
if |t∗ | ≤ t(1 − α/2; n − 2), conclude H0
if |t∗ | > t(1 − α/2; n − 2), conclude Hα

• Absolute values make the test two-sided

Frank Wood, [email protected] Linear Regression Models Lecture 5, Slide 16


Intuition 1-α confidence interval

Test statistic
0.4

0.35

0.3

0.25

0.2

0.15

0.1

0.05
p-value is value of α
0 that moves the green line
-10 -8 -6 -4 -2 0 2 4 6 8 10
( β est. - β) / σ est.
to the blue line
β
Frank Wood, [email protected] Linear Regression Models Lecture 5, Slide 17
Calculating the p-value
• The p-value, or attained significance level, is
the smallest level of significance α for which
the observed data indicate that the null
hypothesis should be rejected.
• This can be looked up using the CDF of the
test statistic.

• In Matlab
– Two-sided p-value
• 2*(1-tcdf(|t*|,ν))

Frank Wood, [email protected] Linear Regression Models Lecture 5, Slide 18


Inferences Concerning β
• Largely, inference procedures regarding β
can be performed in the same way as those
for β
• Remember the point estimator b0 for β

b0 = Ȳ − b1 X̄

Frank Wood, [email protected] Linear Regression Models Lecture 5, Slide 19


Sampling distribution of b0
• The sampling distribution of b0 refers to the
different values of b0 that would be obtained
with repeated sampling when the levels of the
predictor variable X are held constant from
sample to sample.
• For the normal regression model the
sampling distribution of b0 is normal

Frank Wood, [email protected] Linear Regression Models Lecture 5, Slide 20


Sampling distribution of b0
• When error variance is known

E(b0 ) = β0
2
2 2 1  X̄
σ {b0 } = σ (n + (Xi −X̄) 2
)

• When error variance is unknown


2
2
s {b0 } = M SE( n1 +  X̄
(Xi −X̄) 2
)

Frank Wood, [email protected] Linear Regression Models Lecture 5, Slide 21


Confidence interval for β
• The 1-α confidence limits for β are obtained
in the same manner as those for β

b0 ± t(1 − α/2; n − 2)s{b0 }

Frank Wood, [email protected] Linear Regression Models Lecture 5, Slide 22


Considerations on Inferences on β and β
• Effects of departures from normality
– The estimators of β and β have the property of
asymptotic normality – their distributions
approach normality as the sample size increases
(under general conditions)
• Spacing of the X levels
– The variances of b0 and b1 (for a given n and σ)
depend strongly on the spacing of X

Frank Wood, [email protected] Linear Regression Models Lecture 5, Slide 23


Sampling distribution of point estimator of mean response

• Let Xh be the level of X for which we would


like an estimate of the mean response
– Needs to be one of the observed X’s
• The mean response when X=Xh is denoted by
E{Yh}
• The point estimator of E{Yh} is

Ŷh = b0 + b1 Xh
We are interested in the sampling distribution
of this quantity

Frank Wood, [email protected] Linear Regression Models Lecture 5, Slide 24


Sampling Distribution of Ŷh
• We have

Ŷh = b0 + b1 Xh
• Since this quantity is itself a linear
combination of the Yi’s it’s sampling
distribution is itself normal.
• The mean of the sampling distribution is

E{Ŷh } = E{b0 } + E{b1 }Xh = β0 + β1 Xh


Biased or unbiased?

Frank Wood, [email protected] Linear Regression Models Lecture 5, Slide 25


Sampling Distribution of Ŷh
• To derive the sampling distribution variance
of the mean response we first show that b1
and (1/n) ∑ Yi are uncorrelated and, hence,
for the normal error regression model
independent
• We start with the definitions

Ȳ = ( n1 )Yi
 (Xi − X̄)
b1 = ki Yi , ki = 
(Xi − X̄)2
Frank Wood, [email protected] Linear Regression Models Lecture 5, Slide 26
Sampling Distribution of Ŷh
• We want to show that mean response and the
estimate b1 are uncorrelated
Cov(Ȳ , b1 ) = σ 2 {Ȳ , b1 } = 0

• To do this we need the following result (A.32)


n n n
σ { i=1 ai Yi , i=1 ci Yi } = i=1 ai ci σ 2 {Yi }
2

when the Yi are independent

Frank Wood, [email protected] Linear Regression Models Lecture 5, Slide 27


Sampling Distribution of Ŷh
• Using this fact we have
n n n
2 1 1
σ { Yi , ki Yi } = ki σ 2 {Yi } from appendix
i=1
n i=1 i=1
n
n
1
= ki σ 2
i=1
n
σ n2 
= ki i ki = 0
n i=1
= 0

So the mean of Y and b1 are uncorrelated


Frank Wood, [email protected] Linear Regression Models Lecture 5, Slide 28
Sampling Distribution of Ŷh
• This means that we can write down the
variance
2 2
σ {Ŷh } = σ {Ȳ + b1 (Xh − X̄)}
alternative and equivalent
form of regression function

• But we know that the mean of Y and b1 are


uncorrelated so

σ 2 {Ŷh } = σ 2 {Ȳ } + σ 2 {b1 }(Xh − X̄)2

Frank Wood, [email protected] Linear Regression Models Lecture 5, Slide 29


Sampling Distribution of Ŷh
• We know (from last lecture)
σ2
σ 2 {b1 } = 
(Xi − X̄)2
M SE
s2 {b1 } = 
(Xi − X̄)2

• And we can find

2 1
 2 nσ 2 σ2
σ {Ȳ } = n2 σ {Ȳ } = n2 = n

Frank Wood, [email protected] Linear Regression Models Lecture 5, Slide 30


Sampling Distribution of Ŷh
• So, plugging in, we get

2 σ2 2
σ {Ŷh } = n +  σ
(Xi −X̄)2
(X h − X̄)2

• Or
 
2
1 (X − X̄)
σ 2 {Ŷh } = σ 2 n +  h
(Xi −X̄)2

Frank Wood, [email protected] Linear Regression Models Lecture 5, Slide 31


Sampling Distribution of Ŷh
• Since we often won’t know σ we can, as
usual, plug in s2 = SSE/(n-2), our estimate for
it to get our estimate of this sampling
distribution variance
 
2
1 (X − X̄)
s2 {Ŷh } = s2 n + h
(Xi −X̄)2

Frank Wood, [email protected] Linear Regression Models Lecture 5, Slide 32


No surprise…
• The sampling distribution of our point
estimator for the output is distributed as a t-
distribution with two degrees of freedom

Ŷh −E{Yh }
s{Ŷh }
∼ t(n − 2)

• This means that we can construct confidence


intervals in the same manner as before.

Frank Wood, [email protected] Linear Regression Models Lecture 5, Slide 33


Confidence Intervals for E{Yh}
• The 1-α confidence intervals for E{Yh} are

Ŷh ± t(1 − α/2; n − 2)s{Ŷh }

• From this hypothesis tests can be constructed


as usual.

Frank Wood, [email protected] Linear Regression Models Lecture 5, Slide 34


Comments
• The variance of the estimator for E{Yh} is
smallest near the mean of X. Designing
studies such that the mean of X is near Xh will
improve inference precision
• When Xh is zero the variance of the estimator
fo E{Yh} reduces to the variance of the
estimator b0 for β

Frank Wood, [email protected] Linear Regression Models Lecture 5, Slide 35


Prediction interval for single new observation
• Essentially follows the sampling distribution
arguments for E{Yh}
• If all regression parameters are known then
the 1-α prediction interval for a new
observation Yh is

E{Yh } ± z(1 − α/2)σ

Frank Wood, [email protected] Linear Regression Models Lecture 5, Slide 36


Prediction interval for single new observation
• If the regression parameters are unknown the 1-α
prediction interval for a new observation Yh is given
by the following theorem

Ŷh ± t(1 − α/2; n − 2)s{pred}

• This is very nearly the same as prediction for a


known value of X but includes a correction for the
fact that there is additional variability arising from
the fact that the new input location was not used in
the orginal estimates of b1, b0, and s2

Frank Wood, [email protected] Linear Regression Models Lecture 5, Slide 37


Prediction interval for single new observation
• The value of s2{pred} is given by
 
2
2 1 (Xh −X̄)
s {pred} = M SE 1 + n + (X −X̄)2
i

Frank Wood, [email protected] Linear Regression Models Lecture 5, Slide 38


Frank Wood, [email protected] Linear Regression Models Lecture 5, Slide 39
Frank Wood, [email protected] Linear Regression Models Lecture 5, Slide 40
Frank Wood, [email protected] Linear Regression Models Lecture 5, Slide 41
Frank Wood, [email protected] Linear Regression Models Lecture 5, Slide 42
Frank Wood, [email protected] Linear Regression Models Lecture 5, Slide 43
Frank Wood, [email protected] Linear Regression Models Lecture 5, Slide 44
Frank Wood, [email protected] Linear Regression Models Lecture 5, Slide 45

You might also like