STA302 Week07 Full
STA302 Week07 Full
1/48
Important
2/48
Lecture before Midterm
3/48
Week 07- Learning objectives & Outcomes
• Variable transformations.
• More on logarithmic transformation.
• Box-Cox transformation.
• Interpretation of slope after transformation.
• Chapter 4: Simultaneous Inferences
4/48
Variable Transformations
5/48
Transformations
• Why?
• Satisfy model assumptions.
• Improve predictive ability.
• Make it easier to interpret parameters.
• How?
• First fit a linear regression model to the original variables. Diagnostics
indicate
• Nonlinearity: transformation on X.
• Nonlinearity, nonconstant variance and non-normality: transformation
on Y (transformation on X might also helpful).
• Box-Cox transformation.
• Fit linear regression model after the transformations for one or both of
the original variables.
• To make the regression model appropriate for the transformed data.
6/48
Transformations (cont.)
7/48
Transformations on X
.
ttonstant
2
f
8/48
Transformations on Y
:w9#
g.
Lisman
or larger
:
}o2
:
.
:
. .
i i
.
: e
When variance My .
of Y large,
variance of Y’
small 9/48
Example: Transformation on X
fit0 = lm(Y~X)
fit1 = lm(Y~Xp)
par(mfrow=c(2,2))
plot(Y~X, type="p",col="red",main="Before transformation of X")
plot(Y~Xp,type="p",,col="red",xlab=expression(paste(sqrt(X))),
main="After transformation of X" )
plot(fit1,1,main="After transformation of X")
plot(fit1,2,main="After transformation of X")
10/48
Example: Transformation on X (cont.)
Ô
• Diagnostic for Ŷ = ≠10.33 + 84.35 X : no evidence of lack of fit or
strongly unequal error variances.
120
120
100
100
Y
Y
80
80
60
60
40
40
0.5 1.0 1.5 2.0 2.5 0.8 1.0 1.2 1.4 1.6
X X
1.5
4 6 6 4
Standardized residuals
5
0.5
Residuals
-0.5
-5
-1.5
-10
9
9
11/48
Example: Transformation on X (cont.)
X = c(0.5, 0.5, 1,1,1.5,1.5,2,2,2.5,2.5)
Y=c(42.5,50.6,68.5,80.7,89.0,99.6,105.3,111.8,112.3,125.7)
Xp = sqrt(X)
*
or Xml and Xn= 1.2
13/48
Example: Transformation on Y
Annual US GNP data analysis
• US GNP data (1947-2007)
• Y: annual (adjusted) US GNP (Gross National Product) (in $Billions).
• X: years
12000
10000
8000
GNP( $Billions)
6000
4000
2000
year
14/48
Annual US GNP data analysis (cont.)
• Fitted
Ô model, M0: GNPt = ≠315741.23 + 162.43 Yeart + ‘t
• MSE = 606.4, R 2 = 0.9583
Normal Q-Q
12000
61
60
10000
1000
2
59
8000
500
1
Standardized residuals
GNP( $Billions)
M0$res
6000
0
4000
-500
-1
off Normality
2000
nonlinearity non .
constant r2
-1000
1950 1960 1970 1980 1990 2000 1950 1960 1970 1980 1990 2000 -2 -1 0 1 2
15/48
Annual US GNP data analysis (cont.)
27
2
22
9.0
0.05
1
Standardized residuals
log(GNP)( $Billions)
8.5
0.00
M1$res
0
8.0
-1
-0.05
-2
Looks good
7.5
QK Improved .
-0.10
.
.
3
0 10 20 30 40 50 60 0 10 20 30 40 50 60 -2 -1 0 1 2
16/48
Annual US GNP data analysis (cont.)
Ô
GNP, ti = year ≠ 1947
Ô
• Fitted
Ô model,M1: GNP = 36.70997 + 1.1288 ti + ‘i
• MSE = 1.786, R 2 = 0.9923
Normal Q-Q
110
61
2
60
100
2
90
1
Standardized residuals
80
sqrt(GNP)( $Billions)
M2$res
0
70
60
-1
-2
50
ok .
-2
40
36
-4
0 10 20 30 40 50 60 0 10 20 30 40 50 60 -2 -1 0 1 2
17/48
Annual US GNP data analysis (cont.)
Which transformation is best?
Ô
Y X MSE R2
GNP year 606.4 0.9583
log(GNP)
Ô t=year-1947 0.04279 0.9948
GNP t=year-1947 1.786 0.9923
rate).
~ ~
Ei
GAP @ Byte )teEi
GNP # ( HR Taylor Expansion
e×÷H£
= . .
log(GNPi ) = log(GNP0 ) + —1 t + ‘i
18/48
Annual US GNP data analysis (cont.)
Based on the logarithmic transformation model,
• Find confidence interval for —0 , —1
• Find confidence interval for E (GNP) when time = 50, i.e. at year
1997.
• Find Prediction interval for GNP when time = 63, i.e, at year 2010.
\h ) ± t1≠–/2,n≠2 s(pred)log(Y )
log(Yh ) : [L, U] = log(Y h
lwrcewgyn
upr ) 0.90
ceyn
=
PC <
)
pcewr < eupr ) =
ago
<
)
e9' 611=0.95
<
PL g. 43 < yn <
19/48
More on logarithmic transformation
20/48
Why Might Logarithms Work?
y = a ú e bX ∆ log(y ) = log(a) + bX
21/48
Example: logarithmic transformation
Data:
• A memory retention experiment in which 13 subjects were asked to
memorize a list of disconnected items. The subjects were then asked
to recall the items at various times up to a week later.
• The proportion of items (y = prop) correctly recalled at various times
(x = time, in minutes) since the list was memorized
22/48
Example: logarithmic transformation (cont.)
• Scatter plot
0.8
small values
¢
wg
after
-
0.6
0.6
spread
out
Prop
Prop
0.4
0.4
0.2
0.2
Time log(Time)
23/48
Example: logarithmic transformation (cont.)
Y~X Y~X
Residuals vs Fitted Normal Q-Q
1 1
Standardized residuals
2
0.2
2 13
2
Residuals
1
0.0
0
-0.2
-1
10
0.0 0.1 0.2 0.3 0.4 0.5 -1.5 -1.0 -0.5 0.0 0.5 1.0 1.5
Y~log(X) Y~log(X)
Residuals vs Fitted Normal Q-Q
2
0.00 0.02 0.04
7 7
9 9
Standardized residuals
1
Residuals
0
-1
-0.04
13
l
13
0.2 0.4 0.6 0.8 -1.5 -1.0 -0.5 0.0 0.5 1.0 1.5
24/48
A summary on variable transformation
Model use
25/48
Box-Cox transformation
26/48
Box-Cox transformation (cont.)
YÕ = Y⁄
⁄=2 YÕ = Y2
Ô
⁄ = 0.5 YÕ = Y
⁄=0 Y Õ = loge (Y ) (by definition)
Ô
⁄ = ≠0.5 Y Õ = 1/ Y
⁄ = ≠1 Y Õ = 1/Y
27/48
Box-Cox transformation (cont.)
Yi⁄ = —0 + —1 Xi + ‘i
28/48
Box-Cox transformation (cont.)
I
K1 (Yi⁄ ≠ 1), ⁄ ”= 0
Wi =
K2 (loge Yi ), ⁄ = 0
n
1 Ÿ
K1 = , K2 = ( Yi )1/n
⁄K2⁄≠1 i=1
29/48
Box-Cox transformation (cont.)
30/48
Example: Box-Cox transformation
GPA and ACT score
• R code to generate the plot and data are available on portal
75
70
65
60
SSE
55
50
45
:..
40
0 1 2 3 4 5 6
31/48
*
Interpretation after Transformations
32/48
Interpretation of slope (—1 )
33/48
Interpretation of slope (—1 ): Level-log model
E (Y | log(X )) = —0 + —1 log(X )
the mean of Y.
• For example
• Y = pH
• X = time after slaughter (hrs.)
• Estimated model: Ŷ = 6.98 ≠ 0.73 log(X )
• Interpretation of b1 : it is estimated that for each doubling of time
after slaughter (between 0 and 8 hours) the mean pH decreases by 0.5
= 0.73 ú log(2).
34/48
Interpretation of slope (—1 ): Log-level model
≠0.5 < 0
37/48
Joint Estimation of —0 and —1
1 X̄ 2
[L0 , U0 ] = b0 ± t1≠–/2;n≠2 s(b0 ), s 2 (b0 ) = MSE ( + )
n SXX
MSE
[L1 , U1 ] = b1 ± t1≠–/2;n≠2 s(b1 ), s 2 (b1 ) =
SXX
• What is the confidence coefficient of their joint intervals?
• P(L0 Æ —0 Æ U0 , L1 Æ —1 Æ U1 ) =?
• Let 1 ≠ – = 0.95: Not provide 95% C.I.s for —0 and —1 since (0.95)2 if
the inferences were independent.
38/48
Joint Estimation of —0 and —1 (cont.)
• Let A0 denote the event that the first confidence interval does not
cover —0 . Then P(A0 ) = –
• Let A1 denote the event that the first confidence interval does not
cover —1 . Then P(A1 ) = –
u
• Here Ac0 Ac1 is the event which indicates that both of the confidence
intervals cover —0 and —1 .
‹
P(Ac0 Ac1 ) =?
39/48
Bonferroni inequality: P(Ac0 Ac1 ) Ø 1 ≠ 2–
u
Venn diagram
PIAFNAFKPCATUA )
÷
|¥€|
,
=t
_y→
.
Putout
p( AONA , )
-
,
,
-
=plAoHPCAi )
=tPlAo
Tftnan twice
) PLA , )tp( Aona , ,
-
-
-
PLAONAI )
20
⇒ PCAOCNAF )
Ztplttotptai )
=t22
QED .
40/48
Joint Estimation of —0 and —1 (cont.)
# %
Patil Zt5%x2=q%
, plAot=5%# ⇒ PLAOCNAF )
bi ± Bs{bi }, B = t1≠–/4;n≠2
to ensure PCAOCNAF )
>_t2xda
#
i. e. find a
for pop ,
respectively .
41/48
Example: Joint Estimation of —0 and —1
toluca=read.table(
"/Users/Wei/TA/Teaching/0-STA302-2016F/Week07-Oct24/toluca.txt",
col.names = c("lotsize", "workhrs"))
# plot(toluca$lotsize,toluca$workhrs)
modt = lm(lotsize~workhrs,data=toluca)
confint(modt)
## 2.5 % 97.5 %
## (Intercept) -17.1880966 13.4715943
## workhrs 0.1838466 0.2763702
-
) to ensure
## 1.25 % 98.75 %
## (Intercept) -19.6277718 15.9112696 P ( Loyd .
< Uo , 4 Yi < U,
7=0,95
## workhrs 0.1764842 0.2837326
42/48
Joint Estimation of —0 and —1 (cont.)
:
since Putin At ) It 22
a is to
too useful
, 1
;
,
43/48
Simultaneous Estimation of mean response
• Working-Hotelling procedure
• Bonferroni procedure
44/48
Simultaneous Estimation of mean response
Working-Hotelling procedure:
• Based on the confidence band for the regression line (Chap. 2.6).
• The confidence band contains the entire regression line, so it contains
the mean responses at all X levels.
• The simultaneous confidence limits for g mean responses E {Yh }
Ŷh + ±Ws{Ŷh }, W 2 = 2F (1 ≠ –; 2, n ≠ 2)
where
1 (Xh ≠ X̄ )2
Ŷh = b0 + b1 Xh , s{Ŷh } = MSE [ + ]
n SXX
Bonferroin procedure:
• The Bonferroni confidence limits for E {Yh } at g levels Xh with 1 ≠ –
family confidence coefficient:
45/48
Example: Simultaneous Estimation of mean response
• Toluca data example
+ BF
/
'
twit
46/48
Simultaneous Prediction Intervals for New observation
Ŷh + ±Ss{pred}, S 2 = gF (1 ≠ –; g, n ≠ 2)
(Xh ≠X̄ )2
• s 2 {pred} = MSE [1 + 1
n
+ SXX
]
• Reference(http://rstudio-pubs-static.s3.amazonaws.com/5218_
61195adcdb7441f7b08af3dba795354f.html) Good .
47/48
Practice problems and upcoming topics
48/48