STAT 504 Assessment #12 Due Monday, 4/8, Before Midnight
STAT 504 Assessment #12 Due Monday, 4/8, Before Midnight
1. An experiment analyzes imperfection rates for two processes used to fabricate silicon wafers
for computer chips. For Treatment A applied to 10 wafers, the number of imperfections are 8,
7, 6, 6, 3, 4, 7, 2, 3, 4. For Treatment B applied to 10 wafers, the number of imperfections is 9,
9, 8, 14, 8, 13, 11, 5, 7, 7. Treat the counts as independent Poisson variates having means µ𝐴
and µ𝐵
a) Fit the Poisson regression model log(μ) = α + β x, where x = 1 for treatment A and x = 0 for
treatment B. Show that β = log µ𝐴 − log µ𝐵 . Interpret the estimate.
The estimated model coefficients from R code is given bellow. Therefore, the model
would be:
𝐿𝑜𝑔(𝜇) = 2.2083 − 0.5988 𝑥
Interpretation of Coefficient
Form model we can write:
𝑥 = 1 → 𝐿𝑜𝑔(𝜇𝐴 ) = 𝛼 + 𝛽
𝑥 = 0 → 𝐿𝑜𝑔(𝜇𝐵 ) = 𝛼
𝛽 = 𝐿𝑜𝑔(𝜇𝐴 ) − 𝐿𝑜𝑔(𝜇𝐵 )
1
b) Test H0 : µ𝐴 − µ𝐵 = 0 using either a Wald test or a likelihood ratio test from your SAS or R
output from the Poisson regression model you fitted. Interpret.
This null hypothesis is equavalent to the test of significance of coefficient , that is:
Ho: 𝛽 = 0 (or equivalently µ𝐴 − µ𝐵 = 0)
Ha: 𝛽 ≠ 0 (or equivalently µ𝐴 − µ𝐵 ≠ 0)
Conclusion:
Since the p-value=0.0005053<0.05, there is significant evidence against the null hypothesis.
Therefore, we reject the null and conclude that the means for treatmnet A and B are different.
#HW12-P1
A = c(8,7,6,6,3,4,7,2,3,4)
B = c(9,9,8,14,8,13,11,5,7,7)
y=c(A,B)
x = as.factor(c(1,1,1,1,1,1,1,1,1,1,0,0,0,0,0,0,0,0,0,0))
2
2. The data below were reported by Laird and Olivier (1981) on the survival of patients after
heart-valve replacement surgery. Varying numbers of patients fell into the two categories of age
(Under 55, and 55+), two types of heart valve (Aortic and Mitral), and they were followed for
different lengths of time in terms of days (values under label exposure), and the last column is the
total number of deaths for the combination of the three predictors.
a) Under a saturated model we can estimate the mean death rates directly. Let λ1 be the
mean death rate for individuals (Under55, Aortic) combination. Just looking at the table
(even without running the SAS or R code), do you know the estimate of this value? When
you take the natural log of this estimate, which parameter estimate in your model do you
expect to get?
We can compute mean death rate directly from the table by deviding numbre of recorded
deaths by the number of exposure days:
4
𝐷𝑒𝑎𝑡ℎ 𝑟𝑎𝑡𝑒(𝑈𝑛𝑑𝑒𝑟 55, 𝐴𝑜𝑟𝑡𝑖𝑐) = = 0.00317 (instance per exposure day)
1259
4
For individuals (Under 55, Aortic), 𝑥1𝑖 = 𝑥2𝑖 = 𝑥3𝑖 = 0. Therefore, 𝐿𝑛 ( ) = −5.75177
1259
is an estimate for 𝛽0 .
3
b) Examine the Wald statistics of the saturated model output. Which predictors are
significant? Interpret the parameters of this model.
4
𝜇1
𝜇1 𝜇0 𝑡
𝛽1 = 𝐿𝑜𝑔 ( ) − 𝐿𝑜𝑔 ( ) = 𝐿𝑜𝑔 ( 𝜇1 )
𝑡1 𝑡0 0
𝑡0
𝛽1 : Log of ratio of death rate for individual (above 55, Aortic) with respect to that for
(under 55, Aortic) individuals.
For 𝑥𝑖1 = 0, 𝑥𝑖2 = 1, 𝑥𝑖3 = 0
𝜇2
𝐿𝑜𝑔 ( ) = 𝛽0 + 𝛽2
𝑡2
𝜇2
𝜇2 𝜇0 𝑡
𝛽2 = 𝐿𝑜𝑔 ( ) − 𝐿𝑜𝑔 ( ) = 𝐿𝑜𝑔 ( 𝜇2 )
𝑡2 𝑡0 0
𝑡0
𝛽2 :Log of ratio of death rate for individual (Under 55, Mitral) with respect to that for
(Under 55, Aortic) individuals.
𝛽3 is interpreted as the change in log of relative death rate per one unit incease in relative
death rate for (Above 55, Aortic) and (Under 55, Mitral) individuals. Relative death rates
are defined with (Under 55, Aortic) individuals as the base line. ???
𝛽3 is interpreted as the change in log of relative death rate when age group changes from
under 55 to above 55 and heart valve type changes from Aortic to Mitral. ???
Since death instances are recorded over different time intervals, i.e. Exposure days, the
date instances needs to be normalized so that it can be compared across different groups.
Normalized death rate is obtained by deviding total number of Death instances reported
over the associated Exposure days.
Fitting the Poisson regression to the death rate will lead to the offset model.
𝜇𝑖
𝐿𝑜𝑔 ( ) = 𝐿𝑜𝑔(𝜇𝑖 ) − log(𝑡𝑖 ) = 𝛽0 + 𝛽1 𝑥1𝑖 + 𝛽2 𝑥2𝑖 + 𝛽3 𝑥3𝑖
𝑡𝑖
or equivalently,
𝐿𝑜𝑔(𝜇𝑖 ) = log(𝑡𝑖 ) + 𝛽0 + 𝛽1 𝑥1𝑖 + 𝛽2 𝑥2𝑖 + 𝛽3 𝑥3𝑖
where 𝐿𝑜𝑔(𝑡𝑖 ) is considered as the offset of the model.
5
6
d) What would be your criticism of this model, if any? Do you think that main effects model
would be better?
Criticism:
The interaction term is hard to interpret, and seems not to indicate a tangible meaning. In
other words, for the sake of interpretation, we would better neglect 𝛽3 . ???
In order to consider main effect model, we need to perform goodness of fit test:
The results for fitting saturated model is reported in part b, and the results for fitting main
effect model is given in the following table:
where -8.1747 and -6.5635 are the log-likelihood values for main effect and saturated
model with degrees of freedom 3 and 4, respectively. Since the p-value=0.0726 is greater
than the 0.05, we fail to reject the null and conclude that main effect model is valid.
7
#HW12-P2
age=c("55-","55-","55+","55+")
heartvalve=c("Aortic","Mitral","Aortic","Mitral")
exposure=c(1259,2082,1417,1647)
death=c(4,1,7,9)
table=cbind(age,heartvalve,exposure,death)
table