Solved Problems - Survival
Solved Problems - Survival
Survival Analysis R
The following are some hypothetical data on two groups, smokers and non-smokers, in a study
that investigated survival (days) following a root canal.
Group Days(X) Status at Last Follow-up (C)
group days status
smoker 4 alive
smoker 7 dead
smoker 8 alive
nonsmoker 29 alive
smoker 29 dead
smoker 31 alive
nonsmoker 40 dead
smoker 65 dead
nonsmoker 69 dead
nonsmoker 78 alive
nonsmoker 79 alive
nonsmoker 106 dead
smoker 107 alive
nonsmoker 129 dead
smoker 130 alive
smoker 140 alive
smoker 142 alive
smoker 149 dead
smoker 158 alive
smoker 160 dead
nonsmoker 161 dead
smoker 162 alive
smoker 187 dead
smoker 188 alive
nonsmoker 197 dead
nonsmoker 204 alive
nonsmoker 208 alive
smoker 221 dead
nonsmoker 228 dead
nonsmoker 231 alive
sol_survival.docx Page 1 of 8
BIOSTATS 640 Spring 2022 Homework #13 Unit 8. Survival Analysis R
#1.
Create a data set of the observations in the table on page 1. Tips. #1. Create a 0/1 variable called group
(1=smoker, 0 = nonsmoker); and #2. Create a 0/1 variable called status (1=dead, = alive).
Solution:
I created an excel dataset that I named hw_survival.xlsx and then imported it into R.
#2.
Obtain the Kaplan-Meier estimates of survival, separately for smokers and non-smokers.
library(survival)
sol_survival.docx Page 2 of 8
BIOSTATS 640 Spring 2022 Homework #13 Unit 8. Survival Analysis R
By hand: Smokers
ID x c t # At Risk at t- # Surviving t Conditional # at Risk
An actual time t instant before % Surviving to carry forward
Of death or
censoring
Always start at “0” 17 17 17/17 17
1 4 0 Drop 16
2 7 1 7 16 15 15/16 15
3 8 0 Drop 14
4 29 1 29 14 13 13/14 13
5 31 0 Drop 12
6 65 1 65 12 11 11/12 11
7 107 0 Drop 10
8 130 0 Drop 9
9 140 0 Drop 8
10 142 0 Drop 7
11 149 1 149 7 6 6/7 6
12 158 0 Drop 5
13 160 1 160 5 4 4/5 4
14 162 0 Drop 3
15 187 1 187 3 2 2/3 2
16 188 0 Drop 1
17 221 1 221 1 0 0/1 0
Key: ID - Subject Identifier, X – Time on Study, C – Censoring Indicator (C=1 if Event of Death, C=0 if Censored)
By hand: NON-Smokers:
ID x c t # At Risk at t- # Surviving t Conditional # at Risk
An actual time t instant before % Surviving to carry forward
Of death or
censoring
Always start at “0” 13 13 13/13 13
1 29 0 Drop 12
2 40 1 40 12 11 11/12 11
3 69 1 69 11 10 10/11 10
4 78 0 Drop 9
5 79 0 Drop 8
6 106 1 106 8 7 7/8 7
7 129 1 129 7 6 6/7 6
8 161 1 161 6 5 5/6 5
9 197 1 197 5 4 4/5 4
10 204 0 Drop 3
11 208 0 Drop 2
12 228 1 228 2 1 1/2 1
13 231 0 Drop 0
Key: ID - Subject Identifier, X – Time on Study, C – Censoring Indicator (C=1 if Event of Death, C=0 if Censored)
sol_survival.docx Page 3 of 8
BIOSTATS 640 Spring 2022 Homework #13 Unit 8. Survival Analysis R
R
# survfit ( ) in package {survival}
library(survival)
q2fit <- survival::survfit(surv.object~group,data=dat)
summary(q2fit)
sol_survival.docx Page 4 of 8
BIOSTATS 640 Spring 2022 Homework #13 Unit 8. Survival Analysis R
#3.
By hand, perform a log rank test of the null hypothesis of equal survival curves for smokers and non- smokers.
Worksheet .
t O1t n1t dt (N t - d t ) n 2t Nt
7 1 16 1 28 13 29
29 1 14 1 26 13 27
40 0 12 1 23 12 24
65 1 12 1 22 11 23
69 0 11 1 21 11 22
106 0 11 1 18 8 19
129 0 10 1 16 7 17
149 1 7 1 12 6 13
160 1 5 1 10 6 11
161 0 4 1 9 6 10
187 1 3 1 7 5 8
197 0 1 1 5 5 6
221 1 1 1 2 2 3
228 0 0 1 1 2 2
Key: O1t = 1 if death in smoker, 0 if death in nonsmoker, n1t=# at risk among smokers dt = # deaths
(Nt – dt ) = # surviving n2t = # at risk among nonsmokers Nt = Total # at risk
sol_survival.docx Page 5 of 8
BIOSTATS 640 Spring 2022 Homework #13 Unit 8. Survival Analysis R
Worksheet - continued .
é n (N - d t ) ù
T O1t E[O1t ] = (n1t )[d t N t ] V[O1t ] = [E(O1t )] ê 2t t ú
ë N t (N t - 1) û
7 1 0.5517 0.2473
29 1 0.5185 0.2497
40 0 0.5000 0.2500
65 1 0.5217 0.2495
69 0 0.5000 0.2500
106 0 0.5789 0.2438
129 0 0.5882 0.2422
149 1 0.5385 0.2485
160 1 0.4545 0.2479
161 0 0.4000 0.2400
187 1 0.3750 0.2344
197 0 0.1667 0.1389
221 1 0.3333 0.2222
228 0 0 0
Totals 7 6.0272 3.0644
2
æ #deaths #deaths
ö
ç å 1t å E(O1t ) ÷
O -
( 7 - 6.02717 )
2
#4.
Using any software you like, reproduce the log rank test that you did by hand in exercise #3.
## Call:
## survdiff(formula = surv.object ~ group, data = dat)
##
## N Observed Expected (O-E)^2/E (O-E)^2/V
## group=0 13 7 7.97 0.119 0.309
## group=1 17 7 6.03 0.157 0.309
##
## Chisq= 0.3 on 1 degrees of freedom, p= 0.578
Intepretation. The answers match! And so the conclusion is the same; do not reject the null hypothesis of equality of
survival distributions (p-value = .31)
sol_survival.docx Page 6 of 8
BIOSTATS 640 Spring 2022 Homework #13 Unit 8. Survival Analysis R
#5.
Write an expression for a Cox Proportional Hazards Model that could be explored to investigate the
association of survival time following root canal with smoking status. Define all terms.
Solution:
A Cox PH model for the hazard of death following root canal and its association with smoking status is
h(t; Z) = instantaneous hazard of death at time “t” given survival to “t-“ for person with
covariate Z
Z = group = indicator of smoking status with Z=1 for smokers, 0 for nonsmokers.
#6.
What assumptions must hold in order for this model to be valid?
Solution:
(1) Model:
A Cox PH model for the hazard of death following root canal and its association with smoking status is
h(t; Z) = instantaneous hazard of death at time “t” given survival to “t-“ for person with
covariate Z
(2) Proportional Hazards: The relative hazard of death for smokers is a constant multiple (called the hazard
ratio) of the hazard of death for non-smokers over all occasions of time.
sol_survival.docx Page 7 of 8
BIOSTATS 640 Spring 2022 Homework #13 Unit 8. Survival Analysis R
#7.
Using any software you like, fit the model you stated in exercise #5. Report your output and provide
annotations that explain the output.
## Call:
## coxph(formula = surv.object ~ group, data = dat)
##
## coef exp(coef) se(coef) z p
## group 0.317 1.373 0.573 0.55 0.58
##
## Likelihood ratio test=0.31 on 1 df, p=0.579
## n= 30, number of events= 14
Interpretation. In this sample, smokers have a non-statistically significant (p=.58) relative hazard of death that is 37%
greater than that of nonsmokers following root canal. (HR = 1.37 with 95% CI limits 0.45 to 4.22)
#8.
Compare the fit of the model you obtained for exercise #7 to the results of the log- rank test that you got for
exercises #3 and #4.
Solution:
It’s a match!
A Cox PH model for the hazard of event with one 0/1 predictor is equivalent to the log rank test for the comparison of
two groups.
Log Rank Test Chi Square = 0.3088 on df=1 has p-value = .5784
Cox PH Model Score Test for significance of 0/1 GROUP = 0.3088 on df=1 has p-value = .5784
sol_survival.docx Page 8 of 8