Reliability Theory and Survival Analysis Final
Reliability Theory and Survival Analysis Final
AYUSHI JAIN/19STM21/GL6382
27 May 2021
where,
->t represents the survival time
->h(t) is the hazard function determined by a set of p covariates (x1,x2,...,xp)
->the coefficients (b1,b2,...,bp) measure the impact (i.e., the effect size) of covariates.
->the term h0 is called the baseline hazard. It corresponds to the value of the hazard if all
the xi are equal to zero (the quantity exp(0) equals 1). The 't' in h(t) reminds us that the
hazard may vary over time.
library("survival")
args(coxph)
, and
## $par
## [1] 0.03699251
##
## $value
## [1] 85.94567
##
## $counts
## function gradient
## 12 12
##
## $convergence
## [1] 0
##
## $message
## [1] "CONVERGENCE: REL_REDUCTION_OF_F <= FACTR*EPSMCH"
##
## $hessian
## [,1]
## [1,] 14636.52
## rate
## MLE 0.036992509
## se 0.008265727
𝐻(𝑡) = (𝜆𝑡)𝛼
𝛼
⇒ 𝑆(𝑡) = 𝑒 −𝐻(𝑡) = 𝑒 −(𝜆𝑡)
Here 𝛼 is termed as shape parameter and 𝜆 is termed as rate parameter of Weibull
1
distribution. It may also be noted that in R it is 𝑠𝑐𝑎𝑙𝑒 = 𝑟𝑎𝑡𝑒.
1
(𝑙𝑜𝑔(2))𝛼
𝑡0.5 =
𝜆
•
In R Weibull is defined in terms of sacle, not rate.
## $par
## [1] 1.290646 19.153511
##
## $value
## [1] 114.9194
##
## $counts
## function gradient
## 16 16
##
## $convergence
## [1] 0
##
## $message
## [1] "CONVERGENCE: REL_REDUCTION_OF_F <= FACTR*EPSMCH"
##
## $hessian
## [,1] [,2]
## [1,] 31.6382541 -0.6391224
## [2,] -0.6391224 0.1362192
MLE<-M2$par
se<-sqrt(diag(solve(M2$hessian)))
out<-rbind(MLE,se)
colnames(out)<-c("shape","scale")
out
## shape scale
## MLE 1.2906464 19.153511
## se 0.1868602 2.847762
** Gamma distribution**
Gamma distribution is another distribution which has shape and rate parameters.
However, it has an option for scale parameter also. Its pdf can be defined as:
𝜆𝑘 𝑡 𝑘−1 𝑒𝑥𝑝(−𝜆𝑡)
𝑓(𝑡) =
𝛤(𝑘)
where 𝑘 is known as shape parameter and 𝜆 is known as rate parameter. Its mean and
variance are in closed form defined as:
𝑘
𝑀𝑒𝑎𝑛 = 𝐸(𝑇) =
𝜆
and variance is
𝑘
𝑉𝑎𝑟𝑖𝑎𝑛𝑐𝑒 = 𝑉𝑎𝑟(𝑇) =
𝜆2
However, when scale parameter, 𝜃 = 1/𝜆, then in that situation:
𝐸(𝑇) = 𝑘𝜃
and variance is
𝑉𝑎𝑟(𝑇) = 𝑘𝜃 2
for 𝑠ℎ𝑎𝑝𝑒 = 1, gamma reduces to exponential distribution. Consequently, hazard function
for 𝑘 = 1 is constant, for 𝑘 < 1 it is decreasing, whereas for 𝑘 > 1 it is increasing.
Fitting of gamma distribution
nllg<-function(theta,y,censor){
ll<-censor*dgamma(y,shape=theta[1],rate=theta[2],log=TRUE)+(1-
censor)*pgamma(y,shape=theta[1],rate=theta[2],log.p = TRUE,lower.tail =
FALSE)
ll<-sum(ll)
return(-ll)
}
dump("nllg",file="nllg.txt")
source("nllg.txt")
## shape rate
## MLE 0.9645338 0.02173167
## se 0.4355252 0.01492654
## $par
## [1] 3.301151 1.425359
##
## $value
## [1] 33.52954
##
## $counts
## function gradient
## 14 14
##
## $convergence
## [1] 0
##
## $message
## [1] "CONVERGENCE: REL_REDUCTION_OF_F <= FACTR*EPSMCH"
##
## $hessian
## [,1] [,2]
## [1,] 4.477081 -1.143580
## [2,] -1.143580 6.494733
MLE<-M3$par
se<-sqrt(diag(solve(M3$hessian)))
result<-rbind(MLE,se)
colnames(result)<-c("mu","sigma")
result
## mu sigma
## MLE 3.301151 1.4253594
## se 0.483610 0.4015246
data(pharmacoSmoking)
dat1=pharmacoSmoking
dat1$ttr[dat1$ttr==0]<-0.5
# Weibull fit
weib.fit=survreg(Surv(ttr,relapse)~grp+age+employment,dist="weibull",data=dat
1)
model.step.weib<-step(weib.fit)
## Start: AIC=920.3
## Surv(ttr, relapse) ~ grp + age + employment
##
## Df AIC
## <none> 920.30
## - employment 2 925.38
## - grp 1 926.86
## - age 1 930.14
weib.fit$loglik #the first component for intercept only and the second for
both
ℎ1 (𝑡) = 𝑒 𝛽 ℎ0 (𝑡)
• Accelerated failure time (AFT) model given by
ℎ1 (𝑡) = 𝑒 −𝛾 ℎ0 (𝑒 −𝛾 𝑡)
• 𝛽 < 0 ⇒ 𝑒 𝛽 < 1 for proportional hazards model, this implies treatment is effective
(increase survival and decrease in hazard)
• 𝛾 > 0 ⇒ 𝑒 𝛾 > 1 for AFT implies treatment is effective.
##
## Call:
## survreg(formula = Surv(ttr, relapse) ~ grp, data = dat1, dist = "weibull")
## Value Std. Error z p
## (Intercept) 5.2859 0.3320 15.92 <2e-16
## grppatchOnly -1.2514 0.4348 -2.88 0.004
## Log(scale) 0.6888 0.0911 7.56 4e-14
##
## Scale= 1.99
##
## Weibull distribution
## Loglik(model)= -461.8 Loglik(intercept only)= -466.1
## Chisq= 8.63 on 1 degrees of freedom, p= 0.0033
## Number of Newton-Raphson Iterations: 5
## n= 125
## Call:
## coxph(formula = Surv(ttr, relapse) ~ grp, data = dat1)
##
## n= 125, number of events= 89
##
## coef exp(coef) se(coef) z Pr(>|z|)
## grppatchOnly 0.6050 1.8313 0.2161 2.8 0.00511 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## exp(coef) exp(-coef) lower .95 upper .95
## grppatchOnly 1.831 0.5461 1.199 2.797
##
## Concordance= 0.581 (se = 0.027 )
## Likelihood ratio test= 7.99 on 1 df, p=0.005
## Wald test = 7.84 on 1 df, p=0.005
## Score (logrank) test = 8.07 on 1 df, p=0.004
From above plot, it is evident that both the models reach to the same conclusion. Moreover,
it is also evident that combination or triple therapy performs better than the patchOnly as
it improves the survival probability.