Sac401-Lesson 2
Sac401-Lesson 2
We look at methods of making inference about w based on the likelihood function so,it is
important is to know how to derive the likelihood function.
A unit observed to fail at t contributes a term f (t , φ ) to the likelihood, the density of failure at
time t. On the other hand the contribution from a unit whose survival time is censored at C
contributes S(c , φ ) the probability of survival beyond C.
L= ∏ f (t ; φ ) ∏ S(c i φ )
U C
∏ ∏
Where U and C indicate products over uncensored and censored units respectively.
x=
Since i
min (ti , c i )
∑ log f (x i , φ) + ∑ log S( x i , φ )
l= U C
Now
f (ti φ) = h(ti φ) S(t i φ)
thus
∑ log h( x i , φ ) S ( xi , φ) + ∑ log S (x i , φ)
l= U C
∑
Where A indicates summations over all sample units
S(t ) = exp (− H (t ) )
Then
l = ∑ log h( x i φ ) − ∑ H ( x i φ)
U A ……………………………………………..(2.1)
1.2.2 Fitting a parametric model to a single sample of survival data by assuming different
distributions.
S(t ) = e− βt
f (t )
h(t ) =
The hazard function S(t )
h(t , φ) = h(t , β ) = β
Now
xi
H ( x i φ) = ∫ 0 h (u) du
xi
=
β ∫0 du
=
β xi
Thus
l=
d log β − β ∑ xi
∑ x i is often called the total time at risk for both failures and survivors, d is the total number of
failures.
dl d
= − ∑ xi = 0
dβ β
d
¿^ =
∑ xi
β ¿ which is a M.L.E. of β
¿^
We get the variance of β ¿ using the method of information matrix (2.2.2.2)
The information matrix is given by
d2 l d
I =− 2 = 2
dβ β
¿ ^ = ¿¿
^(
¿ β) ¿
Now Var ¿
¿
β2
= d
d
= ∑ x 2i
Note that censored failure times contributes to the denominator and not the numerator of the
ratio.
If there is no censoring ( all the observations were complete) the log likelihood becomes;
l = n log β − β ∑ xi
1.2.2.2 ;When the failure time T is discrete
f (φ) at point U j
Next we consider the case when T is discrete with probability j
such that
U j (U 0 ≤ U 1 ≤ U 2 ≤ ..................
We take the convention that the unit censored at point C could have been observed to fail at C
j−1
h j ( φ) ∏ (1 − h k (φ ))
= k=0
Then
j−1
+
S(C φ ) = (1 − h j ( φ) ∏ (1 − hk ( φ) )
k=0
S(C + ϕ) = ∏ (1 − h j ( φ) )
⇒ j = u j ≤C for units which have not failed at c.
To obtain the full likelihood from a sample of n observations we first collect all the terms
corresponding to
U j If there are d j failures among the r j in view of
U j (i.e. r j are the units that
d r j −d j
(h j (φ ) ) j (1 − h j ( φ) )
r −d
L = ∏ h j (φ ) )
dj
[ 1 − h j ( φ)] j j
¿ ¿ ¿ ¿
Group A: 9 13 13 18 23 28 31 34 45 48 116
i)
U
Compute j , j and j
d r
ii) Find the 95% confidence interval for groups A and B assuming an exponential
distribution for failure times.
Solution:
i)
j
Uj rj dj
1 9 14 1
2 13 13 2
3 13* 11 0
3 18 10 1
4 23 9 1
5 31 5 3
6 34 4 1
7 45* 3 0
8 116 1 0
ii)
For group A
Therefore
d
¿^ =
∑ xi
β ¿ = = 0.0185
¿^
The standard error for β ¿ is
= ¿¿
=
= 0.00699
The symmetric 95% confidence interval based on a normal approximations to the distribution of
¿^
β ¿ is
= (0.0033, 0.031)
h(t ) = kρ( ρt )k − 1
k
l = d log k + k d log ρ + (k − 1 ) ∑ log xi − ρ ∑ x ki
U
We differentiate partially w.r.t k and partially w.r.t p and equate to zero.
w.r.t p
d l kd
= − kρk − 1 ∑ xik = 0
dρ ρ ……………………………………(2.3)
w.r.t. k
dl d
= + d log ρ + ∑ log x i − ρk ∑ x ki log ( ρ x i ) = 0
dk k U ……………………… …….
(2.4)
1
kd ^ d k
− kρ k − 1 ∑ x ki = 0
x
ρ ⇒ i
d
+ d log ρ + ∑ log xi − ρk ∑ x ki log ( ρ x i ) = 0
k U
Becomes
d
d ∑ x ik log xi
+ ∑ log xi − = 0
k U ∑ x ik
This is a non-linear equation in k which can only be solved using an iterative scheme like
Newton-Raphson algorithm numerical procedure. This procedure maximizes both the estimates k
and simultaneously.
EXAMPLE 2.2
The following data refer to the number of weeks from the commencement of the use of intra
uterine device(IUD)for family planning to discontinuance. The device is removed (discontinued)
if the woman becomes pregnant or she gets prolonged or irregular bleeding.
Time in weeks to discontinuation of the use of IUD.
10,13*,18*,19,23*,30,36,36*,38*,54*,56*,59,75,93,97,104*,107,107*.
Fit a Weibull distribution to this data.
Solution
Using a computer package such as SAS, the Weibull distribution can be fitted. The results of the
The confidence intervals for the two estimates are =(-0.00143,0.00235) and
Brief summary of Watch the video on how to fit a given parametric distribution.
overall task https://www.youtube.com/watch?v=ccMcg8BRnUg
Spark
Individual contribution The following data refer to the number of weeks from the
commencement of the use of intra uterine device(IUD)for family
planning to discontinuance. The device is removed(discontinued) if
the woman becomes pregnant or she gets prolonged or irregular
bleeding.
Time in weeks to discontinuation of the use of IUD.
10,13*,18*,19,23*,30,36,36*,38*,54*,56*,59,75,93,97,104*,107,107*
Fit an exponential distribution to these data.
2.3 Assessment
1. The following data gave remission times in weeks of Leukemia patients
¿ ¿ ¿ ¿ ¿ ¿ ¿
6 , 6, 6, 6 7, 9 , 10 , 10, 11 , 13, 16, 17 19 , 20 , 22, 23 , 25*,
¿ ¿ ¿ ¿
32 , 32 , 34 , 35
2. The following data refer to the number of weeks from the commencement of the use of
intra uterine device (IUD) for family planning to discontinuance. The device is removed
(discontinued) if the woman becomes pregnant or she gets prolonged or irregular
bleeding time in weeks to discontinuation of the use of IUD.
10,13*,18*,19,23*,30,36,36*,38*,54*,56*,59,75,93,97,104*,107,107*