0% found this document useful (0 votes)
7 views

lect23

The document discusses survival analysis, focusing on failure rates, the Cox Proportional Hazards Model, and system reliability. It explains various failure rate behaviors, such as constant, increasing, and decreasing rates, and introduces the Cox model for analyzing the impact of explanatory variables on lifespan. Additionally, it covers system survival in both series and parallel configurations, emphasizing the importance of connectivity and redundancy in reliability analysis.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views

lect23

The document discusses survival analysis, focusing on failure rates, the Cox Proportional Hazards Model, and system reliability. It explains various failure rate behaviors, such as constant, increasing, and decreasing rates, and introduces the Cox model for analyzing the impact of explanatory variables on lifespan. Additionally, it covers system survival in both series and parallel configurations, emphasizing the importance of connectivity and redundancy in reliability analysis.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 17

23.

0 Survival Analysis

• Answer Questions
1

• Failure Rates

• Cox Proportional Hazards Model

• Reliability of Systems
23.1 Failure Rates

The survival function is S(t) = 1 − F (t), or the probability that a person or


machine or a business lasts longer than t time units. Here F (t) is the usual
distribution function; in this context, it gives the probability that a thing lasts
2

less than or equal to t time units.

The hazard function is λ(t) = f (t)/S(t). It is the probability that the


person or machine or business dies in the next instant, given that it survived
to time t.
The exponential distribution has

F (t) = 1 − exp(−λt) so S(t) = exp(−λt)

and its hazard function is

λ(t) = f (t)/S(t) = λ exp(−λt)/ exp(−λt) = λ


3

so the hazard function is a constant.

This is another way of seeing the memoryless property of the exponential


distribution. Since the hazard rate is constant, the failure probability does
not change with age. A light bulb with exponential lifespan has constant
probability of failure in the next instant, no matter how old or young it is.
In most applications, things do not have constant hazard (failure) rates.

• Increasing Failure Rates describe things which are more likely to fail
with age, such as machines whose parts wear out.

• Decreasing Failure Rates describe things that are less likely to wear
out with time: a business that has lasted two centuries is less likely to go
bankrupt than one that has lasted two years.
4

• Constant Failure Rates describe things with exponential lifetimes.

• Bathtub-Shaped Failure Rates describe things that have relatively


high failure rates when very young or very old, but flat rates in middle age
(such as human beings and some machines).

Often researchers want to make use models that have specific kinds of failure
rate behavior.
Graphs of some of the main kinds of failure rate behavior:
5
Consider the Rayleigh distribution with parameters θ0 and θ1 :
1
F (t) = 1 − exp(−θ0 t − θ1 t2 )
2
so the hazard function or failure rate is

λ(t) = f (t)/S(t)
(θ0 + θ1 t) exp(−θ0 t − 21 θ1 t2 )
=
6

exp(−θ0 t − 21 θ1 t2 )
= θ0 + θ1 t.

When θ1 is negative, this has decreasing failure rate; when it is positive, it is


increasing failure rate; when zero, it reduces to the exponential distribution.
This makes it a flexible model for many survival analysis applications.
23.2 Cox Proportional Hazards Model

This model assumes that there are relevant explanatory variables X1 , . . . , Xp


that affect the lifespan of the entity. For human beings, this might be
cholesterol level, age of grandparents, and so forth. For machines, it might b
lubrication frequency, peak load, and so forth. Then the Cox Proportional
7

Hazard Model fits the hazard rate

λ(t; x1 , . . . , xp ) = exp(β1 x1 + · · · + βp xp )λ0 (t)

where λ0 (t) a baseline hazard rate.

The log of the overall hazard function is adjusted up or down according to the
additive linear effects of the covariates (i.e., the explanatory variables).
Note that the Cox model is closely related to nonlinear regression. One has
data on covariates and survival times, and uses those to find coefficient
estimates β̂j . If an estimated coefficient is not significantly different from zero,
then it has little or no effect on the lifespan of the item or the person.

The model assumes that the covariates have the same effect at all ages, and
that there are no unspecified interactions among the variables (e.g., smoking
and working around diesel exhaust may be more unhealthy than just the sum
8

of the effect of smoking and the effect of diesel fumes).

A nice feature is that if one believes the model, one can assess the relative
importance of the different covariates without having to know or even to
estimate the baseline rate λ0 (t). This baseline need not be a simple increasing
or decreasing failure rate—it may be realistically complicated, and the
inference on the covariates is still sound.
Since this is like regression analysis, what happened to β0 , the intercept term?

The intercept term is absorbed in the baseline hazard function:

ln λ(t; x1 , . . . , xp ) = ln λ0 (t) + β1 x1 + · · · + βp xp

so any constant term can just be added to ln λ0 (t).


9

Note: Is this a good model for assessing the impact of smoking on human
lifespan? Why or why not?

Note: Is this a good model for assessing the impact of the prime interest rate
on the lifespan of a business? (Hint: The prime rate may change over time.)
The hazard ratio for an entity with a set of covariates (x1 , . . . , xp ) compared
to an entity with covariates (x∗1 , . . . , x∗p ) indicates how much their overall risks
differ. This is often used in medical and insurance settings, where the doctor
or actuary is addressing multiple lifestyle issues.

Using boldface to denote vectors, the hazard ratio is


λ(t; x)
h(x : x∗ ) =
λ(t; x∗ )
10

exp(xβ)
=
exp(x∗ β)
= exp[(x − x∗ )β].

The point estimate for the hazard ratio is exp[(x − x∗ )β̂]. In a more advanced
class, one can set confidence intervals on the hazard ratio.
For technical reasons one cannot estimate β̂1 , . . . , β̂p by directly maximizing
the likelihood function (which is one of our usual strategies). Instead, we must
maximize the partial likelihood:
Y exp(xj β)
L[β; (yj , xj )] = P .
yk ≥yj exp(x k β)
uncensored yj

As before, it is often helpful to maximize the log of the partial likelihood.


11

It turns out that the maximization of the partial likelihood gives you the
estimates that one would have gotten if one had been able to maximize the
full likelihood.

Note: You will not need to know how to maximize the partial likelihoods for
the exam. Here we are just connecting some intellectual dots that underlie
the Cox proportional hazards model. But you should know how to do regular
maximum likelihood estimation for the exam.
A useful alternative to the Cox proportional hazards model is the competing
risks model. In these, an item or a person can fail in multiple ways. For
example, a person can be hit by a car, or struck by lightning, or have a heart
attack.

The competing risks model assumes the these risks operate independently, and
the survival time is a race to see which failure mode occurs first.
12

If each of the k failure modes independently has cdf Fi (t), then the overall
probability of survival beyond time t is

Y
k
S(t) = [1 − Fi (t)]
i=1

since the item cannot fail from each of the k modes.


23.3 System Survival

Often a system has multiple components. We want to understand the expected


lifetime of such a system, as a function of the connectivity of the system.

In a simple series system, where the first component is connected to the


13

second which is connected to the third, and so forth, Then the probability
that the system survives is the probability that none of the components
fail. If component i has lifespan with cdf Fi (t), and if each component fails
independently, then then probability that the system survives beyond time t is
Q
S(t) = [1 − Fi (t)].

So the series reliability problem is equivalent to the competing hazards model.


Now consider a system with parallel connectivity.
14
For a parallel system, reliability is improved by redundancy. Each of the
k components must fail before the system can fail. This means that the
distribution for the lifetime of the system is:

F (t) = IP[ all components fail before time t]


Y
= Fi (t).
i
15

From this, one can calculate hazard rate functions and so forth, at least in
principle.

This analysis assumes that each component fails separately. A more


sophisticated model with load transfer allows the probability of failure for a
component to increase as the number of working components decreases.
A more complex system might combine serial and parallel features. For
example:
16
To solve this more complex system we break into into subsystems. First, note
the it has a two parallel subsystems: components 1, 2, 3 and components 4,
5. The first subsystem is a series system; the second subsystem is a parallel
system.

From previous work, we know that the probability that the first subsystem
fails before time t is
17

1 − [1 − F1 (t)] ∗ [1 − F2 (t)] ∗ [1 − F3 (t)].

And we know the probability that the second subsystem fails is F4 (t) ∗ F5 (t).
Therefore the probability that the overall system fails before time t is

F (t) = (1 − [1 − F1 (t)] ∗ [1 − F2 (t)] ∗ [1 − F3 (t)]) ∗ F4 (t) ∗ F5 (t).

You might also like