Lecture 4
Lecture 4
INSE 6320
Risk Analysis for Information and Systems Engineering
•Survival Analysis
•Reliability
• Weibull Analysis
•Expert Opinion
•
Risk Assessment
• A class of statistical methods for studying the occurrence and timing of events.
(branch of statistics for analyzing the expected duration of time until one event occurs)
•
4
Survival Analysis
• Survival Function - A function describing the proportion of items (or
individuals) surviving to or beyond a given time.
• Notation:
▪ T ≡ survival time of a selected item
▪ t ≡ a specific point in time.
▪ S(t) = P (T > t) ≡ Survival Function=P (surviving longer than time t)
▪ h(t) ≡ instantaneous failure rate among survivors at time t ( hazard
function)
Two related probabilities are used to describe survival data: the survival probability and
the hazard probability.
The survival probability, also known as the survivor function S(t) : the probability that
an individual survives from the time origin (e.g. diagnosis of cancer) to a specified
future time t.
The hazard, denoted by h(t) is the probability that an individual who is under
observation at a time t has an event at that time.
5
F (t ) = P(T ≤ t )
• Failure Density function: The value of f(t) is the probability of the
product failing precisely at time t.
dF (t )
f (t ) =
dt
• Survival function: Probability that the item does not fail before time t
S (t ) = P(T > t ) = 1 − F (t )
6
Kaplan-Meier Estimate of S(t)
Estimate the survival probability from observed survival times
It used to analyze time-to-event data, such as time until
a specific event occurs.
• Rank the failure times as t(1)≤t(2)≤…≤t(n).
• Number of items at risk before t(i) is ni
• Number of items failed at time t(i) is di
• Estimated hazard function at t(i): ^ di
hi =
• Estimate of survival function ni
ˆ ni − d i
S (t ) = ∏
t ( i ) ≤t ni
• In medical research, the K-M estimate is often used to measure the fraction of patients living for a certain amount of
time after treatment.
• In economics, it can be used to measure the length of time people remain unemployed after a job loss.
• In engineering, it can be used to measure the time until failure of machine parts.
The Kaplan–Meier estimator is the nonparametric maximum likelihood estimate of S(t).
7
Kaplan-Meier Estimate of S(t)
^ di
hi =
Failure Rate ni
The hazard rate at each period is the number of failures in the given period divided by
the number of surviving individuals at the beginning of the period (number at risk).
n − di
Sˆ (t ) = ∏ i
Survival Probability t ( i ) ≤t ni
For each period, the survival probability is the product of the complement of hazard rates.
The initial survival probability at the beginning of the first time period is 1. If the hazard
rate for the each period is h(ti), then the survivor probability is as shown.
• The number at risk is the total number of survivors at the beginning of each period. The
number at risk at the beginning of the first period is all individuals in the lifetime study. At
the beginning of each remaining period, the number at risk is reduced by the number of
failures plus individuals censored at the end of the previous period.
• This life table shows fictitious survival data. At the beginning of the first failure time, there
are seven items at risk. At time 4, three fail. So at the beginning of time 7, there are four
items at risk. Only one fails at time 7, so the number at risk at the beginning of time 11 is
three. Two fail at time 11, so at the beginning of time 12, the number at risk is one. The
remaining item fails at time 12.
4 3 7
7 1 4
11 2 3
12 1 1 35
9
A Data Example
How to compute the cumulative hazard rate, survival rate, and cumulative distribution
function for the following data ?
t(i) Number Failed Number at Risk Hazard Rate Survival Probability Cumulative
di ni h(t)=di/ni S(t) Distribution
Function
F(t) = 1-S(t)
4 3 7
7 1 4
11 2 3
12 1 1
10
A Data Example
We can compute the cumulative hazard rate, survival rate, and cumulative distribution
function for the following data as follows:
t(i) Number Failed Number at Risk Hazard Rate Survival Probability Cumulative
di ni h(t)=di/ni S(t) Distribution
Function
F(t) = 1-S(t)
4 3 7 3/7 1 – 3/7 = 4/7 = 0.5714 0.4286
11 2 3
12 1 1
11
A Data Example
We can compute the cumulative hazard rate, survival rate, and cumulative distribution
function for the following data as follows:
t(i) Number Failed Number at Risk Hazard Rate Survival Probability Cumulative
di ni h(t)=di/ni S(t) Distribution
Function
F(t) = 1-S(t)
4 3 7 3/7 1 – 3/7 = 4/7 = 0.5714 0.4286
12 1 1 1/1 1/7*(1 – 1) = 0 1
12
A Censored Data Example
When you have censored data, the life table might look like the following:
Time Number failed di Censoring Number at Risk Hazard Rate Survival Probability Cumulative
t(i) ni Distribution
Function
4 ? 1 7
7 0 4
11 1 3
12 0 1
• At any given time, the censored items are also considered in the total of number at
risk, and the hazard rate formula is based on the number failed and the total number
at risk.
• While updating the number at risk at the beginning of each period, the total number
failed and censored in the previous period is reduced from the number at risk at the
beginning of that period.
• Notation: 1 for censored data, and 0 for exact failure time.
13
A Censored Data Example
When you have censored data, the life table might look like the following:
Time Number failed di Censoring Number at Risk Hazard Rate Survival Probability Cumulative
t(i) ni Distribution
Function
4 2 1 7
7 1 0 4
11 1 1 3
12 1 0 1
• At any given time, the censored items are also considered in the total of number at
risk, and the hazard rate formula is based on the number failed and the total number
at risk.
• While updating the number at risk at the beginning of each period, the total number
failed and censored in the previous period is reduced from the number at risk at the
beginning of that period.
• Notation: 1 for censored data, and 0 for exact failure time.
14
A Censored Data Example
When you have censored data, the life table might look like the following:
Time Number failed di Censoring Number at Risk Hazard Rate Survival Probability Cumulative
t(i) ni Distribution
Function
• At any given time, the censored items are also considered in the total of number at
risk, and the hazard rate formula is based on the number failed and the total number
at risk.
• While updating the number at risk at the beginning of each period, the total number
failed and censored in the previous period is reduced from the number at risk at the
beginning of that period.
• Notation: 1 for censored data, and 0 for exact failure time.
15
A Data Example
We can compute the cumulative hazard rate, survival rate, and cumulative distribution
function for the following data as follows:
t(i) Number Failed Number at Risk Hazard Rate Survival Probability Cumulative
di ni h(t)=di/ni S(t) Distribution
Function
F(t) = 1-S(t)
4 3 7 3/7 1 – 3/7 = 4/7 = 0.5714 0.4286
12 1 1 1/1 1/7*(1 – 1) = 0 1
16
A Censored Data Example: MATLAB
In MATLAB, we can enter the data and calculate these measures using ecdf. Suppose the
failure times are stored in an array y.
• While using ecdf, you must also enter the censoring information using an array of
binary variables. Enter 1 for censored data, and enter 0 for exact failure time.
>> y = [4 4 4 7 11 11 12];
>> cens = [1 0 0 0 1 0 0];
>> [f,x] = ecdf(y,'censoring',cens)
• ecdf, by default, produces the cumulative distribution function values. You have to
specify the survivor function or the hazard function using optional name-value pair
arguments. You can also plot the results as follows.
>> ecdf(y,'censoring',cens,'function','survivor');
17
Reliability Definition
18
Reliability
• Reliability: The probability that an item will perform its intended function without
failure under stated conditions for a specified period of time.
• Failure: The termination of the ability of the product to perform its intended function
• In its simplest and most general form, reliability is the probability of success.
Reliability Theory
Let T be a random variable representing the failure time or lifetime of a
physical system. For this system, the probability that it will fail by time t is:
t
F (t ) = P[T ≤ t ] = ∫ f (u )du
0
• Failure rate: the probability that a failure will occur in the interval [t1, t2]
given that a failure has not occurred before time t1. This is written as:
Reliability Terms
Benefits:
• The pictorial representation means that models are easily understood and therefore
readily checked.
• Block diagrams are used to identify the relationship between elements in the system.
The overall system reliability can then be calculated from the reliabilities of the blocks
using the laws of probability.
• Block diagrams can be used for the evaluation of system availability provided that both
the repair of blocks and failures are independent events, i.e. provided the time taken to
repair a block is dependent only on the block concerned and is independent of repair
to any other block
23
• Series System
The reliability of the system is given by
R(t ) = RA (t ) RB (t ) RC (t ).... RZ (t )
Input Output
The interpretation can be stated as ‘any unit failing causes the system as a whole to fail’.
• Parallel System
The reliability of the system is given by:
Input Output
R(t ) = 1 − (1 − RX (t ))(1 − RY (t ))
The units X and Y that are operating in such a way that the system will survive as long as at
least one of the unit survives.
25
• Series/Parallel System
When blocks such as X and Y themselves comprise sub-blocks in series, block
diagrams of the type are shown below
Output
Input
R(t ) = 1 − (1 − RX (t ))(1 − RY (t ))
26
Commonality:
Both reliability and survival functions essentially measure the same concept: the probability of no event
(failure or death) occurring by a certain time.
Both are crucial for assessing risk over time and making informed decisions about system maintenance,
safety, and performance.
Differences:
Context: Reliability is more frequently used in engineering and industrial contexts, while survival is more
common in medical and biological contexts.
Reliability:
An automotive manufacturer may use reliability analysis to determine the probability that a car’s
transmission will last at least 100,000 miles without failure. This involves collecting data on transmission
failures and using reliability functions to predict performance.
Survival:
In a medical study, researchers might use survival analysis to determine the probability that patients with a
certain disease will survive for five years after treatment. This involves analyzing patient data and using
survival functions to estimate survival probabilities.
27
Weibull Probability Distribution
A random variable T ~ Weibull (α , β ) is said to have the Weibull Probability
Distribution with parameters β and α, where β > 0 and α > 0, if the probability
density function is β
⎛t ⎞
β −⎜ ⎟
β −1 ⎝ α ⎠ , t ≥0
t e for
f (t ) = α β
0 , elsewhere
where β is the Shape Parameter, α is the Scale Parameter, t is the mission length
(time, cycles, etc.). The scale parameter (also called the characteristic life) is the
time at which 63.2% of the product will have failed .
• The scale parameter influences both the mean and the spread of the Weibull distribution.
• If β = 1, the Weibull reduces to the Exponential Distribution.
• Weibull distribution is frequently used to model fatigue failure, ball bearing failure etc.
28
Weibull Probability Distribution
Probability Density Function
f(t)
1.8
β=5.0
1.6
1.4 β=0.5 β=3.44
1.2
β=1.0
1.0 β=2.5
0.8
0.6
0.4
0.2
0
0 0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6 1.8 2.0 2.2 2.4
t
t is in multiples of α
0.75
0.5 β=5
F(t) β=3
0.25
β=1
β = 0.5
0
0 50 100 150 200
t
• The reliability of a product is the probability that it does not fail before time t.
It is therefore the complement of the CDF:
R (t ) = 1 − F (t )
30
Weibull Probability Plot in MATLAB
>> data = [1.03; 2.20; 1.55; 0.24; 1.83; 0.40; 0.87; 0.03; 2.24; 1.05; 2.05; 0.14; 3.68; 0.48; 0.41];
>> wblplot(data);
31
Effects of β on Weibull Reliability Function
• 0 < β < 1, R(t) decreases sharply and monotonically, and is convex.
• β = 1, R(t) decreases monotonically but less sharply than for 0 < β < 1, and is convex.
• β > 1, R(t) decreases as t increases. As wear-out sets in, the curve goes through an
inflection point and decreases sharply.
β −1
f (t ) β ⎛ t ⎞
h (t ) = = ⎜ ⎟ , t ≥ 0, α > 0, β > 0
R (t ) α ⎝ α ⎠
Example
β ≥1
Constant Failure Rate
Region
Failure Rate
0 < β <1
0 Time t
34
Properties of the Weibull Distribution
⎛1 ⎞
µ = E (T ) = α Γ⎜⎜ + 1⎟⎟
⎝β ⎠
• Standard Deviation of T
1
⎡ ⎛2 ⎞ 2⎛ 1 ⎞⎤ 2
σ = α ⎢Γ⎜⎜ + 1⎟⎟ − Γ ⎜⎜ + 1⎟⎟⎥
⎣ ⎝β ⎠ ⎝β ⎠⎦
2
where Γ 2 (a ) = [Γ(a )]
Γ(a + 1) = aΓ(a)
35
⎡ ⎛ t ⎞β ⎤
R (t ) = exp ⎢ − ⎜ ⎟ ⎥ = 1 − F (t )
⎢⎣ ⎝ α ⎠ ⎥⎦
β −1
β ⎛t ⎞
h(t ) = f (t ) / R(t ) = ⎜ ⎟
α ⎝α ⎠
∞
1/ β −t ⎛ 1⎞
MTTF = α ∫0 t e dt = α Γ ⎜1 + ⎟
⎝ β⎠
• β is the Shape Parameter and
• α is the Characteristic Lifetime survival
36
Weibull Distribution - Example
1- Let T = the ultimate tensile strength (ksi) at -200
degrees F of a type of steel that exhibits ‘cold
brittleness’ at low temperatures. Suppose T has a
Weibull distribution with parameters β = 20,
and α = 100. Find:
(a) P( T ≤ 105)
− (105/ 100) 20
= 1− e = 1 − 0.070 = 0.930
− ( 0.98) 20 − (1.02) 20
=e −e
Expert Opinion
• Expert Opinion techniques involve consultation with experts, who use
their experience and understanding of the system to arrive at an estimate
of its cost.
• Only used when more objective techniques are not applicable
• Used to corroborate or adjust objective data
▪ Cross check historical based estimate
• Use for high level, low fidelity estimating
• Last resort
• Advantages
▪ An expert can factor in differences between past project experiences and
new techniques, architectures or applications involved in the future project
▪ Good cross check of other estimate from Subject Matter Expert (SME) point
of view
▪ Allows perspective to an estimate that may be overlooked without SME
• Disadvantages
▪ Expert judgment is only as good as the estimator, who has his own biases
▪ Completely subjective without use of other techniques
▪ Low-to-nil credibility
41
Exam 1