0% found this document useful (0 votes)
8 views

Lecture 4

Uploaded by

ssrkr96
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views

Lecture 4

Uploaded by

ssrkr96
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 42

1

INSE 6320
Risk Analysis for Information and Systems Engineering

•Survival Analysis
•Reliability
• Weibull Analysis
•Expert Opinion

Dr. M. AMAYRI Concordia University


2

Risk Assessment

• Hazard: the property of a substance or situation with the potential for


creating damage
• A hazard analysis is a process used to assess risk.
• The results of a hazard analysis is the identification of unacceptable
risks and the selection of means of controlling or eliminating them.

• Probability is a way to predict stochastic events

• Risk: the likelihood of a specific effect within a specified period


3
What is Survival Analysis?

• A class of statistical methods for studying the occurrence and timing of events.
(branch of statistics for analyzing the expected duration of time until one event occurs)

• A class of methods for analyzing survival times (i.e., times to events).


• A class of methods for analyzing survival probabilities at different follow-up times.
• Not restricted to data with a certain distribution (non-parametric in nature).
• In many biomedical studies, the primary endpoint is time until an event occurs (e.g.
death, recurrence, new symptoms, etc.)
• Data are typically subject to censoring when a study ends before the event occurs.

Survival typically refers to the probability that an entity, such


as a human, animal, or system, will continue to live or function
beyond a specified period of time.


4
Survival Analysis
• Survival Function - A function describing the proportion of items (or
individuals) surviving to or beyond a given time.
• Notation:
▪ T ≡ survival time of a selected item
▪ t ≡ a specific point in time.
▪ S(t) = P (T > t) ≡ Survival Function=P (surviving longer than time t)
▪ h(t) ≡ instantaneous failure rate among survivors at time t ( hazard
function)
Two related probabilities are used to describe survival data: the survival probability and
the hazard probability.

The survival probability, also known as the survivor function S(t) : the probability that
an individual survives from the time origin (e.g. diagnosis of cancer) to a specified
future time t.

The hazard, denoted by h(t) is the probability that an individual who is under
observation at a time t has an event at that time.
5

• Failure Distribution function (or unreliability): Probability that the


product fails at some time prior to t.

F (t ) = P(T ≤ t )
• Failure Density function: The value of f(t) is the probability of the
product failing precisely at time t.
dF (t )
f (t ) =
dt
• Survival function: Probability that the item does not fail before time t

S (t ) = P(T > t ) = 1 − F (t )
6
Kaplan-Meier Estimate of S(t)
Estimate the survival probability from observed survival times
It used to analyze time-to-event data, such as time until
a specific event occurs.
• Rank the failure times as t(1)≤t(2)≤…≤t(n).
• Number of items at risk before t(i) is ni
• Number of items failed at time t(i) is di
• Estimated hazard function at t(i): ^ di
hi =
• Estimate of survival function ni

ˆ ni − d i
S (t ) = ∏
t ( i ) ≤t ni
• In medical research, the K-M estimate is often used to measure the fraction of patients living for a certain amount of
time after treatment.
• In economics, it can be used to measure the length of time people remain unemployed after a job loss.
• In engineering, it can be used to measure the time until failure of machine parts.
The Kaplan–Meier estimator is the nonparametric maximum likelihood estimate of S(t).
7
Kaplan-Meier Estimate of S(t)
^ di
hi =
Failure Rate ni
The hazard rate at each period is the number of failures in the given period divided by
the number of surviving individuals at the beginning of the period (number at risk).
n − di
Sˆ (t ) = ∏ i
Survival Probability t ( i ) ≤t ni
For each period, the survival probability is the product of the complement of hazard rates.
The initial survival probability at the beginning of the first time period is 1. If the hazard
rate for the each period is h(ti), then the survivor probability is as shown.

Failure Hazard Rate Cumulative Hazard Time (t) Survival Probability


Time (t) (h(t)) Rate (S(t))
0 0 0 0 1
t1 d1/n1 d1/n1 t1 1*(1 – h(t1))
t2 d2/n2 h(t1) + d2/n2 t2 S(t1)*(1 – h(t2))
... ... ... ... ...
tk dk/nk h(tk – 1) + dk/nk tn S(tk – 1)*(1 – h(tk))
8
A Data Example

• The number at risk is the total number of survivors at the beginning of each period. The
number at risk at the beginning of the first period is all individuals in the lifetime study. At
the beginning of each remaining period, the number at risk is reduced by the number of
failures plus individuals censored at the end of the previous period.

• This life table shows fictitious survival data. At the beginning of the first failure time, there
are seven items at risk. At time 4, three fail. So at the beginning of time 7, there are four
items at risk. Only one fails at time 7, so the number at risk at the beginning of time 11 is
three. Two fail at time 11, so at the beginning of time 12, the number at risk is one. The
remaining item fails at time 12.

Failure Time Number Failed Number at Risk ni


t(i) di

4 3 7
7 1 4
11 2 3
12 1 1 35
9
A Data Example
How to compute the cumulative hazard rate, survival rate, and cumulative distribution
function for the following data ?
t(i) Number Failed Number at Risk Hazard Rate Survival Probability Cumulative
di ni h(t)=di/ni S(t) Distribution
Function
F(t) = 1-S(t)
4 3 7

7 1 4

11 2 3

12 1 1
10
A Data Example
We can compute the cumulative hazard rate, survival rate, and cumulative distribution
function for the following data as follows:
t(i) Number Failed Number at Risk Hazard Rate Survival Probability Cumulative
di ni h(t)=di/ni S(t) Distribution
Function
F(t) = 1-S(t)
4 3 7 3/7 1 – 3/7 = 4/7 = 0.5714 0.4286

7 1 4 1/4 4/7*(1 – 1/4) = 3/7 = .4286 0.5714

11 2 3

12 1 1
11
A Data Example
We can compute the cumulative hazard rate, survival rate, and cumulative distribution
function for the following data as follows:
t(i) Number Failed Number at Risk Hazard Rate Survival Probability Cumulative
di ni h(t)=di/ni S(t) Distribution
Function
F(t) = 1-S(t)
4 3 7 3/7 1 – 3/7 = 4/7 = 0.5714 0.4286

7 1 4 1/4 4/7*(1 – 1/4) = 3/7 = .4286 0.5714

11 2 3 2/3 3/7*(1 – 2/3) = 1/7 = 0.1429 0.8571

12 1 1 1/1 1/7*(1 – 1) = 0 1
12
A Censored Data Example
When you have censored data, the life table might look like the following:
Time Number failed di Censoring Number at Risk Hazard Rate Survival Probability Cumulative
t(i) ni Distribution
Function

4 ? 1 7

7 0 4

11 1 3

12 0 1

• At any given time, the censored items are also considered in the total of number at
risk, and the hazard rate formula is based on the number failed and the total number
at risk.
• While updating the number at risk at the beginning of each period, the total number
failed and censored in the previous period is reduced from the number at risk at the
beginning of that period.
• Notation: 1 for censored data, and 0 for exact failure time.
13
A Censored Data Example
When you have censored data, the life table might look like the following:
Time Number failed di Censoring Number at Risk Hazard Rate Survival Probability Cumulative
t(i) ni Distribution
Function

4 2 1 7

7 1 0 4

11 1 1 3

12 1 0 1

• At any given time, the censored items are also considered in the total of number at
risk, and the hazard rate formula is based on the number failed and the total number
at risk.
• While updating the number at risk at the beginning of each period, the total number
failed and censored in the previous period is reduced from the number at risk at the
beginning of that period.
• Notation: 1 for censored data, and 0 for exact failure time.
14
A Censored Data Example
When you have censored data, the life table might look like the following:
Time Number failed di Censoring Number at Risk Hazard Rate Survival Probability Cumulative
t(i) ni Distribution
Function

4 2 1 7 2/7 1 – 2/7 = 0.7143 0.2857

7 1 0 4 1/4 0.7143*(1 – 1/4) = 0.4643


0.5357

11 1 1 3 1/3 0.5357*(1 – 1/3) = 0.6429


0.3571
12 1 0 1 1/1 0.3571*(1 – 1) = 0 1.0000

• At any given time, the censored items are also considered in the total of number at
risk, and the hazard rate formula is based on the number failed and the total number
at risk.
• While updating the number at risk at the beginning of each period, the total number
failed and censored in the previous period is reduced from the number at risk at the
beginning of that period.
• Notation: 1 for censored data, and 0 for exact failure time.
15
A Data Example
We can compute the cumulative hazard rate, survival rate, and cumulative distribution
function for the following data as follows:
t(i) Number Failed Number at Risk Hazard Rate Survival Probability Cumulative
di ni h(t)=di/ni S(t) Distribution
Function
F(t) = 1-S(t)
4 3 7 3/7 1 – 3/7 = 4/7 = 0.5714 0.4286

7 1 4 1/4 4/7*(1 – 1/4) = 3/7 = .4286 0.5714

11 2 3 2/3 3/7*(1 – 2/3) = 1/7 = 0.1429 0.8571

12 1 1 1/1 1/7*(1 – 1) = 0 1
16
A Censored Data Example: MATLAB
In MATLAB, we can enter the data and calculate these measures using ecdf. Suppose the
failure times are stored in an array y.

• While using ecdf, you must also enter the censoring information using an array of
binary variables. Enter 1 for censored data, and enter 0 for exact failure time.
>> y = [4 4 4 7 11 11 12];
>> cens = [1 0 0 0 1 0 0];
>> [f,x] = ecdf(y,'censoring',cens)

• ecdf, by default, produces the cumulative distribution function values. You have to
specify the survivor function or the hazard function using optional name-value pair
arguments. You can also plot the results as follows.
>> ecdf(y,'censoring',cens,'function','survivor');
17

Reliability Definition
18

Reliability

• Reliability: The probability that an item will perform its intended function without
failure under stated conditions for a specified period of time.

• Failure: The termination of the ability of the product to perform its intended function

• Reliability provides a quantitative statement of the chance that an item will


operate without failure for a given period of time in the environment for which
it was designed.

• In its simplest and most general form, reliability is the probability of success.

• To perform reliability calculations, reliability must first be defined explicitly. It


is not enough to say that reliability is a probability. A probability of what?

Reliability is performance over time, probability


that something will work when you want it to.
19

Reliability Theory
Let T be a random variable representing the failure time or lifetime of a
physical system. For this system, the probability that it will fail by time t is:
t
F (t ) = P[T ≤ t ] = ∫ f (u )du
0

• The probability of the system surviving until time "t" is:



R(t ) = P[T > t ] = 1 − F (t ) = ∫ f (u )du
t

• Failure rate: the probability that a failure will occur in the interval [t1, t2]
given that a failure has not occurred before time t1. This is written as:

P[t 1 ≤ T ≤ t 2 | T > t 1] P[t 1 ≤ T ≤ t 2] F (t 2) − F (t 1)


= =
t 2 − t1 (t 2 − t 1) P[T > t 1] (t 2 − t 1) R (t 1)
20

Reliability Terms

• Mean Time To Failure (MTTF) for non-repairable systems


• Mean Time Between Failures for repairable systems (MTBF)
• Reliability Probability (survival) R(t)
• Failure Probability (cumulative density function ) F(t)=1-R(t)
• Failure Probability Density f(t)
• Failure Rate function (hazard rate) h(t)
21

MTTF and MTBF


• One of the measures of the system's reliability is the mean time to failure
(MTTF). It should not be confused with the mean time between failure (MTBF).
• We refer to the expected time between two successive failures as the MTTF
when the system is non-repairable.
• For a repairable item, MTBF is the ratio of the cumulative operating time to the
number of failures for that item.

• Example (repairable system): A motor is repaired and returned to service


six times during its life and provides 45,000 hours of service. Calculate MTBF.

Total operating time 45000


MTBF = = = 7500 hours
Number of failures 6
22

Reliability Block Diagram (RBD) Technique


• The first step in evaluating a system's reliability is to construct a reliability block
diagram which is a graphical representation of the components of the system and how
they are connected.
• The purpose of RBD technique is to represent failure and success criteria pictorially
and to use the resulting diagram to evaluate System Reliability.

Benefits:
• The pictorial representation means that models are easily understood and therefore
readily checked.
• Block diagrams are used to identify the relationship between elements in the system.
The overall system reliability can then be calculated from the reliabilities of the blocks
using the laws of probability.
• Block diagrams can be used for the evaluation of system availability provided that both
the repair of blocks and failures are independent events, i.e. provided the time taken to
repair a block is dependent only on the block concerned and is independent of repair
to any other block
23

System Configuration Models


24

Typical RBD configurations and related formulae

• Series System
The reliability of the system is given by
R(t ) = RA (t ) RB (t ) RC (t ).... RZ (t )
Input Output

The interpretation can be stated as ‘any unit failing causes the system as a whole to fail’.

• Parallel System
The reliability of the system is given by:
Input Output

R(t ) = 1 − (1 − RX (t ))(1 − RY (t ))
The units X and Y that are operating in such a way that the system will survive as long as at
least one of the unit survives.
25

Typical RBD configurations and related formulae

• Series/Parallel System
When blocks such as X and Y themselves comprise sub-blocks in series, block
diagrams of the type are shown below

Output
Input

RX (t ) = RA1 (t ) RB1 (t ) RC1 (t ).... RZ 1 (t )


RY (t ) = RA2 (t ) RB 2 (t ) RC 2 (t ).... RZ 2 (t )
Thus, the reliability of the system is given by

R(t ) = 1 − (1 − RX (t ))(1 − RY (t ))
26

Comparison and Use in Risk Assessment (Reliability, survival function)

Commonality:

Both reliability and survival functions essentially measure the same concept: the probability of no event
(failure or death) occurring by a certain time.
Both are crucial for assessing risk over time and making informed decisions about system maintenance,
safety, and performance.

Differences:

Context: Reliability is more frequently used in engineering and industrial contexts, while survival is more
common in medical and biological contexts.

Practical Example in Risk Assessment

Reliability:
An automotive manufacturer may use reliability analysis to determine the probability that a car’s
transmission will last at least 100,000 miles without failure. This involves collecting data on transmission
failures and using reliability functions to predict performance.
Survival:
In a medical study, researchers might use survival analysis to determine the probability that patients with a
certain disease will survive for five years after treatment. This involves analyzing patient data and using
survival functions to estimate survival probabilities.
27
Weibull Probability Distribution
A random variable T ~ Weibull (α , β ) is said to have the Weibull Probability
Distribution with parameters β and α, where β > 0 and α > 0, if the probability
density function is β
⎛t ⎞
β −⎜ ⎟
β −1 ⎝ α ⎠ , t ≥0
t e for
f (t ) = α β

0 , elsewhere

where β is the Shape Parameter, α is the Scale Parameter, t is the mission length
(time, cycles, etc.). The scale parameter (also called the characteristic life) is the
time at which 63.2% of the product will have failed .

• The scale parameter influences both the mean and the spread of the Weibull distribution.
• If β = 1, the Weibull reduces to the Exponential Distribution.
• Weibull distribution is frequently used to model fatigue failure, ball bearing failure etc.
28
Weibull Probability Distribution
Probability Density Function
f(t)
1.8
β=5.0
1.6
1.4 β=0.5 β=3.44
1.2
β=1.0
1.0 β=2.5
0.8
0.6
0.4

0.2
0
0 0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6 1.8 2.0 2.2 2.4
t
t is in multiples of α

β = 1, the Weibull distribution becomes the exponential distribution w


29
Weibull Probability Distribution β
⎛t⎞
−⎜ ⎟
for t ≥ 0 F (t ) = P (T ≤ t ) = 1-e ⎝α⎠

F(t) for various b and a = 100


1
probability, p

0.75

0.5 β=5
F(t) β=3
0.25
β=1
β = 0.5
0
0 50 100 150 200
t
• The reliability of a product is the probability that it does not fail before time t.
It is therefore the complement of the CDF:

R (t ) = 1 − F (t )
30
Weibull Probability Plot in MATLAB
>> data = [1.03; 2.20; 1.55; 0.24; 1.83; 0.40; 0.87; 0.03; 2.24; 1.05; 2.05; 0.14; 3.68; 0.48; 0.41];
>> wblplot(data);
31
Effects of β on Weibull Reliability Function
• 0 < β < 1, R(t) decreases sharply and monotonically, and is convex.

• β = 1, R(t) decreases monotonically but less sharply than for 0 < β < 1, and is convex.

• β > 1, R(t) decreases as t increases. As wear-out sets in, the curve goes through an
inflection point and decreases sharply.

➢α is called characteristic life.


➢α has the same units as T, such as
hours, miles, cycles, actuations, etc. ⎡ ⎛ t ⎞β ⎤
R(t ) = exp ⎢ − ⎜ ⎟ ⎥
⎣⎢ ⎝ α ⎠ ⎦⎥
32
Effects of β on Weibull Hazard Function
β < 1: Decreasing failure rate (DFR)
β = 1: Constant failure rate (CFR) (Exponential distribution)
β > 1: Increasing failure rate (IFR)

β −1
f (t ) β ⎛ t ⎞
h (t ) = = ⎜ ⎟ , t ≥ 0, α > 0, β > 0
R (t ) α ⎝ α ⎠

Example

h(t ) = c where c is a constant


h(t ) = at increasing for a > 0
1
h(t ) = decreasing for t > 0
t +1
33

Versatility of Weibull Model


β −1
Failure Rate: h(t ) = f (t ) / R(t ) =
β ⎛t ⎞
⎜ ⎟
α ⎝α ⎠

β ≥1
Constant Failure Rate
Region
Failure Rate

0 < β <1

Early Life Wear-Out


Region Region
β =1

0 Time t
34
Properties of the Weibull Distribution

• Mean or Expected Value

⎛1 ⎞
µ = E (T ) = α Γ⎜⎜ + 1⎟⎟
⎝β ⎠
• Standard Deviation of T

1
⎡ ⎛2 ⎞ 2⎛ 1 ⎞⎤ 2
σ = α ⎢Γ⎜⎜ + 1⎟⎟ − Γ ⎜⎜ + 1⎟⎟⎥
⎣ ⎝β ⎠ ⎝β ⎠⎦
2
where Γ 2 (a ) = [Γ(a )]
Γ(a + 1) = aΓ(a)
35

Properties of Weibull Model


β −1
β ⎛ t ⎞ ⎡ ⎛ t ⎞β ⎤
f (t ) = ⎜ ⎟ exp ⎢ − ⎜ ⎟ ⎥ β > 0, α > 0, t ≥ 0
α ⎝α ⎠ ⎢⎣ ⎝ α ⎠ ⎥⎦

⎡ ⎛ t ⎞β ⎤
R (t ) = exp ⎢ − ⎜ ⎟ ⎥ = 1 − F (t )
⎢⎣ ⎝ α ⎠ ⎥⎦
β −1
β ⎛t ⎞
h(t ) = f (t ) / R(t ) = ⎜ ⎟
α ⎝α ⎠

1/ β −t ⎛ 1⎞
MTTF = α ∫0 t e dt = α Γ ⎜1 + ⎟
⎝ β⎠
• β is the Shape Parameter and
• α is the Characteristic Lifetime survival
36
Weibull Distribution - Example
1- Let T = the ultimate tensile strength (ksi) at -200
degrees F of a type of steel that exhibits ‘cold
brittleness’ at low temperatures. Suppose T has a
Weibull distribution with parameters β = 20,
and α = 100. Find:

(a) P( T ≤ 105)

(b) P(98 ≤ T ≤ 102)

2- The random variable T can be modeled by a Weibull distribution with β =


½ and α = 1000. The specification time limit is set at t = 4000.
What is the proportion of items not meeting specification?
37
Weibull Distribution - Example Solution ⎛t⎞
β
−⎜ ⎟
⎝α⎠
F (t ) = P (T ≤ t ) = 1-e
(a) P( T ≤ 105) = F(105; 100, 20)

− (105/ 100) 20
= 1− e = 1 − 0.070 = 0.930

(b) P(98 ≤ T ≤ 102)


= F(102; 100, 20) - F(98; 100, 20)

− ( 0.98) 20 − (1.02) 20
=e −e

= 0.513 − 0.226 = 0.287


38
Weibull Distribution - Example
The random variable T can be modeled by a Weibull distribution with β = ½
and α = 1000. The specification time limit is set at t = 4000.
What is the proportion of items not meeting specification? P(T > 4000)

P(T > 4000 ) = 1 − P(T ≤ 4000)


= 1 − F( 4000)
1/2
⎛ 4000 ⎞
−⎜ ⎟
⎝ 1000 ⎠
=e
−2
=e
= 0.1353
That is, all but about 13.53% of the items will not meet spec.
39

Expert Opinion
• Expert Opinion techniques involve consultation with experts, who use
their experience and understanding of the system to arrive at an estimate
of its cost.
• Only used when more objective techniques are not applicable
• Used to corroborate or adjust objective data
▪ Cross check historical based estimate
• Use for high level, low fidelity estimating
• Last resort

Tip: Expert opinion is the least regarded


and most dangerous method, but it is
seductively easy. Most lexicons do not even
admit it as a technique, but it is included
here for completeness.
40

Expert Opinion – Advantages/Disadvantages

• Advantages
▪ An expert can factor in differences between past project experiences and
new techniques, architectures or applications involved in the future project
▪ Good cross check of other estimate from Subject Matter Expert (SME) point
of view
▪ Allows perspective to an estimate that may be overlooked without SME

• Disadvantages
▪ Expert judgment is only as good as the estimator, who has his own biases
▪ Completely subjective without use of other techniques
▪ Low-to-nil credibility
41

What makes a good expert?


• Credibility!

• Someone who has the ear of the Program manager.


▪ You should use the same person that the program manager relies upon for
the most critical information.

• Technical specialist or engineer who is knowledgeable about the program


under question.
42

Exam 1

▪ Exam 1 Coverage: Lectures 1 to 4


29/05/2024 in the class at 11:45

You might also like