0% found this document useful (0 votes)
6 views

Probability Slides

Probability and se is given for reference.

Uploaded by

qwert1
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views

Probability Slides

Probability and se is given for reference.

Uploaded by

qwert1
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 45

TPK 4120 / TPK 4161 / TPK 5115

Introduction to probability
Jørn Vatn/August-2020
Event

In order to define probability, we need to work with events. Let as an example


A be the event that there is an operator error in a control room. This is written:

A = {operator error}

An event may occur, or not. We do not know the outcome in advance prior to
the experiment or a situation in “real life”.

1
Probability

When events are defined, the probabilities that the events occur are of
interest. Probability is denoted by Pr(·), i.e.

Pr(A) = Probability that A occur

The numeric value of Pr(A) may be found by:


I Studying the sample space (all possible outcomes)
I Analysing collected data
I Look up the value in data hand books
I “Expert judgement”

2
Set theory

To work with probability, we need some set theory:


I Union
I Intersection
I Disjoint sets

3
Union

We write A ∪ B to denote the union of A and B, i.e. the occurrence of A or B or


(A and B). Let A be the event that tossing a die results in a “six”, and B be the
event that we get an odd number of eyes. We then have A ∪ B = {1, 3, 5, 6}.

4
Intersection

We write A ∩ B to denote the intersection of A and B, i.e. the occurrence of


both A and B. As an example, let A be the event that a project is not
completed in due time, and let B be the event that the budget limits are
exceeded. A ∩ B then represents the situation that the project is not
completed in due time and the budget limits are exceeded.

5
Disjoint events/sets

A and B are said to be disjoint if they can not occur simultaneously, i.e. A ∩ B =
Ø = the empty set. Let A be the event that tossing a die results in a “six”, and B
be the event that we get an odd number of eyes. A and B are disjoint since
they cannot occur simultaneously, and we have A ∩ B = Ø.

6
Complementary event

The complement of an event A is all events in the sample space S except for A.
The complement of an event is denoted by AC . Let A be the event that tossing
a die results in an odd number of eyes. AC is then the event that we get an
even number of eyes.

7
Mapping of events on the interval [0,1]

8
Conditional probabilities

Pr(A|B) denotes the conditional probability that A will occur given that B has
occurred

9
Independent events

A and B are said to be (stochastic) independent if information about whether B


has occurred does not influence the probability that A will occur, i.e.
Pr(A|B) = Pr(A)

10
Some rules for probability calculus

Pr(A ∪ B) = Pr(A) + Pr(B) − Pr(A ∩ B)


Pr(A ∩ B) = Pr(A) × Pr(B) if A and B are independent
Pr(AC ) = Pr(A does not occur) = 1 − Pr(A)
Pr(A ∩ B)
Pr(A|B) =
Pr(B)

11
The law of total probability

I In many situations it is easier to assess the probability of an event B


conditionally on some other events, say A1 , A2 , . . ., Ar , than unconditionally
I Let A1 , A2 , . . ., Ar represent a division of the sample space S, i.e. A1 ∪ A2 ∪
. . . ∪ Ar = S and the Ai ’s are pair wise disjoint, i.e., Ai ∩ Aj = Ø for i 6= j
I Further let B be an arbitrary event in S
I The law of total probability now states

r
X
Pr(B) = Pr(Ai ) × Pr(B|Ai )
i=1

12
Example
Let
I D denote the event that a project is delayed
I WN denote that there is no work conflict
I WM denote that there is a minor work conflict
I WS denote that there is a severe work conflict
Further, assume
I Pr(D|WN ) = 0.1, Pr(WN ) = 0.8
I Pr(D|WM ) = 0.5, Pr(WM ) = 0.15
I Pr(D|WS = 0.9, Pr(WS ) = 0.05

Pr(D) = Pr(WN ) Pr(D|WN ) + Pr(WM ) Pr(D|WM ) + Pr(WS ) Pr(D|WS ) = 0.2

13
Stochastic variables and their properties
Stochastic variables (=random quantities) are used to describe quantities
which can not be predicted exactly

X is stochastic ⇔ Impossible to state exactly the value of X

Examples of stochastic variables are:


I X = Life time of a component (continuous)
I R = Repair time after a failure (continuous)
I T = Duration of a construction project (continuous)
I C = Total cost of a renewal project (continuous)
I N = Number of delayed trains next month (discrete)
I W = Maintenance and operational cost next year (continuous)

14
How to represent stochastic variables?

I Cumulative distribution function (CDF)


I Probability distribution function (PDF)
I Expectation
I Variance
I Mode
I Tripple estimate

15
Cumulative distribution function
A stochastic variable X is characterized by it’s cumulative distribution function
(CDF)
FX (x) = Pr(X ≤ x)

16
Probability density function
For a continuous stochastic variable, the probability density function (PDF) is
given by
d
fX (x) = FX (x)
dx

17
Expectation

The expectation (mean) of X is given by


 R∞
 x · fX (x) dx if X is continuous
E(X ) =  −∞

P
 xj · p(xj ) if X is discrete
j

The expectation can be interpreted as the long time run average of X , if an


infinite amount of observations are available.

18
Median and mode

I The median of a distribution is the value m0 of the stochastic variable X


such that Pr(X ≤ m0 ) ≥ 1/2 and Pr(X ≥ m0 ) ≥ 1/2. In other words, the
probability at or below m0 is at least 1/2, and the probability at or above
m0 is at least 1/2.
I The mode of a distribution is the value M of the stochastic variable X such
that the probability density function, or point probability at M is higher or
equal than for any other value of the stochastic variable. We sometimes
used the term ‘most likely value’ rather than mode.

19
Variance and standard deviation

The variance of a random quantity expresses the variation in the value X will
take in the long run:
 R∞

 [x − E(X )]2 · fX (x) dx if X is continuous
Var(X ) = −∞
[(xj − E(X )]2 · p(xj ) if X is discrete
P


j

The standard deviation of X is given by


p
SD(X ) = + Var(X )

20
Double expectation
If X and Y are stochastic variables then:

E(X ) = E(E(X Y )) (= EY (EX (X Y )))

Var(X ) = E(Var(X Y )) + Var(E(X Y ))


It follows easily (B and B C represent the Y variable):

E(X ) = E(X B) Pr(B) + E(X B C ) Pr(B C )


Var(X ) = Var(X |B) Pr(B) + Var(X |B C ) Pr(B C )
i2
+ [E(X |B) − E(X )]2 Pr(B) + E(X |B C ) − E(X ) Pr(B C )
h

21
Example
Let
I X denote the duration of a project
I WN denote that there is no work conflict
I WM denote that there is a minor work conflict
I WS denote that there is a severe work conflict
where WN , WM and WS represent the Y . Further assume:
I E(X |WN ) = 10, Pr(WN ) = 0.8
I E(X |WM ) = 12, Pr(WM ) = 0.15
I E(X |WS = 20, Pr(WS ) = 0.05
E(X ) = E(X |WN ) Pr(WN ) + E(X |WM ) Pr(WM ) + E(X |WS ) Pr(WS ) = 10.8

22
Common probability distributions
Different distributions have different properties and usage. We review some
of them:
I The normal distribution is often used for aggregated situations, for
example if the total cost is the sum of various item costs
I The PERT and Triangular distribution is often used when we shall assess
the cost, duration etc for single elements where we ask an expert about a
low, high and most likely value (triple estimate)
I The exponential distribution is often used for simplicity, but it might also
be realistic in situation where for example the time to next failure does
not depend on the past
I The Weibull distribution is often used to model life times where time to
failure decreases with increasing age
In the textbook(TPK4120) and in the course compendia more comprehensive
presentations are given
23
The normal distribution

X is said to be normally distributed if the probability density function of X is


given by:

1 1 − (x−µ)2 2
fX (x) = √ e 2σ
2π σ
where µ and σ are parameters that characterise the distribution. The mean
and variance are given by:

E(X ) = µ
Var(X ) = σ2

24
Normal distribution, cont

The CDF cannot be found on closed form for the Normal distribution. In Excel
we may use the Normdist() function, and in pRisk.xlsm we may use the
CDFNormal() function.

For hand calculation it is convenient to introduce a standardised normal


distribution for this purpose. We say that U is standard normally distributed if
it’s probability density function is given by:

1 u2
fU (u) = φ(u) = √ e − 2

25
Standardized normal distribution
We then have
Zu Zu
1 t2
FU (u) = Φ(u) = φ(t)dt = √ e − 2 dt

−∞ −∞

Φ(u) is tabulated enabeling us to find the CDF for a general normal


distribution. We have that if X is normally distributed with parameters µ and
σ, then U = X σ−µ is standard normally distributed, hence

     
X −µ x −µ X −µ x −µ
FX (x) = Pr(X ≤ x) = Pr ≤ = Pr U ≤ =Φ
σ σ σ σ

26
Example
Let X be normally distributed with parameters µ = 5 and σ = 3. We will find
Pr(X ≤ 6):
6−µ 6−5
   
X −µ
Pr(X ≤ 6) = Pr ≤ = Pr U ≤
σ σ 3
1
 
=Φ ≈ Φ(0.33) ≈ 0.629
3

27
The traiangular distribution
The triangular distribution has a probability density function that comprises a
triangle:
2(x−L)
(
(M−L)(H−L) if L ≤ x ≤ M
fX (x) = 2(H−x)
(H−M)(H−L) if M ≤ x ≤ H

The cumulative distribution function is given by:


(x−L)2
(
(M−L)(H−L) if L ≤ x ≤ M
FX (x) = (H−x)2
1 − (H−M)(H−L) if M ≤ x ≤ H
The mean and variance are given by:

E(X ) = L + M3 + H
2 + M 2 + H 2 + LM + LH + MH
Var(X ) = L 6
28
The PERT distribution
The PERT distribution is defined by the lowest value (L), the most likely value
(M ), and the highest value (H):
(x − L)α1 −1 (H − x)α2 −1
fX (x) =
B(α1 , α2 )(H − L)α1 +α2 −1
4M+H−5L 5H−4M−L x−L
where α1 = H−L , α2 = H−L ,z = H−L and B(·, ·) is the beta function.
Bz (α1 , α2 )
FX (x) =
B(α1 , α2 )
where Bz (·, ·) is the incomplete beta function
The mean and variance are given by:

E(X ) = L + 4M
6
+H

(E(X ) − L)(H − E(X ))


Var(X ) = 7
29
The exponential distribution
X is said to be exponentially distributed if the probability density function of X
is given by:
fX (x) = λe −λx
The cumulative distribution function is given by:
FX (x) = 1 − e −λx
and the mean and variance are given by:
E(X ) = 1/λ
Var(X ) = 1/λ2
Note that for the exponential distribution, X will always be greater than 0. The
parameter λ is often denoted the intensity in the distribution

30
Example
We will obtain the probability that X is greater than it’s expected value:

Pr(X > E(X )) = 1 − Pr(X ≤ E(X )) = 1 − FX (E(X )) = e −λE(X ) = e −1 ≈ 0.37

We will obtain the probability that X is greater than 2E(X ) given that X is
greater than E(X )

Pr(X > 2E(X ) ∩ X > E(X ))


Pr(X > 2E(X )|X > E(X )) =
Pr(X > E(X ))
Pr(X > 2E(X )) e −λ2E(X )
= −λE(X ) = e −λE(X ) = e −1 ≈ 0.37
Pr(X > E(X )
=
e

This illustrates the memoryless property of the exponential distribution

31
The Weibull distribution
X is said to be Weibull distributed if the probability density function of X is
given by:
fX (x) = αλ(λx)α−1 e −(λx)
α

The cumulative distribution function is given by:


α
FX (x) = 1 − e −(λx)
and the mean and variance are given by:
1 1
 
E(X ) = λ Γ α + 1
Var(X ) = λ12 Γ α2 + 1 − Γ2 α1 + 1
    

where Γ(·) is the gamma function

32
Example

We will obtain the probability that X is greater than it’s expected value in the
Weibull situation where α = 2:

Pr(X > E(X )) = 1 − Pr(X ≤ E(X )) = e −(λE(X )) = e −Γ


α α (1+1/α)
≈ 0.46 > 0.37

The probability that X is greater than 2E(X ) given that X is greater than E(X )
is given by:

Pr(X > 2E(X ))


Pr(X > 2E(X )|X > E(X )) = = e −(2 −1)Γ (1+1/α) ≈ 0.09 < 0.37
α α

Pr(X > E(X )

33
Distribution of sums, products and maximum values
If X1 , X2 ,. . . ,Xn are random variables we have for of the sum of the x-es:

E(X1 + X2 + . . . + Xn ) = E E(Xi )
Xn  Xn
Xi =
i=1 i=1

Var(X1 + X2 + . . . + Xn ) = Var Var(Xi )


X n  Xn
Xi =
i=1 i=1

r
SD [SD(Xi )]2
Xn  Xn
Xi =
i=1 i=1

Note that the equations for variance and standard deviations are only valid if
the x-es are stochastically independent

34
Sum of normally distributed stochastic variables

I Let X1 , X2 ,. . . ,Xn be independent normally distributed


Let Y be the sum of the x-es, i.e., Y = ni=1 Xi
P
I

distributed with E(Y ) = ni=1 E(Xi ) and


P
I Y is then normally
Var(Y ) = i=1 Var(Xi )
Pn

The result does not generally apply for other distributions!

35
Central limit theorem

I Let X1 , X2 ,. . . ,Xn be a sequence of identical independent distributed


stochastic variables with expected value µ and standard deviation σ
I As n approaches infinity, the average value of the x-es will asymptotically
have a normal distribution with expected value µ and standard deviation

σ/ n
I Similarly, the sum of the x-es will asymptotically have a normal

distribution with expected value nµ and standard deviation σ n

36
Generalized version of the central limit theorem

I Several generalizations for finite variance exist which do not require


identical distribution but incorporate some conditions which guarantee
that none of the variables exert a much larger influence than the others
I Two such conditions are the Lindeberg condition and the Lyapunov
condition
I Now, as n approaches infinity, the sum of the
Pn x-es will asymptotically have
a
Pnormal distribution with expected value i=1 E(Xi ) and variance
n
i=1 Var (Xi)
I We often apply this result without considering the required conditions!
I We also often ignores that n should be large!

37
Weighted sum

If X1 , X2 ,. . . ,Xn are independent stochastic variables, we have for constants a0 ,


a1 ,. . . ,an :

E (a0 + a1 X1 + a2 X2 + . . . + an Xn ) = a0 + a1 E(X1 ) + a2 E(X2 ) + . . . + an E(Xn )

Var (a0 + a1 X1 + a2 X2 + . . . + an Xn ) = a12 Var(X1 ) + a22 Var(X2 ) + . . . + an2 Var(Xn )

38
Distribution of a random number of stochastic variables
Consider Y = N
P
i=1 Xi , where the Xi ’s are independent and identically
distributed stochastic variables. If N is fixed, we can easility find the expected
value and variance of Y . If N is a stochastic variable it is not obvious, but we
have:
I Wald’s formula:

N
!
E Xi = E(N)E(X )
X

i=1

I Blackwell–Girshick equation:
N
!
Var Xi = E(N)Var(X ) + E2 (X )Var(N)
X

i=1

39
Distribution of a product
If X1 , X2 ,. . . ,Xn are independent stochastic variables we might obtain the
expected value, the variance and the standard deviation of the product of the
x-es:
n n
!
E(X1 · X2 · . . . · Xn ) = E E(Xi )
Y Y
Xi =
i=1 i=1

The results for the variance and standard deviation are more complicated,
and we only present the results for n = 2:

Var(X1 X2 ) = Var(X1 )Var(X2 ) + Var(X1 ) [E(X2 )]2


+Var(X2 ) [E(X1 )]2

40
Distribution of maximum values

I Let X1 og X2 be independent stochastic variables, and let Y = max(X1 , X2 )


I The cumulative distribution function of Y is given by:

FY (x) = Pr(Y ≤ x) = Pr(X1 ≤ x ∩ X2 ≤ x)


= Pr(X1 ≤ x) Pr(X2 ≤ x) = FX1 (x)FX2 (x)

In this situation we could easily obtain the distribution of the maximum of two
stochastic variables, but it is not so easy to obtain the expectation and variance

41
Distribution of maximum values, cont
Solution: The probability density function, fY (x) is the derivative of FY (x):
Z∞
E(Y ) = x · fY (x) dx =
−∞
Z∞
x · [fX1 (x)FX2 (x) + fX2 (x)FX1 (x)] dx
−∞

Z∞
Var(Y ) = [x − E(Y )]2 · [fX1 (x)FX2 (x) + fX2 (x)FX1 (x)] dx
−∞

42
The EMax() and VarMax() functions

I The EMax() and VarMax() functions in pRisk.xlsm are used to find the
expectation and variance of two independent normally distributed
stochastic variables
I The syntax is:
I EMax(µ1 , σ12 , µ2 , σ22 )
I VarMax(µ1 , σ12 , µ2 , σ22 )
I where µi and σi are the expected value and standard deviation for the two
variables respectively
I The routines are implemented by use of numerical integration

43
Thank you for your attention

You might also like