0% found this document useful (0 votes)
13 views91 pages

Module 1

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views91 pages

Module 1

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 91

CL 202: Introduction to Data Analysis

Fundamentals of Probability
and Random Variables
Mani Bhushan and Sachin Patawardhan
Department of Chemical Engineering
I.I.T. Bombay

1/5/2022 Fundamentals 1
Automation Lab

Outline
IIT Bombay

 Sample Space

 Borel Field and Probability Measure

 Probability Space

 Computing Probabilities

 Concept of a Random Variable

 Discrete and Continuous Random Variables

 Properties of Random Variables

 Appendix: Conditional Probability and Independence

1/5/2022 Fundamentals 2
Automation Lab

Note
IIT Bombay

The material in this presentation is composed from


multiple sources. The references are listed at the
end. If you are looking for one reference text that
contains almost every concept covered here then
refer to the following standard textbook:

Papoulis, A. and Pillai, S. U., Probability, Random


Variables and Stochastic Processes, (4’th Ed.),
MacGraw-Hill International, 2002.

1/5/2022 Fundamentals 3
Automation Lab

Probability (Maybeck, 1979)


IIT Bombay

Intuitive approach to define probabilities of events of


interest in terms of the relative frequencies of occurrence
If the event A is observed to occur N(A) times in
a total of N trials, then P(A) is defined by

lim N ( A)
P ( A) 
N  N
provided that this limit in fact exists.
Although this is a conceptually appealing basis for probability
theory, it does not allow precise treatment of many problems
and issues of direct importance.

Modern probability theory is more rigorously based


on an axiomatic definition of the probability.

1/5/2022 Fundamentals 4
Automation Lab

Sample Space
IIT Bombay
(Maybeck, 1979)

S : fundamental sample space containing all possible


outcomes of the experiment conducted
 : single elementary outcome of the experiment
(i.e.   S )

A : a specific event of interest, a specific set


of outcomes of the experiment.
Each such event A is a subset of S , i.e. A  S

An event A is said to occur if the observed outcome 


is an element of A, i.e. if   A .
1/5/2022 Fundamentals 5
Automation Lab

Sample Space
IIT Bombay

 Discrete Sample space: consists of a finite or


countably infinite number of elements/outcomes
 Examples: (1) Coin toss or roll of a die experiments,
(2) set of manufacturing defects in a device
 Continuous Sample Space: consists of
uncountable number of elements
 Examples: (1) Values measurement noise can take in a
sensor, (2) Yield of a reaction, (3) Monthly profit of a

company

1/5/2022 Fundamentals 6
Automation Lab

Field
IIT Bombay
(Papoulous and Pillai, 2002)

  Field or  - algebra(F)
A field F is a non - empty class of sets such that if
(a) A  F then A  F and
(b) A, B  F then A  B  F
Using these properties, it can be shown that
(a) If A,B  F then A  B  F
________
A  B  F  A  B    A  B   F
(b) The field contains the " certain event" ( S  A  A )
and the " impossible event" (  A  A )
Example: Consider experiment of rolling a die once
We can define sample space as S  1,2,3,4,5,6
1/5/2022 Fundamentals 7
Automation Lab
IIT Bombay

Sigma Field (Papoulous and Pillai, 2002)


Example (contd)
Now consider a class of sets, F, defined as
F   ,{2}, {1,3,4,5,6}, 1,3,5, 1,2,3,5, 2,4,6, 4,6, S 
The class F qualifies to be a  - field.

A probability measure is defined only on a Sigma field.

1/5/2022 Fundamentals 8
Automation Lab
IIT Bombay

Example 1: Rolling of a Die (Jazwinski, 1970)


Consider experiment of rolling a die (with faces
numbered from 1 to 6) once
We can define probability (or sample) space as
S1  1,2,3,4,5,6
Now, a  field can be defined in multiple ways.
Case 1 :  - field F1 can be taken as set of all
subsets of S 1 (i.e. power set of S 1 )
Case 2 : If we are interested in betting on only events
(odd) and (even), then  - field F2 can be defined as
F2   , 1,3,5, 2,4,6, S1
1/5/2022 Fundamentals 9
Automation Lab
IIT Bombay

Example 1: Rolling of a Die


Now consider a set F defined as
F   , 2, 1,3,5, 2, 4,6, S1
Note : 2 1,3,5  1,2,3,5 F
even though 1,2,3,5  S1 and S1  F.
Thus set F does not qualify as a  - field.

1/5/2022 Fundamentals 10
Automation Lab

Borel Field
IIT Bombay

Borel Field (B) [Papoulis and Pillai, 2002]:


Suppose A1, A2, ..., An, ... is an infinite sequence of
sets in field B. If the union and intersections of
these sets also belong to B, then B is called a
Borel field

While the notion of borel fields can be defined on


any topology, in probability theory, then comes in
most handy when it is defined over the real space
Rn and these are called Borel Fields defined on Rn .
We will refer to such fields as Borel Fields

Here we think S = Rn
1/5/2022 Fundamentals 11
Automation Lab

Borel Field
IIT Bombay

Consider the very special semi-open sets defined on the real line

A1  w   :   w  a1, a1  
A2  w   :   w  a2 , a2  
• The sigma algebra generated from such sets are so commonly used
that they have their own name called Borel Field.

• What other sets exist in this sigma algebra?

A1 :
A2 :
A1  A2 :
1
Let Bk  {w : b   w  b}, k  I  then  k 1 Bk  {b}
k
• It can be shown that the Borel field consists of all semi-open, open
and closed subsets of the real line including singleton sets.
Automation Lab
IIT Bombay
Example 2: Error in temperature measurement

This is an example of a continuous RV.


Theoretically, the measurement error can take any
real value.

Sample space S  R i.e. real line.

1/5/2022 Fundamentals 13
Automation Lab
IIT Bombay
Example 2: Measurement Errors

The Borel Field B can be defined to consist


of events of the following form
Error is less than or equal to ei
Ai   :   ei   ( , ei ] ei  R
and all possible complements,
intersections and unions of Ai sets

The Borel field then contains all possible


half open and closed intervals,
and point sets as well

1/5/2022 Fundamentals 14
Automation Lab
IIT Bombay
Example 2: Meas. Errors (Cont.)

Let e1 and e2 be points on R such that e1  e2 .


Then, the following sets belong to the Borel Field B

Event 1 : error is less than or equal to e1


A1   :   e1 ,   R  ( , e1]
Event 2 : error is less than or equal to e2
A2   :   e2 ,   R  ( , e2 ]
Event 3 : Error lies between e1 and e2
A1   :   e1 ,   R  (e1 , )
A1  A2  (e1 , e2 ]
Event 4 : error is equal to e1
1/5/2022 Fundamentals 15
Automation Lab

Axioms of Probability
IIT Bombay

The probability function (or probability measure P(.) is


defined to be a real scalar-valued function defined on
sigma field (or algebra) that assigns a value P(Ai) to each
Ai which is a member of F (that is, A i  F ) such that

1. P(A i )  0 for all A i  F


2. P(S)  1
3. If A1 , A 2 ,...A N are disjoint or mutually exclusive
N N
elements of F then P(  A i )   P(A i )
i 1 i 1

for all finite and countably infinite N.

1/5/2022 Fundamentals 16
Automation Lab
IIT Bombay

Axioms of Probability Implications


1. What can we say about P(A ) in terms of P(A)?

S  A  A  P(S)  P(A)  P(A )  P(A )  1 - P(A)

Q. What can we say about P( )?

2. If A1  A2 , what can we say about P(A1 ) and P(A2 )?

A2  A1  (A1  A2 )  P(A2 )  P(A1 )  P(A1  A2 )


 P(A2 )  P(A1 )

1/5/2022 Fundamentals 17
Automation Lab

Note
IIT Bombay
(Papoulous and Pillai, 2002)

• Axioms of probability chosen so that the resulting


theory gives a satisfactory representation of the
physical world.
• Probabilities as used in real problems must be
compatible with the axioms.
• Using the frequency interpretation of probability:

1. P ( A)  0 because N ( A)  0 and N>0.


2. P ( S ) =1 because S occurs at every trial; hence N ( S )=N.
3. If A  B = { }, then N ( A  B )  N ( A)  N ( B ), because
if A  B occurs, then A or B occurs but not both.
N ( A  B ) N ( A)  N ( B )
Hence P ( A  B )    P ( A)+P( B )
N N
1/5/2022 Fundamentals 18
Automation Lab
IIT Bombay

Probability Space
Probability space : Defined by the triplet ( S , F, P)
of the sample space, the underlying  - field, and
the probability function, all defined axiomatically.

Example 1: Rolling of a Die (contd.)

Consider sample space S1  1,2,3,4,5,6


Case 1 :  - field F1  power set of S1
Consider disjoint sets Ai  {i} for i  1,2,..,6.
If we set P1 ( Ai )  1 / 6 for i  1,2,..,6
Then we can find probability of any event in F1
1/5/2022 Fundamentals 19
Automation Lab
IIT Bombay

Example 1: Rolling of a Die (contd.)


Thus, triplet (S1,F1,P1 ) formsa probability space.
Case 2 : If we are interested in betting
on only events (odd) and (even),
and  - field F2 is defined as
F2   , 1,3,5, 2,4,6, S1 

Define
P2 (1,3,5)  p 
P2 (2,4,6)  1  p, P2 ( )  0 and P2 ( S1 )  1, with 1  p  0
Triplet ( S1,F2 ,P2 ) forms a probability space
1/5/2022 Fundamentals 20
Automation Lab
IIT Bombay

Example 1: Rolling of a Die (contd.)

Case 3 : We can define S 2  R i.e. set of all real numbers.


 - field B3  Set of all subsets of R. This sigma algebra is a Borel Field

P3 ((-, a]) = (1/6) × {No. of pts. 1,2,...,6 which (-, a] contains}


P3 ω : ω  2.5  2  (1/6) ; P3 ω : ω  5.3  5  (1/6)

P3  : 1.9    5.3  4  (1/ 6)


P3  :   1  0 ; P3  :   10  1

Triplet (S2 , B3 , P3 ) forms a probability space

1/5/2022 Fundamentals 21
Automation Lab
IIT Bombay

Points to Note

PrA  0 does NOT imply A  


PrA  1 does NOT imply A  S

Definition of a probability space for a given


experiment is NOT unique.

Triplet ( S, B, P) must be 'specified' and


NOT determined by the physical experiemnt.

1/5/2022 Fundamentals 22
Automation Lab
IIT Bombay

Computing Probabilities
Consider a probability sapce ( S , B, P ).
Given an event A  B, in some cases, it is easier to find P( A ) than P( A).
Then, using the fact A  A  S and ( A, A ) are disjoint, we can write
P( S )  P( A)  P( A )  1  P( A)  1  P( A )
Odds of event A  P( A) / P ( A )  P( A) / 1  P( A) 

Non-Disjoint Events
Let A and B be two events in B that are NOT disjoint.
We can write
A   A  B   A  B 
which are disjoint.
 P ( A)  P  A  B   P A  B  ........ ( I )
1/5/2022 Fundamentals 23
Automation Lab
IIT Bombay

Computing Probabilities
Non-Disjoint Events (contd.)

Similarly, we can write


A  B   A  B   B    A  B   B 
 B    A  B 
which are disjoint.
 P( A  B )  P B   P  A  B  ....... ( II )

Eliminating P  A  B  from (II) using (I), we have


P ( A  B )  P  A  P ( B )  P ( A  B )

1/5/2022 Fundamentals 24
Automation Lab

Example
IIT Bombay
(Ross, 2009)

A total of 28 percent of American males smoke


cigarettes, 7 percent smoke cigars, and 5 percent smoke
both cigars and cigarettes.
What percentage of males smoke neither cigars nor
cigarettes?
Event A: a randomly chosen male is a cigarette smoker
Event B: a randomly chosen male a cigar smoker.
P( A  B )  P( A)  P( B )  P( A  B )
 0.28  0.07  0.05  0.3
Thus the probability that the person is not a smoker
is .7, implying that 70 percent of American males
smoke neither cigarettes nor cigars.

1/5/2022 Fundamentals 25
Automation Lab

Product of Sample Spaces


IIT Bombay

In statistics one usually does not consider a


single experiment, but that the same experiment
is performed several times.
When we perform an experiment n times,
then the corresponding sample space is
S  S1  S2  S3  ....  Sn
where Si for i  1, . . . , n is a copy
of the sample space of the original experiment
Probability of the combined outcome (1 , 2 ,...., n )
P (1 , 2 ,...., n )   p1. p2 .... pn
where each i has probability pi
1/5/2022 Fundamentals 26
Automation Lab
IIT Bombay
Example: Repeated Coin Toss

Sample space associated with coin toss experiment


S  H , T 
Let P( H )  p and P(T )  q  1  p
0  p, q  1

When we perform the coin toss experiment n times


S  H , T  H , T  ......  H , T 

Consider the situation when n  3. Then


P ( H , H , H )   p 3
P ( H , T , H )   p.q. p  p 2 (1  p )
Dekking et al., 2005 and so on.
1/5/2022 Fundamentals 27
Automation Lab
IIT Bombay
Example: Infinite Outcomes
Experiment with infinitely many outcomes
Toss a coin repeatedly until the first head turns up
Outcome of the experiment
No. of tosses it takes to have the first success
(i.e. occurrence of H )
 S  1,2,3,......
For i  3, we have
P(3)  P (T , T , H )   p(1  p ) 2
..........
For i  n, we have
P(n )  P (T , T ,...., T , H )   p(1  p )n 1
1/5/2022 Fundamentals 28
Automation Lab

Example Infinite Outcomes


IIT Bombay

It may be noted that events 1,2 ,..., n,... i.e.


( H ), (T , H ), (T , T , H ),......(T , T ,...., T , H ),.....
are disjoint or mutually exclusive events.

Thus, from axioms of probability, it follows that


P( S )  P (1)  P (2)  ....  P (n )  ....
 p  (1  p ) p  ....  (1  p )n 1 p  ....

 p 1  (1  p )  (1  p ) 2  ...... 
 1 
 p   1
 1  (1  p ) 

1/5/2022 Fundamentals 29
Automation Lab
IIT Bombay

Random Variable

Problem 1: A sample space associated with a random


phenomenon need not consist of elements that are
numbers.

Example 1: A die with 6 faces painted with 6


different colors

Example 2: Candidates appearing in an election held


in a constituency

How to perform numerical calculations


involving such sample spaces?

1/5/2022 Fundamentals 30
Automation Lab
IIT Bombay

Random Variable
Problem 2: We often have to consider multiple random
phenomena simultaneously. Even if the sample spaces
associated with these random phenomenon consists of
numbers, their ranges can be widely different.
Example: Temperature, pressure and feed
concentration fluctuations in a chemical reactor.
To understand the reactor behavior, these
multiple random phenomenon have to be
considered simultaneously.

How can we treat such problems through a


unified mathematical framework?
1/5/2022 Fundamentals 31
Automation Lab
IIT Bombay

Random Variable

It is possible to define a transformation


such that we can perform all the calculations using
a generic sample space
and
a generic Borel field
defined using the generic sample space?

The concept of a random variable is introduced


because, we need a mapping from the sample space
to the set of real numbers for carrying out
quantitative analysis through
a unified mathematical framework.

1/5/2022 Fundamentals 32
Automation Lab

Random Variable (Maybeck, 1979)


IIT Bombay

Given a sample space, S , and


an associated Borel field B.
A scalar random variable x( ) is a real - valued point
'function or mapping' which assigns a real scalar value to
each point  in S , denoted as x( )   , such that
(1) every set A of the form A  { : x( )  x }
for any x on the real line ( x  R) is an element of the Borel field B.
(2) P(x( )  )  P(x( )  )  0
The second condition states that although we allow x( ) to be  
or  , we demand that these outcomes form a set
with zero probability
1/5/2022 Fundamentals 33
Automation Lab

Random Variable
IIT Bombay

x( ) : Random variable (mapping)


 : a realization of the random variable
(i.e. the value that this function assumes for a particular  )
A scalar random variable
is a mapping from the sample space, S , into R
such that inverse images of
half - open intervals of the form (-, x] in R
are events in B for which
the probabilities can be defined
through a probability function P(.).
1/5/2022 Fundamentals 34
Automation Lab

Advantage
IIT Bombay

Once we define a random variable x( )


on an original sample space, say S ,
we can start working with
the 'generic' sample space S R  R

Original Borel Field  B


Generic Borel Field  B R
(consisting of all sub - intervals of R)

A generic elementary event A in S R


A  ( , x] with x  R
1/5/2022 Fundamentals 35
Automation Lab
IIT Bombay
The Generic Borel Field
Let x1 and x2 be points on R such that x1  x2 .
Then, the following sets belong to the Borel Field, B R
A1  ( , x1]
A2  ( , x2 ]
A1  ( x1 , )
A1  A2  ( x1 , x2 ]
Taking complements, unions, and intersections of the sets Ai
leads to finite intervals (open, closed, half open) and point values.
The Borel field, BR , is composed of virtually all sub - intervals
of R of interest in describing a probability problem on R.

1/5/2022 Fundamentals 36
Automation Lab

Advantage
IIT Bombay

A generic probability function, P(.), is defined


for all events of the form A  ( , x]
Fx ( x)  P(( , x])  
where 0    1
Fx ( x) : Probability Distribution Function

Thus, from now on,


we work with probability space ( R,B R ,Fx ( x))
for all problems.

1/5/2022 Fundamentals 37
Automation Lab

Pictorial Representation
IIT Bombay

1/5/2022 Fundamentals 38
Automation Lab
IIT Bombay

Properties of Distribution Function


Notation
lim lim

Fx ( x )  Fx ( x   ) and Fx ( x ) 

Fx ( x   )
 0  0
for 0    0
Property 1
F ( )  1 and F ( )  0
Property 2
Fx ( x ) is a non - decreasing function of x,
i.e., if x1  x2 then Fx ( x1 )  Fx ( x2 ).

1/5/2022 Fundamentals 39
Automation Lab
IIT Bombay

Properties of Distribution Function


Property 3
If Fx ( x0 )  0 then Fx ( x )  0 for all x  x0 .

Property 4
The function Fx ( x ) is continuous from the right, i.e.,
Fx ( x  )  Fx ( x ).
Property 5
P( x  x )  Fx ( x )  Fx ( x  ).

Property 6
P( x1  x  x2 )  Fx ( x2 )  Fx ( x1 )
1/5/2022 Fundamentals 40
Automation Lab

Points to Note
IIT Bombay

Continuous Random Variable


A random variable is called contunuous type if
the distribution function Fx ( x ) is continuous, i.e.,
Fx ( x  )  Fx ( x  )  Fx ( x ) for all x.

Discrete Random Variable


If Fx (x ) is constant except for a finite number of
jump discontuities (piecewise constant, step type),
then x is said to be discrete - type RV.

It is possible to encounter a situation where


a random variable is mixed type.
1/5/2022 Fundamentals 41
Automation Lab

Example: Coin Toss


IIT Bombay

Consider the coin toss experiment with triplet


S  H , T  ; B   , H , T , S  ;
P( H )  p, P(T )  q, p  q  1, 0  p, q  1

We can define a discrete RV as


x( H )  1 and x(T )  0
New sample space is R and associated Borel field is B R

New distribution function


0 for    x  0

Fx ( x )   q for 0  x  1
 1 for 1  x  

1/5/2022 Fundamentals 42
Automation Lab
IIT Bombay
Probability of Other Events

Probability of event {x  x} i.e. set ( x, )


P( x, )  1  P (( , x ])  1  Fx ( x )
(This follows from the axiom of probability P( S )  1)

Probability of event { x1  x  x2 } i.e. set ( x1 , x2 ]


( , x2 ]  ( , x1 ]  ( x1 , x2 ]
Note : Sets ( , x1 ] and ( x1 , x2 ] are disjoint
 P( , x2 ]  P( , x1 ]  P ( x1 , x2 ]
(follows from axioms of probability)
P ( x1 , x2 ]  Fx ( x2 )  Fx ( x1 )
1/5/2022 Fundamentals 43
Automation Lab
IIT Bombay

Example 1: Rolling a Die (contd.)


Consider experiment of rolling a die (with 6 faces
painted with 6 different colors) once.

Original sample space S4  C1, C2, C3, C4 , C5, C6 


A Borel field, say B 4 , defined on the original
sample space, S , contains sets of the form
C1, C2 , C3, C4 , C5, C6 
C1, C3 , C6 , C2, C4 , C5 
.........
and so on
in addition to  and S .
1/5/2022 Fundamentals 44
Automation Lab
IIT Bombay

Example 1: Rolling a Die (contd.)


We can define a discrete RV
x(Ci )  10i
New sample space  S R  R

An event, say A, in the Borel field B4


~
can now be described using an event A in B R
Events in B R are intervals of the form
~
A  x  x  ( , x ] where x  R
or
~
A  x1  x  x2   [ x1 , x2 ] where x1 , x2  R, x1  x2
and so on.
1/5/2022 Fundamentals 45
Automation Lab
IIT Bombay

Example 1: Rolling a Die (contd.)


Event {x  35} in B R  Event C1, C2, C3 in B4
because x(Ci )  35 only if i  1,2,3

Event {x  6} in B R  Event  in B4
because there is no outcome x(Ci )  6

Event {17.5  x  46.8} in B R  Event C2, C3, C4 in B4


because 17.5  x(Ci )  46.8 only for i  2,3,4
Event {20  x  45} in B R  Event C2, C3, C4 in B4
because 20  x(Ci )  45 only for i  2,3,4

1/5/2022 Fundamentals 46
Automation Lab
IIT Bombay

Example 1: Rolling a Die (contd.)


Event {x  50} in B R  Event C5 in B 4
because x(Ci )  50 only if i  5

Event {x  28} in B R  Event  in B 4


because there is no outcome such that x(Ci )  28

Event {x  80} in B R  Event S in B4

Note
(1) An event in B4 can correspond to
Multiple events in B R
(2) Event  in B R is NOT the same as event  in B4

1/5/2022 Fundamentals 47
Automation Lab

Probability Measured
IIT Bombay

A Probability Measure on B 4
P(Ci )  1 / 6 for i  1,2,...,6
An equivalent probability measure on B R
No. of pts. 10,20,..., 60 
Fx ( x )  P((-,x]) = (1/ 6) ×  
 which (-, x] contains 
Note :
Unlike P (Ci ),
Fx ( x ) is a continuous function
defined over entire R
(with discontinuities
only at 6 points)
1/5/2022 Fundamentals 48
Automation Lab
IIT Bombay
Example 1: Rolling a Die (contd.)

Distribution function of discrete RV x(Ci )  10i


is a staircase function.

In particular
1
Fx (38)  P {-  x  38} in B R  PC1, C2, C3 in B 4  3 
6
1
Fx (31.9)  P{-  x  31.9} in B R  PC1, C2, C3 in B 4  3 
6
1
Fx (29.5)  P{-  x  29.5} in B R  PC1, C2 in B 4  2 
6
Fx (6)  P {-  x  6} in B R  P{ } in B 4  0
Fx (82)  P {-  x  82} in B R  P{S4 } in B 4  1

1/5/2022 Fundamentals 49
Automation Lab

Probability Mass Function


IIT Bombay

Probability mass function of a discrete RV, x,


is a function f x ( .) : R  [0,1] defined by
f x ( a )  P ( x=a ) for -  < a < .
1/5/2022 Fundamentals 50
Automation Lab

Probability Mass Function


IIT Bombay

If x takes values a1,a2 ,... then we have


f x ( ai )  P (x  ai )  pi  0
and p1  p2  ....=1 with 0  p1 , p2 ,....  1
and f x ( x )  P( x  x )  0
for all other x  R
Thus, the probability / Cumulative Distribution Function
of a discrete RV, x, is related to the probability mass
function of, x, as follows
F ( x )  P(x  x )  f
i , ai  x
x (ai )

1/5/2022 Fundamentals 51
Automation Lab

Probability Mass Function


IIT Bombay

The definition of probability mass function on


the previous slide is intuitively appealing.
However, it does not facilitate treatment
through calculus.

To facilitate treatment through calculus,


the probability mass function for
a discrete RV can be represented as
f x ( x )   pi ( x  ai ) for -  x  
i

where  ( x  ai ) represents the Dirac delta function.


1/5/2022 Fundamentals 52
Automation Lab

Probability Mass Function


IIT Bombay

The probability mass function is related to


probability distribution function through
the following integral equation
x x
Fx ( x)  f

x ( )d    p  (  a )d
 i
i i

x
  p  (  a )d   f
i i x ( ai )
i  i , ai  x

(This follows from the properties of  ( x  ai ))


dFx ( x)
Thus, f x ( x) 
1/5/2022 Fundamentals
dx 53
Automation Lab

Example: Data Transmission


IIT Bombay

There is a chance that a bit transmitted through a


digital transmission channel is received in error.

Define discrete RV, x, equal to the number of bits


in error in the next four bits transmitted.

Original sample space S  0 ,1,2 ,3,4


Original Borel field  Power set of S

Probabilities Defined with reference to S


P(1= 0)=0.6561, P(2=1)=0.2916, P(3=2)=0.0486
P(4= 3)=0.0036, P(5=4)=0.0001
1/5/2022 Fundamentals 54
Automation Lab

Example: Data Transmission


IIT Bombay

Let us define a discrete RV


x(i )  i  1 for i  1,2,3,4,5
New sample space  R and new Borel field  B R

Probability Mass Function


f x (0)=0.6561, f x (1)=0.2916, f x (2)=0.0486
f x (3)=0.0036, f x (4)=0.0001

Probability/Cumulative Distribution Function


 0 for    x  0  0.9963 for 2  x  3
 
Fx ( x )   0.6561 for 0  x  1 Fx ( x )  0.9999 for 3  x  4
0.9477 for 1  x  2 1 for 4  x  
 
1/5/2022 Fundamentals 55
Automation Lab

Example 2: Telephone Call


IIT Bombay

A telephone call occurs at random


during time interval [0,T ].
Original Sample Space ( S )  [0,T ]

Original Borel Field


Set of all sub - intervals [t1,t2 ]
B 
 defined over interval [0, T ] 

Probability Function on B
t2  t1
P(t1    t2 )  for any 0  t1  t2  T
T
1/5/2022 Fundamentals 56
Automation Lab

Example 2: Telephone Call


IIT Bombay

Define a continuous RV as
x()   when   [0,T ]
New sample space ( S R )  R
Event   [0, t ] B  Event x  ( , t ] B R
(for 0  t  T )

An event in B R , x  ( , a ], where a  T


is associated with the certain event, S , in B.

An event in B R , x  ( , t ], where t  0


is associated with the impossible event  in B.
1/5/2022 Fundamentals 57
Automation Lab

Example 2: Telephone Call


IIT Bombay

 0 if -  t  0

Define Fx (t )  P( x  t )  (t/T ) if 0  t  T
 1 if T  t  

Now consider t1 , t2  [0, T ] such that t1  t2
Suppose we want to find out the probability
of occurrence of a call in interval (t1  x  t2 )

Since (x  t2 )  ( x  t1 )  (t1  x  t2 )
and events ( x  t1 ) and (t1  x  t2 ) are disjoint,
from the axioms of probability, it follows that
P ( x  t2 )  P( x  t1 )  P (t1  x  t2 )

1/5/2022 Fundamentals 58
Automation Lab

Example 2: Telephone Call


IIT Bombay

 P(t1  x  t2 )  P(x  t2 )  P( x  t1 )
t2  t1
Thus, P(t1  x  t2 )  Fx (t2 )  Fx (t1 )  ,
T
Note
Now, suppose t1  t   and t2  t  
where   0 is a small number, then
2
P (t    x  t   )   0 as   0
T

This is an example of a continuous random variable.

1/5/2022 Fundamentals 59
Automation Lab

Continuous Random Variable


IIT Bombay

A random variable, x, is called continuous


if there exists a density function f X ( x ) : R  R
such that for any a,b  R with a  b
b
P ( a  x  b)   f X ( x )dx ......(1)
a

f X ( x ) must satisfy f X ( x )  0  x and f

X ( x )dx  1

Distribution function of a continuous RV, x,


is a function F (.):R  [0 ,1] defined by
a
F (a )  P(x  a )  f

X ( x )dx

1/5/2022 Fundamentals 60
Automation Lab

Probability Density Function


IIT Bombay

Typical f x (x )

Area under the probability


density function f x ( x ) in [a, b]
 probability that the RV x
will lie in [a,b]
1/5/2022 Fundamentals 61
Automation Lab

Note
IIT Bombay
(Dekking et al., 2005)

If the interval gets progressively smaller, then


the probability will tend to zero.

For any arbitrarily small  >0, we have


a 
P(a    x  a   )   f
a
X ( x )dx

As   0, it follows that P(a )  0

This implies that for a continuous random variable,


we need not be precise about of the intervals :
P( a  x  b)=P( a<x  b)=P( a<x<b)=P( a  x<b)
1/5/2022 Fundamentals 62
Automation Lab

Note
IIT Bombay

For any arbitrarily small  >0, we can write


a 

 f
a
X ( x )dx  2  f X ( x )

Thus, f X ( x ) can be interpreted as a (relative) measure


of how likely it is that RV, x, will be near a

Important to note :
In general, f x (a ), can have a very large value
and is NOT the value of probability of x at a.

1/5/2022 Fundamentals 63
Automation Lab

Uniform Distribution
IIT Bombay

Generalization of PDF appearing in Telephone Call example

 0 if -  x  a
 The distribution is not
Fx ( x )  ( x/ (b  a )) if a  x  b differentiable at a and at b.
 1 if b  x  

Differentiating Fx ( x ), we get
 0 xa

f x ( x )  1 /(b  a ) a xb
 0 xb

Uniform Density Function
1/5/2022 Fundamentals 64
Automation Lab

Example: Chemical Reactor


IIT Bombay

Consider steady state operation of


a Continuously Stirred Tank
Reactor (CSTR)
V  Reactor Volume
q  effluent volumetric flow rate
 inlet feed flow rate

Random Variable of interest


x  residence time of a
particle in the vessel

Assumption: perfect mixing,


particle's position is uniformly
distributed over the volume
1/5/2022
Fundamentals 65
Automation Lab

Example: Chemical Reactor


IIT Bombay

Suppose an elemental volume, v, entered the CSTR at t  0


Question : How long does v stay in the reactor?

Consider interval [0, t ] when t is not too large.


Let the interval [0,t] be divided into n small intervals
each of equal length t  t/n

Consider a volume element v  qt which enters the


reactor during an interval t.
(Since the system is at steady state, equal amount leaves
the reactor during t and the total volume is still V .)

1/5/2022 Fundamentals 66
Automation Lab

Example: Chemical Reactor


IIT Bombay

Let us assume that what happens in each interval t is


qualitatively similar to the repeated coin toss experiments .

Thus, during each t , the sample space S consists of


1  success (  v leaves the reactor)
2  failure (  v stays in the reactor)
Since the reactor is assumed to be well mixed,
the probability that v leaves the vessel during
any of the n intervals of length t is
P(1 )  p  ( v / V )  qt / V  qt /( nV )

1/5/2022 Fundamentals 67
Automation Lab

Example: Chemical Reactor


IIT Bombay

Thus, the probability that v stays in the vessel during


any of the n intervals of length t is
P(2 )  1  p
Sample space : No. of intervals (or " tosses" )
it takes to have the first occurrence of 1
 S  1,2,3,......
Thus, the probability that an elemental volume, v,
which entered the CSTR at t  0 is still in the vessel
at least upto time t is, for large n, well approximated by
n
 qt 
P( x  t )  P (2 , 2 ,...., 2 )   (1-p )  1-
n

 nV 
1/5/2022 Fundamentals 68
Automation Lab

Example: Chemical Reactor


IIT Bombay

By letting n  
lim  qt 1  n  qt 
P(x  t )  1-   exp  
n   V n   V
It follows that the distribution function of x equals
0 for t  0

Fx (t )  P (x  t )    qt 
1  exp   for t  0
  V 
and associated probability density function is
0 for t  0
q
f x (t )    qt 
exp   for t  0
V  V
This is an example of the exponential distribution.
1/5/2022 Fundamentals 69
Automation Lab

Exponential Distribution
IIT Bombay

fx (t ) Fx (t )

t t
Probability Density Function Probability Distribution Function

1/5/2022 Fundamentals 70
Automation Lab

Exponential Distribution
IIT Bombay

Probability distribution function


0 for t  0
Fx (t )  P (x  t )  
1  exp t  for t  0
Probability density function
0 for t  0
f x (t )  
  exp t  for t  0

Useful in describing probabilities


associated with many engineering problems

For example, x = lifetime of an equipment/component


(i.e. time before which the equipment/component fails)
1/5/2022 Fundamentals 71
Automation Lab

Points to Note
IIT Bombay

A discrete RV does not have a probability density function


and
continuous RV does not have a probability mass function.

However, both have a distribution function and



F ( a )  P ( x  a )  P x  ( , a ]  f x ( x )dx


For a continuous or discrete RV, it follows from


the integral calculus that
dFx ( x )
fx ( x) 
dx
when the derivative exists.
1/5/2022 Fundamentals 72
Automation Lab

Points to Note
IIT Bombay

A continuous RV is defined using the integral


equation (1) on slide 58. The RV definition does
not require the density function to be
continuous and differentiable at all points on R.
Thus, a valid probability density function may
have discontinuities at isolated points on the
real line.

Histogram of a continuous RV can be viewed as


an approximation of the probability density
function. The relative frequency is an estimate
of the probability that a measurement falls in
the interval.
1/5/2022 Fundamentals 73
Automation Lab

Histogram
IIT Bombay

A typical histogram of a continuous Random Variable [1]

1/5/2022 Fundamentals 74
Automation Lab

Summary
IIT Bombay

 The modern axiomatic definition of the probability


facilitates rigorous mathematical treatment of
the random phenomenon.
 A probability space consists of the triplet (i)
sample space, (ii) a Borel field defined on the
sample space and (iii) a probability measure
defined on each event in the Borel field.
 The concept of a random variable is introduced
because, we need a mapping from the sample space
to the set of real numbers for carrying out
quantitative analysis through a unified
mathematical framework.

1/5/2022 Fundamentals 75
Automation Lab

Summary
IIT Bombay

After defining a random variable mapping


we work with probability space ( R,B R ,Fx ( x ))
for all problems.

Moreover
a generic probability function, P (.), is defined
for all events of the form A  ( , x ]
Fx ( x )  P(( , x ])   where 0    1
for all problems.

1/5/2022 Fundamentals 76
Automation Lab
IIT Bombay

Appendix
Conditional Probabilities
and Independence
(Dekking et al., 2005)

1/5/2022 Fundamentals 77
Automation Lab

Conditional Probability
IIT Bombay

Note
The conditional probability function
satisfies all the axioms of probability, and,
thus, is a valid probability function in itself.

Multiplication rule for any events A and B


P ( A  B )  P( A | B ) P( B )  P( B | A) P( A)

Consider event A, which can be expressed as


A  ( A  B)  ( A  B )
 the probability of event A
P( A)  P ( A  B )  P( A  B )
 P( A | B ) P( B )  P( A | B ) P( B )
1/5/2022 Fundamentals 78
Automation Lab

Conditional Probability
IIT Bombay

Consider a probability sapce ( S , B, P ).


Let A and B represent two events in B.

Conditional Probability
Knowing that an event B has occurred sometimes
forces us to reassess the probability of event A. The
new probability is the conditional probability.

Independence
If the conditional probability of A equals what the
probability of A was before, then events A and B
are called independent.

1/5/2022 Fundamentals 79
Automation Lab

Conditional Probability
IIT Bombay

The conditional probability of event A


given event B has occurred is defined as
P( A  B )
P( A | B ) 
P( B )
provided P( B )  0.

Note
P( A  B )
P( A | B ) 
P( B)
P( A  B ) P( A  B )
 P( A | B)  P( A | B)  
P( B ) P( B )
P ( A  B )  ( A  B )  P( B )
  1
P( B ) P( B )
1/5/2022 Fundamentals 80
Automation Lab

Example: Mad Cow Disease


IIT Bombay

Consider a test in which a cow is tested to determine


infection with the “mad cow disease.” It is known that,
using the specified test, an infected cow has a 70%
chance of testing positive, and a healthy cow just 10%. It
is also known that 2% cows are infected. Find probability
that an arbitrary cow tests positive.
Event B : A randomly picked cow is infected
Event T : Test comes positive
P(T | B )  0.7 and P(T | B )  0.1
Note: As no test is 100% accurate, most tests have the
problem of false positives and false negatives. A false
positive means that according to the test the cow is
infected, but in actuality it is not. A false negative means an
infected cow is not detected by the test.
1/5/2022 Fundamentals 81
Automation Lab

Example: Mad Cow Disease


IIT Bombay

Note : S  B  B
P( B )  0.02  P( B )  0.98
Problem is to find P(T )

Since T  (T  B )  (T  B )
P(T )  P(T  B )  P (T  B )
P(T )  P (T | B ) P( B )  P(T | B ) P( B )
 0.7  0.02  0.1  0.98  0.112

This is an application of the law of total probability.

1/5/2022 Fundamentals 82
Automation Lab

Law of Total Probability


IIT Bombay

Computing a probability through conditioning on


several disjoint events that make up the whole
sample space

Suppose C1, C2 , . . . , Cm are disjoint events


such that C1  C2  . . . .  Cm  S
Then, the probability of an arbitrary event A
can be expressed as :
P( A)  P( A  C1 )  P ( A  C2 ).....  P ( A  Cm )
 P( A | C1 ) P(C1 )  P( A | C2 ) P(C2 ) 
...... P( A | Cm ) P (Cm )
1/5/2022 Fundamentals 83
Automation Lab

Law of Total Probability


IIT Bombay

The law of total probability (illustration for m = 5).

1/5/2022 Fundamentals 84
Automation Lab

Mad Cow Example (contd.)


IIT Bombay

A more pertinent question about the mad cow


disease test is the following:
Suppose a cow tests positive; what is the
probability it really has the mad cow disease?

In mathematical terms, what is P( B | T ) ?


(T  B ) (T  B )
P( B | T )  
P (T ) P(T  B )  P (T  B )
P(T | B ) P ( B )

P(T | B ) P( B )  P(T | B ) P ( B )
0.7  0.02
  0.125
0.7  0.02  0.1  0.98
(Dekking et al., 2005)
1/5/2022 Fundamentals 85
Automation Lab

Example: Mad Cow Disease


IIT Bombay

Interpretation
If we know nothing about a cow, we would say
that there is a 2% chance it is infected.
However, if we know it tested positive, then we
can say there is a 12.5% chance the cow is
infected.

Finding P( B | T ) using P(T | B ) is an


application of Bayes' Rule derived by English
clergyman Thomas Bayes in the 18th century.

1/5/2022 Fundamentals 86
Automation Lab

Bayes’ Rule
IIT Bombay

Suppose C1, C2 , . . . , Cm are disjoint events


such that C1  C2  . . . .  Cm  S
Then, the conditional probability of Ci , given an
arbitrary event A, can be expressed as :
P ( A | Ci ) P (Ci )
P (Ci | A) 
P ( A | C1 ) P (C1 )  ......  P ( A | Cm ) P (Cm )
P ( A | Ci ) P (Ci )

P ( A)

1/5/2022 Fundamentals 87
Automation Lab

Independence
IIT Bombay

An event A is called independent of B if


P( A|B )  P( A)

Result 1 : A independent of B  A independent of B


P( A |B )  1  P( A|B )  1  P( A)  P( A )

Result 2 : A independent of B  P( A  B )  P( A) P( B )
By application of the multiplication rule,
if A is independent of B
P( A  B )  P( A|B ) P( B )  P( A) P( B )

1/5/2022 Fundamentals 88
Automation Lab

Independence
IIT Bombay

Result 3 : A independent of B  B independent of A


P ( A  B ) P( A) P( B )
P ( B | A)    P( B)
P( A) P ( A)
To show that A and B are independent it suffices
to prove just one of the following
P ( A|B )  P( A)
P( B | A)  P( B )
P( A  B )  P( A) P( B )
( A may be replaced by A and B may be replaced by B .)
If one of these statements holds, all of them are true.
If two events are not independent, they are called dependent.
1/5/2022 Fundamentals 89
Automation Lab
IIT Bombay
Independence Multiple Events

An event A1 , A2 ,..., An are called independent if


P( A1  A2  ....  An )  P( A1 ) P( A2 ).... P( An )
and this statement also holds when any number
of the events A1,. . . , An are replaced by their
complements throughout the formula.

Note :
If A and B are independent
and B and C are independent, then
it does NOT imply that A and C are independent
1/5/2022 Fundamentals 90
Automation Lab

References
IIT Bombay

1. Papoulis, A. Probability, Random Variables and Stochastic


Processes, MacGraw-Hill International, 1991.
2. Dekking, F.M., Kraaikamp, C., Lopuhaa, H.P., Meester, L. E., A
Modern Introduction to Probability and Statistics:
Understanding Why and How, Springer, 2005.
3. Montgomery, D. C. and G. C. Runger, Applied Statistics and
Probability for Engineers, John Wiley and Sons, 2004.
4. Ross, S. M., Introduction to Probability and Statistics for
Engineers and Scientists, Elsevier, 4th Edition, 2009.
5. Maybeck, P. S., Stochastic models, Estimation, and Control:
Volume 1, Academic Press, 1979.
6. Jazwinski, A. H., Stochastic Processes and Filtering Theory,
Academic Press, 1970.

1/5/2022 Fundamentals 91

You might also like