Probability
Probability
Dhanya N.M.
X 1,4,7,9 Y
X
Y 2,3,4,5,6
X Y 4
E1 E2 E3
P( Sample Space ) 1
Sample
Space
A A P( A) 1 P( A)
N N! 1000!
166,167,000
n n!( N n)! 3!(1000 3)!
P( X ) P( X Y ) P( X Y ) P( X | Y )
The probability The probability The probability The probability
of X occurring of X or Y of X and Y of X occurring
occurring occurring given that Y
has occurred
X X Y X Y
Y
P( X Y ) P( X ) P(Y ) P( X Y )
X Y
P( N S ) P( N ) P( S ) P( N S )
N S P ( N ) .70
P ( S ) .67
.56 P ( N S ) .56
.70 .67
P ( N S ) .70.67 .56
0.81
Increase
Storage Space
Yes No Total
Noise Yes .56 .14 .70
Reduction No .11 .19 .30
Total .67 .33 1.00
P( N S ) P( N ) P( S ) P( N S )
.70.67 .56
.81
P ( N S ) .56.14 .11
.81
X Y
X Y
P ( X Y ) 1 P ( X Y )
N S
P ( N S ) 1 P ( N S )
1 .81
.19
Foundations of Data Science
Special Law of Addition
Y
X
P( P C ) P( P) P(C )
44 31
155 155
.484
Foundations of Data Science
Law of Multiplication
Demonstration Problem 4.5
P( M S ) P( M ) P( M S ) P( S ) 1 P( S )
0. 5714 0.1143 0. 4571 1 0. 2143 0. 7857
P( M S ) P( S ) P( M S ) P( M S ) P( S ) P( M S )
0. 2143 0.1143 0.1000 0. 7857 0. 4571 0. 3286
P( M ) 1 P( M )
1 0. 5714 0. 4286
Foundations of Data Science
Special Law of Multiplication
for Independent Events
• General Law
P( X Y ) P( X ) P(Y | X ) P(Y ) P( X | Y )
• Special Law
If events X and Y are independent,
P( X ) P( X | Y ), and P(Y ) P(Y | X ).
Consequently,
P( X Y ) P( X ) P(Y )
Foundations of Data Science
Law of Conditional Probability
• The conditional probability of X given Y is
the joint probability of X and Y divided by
the marginal probability of Y.
P( X Y ) P(Y | X ) P( X )
P( X | Y )
P(Y ) P(Y )
P ( N ) .70
S N P ( N S ) .56
P( N S )
P( S | N )
.56 P( N )
.70
.56
.70
.80
Reduced Sample
Space for P ( N S ) .11
P( N | S )
“Increase P( S ) .67
Storage Space” .164
= “Yes”
Foundations of Data Science
Independent Events
• If X and Y are independent events, the
occurrence of Y does not affect the
probability of X occurring.
• If X and Y are independent events, the
occurrence of X does not affect the
probability of Y occurring.
If X and Y are independent events,
P( X | Y ) P( X ), and
P(Y | X ) P(Y ).
Foundations of Data Science
Independent Events
Demonstration Problem 4.10
Geographic Location
Northeast Southeast Midwest West
D E F G
Finance A .12 .05 .04 .07 .28
P( A G ) 0.07
P( A| G ) 0.33 P( A) 0.28
P(G ) 0.21
P( A| G ) 0.33 P ( A) 0.28
Foundations of Data Science
Independent Events
Demonstration Problem 4.11
D E
A 8 12 20 8
P( A| D) .2353
34
B 20 30 50
20
P( A) .2353
C 6 9 15 85
P( A| D) P( A) 0.2353
34 51 85
P(Y | Xi ) P( Xi )
P( Xi| Y )
P(Y | X 1) P( X 1) P(Y | X 2 ) P( X 2 ) P(Y | Xn ) P( Xn )
Event
P( Ei ) P(d| Ei ) P(Ei d) P( Ei| d )
Alamo 0.65 0.08 0.052 0.052
0.094
=0.553
P(C1 R 2 R 3) ( R1 C 2 R 3) ( R1 R 2 C 3)
P(C1 R 2 R 3) P( R1 C 2 R 3) P( R1 R 2 C 3)
9 9 9 27
64 64 64 64
n n n! 4!
nCr C 6
r r
r !n r ! 2 !4 2!
• Probability of a sequence containing exactly 2 erroneous tax
returns
55,552
6 0.06
5,527,200
Foundations of Data Science