425 CHPT 4
425 CHPT 4
Pr(AB) = Pr(A)Pr(B|A).
Note that if Pr(AB) > 0, then Pr(A) > 0 and Pr(B) > 0 and we have Pr(AB) =
Pr(A)Pr(B|A) = Pr(B)Pr(A|B).
Corollary 4.2. Given events A1 , . . . , An with Pr(A1 A2 · · · An ) > 0,
Pr(A1 A2 · · · An |B) = Pr(A1 |B)Pr(A2 |A1 B)Pr(A3 |A1 A2 B) · · · Pr(An |A1 A2 · · · An−1 B)
Intuitively we say that the events A and B are independent (stochastically indepen-
dent) when knowing that B has occurred has no effect on the probability of occurrence
of A, i.e. when Pr(A) = Pr(A|B). For mathematical convenience the formal definition
of independence is in terms of a product so that it does not depend on the existence of
conditional probabilities.
22 4. Conditional probability
Note that if two events are disjoint (mutually exclusive), then they cannot occur at
the same time; thus if A and B are mutually exclusive, then they cannot be independent
unless one or both is the null event.
Theorem 4.4. If the events A and B are independent, then: the events A and B c are
independent; the events Ac and B are independent; and, the events Ac and B c are inde-
pendent.
Proof. Let the independent events A and B be given. We will show that A and B c
are independent, the other results are proved analogously. Theorem 2.5 implies that
Pr(AB c ) = Pr(A) − Pr(AB). Thus the independence of A and B implies that Pr(AB c ) =
Pr(A) − Pr(A)Pr(B) = Pr(A)(1 − Pr(B)) which establishes the result. t
u
Example. Let Pr(ω) = 1/8 for ω ∈ Ω = {1, 2, 3, 4, 5, 6, 7, 8}, let A = {1, 2, 3, 4}, B =
{1, 2, 5, 6}, and C = {1, 2, 7, 8}. Then Pr(A) = Pr(B) = Pr(C) = 21 and Pr(AB) =
Pr((AC) = Pr(BC) = 41 = ( 12 )2 . But Pr(ABC) = 41 6= ( 12 )3 . Thus, in this example, the
events A, B, and C are pairwise independent but not mutually independent.
Recall that if Pr(AB) > 0, then A and B are independent if, and only if Pr(A|B) =
Pr(A) and Pr(B|A) = Pr(B). A similar result holds for a collection of events.
4. Conditional probability 23
for all nonempty disjoint subsets {i1 , . . . , ia } and {j1 , . . . , jb } of {1, . . . , n}.
Definition. Given events A, B, and C with Pr(ABC) > 0, the events A and B are said
to be conditionally independent given the event C when Pr(AB|C) = Pr(A|C)Pr(B|C).
Theorem 4.6 (The law of total probability). If the events B1 , . . . , Bn form a partition
of Ω, i.e. if Bi Bj = ∅ for all i 6= j and Ω = B1 ∪ · · · ∪ Bn , then, for any event A,
n
X
Pr(A) = Pr(ABi ).
i=1
Corollary 4.7. If the events B1 , . . . , Bn form a partition of Ω, and Pr(Bi ) > 0 for i =
1, . . . , n, then
Xn
Pr(A) = Pr(A|Bi )Pr(Bi ).
i=1
Pr(A|B1 )Pr(B1 )
Pr(B1 |A) = Pn .
i=1 Pr(A|Bi )Pr(Bi )
Bayes’ theorem is particularly useful for a situation where the occurrence of event
A follows the occurrence of one of the events Bi in time and we are interested in the
conditional probability that a particular Bi , say B1 , has occurred given that event A has
occurred.
Note that if A and B are events and 0 < Pr(B) < 1, then the events AB and AB c
form a partition of Ω. Thus Pr(A) = Pr(AB) + Pr(AB c ) and Bayes’ theorem reduces to
Pr(AB) Pr(A|B)Pr(B)
Pr(B|A) = c
= .
Pr(AB) + Pr(AB ) Pr(A|B)Pr(B) + Pr(A|B c )Pr(B c )
Example. Consider a box containing 100 balls of which 20 are labeled A, 30 are labeled B,
and 50 are labeled C, and three other boxes labeled A, B, and C such that: box A contains
8 red and 2 green balls; box B contains 7 red and 3 green balls; and, box C contains 6 red
and 4 green balls. Now suppose that a ball is chosen at random from the box containing
100 balls, the letter (A, B, C) on the ball is noted, and then a ball is chosen at random
24 4. Conditional probability
from the 10 balls in the box with the appropriate letter label. Obviously, the conditional
probabilities of choosing a red ball given the letter label are: Pr(R|A) = .8, Pr(R|B) = .7,
and Pr(R|C) = .6. It is also obvious that the probabilities of selecting the label (A, B, C)
are: Pr(A) = .2, Pr(B) = .3, and Pr(C) = .5. The values of conditional probabilities of the
form Pr(A|R), the conditional probability that the ball was selected from box A given that
it was red, are less obvious. However, these conditional probabilities are readily computed
using Bayes’ Theorem. Thus
Pr(R|A)Pr(A) 16
Pr(A|R) = = ≈ .24
Pr(R) 67
Pr(R|B)Pr(B) 21
Pr(B|R) = = ≈ .31
Pr(R) 67
Pr(R|C)Pr(C) 30
Pr(C|R) = = ≈ .45
Pr(R) 67
It is interesting to compare the unconditional probabilities of drawing from boxes A, B, and
C, Pr(A) = .2, Pr(B) = .3, and Pr(C) = .5, to the corresponding conditional probabilities
given that the ball drawn is known to be red, Pr(A|R) = 16 21
67 ≈ .24, Pr(B|R) = 67 ≈ .31, and
30
Pr(C|R) = 67 ≈ .45. The initial probabilities (before we obtain the additional information
that the ball drawn was red) are known as prior probabilities and the updated probabilities
(conditional on the added information) are known as posterior probabilities.
Example. Suppose balls (objects) are selected at random with replacement from a popu-
lation of N balls, of which N1 are red (R), N2 are green (G), and N3 = N − N1 − N2 are
black (B), sequentially until either a red ball is selected or a green ball is selected. In this
context the elementary outcomes can be represented by finite sequences of the form R, G,
BR, BG, BBR, BBG, . . ., i.e. sequences of the form B . . . BR or B . . . BG. What is the
probability that a red ball will be selected before a green ball is selected? Reasoning as in
the geometric distribution example of Section 3.3, it is clear that
∞ µ ¶x µ ¶
X N3 N1 N1 N1
Pr(red before green) = = = ,
x=0
N N N − N3 N1 + N 2
and analogously
N2
Pr(green before red) = .
N1 + N 2
There is an interesting connection between these probabilities and certain conditional
probabilities defined in terms of the selection of a single ball from this population. Note that
if one ball is selected at random, then Pr(R) = NN1 , Pr(G) = NN2 , and Pr(R or G) = N1N +N2
.
4. Conditional probability 25
Thus, the conditional probability of selecting a red ball given that the ball selected is red
or green is Pr(R|R or G) = N1N+N 1
2
. Similarly, the conditional probability of selecting a
green ball given that the ball selected is red or green is Pr(G|R or G) = N1N+N2
2
.
Example. Craps. Craps is a popular dice game. In this game a pair of fair dice is thrown
(tossed) and the sum of the numbers on the dice is computed; this action is repeated until
the player either wins or loses. The outcome of the game is determined on the first throw
when the player throws: seven or eleven “a natural” in which case the player wins or two,
three or twelve “craps” in which case the player loses (craps out). If any other sum is
thrown four, five, six, eight, nine, or ten, then the number thrown becomes the player’s
“point” and play continues until the player makes his point or throws a seven. If the player
makes his point he wins and if he throws a seven he loses (craps out).
The probabilities of the various sums on a single throw, which were computed in an
1 2
example in Section 3.1, are: Pr(2) = Pr(12) = 36 , Pr(3) = Pr(11) = 36 , Pr(4) = Pr(10) =
3 4 5 6
36 , Pr(5) = Pr(9) = 36 , Pr(6) = Pr(8) = 36 , and Pr(7) = 36 . The probability of winning
on the first throw is
8
p0 = Pr(7 or 11) = 36 .
The other ways to win correspond to throwing a 4, 5, 6, 8, 9, or 10 on the first throw and
then making this point in a sequence of throws. Consider first the case when the player
throws a 4 on the first throw. It is easy to see that the event that the player makes his
point by throwing a 4 before a 7 is independent of the outcome of the first toss so that the
probability of winning with a 4 is
p4 = Pr(4 on the first throw)Pr(making the point 4)
Appealing to the preceding example the probability of making the point 4 is equal to the
conditional probability of throwing a 4 on a single throw given that the single throw results
in a 4 or a 7. Hence p4 is given by ³ the product
¡3¢ 3 ´ 9 1
p4 = Pr(4)Pr(4|4 or 7) = 36 3+6 = 36·9 = 36
Applying this argument to the other possible point values gives:
¡ 4 ¢³ 4 ´ 16 2
p5 = Pr(5)Pr(5|5 or 7) = 36 4+6 = 36·10 = 45
¡ 5 ¢³ 5 ´ 25 25
p6 = Pr(6)Pr(6|6 or 7) = 36 5+6 = 36·11 = 396
¡ 5 ¢³ 5 ´ 25 25
p8 = Pr(8)Pr(8|8 or 7) = 36 5+6 = 36·11 = 396
¡4¢ 4³ ´
16 2
p9 = Pr(9)Pr(9|9 or 7) = 36 4+6 = 36·10 = 45
¡ 3 ¢³ 3 ´ 9 1
p10 = Pr(10)Pr(10|10 or 7) = 36 3+6 = 36·9 = 36
244
Thus the probability of winning is p0 + p4 + p5 + p6 + p8 + p9 + p10 = 495 ≈ .4930.