Recitation 2 Solution
Recitation 2 Solution
Recitation 2
2.1.1 Review
Definition 2.1 (General Definition of Probability). A prob. space consists of a sample space S and
a prob. function P which takes an event A ⊆ S as input and returns a real number P (A) as output.
The function P must satisfy the following Axioms:
Corollary 2.2. Any probability function P (·) that satisfies the three axioms have the following prop-
erties:
1. P (∅) = 0.
2. P (Ac ) = 1 − P (A).
3. 0 ≤ P (A) ≤ 1.
4. If B ⊆ A, then P (B) ≤ P (A).
5. If A1 , A2 , . . . , An are mutually exclusive, then
( n )
∪ ∑
n
P Aj = P (Aj ) .
j=1 j=1
(∪
n
) ∑ ∑ ∑
P Ai = P (Ai ) − P (Ai ∩ Aj ) + P (Ai ∩ Aj ∩ Ak ) − . . .
i=1 i i<j i<j<k
+(−1)n+1 P (A1 ∩ · · · ∩ An ) .
2-1
2-2
2.1.2 Examples
Exercise 1. A consulting firm presently has bids out on three projects. Let Ai = {awarded project
i}, for i = 1, 2, 3, and suppose that P (A1 ) = .22, P (A2 ) = .25, P (A3 ) = .28, P (A1 ∩ A2 ) = .11,
P (A1 ∩ A3 ) = .05, P (A2 ∩ A3 ) = .07, and P (A1 ∩ A2 ∩ A3 ) = .01. Explain the meaning of each of the
following events in words, and compute the probability of each event:
1. A1 ∪ A2 ;
2. Ac1 ∩ Ac2 ;
3. A1 ∪ A2 ∪ A3 ;
4. Ac1 ∩ Ac2 ∩ Ac3 ;
5. Ac1 ∩ Ac2 ∩ A3 ;
6. (Ac1 ∩ Ac2 ) ∪ A3 .
Answer:
1. A1 or A2 occurs.
P (A1 ∪ A2 ) = P (A1 ) + P (A2 ) − P (A1 ∩ A2 ) = .22 + .25 − .11 = .36
2. Neither A1 nor A2 occurs.
P (Ac1 ∩ Ac2 ) = P ((A1 ∪ A2 )c ) = 1 − P (A1 ∪ A2 ) = .64
3. At least one of {A1 , A2 , A3 } occurs.
P (A1 ∪A2 ∪A3 ) = P (A1 )+P (A2 )+P (A3 )−P (A1 ∩A2 )−P (A2 ∩A3 )−P (A1 ∩A3 )+P (A1 ∩A2 ∩A3 ) =
.22 + .25 + .28 − .11 − .05 − .07 + .01 = .53
4. None of {A1 , A2 , A3 } occurs.
P (Ac1 ∩ Ac2 ∩ Ac3 ) = 1 − P (A1 ∪ A2 ∪ A3 ) = .47
5. Only A3 occurs in {A1 , A2 , A3 }.
First note that Ac1 ∩ Ac2 ∩ A3 and Ac1 ∩ Ac2 ∩ Ac3 form a partition of Ac1 ∩ Ac2 , thus
Therefore, P (Ac1 ∩ Ac2 ∩ A3 ) = P (Ac1 ∩ Ac2 ) − P (Ac1 ∩ Ac2 ∩ Ac3 ) = .64 − .47 = .17.
6. A3 occurs, or none of {A1 , A2 } occurs.
P ((Ac1 ∩ Ac2 ) ∪ A3 ) = P (Ac1 ∩ Ac2 ) + P (A3 ) − P (Ac1 ∩ Ac2 ∩ A3 ) = .64 + .28 − .17 = .75
Verify that this is a valid probability function according to the axiomatic definition of probability.
Compare and contrast this probability function with the naive definition of probability.
Answer:
∑
Axiom 1: Because pi is a nonnegative number, we have P (A) = i:xi ∈A pi ≥ 0 for any event A.
∑∞ ∑
Axiom 2: Because S = {x1 , x2 , . . . , xn , . . . } and i=1 pi = 1, we have P (S) = i:xi ∈S pi = 1.
(∪ ) ∑
∞ ∪
Axiom 3: If A1 , A2 , . . . is an infinite collection of disjoint events, then P j=1 j =
A i:xi ∈ ∞ pi =
j=1 Aj
∑∞ ∑ ∑∞
j=1 i:xi ∈Aj pi = j=1 P (Aj ), where the second equality uses the disjointness.
Compared to the naive definition of probability, the probability function defined in this exercise allows
for infinitely many outcomes and the outcomes are not necessarily equally likely. So it is more flexible
than the naive definition of probability.
1
Exercise 3. Consider two events A and B such that P (A) = 3
and P (B) = 21 . Determine the value
of P (B \ A) for each of the following conditions:
Answer: Before answering the questions, first note that B \A and A∩B form a partition of B, namely,
B \ A and A ∩ B are disjoint and (B \ A) ∪ (A ∩ B) = B. Thus
P (B) = P (B \ A) + P (A ∩ B).
P (B \ A) = P (B) − P (A ∩ B).
∑
n
P (Ai ) − (n − 1) ≤ P (∩ni=1 Ai ) ≤ min {P (A1 ), P (A2 ), . . . , P (An )} . (2.2)
i=1
Answer: We first prove the upper bound in Equation (2.1) by induction. First note that for two
events A1 , A2 , P (A1 ∪ A2 ) = P (A1 ) + P (A2 ) − P (A1 ∩ A2 ) ≤ P (A1 ) + P (A2 ). We assume that the
upper bound in Equation (2.1) holds when n = m. Now we prove that it holds when n = m + 1:
∑
m ∑
m+1
≤ P (Ai ) + P (Am+1 ) = P (Ai )
i=1 i=1
By induction the upper bound in Equation (2.1) holds for any positive integer n. To prove the lower
bound in Equation (2.1), we note that Ai ⊆ ∪ni=1 Ai for any i = 1, . . . , n. Thus
P (Ai ) ≤ P (∪ni=1 Ai ), ∀i = 1, . . . , n.
This means that maxi P (Ai ) ≤ P (∪ni=1 Ai ), i.e., the lower bound in Equation (2.1) holds.
P (∩ni=1 Ai ) ≤ P (Ai ), ∀i = 1, . . . , n.
This means that P (∩ni=1 Ai ) ≤ mini P (Ai ), i.e., the upper bound in Equation (2.2) holds.
∑
n
P (Ai ) − (n − 1) ≤ P (∩ni=1 Ai ) ≤ min {P (A1 ), P (A2 ), . . . , P (An )}
i=1
∑
n
≤ max {P (A1 ), . . . , P (An )} ≤ P (∪ni=1 Ai ) ≤ P (Ai ).
i=1
2-5
Definition 2.5 (Conditional probability). If A and B are events with P (B) > 0, then the conditional
probability of A given B, denoted by P (A | B), is defined as
P (A ∩ B)
P (A | B) = .
P (B)
Proposition 2.6 (Conditional probability is valid probability). For any given event E with P (E) > 0,
the conditional probability function P (· | E) satisfies the following properties:
Theorem 2.8 (Probability of the intersection of two events). Let A, B be two events.
Corollary 2.9 (Probability of the intersection of n events). For any events A1 , . . . , An with P (A1 , . . . , An−1 ) >
0,
Theorem 2.10 (Law of total probability). Let A1 , . . . , An be a partition of the sample space S (i.e.,
the Ai are disjoint events and their union is S), with P (Ai ) > 0 for all i. Then
∑
n ∑
n
P (B) = P (B ∩ Ai ) = P (B | Ai ) P (Ai ) .
i=1 i=1
• Bayes’ rule:
P (B | A, E)P (A | E)
P (A | B, E) = .
P (B | E)
P (A ∩ B) = P (A)P (B).
P (A | B) = P (A),
Definition 2.15 (Conditional independence). Events A and B are said to be conditionally independent
given E if
P (A ∩ B | E) = P (A | E)P (B | E).
P (A | B, E) = P (A | E), or P (B | A, E) = P (B | E).
2.2.3 Examples
Exercise 5. Consider randomly selecting a student at a certain university, and let A denote the event
that the selected individual has a Visa credit card and B be the analogous event for a MasterCard.
Suppose that P (A) = 0.5, P (B) = 0.4, and P (A ∩ B) = 0.25.
1. Compute the probability that the selected individual has at least one of the two types of cards
(i.e., the probability of the event A ∪ B).
2-7
2. What is the probability that the selected individual has neither type of card?
3. Describe, in terms of A and B, the event that the selected student has a Visa card but not a
MasterCard, and then calculate the probability of this event.
4. Calculate and interpret each of the following probabilities: P (B | A), P (A | B), P (Ac | B).
5. Given that the selected individual has at least one card, what is the probability that he or she
has a visa card?
6. Are events A and B independent?
1. At least having one type of card: P (A ∪ B) = P (A) + P (B) − P (A ∩ B) = 0.5 + 0.4 − 0.25 = 0.65.
2. Having neither a Visa nor MasterCardP (Ac ∩B c ) = P ((A∪B)c ) = 1−P (A∪B) = 1−0.65 = 0.35.
3. Having a Visa without a MasterCard: P (A ∩ B c ) = P (A) − P (A ∩ B) = 0.5 − 0.25 = 0.25.
4. The conditional probabilities are as follow:
P (A ∩ B) 0.25
P (B | A) = = = 0.5,
P (A) 0.5
P (A ∩ B) 0.25
P (A | B) = = = 0.625,
P (B) 0.4
P (Ac ∩ B) P (B) − P (A ∩ B)
P (Ac | B) = = = 1 − P (A | B) = 0.375.
P (B) P (B)
we can conclude that events A and B are NOT independent. We can also check if P (A ∩ B) =
P (A)P (B) according to Definition 2.13. We have
Exercise 6. Prove each of the following statements. (Assume that any conditioning event has positive
probability.)
Answer:
P (A ∩ B c ) ≤ P (B c ) = 0.
P (A) = P (A ∩ B) + P (A ∩ B c ) = P (A ∩ B).
It follows that
P (A ∩ B)
P (B | A) = = 1.
P (A)
This shows that if an event is sure to occur (P (B) = 1) even without any evidence, then it will
occur with new evidence (P (B | A) = 1). Moreover, since P (A∩B) = P (B | A)P (A) = 1×P (A) =
P (B)P (A) for any event A, we also have that an event with probability 1 is independent with
any other event. In class, we also learnt that an event with probability 0 is also independent with
any other event. Thus, independence is automatic for events with probability of 0 or 1.
2. The fact that A ⊂ B implies P (A ∩ B) = P (A). This in turn means that
P (A ∩ B)
P (B | A) = = 1.
P (A)
Moreover,
P (A ∩ B) P (A)
P (A | B) = = .
P (B) P (B)
The first conclusion is intuitive: A ⊂ B means that if A occurs then B occurs; so given that
knowledge that A occurs, it is reasonable to expect that the conditional probability of B equals
1 (P (B | A) = 1).
3. Since A, B are mutually exclusive, A ∩ (A ∪ B) = (A ∩ A) ∪ (A ∩ B) = A ∪ ∅ = A and P (A ∪ B) =
P (A) + P (B). Thus
P (A ∩ (A ∪ B)) P (A)
P (A | A ∪ B) = = .
P (A ∪ B) P (A) + P (B)
This means that for two mutually exclusive events, if we know that one of them occur, then the
conditional probability of each event is proportional to its unconditional probability.
4. We prove the conclusion by contradiction. Suppose that P (A | B) ≤ P (A) but P (A | B c ) < P (A).
According to the Law of Total Probability,
P (B c ) = P (A ∩ B c ) + P (Ac ∩ B c ) = P (Ac ∩ B c ).
Therefore,
P (B c ∩ Ac )
P (Ac | B c ) = = 1.
P (B c )
This is a probabilistic version of the law of contraposition ( ). The contrapositive of the
conditional statement “If P , then Q” is “If not Q, then not P ”. For example, the contrapositive
of “If it is raining, then I wear my coat” is “If I don’t wear my coat, then it isn’t raining”. The
law of contraposition says that a conditional statement is true if, and only if, its contrapositive is
true.
Exercise 7. An oil exploration company currently has two active projects, one in Asia and the other in
Europe. Let A be the event that the Asian project is successful and B be the event that the European
project is successful. Suppose that A and B are independent events with P (A) = .4 and P (B) = .7.
1. If the Asian project is not successful, what is the probability that the European project is also
not successful? Explain your reasoning.
2. What is the probability that at least one of the two projects will be successful?
3. Given that at least one of the two projects is successful, what is the probability that only the
Asian project is successful?
1. Since the events are independent, then Ac and B c are independent, too. Thus, P (B c |Ac ) =
P (B c ) = 1–0.7 = 0.3.
2. Using the inclusion-exclusion rule, P (A ∪ B) = P (A) + P (B)–P (A ∩ B) = 0.4 + 0.7–(0.4)(0.7) =
0.82. Since A and B are independent, we are permitted to write P (A ∩ B) = P (A)P (B) =
(0.4)(0.7).
3. The probability is
P ((A ∩ B c ) ∩ (A ∪ B)) P {[(A ∩ B c ) ∩ A] ∪ [(A ∩ B c ) ∩ B]}
P (A ∩ B c | A ∪ B) = =
P (A ∪ B) P (A ∪ B)
P (A ∩ B )
c c
P (A)P (B ) 0.4 · 0.3
= = = = 0.146.
P (A ∪ B) P (A ∪ B) 0.82
Here the third equality follows from the fact that (A ∩ B c ) ∩ A = A ∩ B c and (A ∩ B c ) ∩ B = ∅,
and the fourth equality follows from the fact that A and B c are independent (because A and B
are independent; see proposition 1.2 in the lecture notes).
2-10
Exercise 8. Consider the Monty Hall problem in example 1.11 in Lecture notes. Now suppose that
Monty enjoys opening Door 2 more than he enjoys opening Door 3, and if he has a choice between
opening these two doors, he opens Door 2 with probability p, where 1
2
≤ p ≤ 1.
To recap: there are three doors, behind one of which there is a car (which you want), and behind the
other two of which there are goats (which you don’t want). Initially, all possibilities are equally likely
for where the car is. You choose a door, which for concreteness we assume is Door 1. Monty Hall then
opens a door to reveal a goat, and offers you the option of switching. Assume that Monty Hall knows
which door has the car, will always open a goat door and offer the option of switching, and as above
assume that if Monty Hall has a choice between opening Door 2 and Door 3, he chooses Door 2 with
probability p with ( 12 ≤ p ≤ 1).
1. Find the unconditional probability that the strategy of always switching succeeds (unconditional
in the sense that we do not condition on which of Doors 2, 3 Monty opens).
2. Find the probability that the strategy of always switching succeeds, given that Monty opens Door
2.
3. Find the probability that the strategy of always switching succeeds, given that Monty opens Door
3.
Answer:
1. Let Cj be the event that the car is hidden behind door j and let W be the event that we win
using the switching strategy. Using the law of total probability, we can find the unconditional
probability of winning in the same way as in class:
2. Let Di be the event that Monty opens Door i. Note that we are looking for P (W | D2 ), which
is the same as P (C3 | D2 ) as we first choose Door 1 and then switch to Door 3. By Bayes rule
and the law of total probability,
P (D2 | C3 ) P (C3 )
P (C3 | D2 ) =
P (D2 )
P (D2 | C3 ) P (C3 )
=
P (D2 | C1 ) P (C1 ) + P (D2 | C2 ) P (C2 ) + P (D2 | C3 ) P (C3 )
1 · 1/3
=
p · 1/3 + 0 · 1/3 + 1 · 1/3
1
=
1+p
2-11
3. The structure of the problem is the same as part (b) (except for the condition that p > 1/2, which
was no needed above). Imagine repainting doors 2 and 3, reversing which is called which. We get
P (C2 | D3 ) = 1
1+(1−p)
= 1
2−p
.
Exercise 9. Joe decides to take a series of n tests, to diagnose whether he has a certain disease (any
individual test is not perfectly reliable, so he hopes to reduce his uncertainty by taking multiple tests).
Let D be the event that he has the disease, p = P (D) be the prior probability that he has the disease,
and q = 1 − p. Let Tj be the event that he tests positive on the j th test.
1. Assume for this part that the test results are conditionally independent given Joe’s disease status.
Let a = P (Tj | D) and b = P (Tj | Dc ), where a and b don’t depend on j. Find the posterior
probability that Joe has the disease, given that he tests positive on all n of the n tests.
2. Suppose that Joe tests positive on all n tests. However, some people have a certain gene that
makes them always test positive. Let G be the event that Joe has the gene. Assume that
P (G) = 1/2 and that D and G are independent. If Joe does not have the gene, then the
test results are conditionally independent given his disease status. Let a0 = P (Tj | D, Gc ) and
b0 = P (Tj | Dc , Gc ), where a0 and b0 don’t depend on j. Find the posterior probability that Joe
has the disease, given that he tests positive on all n of the tests.
Answer:
1. Denote the event that all tests are positive by T ∗ , then the desired posterior probability is
P (D|T ∗ ). According to the Bayes’ rule, we have
P (T ∗ | D)P (D)
P (D | T ∗ ) =
P (T ∗ | D)P (D) + P (T ∗ |Dc )P (Dc )
∏n
P (D) i=1 P (Tj | D)
= ∏n ∏n
P (D) i=1 P (Tj | D) + P (Dc ) i=1 P (Tj | Dc )
pan
= n
pa + qbn
2. We still denote the event that all tests are positive by T ∗ , and the desired posterior probability
is still P (D|T ∗ ). The subtlety lies in the calculation of P (T ∗ | D):
P (T ∗ | D) = P (T ∗ | D, G)P (G | D) + P (T ∗ | D, Gc )P (Gc | D)
= P (T ∗ | D, G)P (G) + P (T ∗ | D, Gc )P (Gc )
1 ∏n
= (1 + P (Tj | D, Gc ))
2 i=1
1
= (1 + an0 )
2
2-12
Here the first equality follows from the Law of Total Probability that additionally conditions on
the event D (See Corollary 1.3 in Lecture notes).
Similarly, we can get P (T ∗ | Dc ) = 12 (1 + bn0 ). Hence,
P (T ∗ | D)P (D)
P (D | T ∗ ) =
| D)P (D) + P (T ∗ |Dc )P (Dc )
P (T ∗
p(1 + an0 )
=
p(1 + an0 ) + q(1 + bn0 )