0% found this document useful (0 votes)
12 views38 pages

02 IndCondProb

Uploaded by

haithamnoruldeen
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views38 pages

02 IndCondProb

Uploaded by

haithamnoruldeen
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 38

Independence

We have already considered several cases where we have multiple


outcomes (e.g., two rolls of a die, several tosses of a coin, a pair of
variables (x, y) drawn from a unit-square). In each of these cases we
have been inherently assuming that one outcome gives us no informa-
tion about the other (the first roll of the die or toss of the coin has
no impact on the outcome for the next roll or toss, and the value of
x tells us nothing about the value of y).

We can formalize this notion as follows: two events A and B are


independent if
P (A \ B) = P (A) · P (B) .

That is, if two events are independent, we can compute the probabil-
ity that they both occur simultaneously by computing both of their
individual probabilities and combining them (with a product).

Here are other examples of events that are independent:


The outcomes of consecutive rolls of a die.
Whether you are over 6 feet tall, and whether the person sitting
next to you is over 6 feet tall.
Whether two randomly chosen people in the classroom had a
dog when they were younger.
etc.

20
ECE 3077 Notes by M. Davenport, J. Romberg, C. Rozell, and M. Wakin. Last updated 21:27, August 27, 2023
Here are examples of events that are not independent:
The sum of two die rolls, and the value of the first roll.
The first card that you draw out of a well shu✏ed deck is a
spade, and the second card you draw is a spade.
Whether you are over 6 feet tall, and whether your father is
over 6 feet tall.
Whether Apple stock goes up tomorrow, and whether Google
stock goes up tomorrow.

Important note: Disjointness does not imply independence!


In fact, the opposite is true; if A and B are disjoint (mutually exclu-
sive) events (i.e., A \ B = ;) with positive probabilities, then
P (A \ B) = P (;) = 0 6= P (A) P (B) .
In this case, knowing that A occurred tells you something very tan-
gible about B, namely that it did not occur.
Independent events Disjoint events

⌦ ⌦
A

B A B

P (A \ B) = P (A) · P (B) P (A \ B) = P (;) = 0

21
ECE 3077 Notes by M. Davenport, J. Romberg, C. Rozell, and M. Wakin. Last updated 21:27, August 27, 2023
Examples. Consider an experiment involving two successive rolls of
a fair (6-sided) die. Which pairs of events below are independent?
1
1. A = {the first roll is a 3}, PIA 361 46 P B 6
B = {the second roll is a 6}
6
Ans. P (A) = 36 = 16 and P (B)I = 36
6
116 46
= 16 , thus AUB 436
1
P (A \ B) = 36 = P (A) P (B) =) independent.
2. A = {the first roll is a 1},
They are independent
B = {the sum of the two rolls is 2}
, thus 1 1136
P A
6
= 16 and P (B) = 36 1
P (A) = 36
A1 P
P (A \ B) = 36 6= P (A) P (B) =) not 36 B
PIA1133
independent.
They
3. A = {the first roll are independent
is even},
B = {the sum of the two rolls is even}
3 (B) =1 18
= 18 2 2 , thus They ate
1 1
P (A)PCA36 = 2 and P = indesendent
9 6 36
P (A \ B) = 36 = P (A) P (B) =) independent.
PB 3 PIARB Hy
4. A = {the first roll is even},
B = {the product of the two rolls is even}
P (A) = 18 1 27
36 = 2 and P (B) = 36 , thus
P (A \ B) = 18 1
6 P (A) P (B) =) not independent.
36 = 2 =
5. A = {at least one roll was a 2},
B = {the sum of the rolls is 5}
P (A) = 11 4 1
36 and P (B) = 36 = 9 , thus
2 1
P (A \ B) = 36 = 18 6= P (A) P (B) =) not independent.
6. A = {the first roll is a 2},
B = {the sum of the two rolls is 7}
(tread carefully)
6
P (A) = 36 = 16 and P (B) = 36
6
= 16 , thus
1
P (A \ B) = 36 = P (A) P (B) =) independent.

22
ECE 3077 Notes by M. Davenport, J. Romberg, C. Rozell, and M. Wakin. Last updated 21:27, August 27, 2023
Exercise: We generate two bits B1 and B2 independently at random
with
1
P (B1 = 0) = P (B1 = 1) = ,
2
and similarly for B2 . Set
Z = B1 B2 , where is the ‘xor’ operator.
Are the events {B1 = 1} and {Z = 1} independent?
Ans.
P 2 1 42 P PB(B1 =k1) = 1 K PCB k days
2
Ya all 1
1) = two
P (Z = the events are inferender
2
1
P (B1 = 1 \ Z = 1) = = P (B1 = 1) P (Z = 1) =) independent.
4

Exercise: Suppose that a point (x, y) is chosen from the unit square
[0, 1] ⇥ [0, 1] in the plane according to the uniform law. Consider the
events
A = {x + y  1}
B = {max(x, y)  0.5}
Are A and B independent?
Ans.
1 I
P (A) =
2
1
P (B) =
A 4
Yas P (A \ B) =
1
4
6= P (A) P (B) =) not independent.

B
O X of23 T
ECE 3077 Notes by M. Davenport, J. Romberg, C. Rozell, and M. Wakin. Last updated 21:27, August 27, 2023
PCA K B B
Yy Plans
Exercise: Suppose that a point (x, y) is chosen from the unit square
D eventscredent

[0, 1] ⇥ [0, 1] in the plane according to the uniform law. Consider the
events

A = {x + y  1}
B = {{max(x, y)  0.5} [ {min(x, y) 0.5}}

Are A and B independent?


Ans.
1
P (A) =
plat lk PIB 112
Ik
2
1
P (B) =
HARB
YY
2
1
P (A \ B) = = P (A) P (B) =) independent.
4 They ate independent
Exercise: Suppose that a point (x, y) is chosen from the unit square
[0, 1] ⇥ [0, 1] in the plane according to the uniform law. Consider the
events

A = {x + y  1}
B = {{max(x, y)  0.25} [ {min(x, y) 0.5}}

Are A and B independent?


Ans. deserter
1
P (A) =
2
5
P (B) =
16
1
P (A \ B) = 6= P (A) P (B) =) not independent.
16

24
ECE 3077 Notes by M. Davenport, J. Romberg, C. Rozell, and M. Wakin. Last updated 21:27, August 27, 2023
Independence of multiple events
Now, suppose we are interested in more than two events.
A collection of n events A1 , A2 , . . . , An are independent if and only if
the probability of any intersection of the Ai ’s is the product of the
individual probabilities.
P (Ai \ Aj ) = P (Ai ) P (Aj ) for all i, j.
(This is called pairwise independence.)
P (Ai \ Aj \ Ak ) = P (Ai ) P (Aj ) P (Ak ) for all i, j, k.
...

P (A1 \ A2 \ · · · \ An ) = P (A1 ) P (A2 ) · · · P (An )

Note: Pairwise independence does not imply independence. To see


this, let s1 = ±1 and s2 = ±1 be independent signs with
1
P (s1 = 1) = P (s1 =
1) = ,
2
and similarly for s2 . Set x = s1 s2 , and consider the events
A = {s1 = 1}
B = {s2 = 1}
C = {x = 1}.
Then it easy to see that while A and B are independent, A and C are
independent, and B and C are independent, but A, B, C are clearly
not independent:
P (A \ B \ C) 6= P (A) P (B) P (C) ,
since knowing two of s1 , s2 , x uniquely determines the third.

25
ECE 3077 Notes by M. Davenport, J. Romberg, C. Rozell, and M. Wakin. Last updated 21:27, August 27, 2023
Because the joint probabilities of independent events factor, it is easy
to calculate the probability of the combination of a bunch of indepen-
dent events by simple multiplication.

Exercise: M&Ms come in six colors: red, blue, green, yellow, brown,
and orange. Suppose that our bag contains an equal number of all
colors and imagine repeatedly drawing M&Ms from the bag, return-
ing the M&M to bag after each draw. What is the probability that
the next six M&Ms I pull out of this bag are all green?

Pl ni g n 9
Ans. Mb 9
PIM
P (M 7 9. .116
= ‘g’, . , M = ‘g’)P= P (MMirman
= ‘g’) · · · P (M = ‘g’)
1
✓ ◆
6

1
1
6
6
Mb 7
116
=

Fsu
6
1
= .
46, 656

Exercise: If I am an 85% free throw shooter, what is the probability


that I make my next three free throws?

PC
Ans. Assuming 1
Shot independence,
P thisShot 2 0.853 PC
is simply I
⇡ 0.614. Shot 33 0.85
The events are Independente Hence Pls a a
p 85 o
614
26
ECE 3077 Notes by M. Davenport, J. Romberg, C. Rozell, and M. Wakin. Last updated 21:27, August 27, 2023
Conditional Probability
Dependent
In general, events are not independent and knowing about one event
can provide significant information about another event. Conditional
probability gives us a systematic way to reason about the outcome of
an experiment based on this type of partial information.
Examples:
A die is rolled two times. You are told that the sum of the rolls
is 9. What is the probability that the first roll was a 6?
The alarm is going o↵ in my house. What is the probability
that there is a burglar present?
You enjoyed watching “The Mandalorian”. How likely is it that
you will enjoy watching “Ahsoka”?
Given knowledge of an event B, we can construct a new (updated)
probability law for outcomes in ⌦.
Definition: The conditional probability of an event A given the
occurrence of an event B with P (B) > 0 is

P (A \ B)
I really like
P (A|B) = .
P (B) the calmInternally
peaceful Personimbet
One way to think about this graphically is that you are redefining hi
the sample space to be B and then calculating the relative size of A
within that new sample space. It is not hard to check that if P (·) is
a valid probability law on ⌦ (i.e., satisfies the Kolmogorov axioms),
then P (·|B) is also a valid probability law.
Important Note: P (A|B) 6= P (B|A). This is one of the most
common mistakes made by people with no background in probability.
Much more on this later.

27
ECE 3077 Notes by M. Davenport, J. Romberg, C. Rozell, and M. Wakin. Last updated 21:27, August 27, 2023
Example. Suppose I open a book written in English, put my finger
down at a random location, and then note which letter is closest.
Here are the probabilities for each of the 26 letters:
0.14

0.12

0.1

0.08

0.06

0.04

0.02

0
a b c d e f g h i j k l m n o p q r s t u v w x y z

(Another way to interpret this: A’s make up just over 8% of letters


in English, B’s make up about 1.8%, E’s about 12.5%, etc.)

Now suppose that I see my finger falls on an L. Here are the condi-
tional probabilities for the letter that immediately following the L in
the text:
0.18

0.16

0.14

0.12

0.1

0.08

0.06

0.04

0.02

0
a b c d e f g h i j k l m n o p q r s t u v w x y z

28
ECE 3077 Notes by M. Davenport, J. Romberg, C. Rozell, and M. Wakin. Last updated 21:27, August 27, 2023
Example. Recall the example where we toss a fair coin three con-
secutive times. The sample space consists of eight sequences:
A
1 I
⌦ = {HHH, HHT, HTH, HTT, THH, THT, TTH, TTT}.

Now suppose that we wish to find the conditional probability P (A|B)


where A and B are the events

A = {we toss two or more heads},


B = {the first toss is a head}.

First compute
4 1
- P (B) =
f
= .
8 2 112

Then find
3
P (A \ B) = .
8

And then combine these to find


P (A \ B) 3
P (A|B) =
P (B)
= .
4
L
Compare this to P (A). P (A) = 48 = 12 6= P (A|B) . One way to
think about this particular example is that knowing B has occurred
increases the probability that A must have occurred.

29
ECE 3077 Notes by M. Davenport, J. Romberg, C. Rozell, and M. Wakin. Last updated 21:27, August 27, 2023
Exercise:
Suppose that we now toss a fair coin ten consecutive times. Let A
and B denote the events

A = {we toss two or more heads},


B = {the first toss is a head}.

What is the conditional probability P (A|B)?

PIB
we could Ya
PCB RA
Ans. To calculate P (A|B) we would like P (A \ B). While in theory
enumerate all of the possibilities and count them up, this is
a huge number. Sometimes it’s easier to think about when an event
fails instead of when it succeeds. For example, consider the event
pal
(Ac \ B), which can only happen if we get heads on the first toss and
Penaltffffaway men nine
tails on all the rest. So, by independence we can calculate that:
tails
P ((Ac \ B)) = P (C1 = ‘H’, C2 = ‘T’, . . . , C10 = ‘T’)

that
= P (C1 = ‘H’) P (C2 = ‘T’) . . . P (C10 = ‘T’) = (0.5)10
Ya PCBRA PCAA PIBAA Ya My
c
Then note that we can write B = (A \ B) [ (A \ B), and because
these sets are disjoint, the probabilities add:
0.4990
P (B) = P ((A \ B) [ (Ac \ B)) = P ((A \ B)) + P ((Ac \ B))

PCA B P ((Ac \ B)) = 0.5 (0.5)10 = 0.4990

PATH 0121
=) P ((A \ B)) = P (B)
Finally we can calculate the conditional probability we desire:
0.978
P (A \ B) 0.4990
P (A|B) = = = 0.998
P (B) 0.5

30
ECE 3077 Notes by M. Davenport, J. Romberg, C. Rozell, and M. Wakin. Last updated 21:27, August 27, 2023
Exercise:
Suppose that a point (x, y) is chosen from the unit square [0, 1]⇥[0, 1]
in the plane according to the uniform law. Consider the events

A = {x + y  1}, B = {y  0.5}.

(a) Sketch the sample space ⌦ and the events A and B as subsets of
⌦.

Eia

30
f 318

PA y
Y a lot 112 PlB I 42 112 PLANB
118
(b) Calculate P (A|B).
Ans.
1
P (B) =
PIA B 2

D
PITI
3
P (A \ B) =
8
P (A \ B) 3
P (A|B) = =
P (B) 4

31
ECE 3077 Notes by M. Davenport, J. Romberg, C. Rozell, and M. Wakin. Last updated 21:27, August 27, 2023
Independence and Conditional Probability
Conditional probability gives us a very nice way to interpret (and
check for) the independence of two events. Events A and B are in-
dependent if and only if

P (A|B) = P (A) and/or P (B|A) = P (B) .


PIA PIB AARP

(We say “and/or” above since if one of those relations is true, the
Madge
other must be, so we only need to check for one of them.)

This is a mathematical formalization of the idea that if A and B are


independent, then learning about A tells us nothing about B (and
vice versa) — the probability that it occurs does not change.

Actually, many of the earlier exercises are easier to work out using
this definition rather than checking P (A \ B) = P (A) P (B) directly.

The Multiplication Rule


One of the main benefits of independence is that you can calculate
(potentially complicated) joint probabilities with simple multiplica-
tion. Specifically, if a sequence of events A1 , A2 , . . . , An are indepen-
dent, then we can compute the probability that they all occur simply
via
P (A1 \ A2 \ · · · \ An ) = P (A1 ) · P (A2 ) · · · P (An ) ,
or, alternatively written as
n
! n
\ Y
P Ai = P (Ai ) .
i=1 i=1

32
ECE 3077 Notes by M. Davenport, J. Romberg, C. Rozell, and M. Wakin. Last updated 21:27, August 27, 2023
Conditional probability lets us generalize this approach to events that
are not necessarily independent using conditional probability as fol-
lows:
n
! T
\ P (A1 \ A2 ) P (A1 \ A2 \ A3 ) P ( ni=1 Ai )
P Ai = P (A1 ) · · · · · ⇣T ⌘
P (A ) P (A \ A ) P n 1
j=1 Aj
i=1 1 1 2

= P (A1 ) P (A2 |A1 ) P (A3 |A2 \ A1 ) · · ·


P (An |A1 \ A2 \ . . . \ An 1 )

An example will make it clear why this expansion is useful.

33
ECE 3077 Notes by M. Davenport, J. Romberg, C. Rozell, and M. Wakin. Last updated 21:27, August 27, 2023
Example. A researcher conducting an opinion poll has a list of
phone numbers consisting of 500 Republicans and 500 Democrats.
Unfortunately, the names aren’t labeled with their political affiliation.
The researcher calls 3 people randomly chosen from the list. What is
the probability that none of the people called is a Republican?
To solve this, we let
A1 = {the 1st person called is not a Republican}
A2 = {the 2nd person called is not a Republican}
A3 = {the 3rd person called is not a Republican}
We want to compute P (A1 \ A2 \ A3 ). Applying the multiplication
rule,
P (A1 \ A2 \ A3 ) = P (A1 ) P (A2 |A1 ) P (A3 |A1 \ A2 ) .
Computing each of the three terms on the right is straightforward.
Of the 1000 people on the list, 500 are Democrats and so
✓ ◆
500 1

SIX
P (A1 ) = = .
1000 2
Given that the first person was not a Republican, we know that 499
of the remaining 999 people are not Republicans, so

17
499
P (A2 |A1 ) =
.
999
Given that the first two people were not Republicans, we know that
498 of the remaining 998 are not Republicans, and so
498
P (A3 |A1 \ A2 ) = .
998
Thus
500 · 499 · 498
P (A1 \ A2 \ A3 ) = 112· 999 · 998 ⇡ 0.125.
If
1000

34
ECE 3077 Notes by M. Davenport, J. Romberg, C. Rozell, and M. Wakin. Last updated 21:27, August 27, 2023
Total Probability Theorem
If we have at our disposal a natural way to partition the sample space
⌦, then we have yet another way we can use conditional probability
to break apart calculations.
Let A1 , A2 , . . . , An be disjoint events that partition ⌦, meaning:
Ai \ Aj = ; for all i 6= j
A1 [ A2 [ · · · [ An = ⌦.
That is, every outcome is in one and only one of the events An . We
will also assume that P (An ) > 0 for n = 1, . . . , N . Then for any
event B,
P (B) = P (A1 \ B) + P (A2 \ B) + · · · + P (An \ B)

Xn we
= P (B|A1 ) P (A1 ) + P (B|A2 ) P (A2 ) + · · · + P (B|An ) P (An )

= P (B|Ai ) P (Ai ) .
i=1
PLANB
Here is a picture to illustrate the idea:
PCBIADP
IAI
DIII PIA

A1
B
A2

A3

35
ECE 3077 Notes by M. Davenport, J. Romberg, C. Rozell, and M. Wakin. Last updated 21:27, August 27, 2023
Example. No discussion of conditional probability would be com-
plete without mention of the famous “Monty Hall problem”. Suppose
you are on a game show where there are three doors. Behind one of
the doors is a prize (e.g., a car), and behind the other two doors some-
thing of little value (usually a goat). You get to pick one of the doors.
(Clearly, at this stage in the game, your chance of having picked the
correct door is 13 .) Where it gets interesting is that next, the host of
the show reveals a goat behind one of the two doors you did not pick.
You now have the chance to either stick with your original guess, or
switch to the other door. Which should you choose?

Let’s first define A to be the event that the strategy of sticking with
the original guess results in winning the prize. For convenience, let’s
number the doors, and we will start with the original guess as “Door
1”. We can then compute
3
X
P (A) = P (A|Car behind “Door i”) P (Car behind “Door i”)
i=1
= 1 · 13 + 0 · 13 + 0 · 13 = 1
3

Next define B to be the event that the strategy of switching doors


results in winning the prize. Then we have
3
X
P (B) = P (B|Car behind “Door i”) P (Car behind “Door i”)
i=1
= 0 · 13 + 1 · 13 + 1 · 13 = 2
3

Thus, you should always switch! Some people have a hard time believ-
ing this result. As a thought experiment, imagine the same game but
now with 100 doors. After you pick one, the host opens 98 of them
leaving one unopened (in addition to “your” door). In this case, most
people would decide to switch if given the chance.

PCA PA cat at Di 1 113 0 13 0.113 430


T t Plalacarathitesal
36
sat at Da
ECE 3077 Notes by M. Davenport, J. Romberg, C. Rozell, and M. Wakin. Last updated 21:27, August 27, 2023
do.tt d Mel Meat at all 0.113 43.1 43.1 430
PIRI IBI
carats BBicarath carats
Exercise:
Robert is the star quarterback for your favorite football team. His
knee is bothering him, and so there is only a 40% chance he plays in
the next game. If he plays, the probability that your team wins is
0.75. If he does not, it is only 0.35. What is the probability that your
team wins the game?
A Robert
Plays
Ans. Define A to be the event that Robert plays, and B to be the
Pfa Y B We know that
evento that the team wins.
The team wing
RB P (A) = 0.4,

and thus
PLAY o b P (Ac ) = 1 P (A) = 0.6.
We are also given that
solution P (B|A) = 0.75 and P (B|Ac ) = 0.35.

Since A and Ac are disjoint sets, we can use the total probability

PIAST ABIA'S PCA


theorem to write:
PCBIA
P (B) = P (B|A) P (A) + P (B|Ac ) P (Ac )
= 0.75 · 0.4 + 0.35 · 0.6
0.75= 0.51.O Y
0.35 o b
og

37
ECE 3077 Notes by M. Davenport, J. Romberg, C. Rozell, and M. Wakin. Last updated 21:27, August 27, 2023
Exercise:
Anders and Blake both have coolers; Anders’ is filled with 8 sodas
and 3 beers, while Blake’s is filled with 2 sodas and 11 beers. The
coolers look identical, so you just choose one at random and start
pulling out drinks.
(a) Suppose you pull out one drink. What is the probability that you
pull out a soda?
Ans. Let A be the event that you choose Anders’ cooler, and
B = Ac be the event that you choose Blake’s cooler. Let S be
8
the event that you draw a soda. We know that P (S|A) = 11 and
2
P (S|B) = 13 . We assume that P (A) = P (B) = 0.5. By total
probability we have that:

P (S) = P (S|A) P (A) + P (S|B) P (B) = 12 ( 11


8 2
+ 13 ) ⇡ 0.441.

(b) Now suppose you pull out two drinks. What is the probability
you pull out two sodas?
Ans. Let S1 be the event that you draw a soda in the first pick,
this part
and S2 be the event that you draw sodas in both your first and
second pick.will make
of
As above,
Adding the ranker
we can use total probability to account
for the unknown choice of the cooler:
soda's then
in P (S )finding
= P (S |A) P (A) the
beneficial
2
probabilities
+ P (S |B) P (B)
2 2

Next note that using the multiplication rule we have P (S2 |A) =
8 7 2 1
P (S1 |A) P (S2 |S1 \ A) = 11 · 10 and similarly P (S2 |B) = 13 · 13 .
Putting these pieces together we have

P (S2 ) = P (S2 |A) P (A)+P (S2 |B) P (B) = 12 ( 11·10


8·7 2
+ 13·12 ) ⇡ 0.261.

38
ECE 3077 Notes by M. Davenport, J. Romberg, C. Rozell, and M. Wakin. Last updated 21:27, August 27, 2023
Independence
We have already considered several cases where we have multiple
outcomes (e.g., two rolls of a die, several tosses of a coin, a pair of
variables (x, y) drawn from a unit-square). In each of these cases we
have been inherently assuming that one outcome gives us no informa-
tion about the other (the first roll of the die or toss of the coin has
no impact on the outcome for the next roll or toss, and the value of
x tells us nothing about the value of y).

We can formalize this notion as follows: two events A and B are


independent if
P (A \ B) = P (A) · P (B) .

That is, if two events are independent, we can compute the probabil-
ity that they both occur simultaneously by computing both of their
individual probabilities and combining them (with a product).

Here are other examples of events that are independent:


The outcomes of consecutive rolls of a die.
Whether you are over 6 feet tall, and whether the person sitting
next to you is over 6 feet tall.
Whether two randomly chosen people in the classroom had a
dog when they were younger.
etc.

20
ECE 3077 Notes by M. Davenport, J. Romberg, C. Rozell, and M. Wakin. Last updated 10:02, August 28, 2023
Here are examples of events that are not independent:
The sum of two die rolls, and the value of the first roll.
The first card that you draw out of a well shu✏ed deck is a
spade, and the second card you draw is a spade.
Whether you are over 6 feet tall, and whether your father is
over 6 feet tall.
Whether Apple stock goes up tomorrow, and whether Google
stock goes up tomorrow.

Important note: Disjointness does not imply independence!


In fact, the opposite is true; if A and B are disjoint (mutually exclu-
sive) events (i.e., A \ B = ;) with positive probabilities, then
P (A \ B) = P (;) = 0 6= P (A) P (B) .
In this case, knowing that A occurred tells you something very tan-
gible about B, namely that it did not occur.
Independent events Disjoint events

⌦ ⌦
A

B A B

P (A \ B) = P (A) · P (B) P (A \ B) = P (;) = 0

21
ECE 3077 Notes by M. Davenport, J. Romberg, C. Rozell, and M. Wakin. Last updated 10:02, August 28, 2023
Examples. Consider an experiment involving two successive rolls of
a fair (6-sided) die. Which pairs of events below are independent?

1. A = {the first roll is a 3},


B = {the second roll is a 6}
6
Ans. P (A) = 36 = 16 and P (B) = 36
6
= 16 , thus
1
P (A \ B) = 36 = P (A) P (B) =) independent.
2. A = {the first roll is a 1},
B = {the sum of the two rolls is 2}
6
P (A) = 36 = 16 and P (B) = 361
, thus
1
P (A \ B) = 36 6= P (A) P (B) =) not independent.
3. A = {the first roll is even},
B = {the sum of the two rolls is even}
P (A) = 18 1 18 1
36 = 2 and P (B) = 36 = 2 , thus
9
P (A \ B) = 36 = P (A) P (B) =) independent.
4. A = {the first roll is even},
B = {the product of the two rolls is even}
P (A) = 18 1 27
36 = 2 and P (B) = 36 , thus
P (A \ B) = 18 1
6 P (A) P (B) =) not independent.
36 = 2 =
5. A = {at least one roll was a 2},
B = {the sum of the rolls is 5}
P (A) = 11 4 1
36 and P (B) = 36 = 9 , thus
2 1
P (A \ B) = 36 = 18 6= P (A) P (B) =) not independent.
6. A = {the first roll is a 2},
B = {the sum of the two rolls is 7}
(tread carefully)
6
P (A) = 36 = 16 and P (B) = 36
6
= 16 , thus
1
P (A \ B) = 36 = P (A) P (B) =) independent.

22
ECE 3077 Notes by M. Davenport, J. Romberg, C. Rozell, and M. Wakin. Last updated 10:02, August 28, 2023
Exercise: We generate two bits B1 and B2 independently at random
with
1
P (B1 = 0) = P (B1 = 1) = ,
2
and similarly for B2 . Set
Z = B1 B2 , where is the ‘xor’ operator.
Are the events {B1 = 1} and {Z = 1} independent?
Ans.
1
P (B1 = 1) =
2
1
P (Z = 1) =
2
1
P (B1 = 1 \ Z = 1) = = P (B1 = 1) P (Z = 1) =) independent.
4

Exercise: Suppose that a point (x, y) is chosen from the unit square
[0, 1] ⇥ [0, 1] in the plane according to the uniform law. Consider the
events
A = {x + y  1}
B = {max(x, y)  0.5}
Are A and B independent?
Ans.
1
P (A) =
2
1
P (B) =
4
1
P (A \ B) = 6= P (A) P (B) =) not independent.
4

23
ECE 3077 Notes by M. Davenport, J. Romberg, C. Rozell, and M. Wakin. Last updated 10:02, August 28, 2023
Exercise: Suppose that a point (x, y) is chosen from the unit square
[0, 1] ⇥ [0, 1] in the plane according to the uniform law. Consider the
events

A = {x + y  1}
B = {{max(x, y)  0.5} [ {min(x, y) 0.5}}

Are A and B independent?


Ans.
1
P (A) =
2
1
P (B) =
2
1
P (A \ B) = = P (A) P (B) =) independent.
4

Exercise: Suppose that a point (x, y) is chosen from the unit square
[0, 1] ⇥ [0, 1] in the plane according to the uniform law. Consider the
events

A = {x + y  1}
B = {{max(x, y)  0.25} [ {min(x, y) 0.5}}

Are A and B independent?


Ans.
1
P (A) =
2
5
P (B) =
16
1
P (A \ B) = 6= P (A) P (B) =) not independent.
16

24
ECE 3077 Notes by M. Davenport, J. Romberg, C. Rozell, and M. Wakin. Last updated 10:02, August 28, 2023
Independence of multiple events
Now, suppose we are interested in more than two events.
A collection of n events A1 , A2 , . . . , An are independent if and only if
the probability of any intersection of the Ai ’s is the product of the
individual probabilities.
P (Ai \ Aj ) = P (Ai ) P (Aj ) for all i, j.
(This is called pairwise independence.)
P (Ai \ Aj \ Ak ) = P (Ai ) P (Aj ) P (Ak ) for all i, j, k.
...

P (A1 \ A2 \ · · · \ An ) = P (A1 ) P (A2 ) · · · P (An )

Note: Pairwise independence does not imply independence. To see


this, let s1 = ±1 and s2 = ±1 be independent signs with
1
P (s1 = 1) = P (s1 =
1) = ,
2
and similarly for s2 . Set x = s1 s2 , and consider the events
A = {s1 = 1}
B = {s2 = 1}
C = {x = 1}.
Then it easy to see that while A and B are independent, A and C are
independent, and B and C are independent, but A, B, C are clearly
not independent:
P (A \ B \ C) 6= P (A) P (B) P (C) ,
since knowing two of s1 , s2 , x uniquely determines the third.

25
ECE 3077 Notes by M. Davenport, J. Romberg, C. Rozell, and M. Wakin. Last updated 10:02, August 28, 2023
Because the joint probabilities of independent events factor, it is easy
to calculate the probability of the combination of a bunch of indepen-
dent events by simple multiplication.

Exercise: M&Ms come in six colors: red, blue, green, yellow, brown,
and orange. Suppose that our bag contains an equal number of all
colors and imagine repeatedly drawing M&Ms from the bag, return-
ing the M&M to bag after each draw. What is the probability that
the next six M&Ms I pull out of this bag are all green?

Ans.

P (M1 = ‘g’, . . . , M6 = ‘g’) = P (M1 = ‘g’) · · · P (M6 = ‘g’)


✓ ◆6
1
=
6
1
= .
46, 656

Exercise: If I am an 85% free throw shooter, what is the probability


that I make my next three free throws?

Ans. Assuming independence, this is simply 0.853 ⇡ 0.614.

26
ECE 3077 Notes by M. Davenport, J. Romberg, C. Rozell, and M. Wakin. Last updated 10:02, August 28, 2023
Conditional Probability
In general, events are not independent and knowing about one event
can provide significant information about another event. Conditional
probability gives us a systematic way to reason about the outcome of
an experiment based on this type of partial information.
Examples:
A die is rolled two times. You are told that the sum of the rolls
is 9. What is the probability that the first roll was a 6?
The alarm is going o↵ in my house. What is the probability
that there is a burglar present?
You enjoyed watching “The Mandalorian”. How likely is it that
you will enjoy watching “Ahsoka”?
Given knowledge of an event B, we can construct a new (updated)
probability law for outcomes in ⌦.
Definition: The conditional probability of an event A given the
occurrence of an event B with P (B) > 0 is

P (A \ B)
P (A|B) = .
P (B)

One way to think about this graphically is that you are redefining
the sample space to be B and then calculating the relative size of A
within that new sample space. It is not hard to check that if P (·) is
a valid probability law on ⌦ (i.e., satisfies the Kolmogorov axioms),
then P (·|B) is also a valid probability law.
Important Note: P (A|B) 6= P (B|A). This is one of the most
common mistakes made by people with no background in probability.
Much more on this later.

27
ECE 3077 Notes by M. Davenport, J. Romberg, C. Rozell, and M. Wakin. Last updated 10:02, August 28, 2023
Example. Suppose I open a book written in English, put my finger
down at a random location, and then note which letter is closest.
Here are the probabilities for each of the 26 letters:
0.14

0.12

0.1

0.08

0.06

0.04

0.02

0
a b c d e f g h i j k l m n o p q r s t u v w x y z

(Another way to interpret this: A’s make up just over 8% of letters


in English, B’s make up about 1.8%, E’s about 12.5%, etc.)

Now suppose that I see my finger falls on an L. Here are the condi-
tional probabilities for the letter that immediately following the L in
the text:
0.18

0.16

0.14

0.12

0.1

0.08

0.06

0.04

0.02

0
a b c d e f g h i j k l m n o p q r s t u v w x y z

28
ECE 3077 Notes by M. Davenport, J. Romberg, C. Rozell, and M. Wakin. Last updated 10:02, August 28, 2023
Example. Recall the example where we toss a fair coin three con-
secutive times. The sample space consists of eight sequences:

⌦ = {HHH, HHT, HTH, HTT, THH, THT, TTH, TTT}.

Now suppose that we wish to find the conditional probability P (A|B)


where A and B are the events

A = {we toss two or more heads},


B = {the first toss is a head}.

First compute
4 1
P (B) = = .
8 2

Then find
3
P (A \ B) = .
8

And then combine these to find


P (A \ B) 3
P (A|B) = = .
P (B) 4

Compare this to P (A). P (A) = 48 = 12 6= P (A|B) . One way to


think about this particular example is that knowing B has occurred
increases the probability that A must have occurred.

29
ECE 3077 Notes by M. Davenport, J. Romberg, C. Rozell, and M. Wakin. Last updated 10:02, August 28, 2023
Exercise:
Suppose that we now toss a fair coin ten consecutive times. Let A
and B denote the events

A = {we toss two or more heads},


B = {the first toss is a head}.

What is the conditional probability P (A|B)?

Ans. To calculate P (A|B) we would like P (A \ B). While in theory


we could enumerate all of the possibilities and count them up, this is
a huge number. Sometimes it’s easier to think about when an event
fails instead of when it succeeds. For example, consider the event
(Ac \ B), which can only happen if we get heads on the first toss and
tails on all the rest. So, by independence we can calculate that:

P ((Ac \ B)) = P (C1 = ‘H’, C2 = ‘T’, . . . , C10 = ‘T’)


= P (C1 = ‘H’) P (C2 = ‘T’) . . . P (C10 = ‘T’) = (0.5)10

Then note that we can write B = (A \ B) [ (Ac \ B), and because


these sets are disjoint, the probabilities add:

P (B) = P ((A \ B) [ (Ac \ B)) = P ((A \ B)) + P ((Ac \ B))

=) P ((A \ B)) = P (B) P ((Ac \ B)) = 0.5 (0.5)10 = 0.4990


Finally we can calculate the conditional probability we desire:

P (A \ B) 0.4990
P (A|B) = = = 0.998
P (B) 0.5

30
ECE 3077 Notes by M. Davenport, J. Romberg, C. Rozell, and M. Wakin. Last updated 10:02, August 28, 2023
Exercise:
Suppose that a point (x, y) is chosen from the unit square [0, 1]⇥[0, 1]
in the plane according to the uniform law. Consider the events

A = {x + y  1}, B = {y  0.5}.

(a) Sketch the sample space ⌦ and the events A and B as subsets of
⌦.

1

0
0 1

(b) Calculate P (A|B).


Ans.
1
P (B) =
2
3
P (A \ B) =
8
P (A \ B) 3
P (A|B) = =
P (B) 4

31
ECE 3077 Notes by M. Davenport, J. Romberg, C. Rozell, and M. Wakin. Last updated 10:02, August 28, 2023
Independence and Conditional Probability
Conditional probability gives us a very nice way to interpret (and
check for) the independence of two events. Events A and B are in-
dependent if and only if

P (A|B) = P (A) and/or P (B|A) = P (B) .

(We say “and/or” above since if one of those relations is true, the
other must be, so we only need to check for one of them.)

This is a mathematical formalization of the idea that if A and B are


independent, then learning about A tells us nothing about B (and
vice versa) — the probability that it occurs does not change.

Actually, many of the earlier exercises are easier to work out using
this definition rather than checking P (A \ B) = P (A) P (B) directly.

The Multiplication Rule


One of the main benefits of independence is that you can calculate
(potentially complicated) joint probabilities with simple multiplica-
tion. Specifically, if a sequence of events A1 , A2 , . . . , An are indepen-
dent, then we can compute the probability that they all occur simply
via
P (A1 \ A2 \ · · · \ An ) = P (A1 ) · P (A2 ) · · · P (An ) ,
or, alternatively written as
n
! n
\ Y
P Ai = P (Ai ) .
i=1 i=1

32
ECE 3077 Notes by M. Davenport, J. Romberg, C. Rozell, and M. Wakin. Last updated 10:02, August 28, 2023
Conditional probability lets us generalize this approach to events that
are not necessarily independent using conditional probability as fol-
lows:
n
! T
\ P (A1 \ A2 ) P (A1 \ A2 \ A3 ) P ( ni=1 Ai )
P Ai = P (A1 ) · · · · · ⇣T ⌘
P (A ) P (A \ A ) P n 1
j=1 Aj
i=1 1 1 2

= P (A1 ) P (A2 |A1 ) P (A3 |A2 \ A1 ) · · ·


P (An |A1 \ A2 \ . . . \ An 1 )

An example will make it clear why this expansion is useful.

33
ECE 3077 Notes by M. Davenport, J. Romberg, C. Rozell, and M. Wakin. Last updated 10:02, August 28, 2023
Example. A researcher conducting an opinion poll has a list of
phone numbers consisting of 500 Republicans and 500 Democrats.
Unfortunately, the names aren’t labeled with their political affiliation.
The researcher calls 3 people randomly chosen from the list. What is
the probability that none of the people called is a Republican?
To solve this, we let
A1 = {the 1st person called is not a Republican}
A2 = {the 2nd person called is not a Republican}
A3 = {the 3rd person called is not a Republican}
We want to compute P (A1 \ A2 \ A3 ). Applying the multiplication
rule,
P (A1 \ A2 \ A3 ) = P (A1 ) P (A2 |A1 ) P (A3 |A1 \ A2 ) .
Computing each of the three terms on the right is straightforward.
Of the 1000 people on the list, 500 are Democrats and so
✓ ◆
500 1
P (A1 ) = = .
1000 2
Given that the first person was not a Republican, we know that 499
of the remaining 999 people are not Republicans, so
499
P (A2 |A1 ) =
.
999
Given that the first two people were not Republicans, we know that
498 of the remaining 998 are not Republicans, and so
498
P (A3 |A1 \ A2 ) = .
998
Thus
500 · 499 · 498
P (A1 \ A2 \ A3 ) = ⇡ 0.125.
1000 · 999 · 998

34
ECE 3077 Notes by M. Davenport, J. Romberg, C. Rozell, and M. Wakin. Last updated 10:02, August 28, 2023
Total Probability Theorem
If we have at our disposal a natural way to partition the sample space
⌦, then we have yet another way we can use conditional probability
to break apart calculations.
Let A1 , A2 , . . . , An be disjoint events that partition ⌦, meaning:
Ai \ Aj = ; for all i 6= j
A1 [ A2 [ · · · [ An = ⌦.
That is, every outcome is in one and only one of the events An . We
will also assume that P (An ) > 0 for n = 1, . . . , N . Then for any
event B,
P (B) = P (A1 \ B) + P (A2 \ B) + · · · + P (An \ B)
= P (B|A1 ) P (A1 ) + P (B|A2 ) P (A2 ) + · · · + P (B|An ) P (An )
Xn
= P (B|Ai ) P (Ai ) .
i=1

Here is a picture to illustrate the idea:

A1
B
A2

A3

35
ECE 3077 Notes by M. Davenport, J. Romberg, C. Rozell, and M. Wakin. Last updated 10:02, August 28, 2023
Example. No discussion of conditional probability would be com-
plete without mention of the famous “Monty Hall problem”. Suppose
you are on a game show where there are three doors. Behind one of
the doors is a prize (e.g., a car), and behind the other two doors some-
thing of little value (usually a goat). You get to pick one of the doors.
(Clearly, at this stage in the game, your chance of having picked the
correct door is 13 .) Where it gets interesting is that next, the host of
the show reveals a goat behind one of the two doors you did not pick.
You now have the chance to either stick with your original guess, or
switch to the other door. Which should you choose?

Let’s first define A to be the event that the strategy of sticking with
the original guess results in winning the prize. For convenience, let’s
number the doors, and we will start with the original guess as “Door
1”. We can then compute
3
X
P (A) = P (A|Car behind “Door i”) P (Car behind “Door i”)
i=1
= 1 · 13 + 0 · 13 + 0 · 13 = 1
3

Next define B to be the event that the strategy of switching doors


results in winning the prize. Then we have
3
X
P (B) = P (B|Car behind “Door i”) P (Car behind “Door i”)
i=1
= 0 · 13 + 1 · 13 + 1 · 13 = 2
3

Thus, you should always switch! Some people have a hard time believ-
ing this result. As a thought experiment, imagine the same game but
now with 100 doors. After you pick one, the host opens 98 of them
leaving one unopened (in addition to “your” door). In this case, most
people would decide to switch if given the chance.

36
ECE 3077 Notes by M. Davenport, J. Romberg, C. Rozell, and M. Wakin. Last updated 10:02, August 28, 2023
Exercise:
Robert is the star quarterback for your favorite football team. His
knee is bothering him, and so there is only a 40% chance he plays in
the next game. If he plays, the probability that your team wins is
0.75. If he does not, it is only 0.35. What is the probability that your
team wins the game?
Ans. Define A to be the event that Robert plays, and B to be the
event that the team wins. We know that

P (A) = 0.4,

and thus
P (Ac ) = 1 P (A) = 0.6.
We are also given that

P (B|A) = 0.75 and P (B|Ac ) = 0.35.

Since A and Ac are disjoint sets, we can use the total probability
theorem to write:

P (B) = P (B|A) P (A) + P (B|Ac ) P (Ac )


= 0.75 · 0.4 + 0.35 · 0.6
= 0.51.

37
ECE 3077 Notes by M. Davenport, J. Romberg, C. Rozell, and M. Wakin. Last updated 10:02, August 28, 2023
Exercise:
Anders and Blake both have coolers; Anders’ is filled with 8 sodas
and 3 beers, while Blake’s is filled with 2 sodas and 11 beers. The
coolers look identical, so you just choose one at random and start
pulling out drinks.
(a) Suppose you pull out one drink. What is the probability that you
pull out a soda?
Ans. Let A be the event that you choose Anders’ cooler, and
B = Ac be the event that you choose Blake’s cooler. Let S be
8
the event that you draw a soda. We know that P (S|A) = 11 and
2
P (S|B) = 13 . We assume that P (A) = P (B) = 0.5. By total
probability we have that:

P (S) = P (S|A) P (A) + P (S|B) P (B) = 12 ( 11


8 2
+ 13 ) ⇡ 0.441.

(b) Now suppose you pull out two drinks. What is the probability
you pull out two sodas?
Ans. Let S1 be the event that you draw a soda in the first pick,
and S2 be the event that you draw sodas in both your first and
second pick. As above, we can use total probability to account
for the unknown choice of the cooler:

P (S2 ) = P (S2 |A) P (A) + P (S2 |B) P (B)

Next note that using the multiplication rule we have P (S2 |A) =
8 7 2 1
P (S1 |A) P (S2 |S1 \ A) = 11 · 10 and similarly P (S2 |B) = 13 · 13 .
Putting these pieces together we have

P (S2 ) = P (S2 |A) P (A)+P (S2 |B) P (B) = 12 ( 11·10


8·7 2
+ 13·12 ) ⇡ 0.261.

38
ECE 3077 Notes by M. Davenport, J. Romberg, C. Rozell, and M. Wakin. Last updated 10:02, August 28, 2023

You might also like