0% found this document useful (0 votes)
56 views38 pages

18 - Expected Value

The document discusses expected value and how to calculate it for random variables. It provides examples of rolling dice and calculates the expected value in different cases. It also compares calculating expected value to calculating the average value and discusses when they may be different.

Uploaded by

John Couc
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
56 views38 pages

18 - Expected Value

The document discusses expected value and how to calculate it for random variables. It provides examples of rolling dice and calculates the expected value in different cases. It also compares calculating expected value to calculating the average value and discusses when they may be different.

Uploaded by

John Couc
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 38

EXPECTED VALUE

DISCRETE STRUCTURES II
DARRYL HILL
BASED ON THE TEXTBOOK:
DISCRETE STRUCTURES FOR COMPUTER SCIENCE: COUNTING,
RECURSION, AND PROBABILITY
BY MICHIEL SMID
𝑆: Sample space 𝑆 = {𝑎, 𝑏, 𝑐}
Outcome: Element of 𝑆 7 2 1
Pr 𝑎 = , Pr 𝑏 = , Pr 𝑐 =
Event: Subset of 𝑆 10 10 10
Pr 𝑥 : 𝑥 ∈ 𝑆 → 0,1
σ𝑤∈𝑆 Pr 𝑤 = 1 𝑋 𝑎 = 1, 𝑋 𝑏 = 2, 𝑋 𝑐 = 3

Random Variable: 𝐸 𝑋
𝑃 1 of 𝑋
= expected value

function 𝑋: 𝑆 → ℝ "Average" value of 𝑋 would be?

1+2+3
"neither random nor variable" First instinct 𝐸 𝑋 = =2
3

∘ 𝑋: But we choose 𝑎 ∈ 𝑆 70 % of the time, and 𝑏
𝑆 ∘ k ℝ 20% of the time and 𝑐 10% of the time!
∘ What is the average then if we consider the
∘ probabilities?
𝑆: Sample space 𝑆 = {𝑎, 𝑏, 𝑐}
Outcome: Element of 𝑆 7 2 1
Pr 𝑎 = , Pr 𝑏 = , Pr 𝑐 =
Event: Subset of 𝑆 10 10 10
Pr 𝑥 : 𝑥 ∈ 𝑆 → 0,1
σ𝑤∈𝑆 Pr 𝑤 = 1 𝑋 𝑎 = 1, 𝑋 𝑏 = 2, 𝑋 𝑐 = 3

Random Variable: 𝐸 𝑋
𝑃1
= expected value of 𝑋 - "Weighted Average“
Where each value of a random variable is given a
function 𝑋: 𝑆 → ℝ weight proportional to its probability.

"neither random nor variable" 7 2 1 14


𝐸 𝑋 =1⋅ +2⋅ +3⋅ =
10 10 10 10

∘ 𝑋: ℝ
Note that:
𝑆 ∘ k
∘ ෍ 𝑤𝑒𝑖𝑔ℎ𝑡𝑠 = ෍ Pr(𝑤) = 1
∘ 𝑤∈𝑆
𝑆: Sample space 𝑆 = {𝑎, 𝑏, 𝑐}
Outcome: Element of 𝑆 7 2 1
Pr 𝑎 = , Pr 𝑏 = , Pr 𝑐 =
Event: Subset of 𝑆 10 10 10
Pr 𝑥 : 𝑥 ∈ 𝑆 → 0,1
σ𝑤∈𝑆 Pr 𝑤 = 1 𝑋 𝑎 = 1, 𝑋 𝑏 = 2, 𝑋 𝑐 = 3

Random Variable: 𝐸 𝑋
𝑃1
= expected value of 𝑋 - "Weighted Average“
Where each value of a random variable is given a
function 𝑋: 𝑆 → ℝ weight proportional to its probability.

"neither random nor variable" 7 2 1 14


𝐸 𝑋 =1⋅ +2⋅ +3⋅ =
10 10 10 10

∘ 𝑋: ℝ
If I run the experiment 10 times, I would expect to get
𝑆 ∘ k 𝑎 7 times and 𝑏 twice and 𝑐 once. If I averages those
∘ values (i.e., divide by 10) I get 14/10.

𝑆: Sample space 𝑆 = {𝑎, 𝑏, 𝑐}
Outcome: Element of 𝑆 7 2 1
Pr 𝑎 = , Pr 𝑏 = , Pr 𝑐 =
Event: Subset of 𝑆 10 10 10
Pr 𝑥 : 𝑥 ∈ 𝑆 → 0,1
σ𝑤∈𝑆 Pr 𝑤 = 1 𝑋 𝑎 = 1, 𝑋 𝑏 = 2, 𝑋 𝑐 = 3

Random Variable:
𝑃 1 value is:
The definition of expected

function 𝑋: 𝑆 → ℝ
𝐸 𝑋 = ෍ 𝑋 𝑤 ⋅ Pr(𝑤)
𝑤∈𝑆
"neither random nor variable"
∘ For the random variable 𝑋 above:
∘ 𝐸 𝑋 = 𝑋 𝑎 ⋅ Pr 𝑎 + 𝑋 𝑏 ⋅ Pr 𝑏 + 𝑋 𝑐 ⋅ Pr(𝑐)
𝑆 ∘
j
k 7 2 1 14

𝐸 𝑋 =1⋅ +2⋅ +3⋅ =
∘ 𝑋: ℝ 10 10 10 10
𝐸 𝑋 = 1 ⋅ Pr 1 + 2 ⋅ Pr 2 + 3 ⋅ Pr 3
𝐸 𝑋 = ෍ 𝑋 𝑤 ⋅ Pr(𝑤)
𝑤∈𝑆 +4 ⋅ Pr 4 + 5 ⋅ Pr 5 + 6 ⋅ Pr 6

Roll fair die: 𝑆 = {1,2,3,4,5,6}, uniform 1


= ⋅ 1+2+3+4+5+6
probability. 6
𝑃1
1 = 3.5
Thus every element of 𝑆 has probability .
6

𝑋 = result of roll; 𝑋 𝑖 = 𝑖 In this particular case it is simply the average


die roll – not the value you would expect to see
6
(since you cannot roll 3.5).
𝐸 𝑋 = ෍ 𝑋 𝑖 ⋅ Pr(𝑖)
𝑖=1 You can think of it as rolling a die many times
(say millions of times), and taking the average
of all rolls. The average would be close to 3.5.
1
𝐸 𝑋 = ෍ 𝑋 𝑤 ⋅ Pr(𝑤) 𝑌 𝑖 =
𝑖
𝑤∈𝑆
6
1 1
Roll fair die: 𝑆 = {1,2,3,4,5,6}, uniform 𝐸 𝑌 =෍ ⋅
𝑖 6
probability 𝑖=1
1 𝑃61 1
𝑋 = result of roll; 𝑋 𝑖 = 𝑖 = ⋅෍
6 𝑖
𝑖=1
1 1 1 1 1 1 1
𝐸 𝑋 = 3.5 𝐸 𝑌 = ⋅ + + + + +
6 1 2 3 4 5 6
1
𝑌= 1 120 60 40 30 24 20
𝑟𝑒𝑠𝑢𝑙𝑡
= ⋅ + + + + +
6 120 120 120 120 120 120
49
=
120
1
𝐸 𝑋 = ෍ 𝑋 𝑤 ⋅ Pr(𝑤) 𝑌 𝑖 =
𝑖
𝑤∈𝑆
1
Roll fair die: 𝑆 = {1,2,3,4,5,6}, uniform 𝑌=
𝑋
probability
49 𝑃1 1
𝐸 𝑌 = ത
= 0.4083 = 𝐸
𝑋 = result of roll; 𝑋 𝑖 = 𝑖 120 𝑋

𝐸 𝑋 = 3.5 7
𝐸 𝑋 = ,
1
2
𝑌= 1 2
𝑟𝑒𝑠𝑢𝑙𝑡 = ≈ 0.286
𝐸(𝑋) 7

1 1
Thus in general 𝐸 ≠
𝑋 𝐸(𝑋)
𝐸 𝑋 = ෍ 𝑋 𝑤 ⋅ Pr(𝑤)
𝑤∈𝑆

Roll fair red die and fair blue die.


𝑆 = 𝑖, 𝑗 1 ≤ 𝑖 ≤ 6, 1 ≤ 𝑗 ≤ 6 }
1 𝑃1
Uniform probability Pr 𝑖, 𝑗 = 1 5
36 1 2 3 4 5 6
4
1 2 3 𝑃34 5 𝑃26 7
𝑋: 𝑆 → ℝ: 𝑋 = red + blue: 𝑋 𝑖, 𝑗 = 𝑖 + 𝑗 2 3 4 5 6 7 8

3 4 5 6 𝑇𝑇 7 𝑇𝐻
8 9
We will look at 3 ways to compute the 4 5 6 7𝐻𝑇 8 9 10
𝑇𝑇
Expected Value. We will go in order of 5 6 7 8 9 10 11

difficulty. 6 7 8 9 10 11 12
𝐸 𝑋 = ෍ 𝑋 𝑤 ⋅ Pr(𝑤) ෍ 𝑋 𝑖, 𝑗 = sum of all entries
𝑤∈𝑆 𝑖,𝑗 ∈𝑆
= 252
Roll fair red die and fair blue die. 252
𝑆 = 𝑖, 𝑗 1 ≤ 𝑖 ≤ 6, 1 ≤ 𝑗 ≤ 6 } 𝐸 𝑋 = ෍ 𝑋 𝑖, 𝑗 ⋅ Pr(𝑖, 𝑗) = =7
36
Uniform probability: Pr 𝑖, 𝑗 =
36
1 𝑖,𝑗 ∈𝑆
𝑃1
𝑋: 𝑆 → ℝ: 𝑋 = red + blue: 𝑋 𝑖, 𝑗 = 𝑖 + 𝑗 1 2 3 4 5 46
1 2 3
𝑃34 5
𝑃26 7

𝐸 𝑋 = ෍ 𝑋 𝑖, 𝑗 ⋅ Pr(𝑖, 𝑗) 2 3 4 5 6 7 8
𝑇𝑇 𝑇𝐻
𝑖,𝑗 ∈𝑆 3 4 5 6 7 8 9

4 5 6 𝐻𝑇
7 8 𝑇𝑇9 10
1 5 6 7 8 9 10 11
= ෍ 𝑋 𝑖, 𝑗
36 6 7 8 9 10 11 12
𝑖,𝑗 ∈𝑆
𝑋=4 { 3,1 , 2,2 , 1,3 }
𝐸 𝑋 = ෍ 𝑋 𝑤 ⋅ Pr(𝑤)
𝑤∈𝑆
We only look at elements of the summation
Roll fair red die and fair blue die. where 𝑋 = 4. There are 3, so the probability
𝑆 = 𝑖, 𝑗 1 ≤ 𝑖 ≤ 6, 1 ≤ 𝑗 ≤ 6 } sums to
3
Uniform probability: Pr 𝑖, 𝑗 =
1
36
𝑃1
36
𝑋: 𝑆 → ℝ: 𝑋 = red + blue: 𝑋 𝑖, 𝑗 = 𝑖 + 𝑗 1 2 3 4 5 6

1 2 3 4 5 6 7
Goal is to get a different formula that is
shorter and easier. 2 3 4 5 6 7 8

3 4 5 6 7 8 9
If we look at the table (which is really just the
function 𝑋 𝑖, 𝑗 ), there are entries that occur 4 5 6 7𝐻𝑇 8 9
𝑇𝑇 10

multiple times. 5 6 7 8 9 10 11

6 7 8 9 10 11 12
For instance, 4 occurs 3 times.
The event 𝑋 = 4 { 3,1 , 2,2 , 1,3 }
𝐸 𝑋 = ෍ 𝑋 𝑤 ⋅ Pr(𝑤)
𝑤∈𝑆
3,1 , 2,2 , 1,3
Pr 𝑋 = 4 =
Roll fair red die and fair blue die. 𝑆
𝑆 = 𝑖, 𝑗 1 ≤ 𝑖 ≤ 6, 1 ≤ 𝑗 ≤ 6 }
3
= 𝑃1
1
Uniform probability: Pr 𝑖, 𝑗 =
36 36 5
𝑋: 𝑆 → ℝ: 𝑋 = red + blue: 𝑋 𝑖, 𝑗 = 𝑖 + 𝑗 1 2 3 4 5 6
4
Of course this is the definition of an event, 1 2 3 𝑃34 5 𝑃26 7

and this is exactly how we compute the 2 3 4 5 6 7 8


probability of an event. 3 4 5 6 𝑇𝑇 7 𝑇𝐻
8 9

4 5 6 7𝐻𝑇 8 9
𝑇𝑇 10

5 6 7 8 9 10 11

6 7 8 9 10 11 12
𝐸 𝑋
𝐸 𝑋 = ෍ 𝑋 𝑤 ⋅ Pr(𝑤)
1 2 3 4 5
𝑤∈𝑆 =2⋅ +3⋅ +4⋅ +5⋅ +6⋅
36 36 36 36 36
6 5 4 3 2
Roll fair red die and fair blue die. +7⋅ +8⋅ +9⋅ + 10 ⋅ + 11 ⋅
𝑆 = 𝑖, 𝑗 1 ≤ 𝑖 ≤ 6, 1 ≤ 𝑗 ≤ 6 } 36 36 36 36 36
1
Uniform probability: Pr 𝑖, 𝑗 =
1
36
+ 12 ⋅
36
=7 𝑃1
𝑋: 𝑆 → ℝ: 𝑋 = red + blue: 𝑋 𝑖, 𝑗 = 𝑖 + 𝑗 5
1 2 3 4 5 6
4
We can rewrite our summation then to sum 1 2 3 𝑃34 5 𝑃26 7

over all possible values of 𝑋 and assign each 2 3 4 5 6 7 8


of these Events the appropriate weight. 3 4 5 6 𝑇𝑇 7 𝑇𝐻
8 9

𝑋 can take on all values from 2 up to 12. 4 5 6 7𝐻𝑇 8 9


𝑇𝑇 10

5 6 7 8 9 10 11

6 7 8 9 10 11 12
𝐸 𝑋
𝐸 𝑋 = ෍ 𝑋 𝑤 ⋅ Pr(𝑤)
1 2 3 4 5
𝑤∈𝑆 =2⋅ +3⋅ +4⋅ +5⋅ +6⋅
36 36 36 36 36
6 5 4 3 2
Roll fair red die and fair blue die. +7⋅ +8⋅ +9⋅ + 10 ⋅ + 11 ⋅
𝑆 = 𝑖, 𝑗 1 ≤ 𝑖 ≤ 6, 1 ≤ 𝑗 ≤ 6 } 36 36 36 36 36
1
Uniform probability: Pr 𝑖, 𝑗 =
1
36
+ 12 ⋅
36
=7 𝑃1
𝑋: 𝑆 → ℝ: 𝑋 = red + blue: 𝑋 𝑖, 𝑗 = 𝑖 + 𝑗 5
1 2 3 4 5 6
4
We can rewrite our summation then to sum 1 2 3 𝑃34 5 𝑃26 7

over all possible values of 𝑋 and assign each 2 3 4 5 6 7 8


of these Events the appropriate weight. 3 4 5 6 𝑇𝑇 7 𝑇𝐻
8 9

𝑋 can take on all values from 2 up to 12. 4 5 6 7𝐻𝑇 8 9


𝑇𝑇 10

5 6 7 8 9 10 11

6 7 8 9 10 11 12
𝐸 𝑋
𝐸 𝑋 = ෍ 𝑋 𝑤 ⋅ Pr(𝑤)
1 2 3 4 5
𝑤∈𝑆 =2⋅ +3⋅ +4⋅ +5⋅ +6⋅
36 36 36 36 36
6 5 4 3 2
Roll fair red die and fair blue die. +7⋅ +8⋅ +9⋅ + 10 ⋅ + 11 ⋅
𝑆 = 𝑖, 𝑗 1 ≤ 𝑖 ≤ 6, 1 ≤ 𝑗 ≤ 6 } 36 36 36 36 36
1 3
+ 12 ⋅
36
=7 𝑃1
𝑋: 𝑆 → ℝ: 𝑋 = red + blue: 𝑋 𝑖, 𝑗 = 𝑖 + 𝑗 1 5
1 2 3 4 5 6
4
Still pretty painful. But using the former 1 2 3 𝑃34 5 𝑃26 7
method, there are 36 entries to add up. 2 3 4 5 6 7 8

Using this method there are 11 entries to 3 4 5 6 𝑇𝑇 7 𝑇𝐻


8 9
add up. 4 5 6 7𝐻𝑇 8 9
𝑇𝑇 10

5 6 7 8 9 10 11

6 7 8 9 10 11 12
Random Variable: 𝑋: 𝑆 → ℝ 𝐸 𝑋 = ෍ 𝑋(𝑤) ⋅ Pr(𝑤)
𝑤∈𝑆
Event: "𝑋 = 𝑘" {𝑤 ∈ 𝑆: 𝑋 𝑤 = 𝑘}
Gather all elements 𝑤 ∈ 𝑆 for which 𝑋 𝑤 = 𝑘.
𝑘 ∈ range of function 𝑋. Then the above summation can be rewritten as:
𝐸 𝑋 =෍ ෍
𝑃1𝑋(𝑤) ⋅ Pr(𝑤)
Instead of looking at every element of
𝑆, we look at every Event defined by the ∀𝑘 𝑤:𝑋 𝑤 =𝑘
range of 𝑋, that is, all values 𝑋 can take. =෍ ෍ 𝑘 ⋅ Pr(𝑤)
∀𝑘 𝑤:𝑋 𝑤 =𝑘
Sum over Events instead of Outcomes
∘ We are still summing over all elements of 𝑆 but
∘ 𝑋: we are dividing into subsets based on the Events
𝑆 ∘ k ℝ 𝑋 = 𝑘.


Random Variable: 𝑋: 𝑆 → ℝ 𝐸 𝑋 = ෍ 𝑋(𝑤) ⋅ Pr(𝑤)
𝑤∈𝑆
Event: "𝑋 = 𝑘" {𝑤 ∈ 𝑆: 𝑋 𝑤 = 𝑘}
Gather all elements 𝑤 ∈ 𝑆 for which 𝑋 𝑤 = 𝑘.
𝑘 ∈ range of function 𝑋. Then the above summation can be rewritten as:
𝐸 𝑋 =෍ ෍
𝑃1𝑋(𝑤) ⋅ Pr(𝑤)
Instead of looking at every element of
𝑆, we look at every Event defined by the ∀𝑘 𝑤:𝑋 𝑤 =𝑘
range of 𝑋, that is, all values 𝑋 can take. = ෍𝑘 ⋅ ෍ Pr 𝑤
∀𝑘 𝑤:𝑋 𝑤 =𝑘
Sum over Events instead of Outcomes
∘ The part in brackets is simply our definition of
∘ 𝑋: the Event "𝑋 = 𝑘“. Thus we can rewrite it as
𝑆 ∘ k ℝ
∘ 𝐸 𝑋 = ෍ 𝑘 ⋅ Pr(𝑋 = 𝑘)
∘ ∀𝑘
Random Variable: 𝑋: 𝑆 → ℝ These Events are defined by “𝑋 = 𝑘“. So every
element in the Event is mapped to the same
Event: "𝑋 = 𝑘" {𝑤 ∈ 𝑆: 𝑋 𝑤 = 𝑘} value 𝑘.

𝑘 ∈ range of function 𝑋. Likewise, being Events we have tools to compute


the probability Pr 𝑋 = 𝑘 .𝑃
1
Instead of looking at every element of
𝑆, we look at every Event defined by the Given our expected value definition, we gather
range of 𝑋, that is, all values 𝑋 can take. all values 𝑋(𝑤) with the same image to the
same Event.
Sum over Events instead of Outcomes 𝐸 𝑋 = ෍ 𝑋(𝑤) ⋅ Pr(𝑤)

∘ 𝑋: 𝑤∈𝑆←𝑑𝑜𝑚𝑎𝑖𝑛 𝑜𝑓 𝑋

𝑆 ∘ k ℝ 𝐸 𝑋 = ෍ 𝑘 ⋅ Pr(𝑋 = 𝑘)
∘ 𝑘∈𝑟𝑎𝑛𝑔𝑒 𝑜𝑓 𝑋
∘ These expressions are the same (different order).
Random Variable: 𝑋: 𝑆 → ℝ
𝐸 𝑋 = ෍ 𝑋(𝑤) ⋅ Pr(𝑤)
Event: "𝑋 = 𝑘" {𝑤 ∈ 𝑆: 𝑋 𝑤 = 𝑘}
𝑤∈𝑆

𝑘 ∈ range of function 𝑋.
𝑃1 = 𝑘)
𝐸 𝑋 = ෍ 𝑘 ⋅ Pr(𝑋
Instead of looking at every element of 𝑘
𝑆, we look at every Event defined by the
range of 𝑋, that is, all values 𝑋 can take. These expressions are the same.

Sum over Events instead of Outcomes All we have done to go from one to the other is
∘ we changed the order that we summed over the
∘ 𝑋: elements 𝑤 ∈ 𝑆.
𝑆 ∘ k ℝ
∘ Next we will look at the 3rd technique, Linearity
∘ of Expectation.
𝐸 𝑋 = ෍ 𝑋(𝑤) ⋅ Pr(𝑤) We introduce a third random variable
𝑤∈𝑆
𝑍 =𝑋+𝑌

𝐸 𝑋 = ෍ 𝑘 ⋅ Pr(𝑋 = 𝑘) For any 𝑤 ∈ 𝑆 function 𝑍(𝑤) will give us a value,


𝑘 and that value is exactly the sum of the other
two functions:
Linearity of Expectation: 𝑃1
𝑍 𝑤 = 𝑋 𝑤 + 𝑌(𝑤)
Given two random variables 𝑋 and 𝑌,
If we want 𝐸(𝑍) we go with the definition of
𝐸 𝑋 + 𝑌 = 𝐸 𝑋 + 𝐸(𝑌). Expected Value:
“The expected value of the sum is equal to
the sum of the expected values.” 𝐸 𝑍 = ෍ 𝑍(𝑤) ⋅ Pr(𝑤)
𝑤∈𝑆
We will show this follows directly from the
first expression above.
𝐸 𝑋 = ෍ 𝑋(𝑤) ⋅ Pr(𝑤) 𝑍 =𝑋+𝑌
𝑤∈𝑆
𝑍 𝑤 = 𝑋 𝑤 + 𝑌(𝑤)

𝐸 𝑋 = ෍ 𝑘 ⋅ Pr(𝑋 = 𝑘)
𝑘 𝐸 𝑍 = ෍ 𝑍(𝑤) ⋅ Pr(𝑤)
𝑤∈𝑆
Linearity of Expectation: = ෍ [𝑋 𝑤
𝑃
+ 𝑌 𝑤 1] ⋅ Pr(𝑤)
𝑤∈𝑆
Given two random variables 𝑋 and 𝑌,
= ෍ [𝑋 𝑤 ⋅ Pr(𝑤) + 𝑌 𝑤 ⋅ Pr(𝑤)]
𝐸 𝑋 + 𝑌 = 𝐸 𝑋 + 𝐸(𝑌). 𝑤∈𝑆

= ෍ 𝑋 𝑤 ⋅ Pr(𝑤) + ෍ 𝑌 𝑤 ⋅ Pr(𝑤)
“The expected value of the sum is equal to 𝑤∈𝑆 𝑤∈𝑆
the sum of the expected values.”
= 𝐸 𝑋 + 𝐸(𝑌)
We will show this follows directly from the
first expression above.
𝐸 𝑋 = ෍ 𝑋(𝑤) ⋅ Pr(𝑤) This will work for any number of Random
𝑤∈𝑆
Variables:
𝐸 𝑋 = ෍ 𝑘 ⋅ Pr(𝑋 = 𝑘)
𝑍 𝑤 = 𝑧1 𝑤 + 𝑧2 𝑤 + ⋯ + 𝑧𝑛 𝑤
𝑘

Linearity of Expectation: 𝑛 𝑃1
𝑍 𝑤 = ෍ 𝑧𝑖 (𝑤)
Given two random variables 𝑋 and 𝑌, 𝑖=1

𝐸 𝑋 + 𝑌 = 𝐸 𝑋 + 𝐸(𝑌). 𝑛

𝐸 𝑍 = ෍ ෍ 𝑧𝑖 𝑤 ⋅ Pr(𝑤)
“The expected value of the sum is equal to 𝑖=1 𝑤∈𝑆
the sum of the expected values.”
𝑛
We will show this follows directly from the = ෍ 𝐸(𝑧𝑖 )
first expression above. 𝑖=1
1 7
Roll fair red die and fair blue die. 𝐸 𝑟𝑒𝑑 = ⋅ 1 + 2 + 3 + 4 + 5 + 6 =
6 2
𝑆 = 𝑖, 𝑗 1 ≤ 𝑖 ≤ 6, 1 ≤ 𝑗 ≤ 6 } 1 7
𝐸 𝑏𝑙𝑢𝑒 = ⋅ 1 + 2 + 3 + 4 + 5 + 6 =
6 2
𝑋: 𝑆 → ℝ: 𝑋 = red + blue: 𝑋 𝑖, 𝑗 = 𝑖 + 𝑗 Using linearity of expectation:

We know expected value of one die is 3.5, 𝑃1𝐸 𝑟𝑒𝑑 + 𝐸 𝑏𝑙𝑢𝑒


𝐸 𝑋 = 𝐸 𝑟𝑒𝑑 + 𝑏𝑙𝑢𝑒 =
and of 2 dice it is 7, which is twice the value. 7 17
= + =7
2 2 4
This is not a coincidence, and can be verified 1 2 𝑃33 4 𝑃5 2 6
using linearity of expectation. 1 2 3 4 5 6 7
Define random variables: 2 3 4 5 6 7 8
3 4 5 6 7 8 9
𝑟𝑒𝑑 𝑖, 𝑗 = 𝑖 4 5 6 7 8 9 10
𝑏𝑙𝑢𝑒 𝑖, 𝑗 = 𝑗
5 6 7 8 9 10 11

Thus 𝑋 = 𝑟𝑒𝑑 + 𝑏𝑙𝑢𝑒 6 7 8 9 10 11 12


Couple have a child.

Child is a boy, the couple keeps trying since


they wanted a girl.

Have another child – another boy.


3
𝑃1
Keep having children until they have a girl, at
which point they stop having children.

[boy probability ½, girl probability ½,


independent of the gender of the other First we will develop a framework to help us
children] solve this problem.

If everyone in the world does this, will there This is an infinite probability space that we
be more girls than boys in the world, or more will apply Expected Value to.
boys than girls in the world?
0 < 𝑝 < 1: What is the sample space?
Experiment -> success with prob 𝑝
-> failure prob 1 − 𝑝 We have seen this before:

Instead of children, we will flip a (possibly 𝑆 = {𝐻, 𝑇𝐻, 𝑇𝑇𝐻, 𝑇𝑇𝑇𝐻, … }


unfair) coin.
3
𝑃1differently
We will define it slightly
Coin comes up:
H with prob 𝑝 𝑆 = {𝑇 𝑘−1 𝐻: 𝑘 ≥ 1}
T with prob 1 − 𝑝
Where 𝑘 is the number of coin flips. If 𝑘 = 1
Flip coins until H, each coin flip is independent. then there are 0 tails and 1 heads.

Define a Random Variable


𝑋 = number of flips
T T T H
What is 𝐸(𝑋)?
0 < 𝑝 < 1: 𝑆 = {𝑇 𝑘−1 𝐻: 𝑘 ≥ 1}
Experiment -> success with prob 𝑝
-> failure prob 1 − 𝑝 For any individual outcome of 𝑆 we can
determine the probability. Each coin flip is
Instead of children, we will flip a (possibly independent, and thus:
unfair) coin.
3
Pr(𝑇
𝑃
𝑘−1 1
𝐻)
Coin comes up:
H with prob 𝑝 = Pr 𝑓1 = 𝑇 ∧ 𝑓2 = 𝑇 ∧ ⋯ ∧ 𝑓𝑘−1 = 𝑇 ∧ 𝑓𝑘 = 𝐻
T with prob 1 − 𝑝 = Pr 𝑇 ⋅ Pr 𝑇 ⋅ … ⋅ Pr 𝑇 ⋅ Pr(𝐻)
= Pr 𝑇 𝑘−1 ⋅ Pr(𝐻)
Flip coins until H, each coin flip is independent.
= 1−𝑝 𝑘−1 ⋅𝑝
Define a Random Variable
𝑋 = number of flips
T T T H
What is 𝐸(𝑋)?
∞ As a sanity check we can verify that the sum
𝑘
1 of all probabilities of outcomes in 𝑆 sums to
෍𝑥 =
1−𝑥 1.
𝑘=0

∞ ∞
0 < 𝑝 < 1:
Coin comes up: ෍ Pr(𝑇 𝑘−1 𝐻) = ෍ 1 − 𝑝 𝑘−1
⋅𝑝
3
H with prob 𝑝 𝑘=1

𝑃
𝑘=1 1
T with prob 1 − 𝑝 𝑘−1 Let 𝑖 = 𝑘 − 1
=𝑝⋅෍ 1−𝑝
𝑘=1
Flip coins until H, each coin flip is independent. ∞
𝑋 = number of flips =𝑝⋅෍ 1−𝑝 𝑖

𝑖=0
What is 𝐸(𝑋)? Substitute 1 − 𝑝 for 𝑥:
1 1
𝑆 = {𝑇 𝑘−1 𝐻: 𝑘 ≥ 1} =𝑝⋅ =𝑝⋅ =1
1− 1−𝑝 𝑝
Pr(𝑇 𝑘−1 𝐻) = Pr(𝑋 = 𝑘) = 1 − 𝑝 𝑘−1 ⋅𝑝
∞ We will use the expression for 𝐸 𝑋 where
𝑘
1 we iterate over the range of 𝑋. What is the
෍𝑥 =
1−𝑥 range of 𝑋?
𝑘=0
𝑘 is the number of flips in the sequence.
0 < 𝑝 < 1:
Coin comes up: We have 𝑘 ≥ 1 and 𝑘 → ∞. Thus:
3
H with prob 𝑝 𝑃1

T with prob 1 − 𝑝
𝐸 𝑋 = ෍ 𝑘 ⋅ Pr(𝑋 = 𝑘)
Flip coins until H, each coin flip is independent. 𝑘=1
𝑋 = number of flips
What is the Event 𝑋 = 𝑘? It is when there are
What is 𝐸(𝑋)? exactly 𝑘 coin flips (ending in heads). Thus

𝑆 = {𝑇 𝑘−1
𝐻: 𝑘 ≥ 1} “𝑋 = 𝑘" = {𝑇 𝑘−1 𝐻}
Pr(𝑇 𝑘−1 𝐻) = Pr(𝑋 = 𝑘) = 1 − 𝑝 𝑘−1 ⋅𝑝 So Pr 𝑋 = 𝑘 = Pr(𝑇 𝑘−1 𝐻)


𝑘
1 𝐸 𝑋 = ෍ 𝑘 ⋅ Pr(𝑋 = 𝑘)
෍𝑥 =
1−𝑥 𝑘=1
𝑘=0

0 < 𝑝 < 1: ∞
Coin comes up:
H with prob 𝑝 = ෍ 𝑘 ⋅ 1 −3𝑝 𝑘−1
𝑃 ⋅𝑝 1
𝑘=1
T with prob 1 − 𝑝

Flip coins until H, each coin flip is independent. =𝑝⋅෍𝑘⋅ 1−𝑝 𝑘−1
𝑋 = number of flips 𝑘=1

What is 𝐸(𝑋)? This is an infinite series we have not seen yet.

𝑆 = {𝑇 𝑘−1 𝐻: 𝑘 ≥ 1}
Pr(𝑇 𝑘−1 𝐻) = Pr(𝑋 = 𝑘) = 1 − 𝑝 𝑘−1 ⋅𝑝
∞ ∞
1
෍ 𝑥𝑘 = 𝐸 𝑋 = ෍ 𝑘 ⋅ Pr(𝑋 = 𝑘)
1−𝑥
𝑘=0 𝑘=1

0 < 𝑝 < 1: 𝑘−1
Coin comes up: = 𝑝⋅෍𝑘⋅ 1−𝑝
𝑘=1
H with prob 𝑝
3
T with prob 1 − 𝑝
Without the 𝑝 we have:
𝑃1
Flip coins until H, each coin flip is 2 3…
1+2⋅ 1−𝑝 +3⋅ 1−𝑝 +4⋅ 1−𝑝
independent.
𝑋 = number of flips
If the 𝑘 were gone we understand how to solve
this.
What is 𝐸(𝑋)?

𝑘−1 We will look at the general form of this


𝑆= 𝑇 𝐻: 𝑘 ≥ 1 expression and use our favourite infinite sum…
Pr(𝑇 𝑘−1 𝐻) = Pr(𝑋 = 𝑘) = 1 − 𝑝 𝑘−1 ⋅𝑝

1
෍ 𝑥𝑘 =
1−𝑥 ∞ ∞ ∞
𝑘=0
𝑑
The general form of ⋅ ෍ 𝑥 𝑘 = ෍ 𝑘 ⋅ 𝑥 𝑘−1 = ෍ 𝑘 ⋅ 𝑥 𝑘−1
σ∞ 𝑘 ⋅ 1 − 𝑝 𝑘−1 is: 𝑑𝑥
𝑘=1 𝑘=0 𝑘=0 𝑘=1

3
෍ 𝑘 ⋅ 𝑥 𝑘−1 And the RHS: 𝑃1
𝑘=1
𝑑 1 0+1 1
We want to find a closed form. We know ⋅ = 2
= 2
𝑑𝑥 1 − 𝑥 1−𝑥 1−𝑥
this:

𝑘
1 Thus we have shown that:
෍𝑥 =
1−𝑥
𝑘=0 ∞
1
෍𝑘⋅ 𝑥 𝑘−1 = 2
If we differentiate both sides of this 1−𝑥
𝑘=1
expression with respect to 𝑥, they will still
be equal. The derivative of the LHS:
∞ ∞
1
෍ 𝑥𝑘 = 𝐸 𝑋 = ෍ 𝑘 ⋅ Pr(𝑋 = 𝑘)
1−𝑥
𝑘=0 𝑘=1
∞ ∞
1
෍𝑘 ⋅ 𝑥 𝑘−1 = 2
= 𝑝⋅෍𝑘⋅ 1−𝑝 𝑘−1
1−𝑥
𝑘=1 𝑘=1
13
=𝑝⋅ 𝑃1 2
1 − (1 − 𝑝)
Coin comes up:
H with prob 𝑝 1
T with prob 1 − 𝑝 =𝑝⋅ 2
𝑝
Flip coins until H, 𝑋 = number of flips
1
=
𝑆 = {𝑇 𝑘−1 𝐻: 𝑘 ≥ 1} 𝑝

Pr(𝑇 𝑘−1 𝐻) = Pr(𝑋 = 𝑘) = 1 − 𝑝 𝑘−1 ⋅ 𝑝 Thus the expected number of trials of an


1
experiment with probability 𝑝 of success is .
𝑝
∞ 1
1 𝐸 𝑋 =
෍ 𝑥𝑘 = 𝑝
1−𝑥
𝑘=0

1 Thus the expected number of trials of an
෍𝑘 ⋅ 𝑥 𝑘−1 = 2 1
1−𝑥 experiment with probability 𝑝 of success is .
𝑘=1 𝑝
3
𝑃1
So assume that each coin flip lands on heads
Coin comes up: 1 1
H with prob 𝑝 with probability and tails with probability .
2 2
T with prob 1 − 𝑝 What is the expected number of times we flip
the coin until we see heads?
Flip coins until H, 𝑋 = number of flips
This is a nice clean result that can be applied in
𝑆 = {𝑇 𝑘−1 𝐻: 𝑘 ≥ 1} many different places.
Pr(𝑇 𝑘−1 𝐻) = 1 − 𝑝 𝑘−1 ⋅𝑝
A couple keeps having children until they have
a girl.

[boy probability ½, girl probability ½,


independent of the gender of the other
children]
3
𝑃1
Success: have a girl
Failure: have a boy 1
𝐸 𝑋 =
𝑝
How many children do we expect they will 1
=
have? 1ൗ
2
=2
Let 𝑋 be the number of children a couple has.
We would expect the couple to have 2
children on average.
A couple keeps having children until they have a
girl.

[boy probability ½, girl probability ½,


independent of the gender of the other
children]
3
𝑃1
Stop: have a girl
Continue: have a boy 𝐸 𝑋 =2
𝐸 𝑋 = 𝐸 𝐺 + 𝐸(𝐵)
Let 𝐵 be the number of boys born. 𝐺 is a constant, 𝐺 = 1. Thus:
Let 𝐺 be the number of girls born. 𝐸 𝐺 =1
Let 𝑋 be the total number of children the couple 𝐸 𝑋 = 𝐸 𝐺 + 𝐸(𝐵)
has. 2 = 1 + 𝐸(𝐵)
𝐸 𝐵 =2 −1
𝑋 =𝐺+𝐵 𝐸 𝐵 =1
A couple keeps having children until they have
a girl.

[boy probability ½, girl probability ½,


independent of the gender of the other
children]
3
𝑃1
Stop: have a girl
Continue: have a boy 𝐸 𝑋 =2
𝐸 𝑋 = 𝐸 𝐺 + 𝐸(𝐵)
Let 𝐵 be the number of boys born. 𝐺 is a constant, 𝐺 = 1. Thus:
Let 𝐺 be the number of girls born. 𝐸 𝐺 =1
Let 𝑋 be the total number of children a couple 𝐸 𝐵 = 1,
has.
If all couples in the world did this, then on
𝑋 =𝐵+𝐺 average there would be the same number of
boys as girls.
What if a couple keeps having children until
they have two girls? Does this change the
expected number of girls and boys?

As before, with the coin flip game, we can


write it out explicitly, with a double
3
summation, or break it down into two rounds. 𝑃1
We break it into 2 rounds by defining our
random variables correctly.
A couple keeps having children until they have 2
girls.

Let 𝐵1 be the number of boys born before girl 1.


Let 𝐵2 be the number of boys born after girl 1
but before girl 2.
3
𝑃1
Let 𝐺 be the number of girls born, 𝐺 = 2.
1
𝐸 𝑋1 = =2
Let X1 be the number of children a couple has 1ൗ
2
when girl 1 is born. 1
𝐸 𝑋2 = =2
1ൗ
Let X2 be the number of children that are born 2
𝐸 𝑋 = 𝐸 𝑋1 + 𝑋2
after girl 1.
= 𝐸 𝑋1 + 𝐸 𝑋2
=4
Let 𝑋 be the total number of children that are
born.

You might also like