Introduction To Probability
Introduction To Probability
An EXPERIMENT is a repeatable process that typically has more than one possible outcome.
The process should be repeatable since we will adopt a “frequentist” approach to probability.
A process having only one possible outcome is uninteresting from the perspective of probability.
Examples:
Measure the lifetime of a randomly chosen lightbulb of a particular type;
Find whether a voter chosen at random from a large electorate supports a particular
candidate or not in an upcoming election;
Determine the amount of relief experienced by a randomly chosen patient who is
administered either a novel treatment or a placebo according to a coin flip;
Record the sex, discipline, level, and income of a randomly chosen faculty member
from a large university.
The SAMPLE SPACE is the set of all possible outcomes to the experiment.
An EVENT is a subset of the sample space.
Example: Suppose that the experiment is to roll a six-sided die and record the number
of dots on the upturned face.
The sample space is 𝑺𝑺 = 𝟏𝟏, 𝟐𝟐, 𝟑𝟑, 𝟒𝟒, 𝟓𝟓, 𝟔𝟔 .
Consider events 𝑬𝑬, 𝑭𝑭, 𝑮𝑮 given as follows:
𝑬𝑬 is the event that the upturned face is even, so 𝑬𝑬 = 𝟐𝟐, 𝟒𝟒, 𝟔𝟔 ;
𝑭𝑭 is the event that the upturned face is at most 3, so 𝑭𝑭 = {𝟏𝟏, 𝟐𝟐, 𝟑𝟑};
𝑮𝑮 is the event that the upturned face is at least 3, so 𝑮𝑮 = 𝟑𝟑, 𝟒𝟒, 𝟓𝟓, 𝟔𝟔 .
Note that the sample space 𝑺𝑺 is an event that always occurs. Whenever the experiment is
performed, the resulting outcome belongs to 𝑺𝑺.
The null or empty event (usually denoted by ∅) is the event that contains no outcomes. The
null event never occurs, for whenever the experiment is performed, the result is an outcome
that does not belong to ∅.
In the example, the event 𝑬𝑬 = “the upturned face is even” = 𝟐𝟐, 𝟒𝟒, 𝟔𝟔
has complement 𝑬𝑬 � = “the upturned face is odd” = 𝟏𝟏, 𝟑𝟑, 𝟓𝟓 .
In the example, the intersection of 𝑬𝑬 = “the upturned face is even” = 𝟐𝟐, 𝟒𝟒, 𝟔𝟔
and 𝑭𝑭 = “the upturned face is at most 3” = 𝟏𝟏, 𝟐𝟐, 𝟑𝟑 is 𝑬𝑬 ∩ 𝑭𝑭 = 𝟐𝟐 .
In the example, the intersection of 𝑬𝑬 = “the upturned face is even” = 𝟐𝟐, 𝟒𝟒, 𝟔𝟔
and 𝑮𝑮 = “the upturned face is at least 3” = 𝟑𝟑, 𝟒𝟒, 𝟓𝟓, 𝟔𝟔 is 𝑬𝑬 ∩ 𝑮𝑮 = 𝟒𝟒, 𝟔𝟔 .
For instance, in the example, 𝑬𝑬 = “the upturned face is even” = 𝟐𝟐, 𝟒𝟒, 𝟔𝟔 and
� = “the upturned face is odd” = 𝟏𝟏, 𝟑𝟑, 𝟓𝟓 have no outcomes in common, and hence,
𝑬𝑬
they are disjoint.
The UNION of two events 𝑬𝑬 and 𝑭𝑭 (denoted by 𝑬𝑬 ∪ 𝑭𝑭 or 𝑬𝑬 + 𝑭𝑭) is the event consisting
of all outcomes that belong to 𝑬𝑬 or to 𝑭𝑭 or to both.
In the example, the union of 𝑬𝑬 = “the upturned face is even” = 𝟐𝟐, 𝟒𝟒, 𝟔𝟔 and
𝑭𝑭 = “the upturned face is at most 3” = 𝟏𝟏, 𝟐𝟐, 𝟑𝟑 is 𝑬𝑬 ∪ 𝑭𝑭 = 𝟏𝟏, 𝟐𝟐, 𝟑𝟑, 𝟒𝟒, 𝟔𝟔 .
In the example, the union of 𝑬𝑬 = “the upturned face is even” = 𝟐𝟐, 𝟒𝟒, 𝟔𝟔 and
𝑮𝑮 = “the upturned face is at least 3” = 𝟑𝟑, 𝟒𝟒, 𝟓𝟓, 𝟔𝟔 is 𝑬𝑬 ∪ 𝑮𝑮 = 𝟐𝟐, 𝟑𝟑, 𝟒𝟒, 𝟓𝟓, 𝟔𝟔 .
For instance, in the example, 𝑬𝑬 = “the upturned face is even” = 𝟐𝟐, 𝟒𝟒, 𝟔𝟔 and
� = “the upturned face is odd” = 𝟏𝟏, 𝟑𝟑, 𝟓𝟓 contain all possible outcomes between them,
𝑬𝑬
and hence, they are exhaustive.
In the example, concoct events 𝑬𝑬 and 𝑭𝑭 that are i) mutually exclusive but not exhaustive;
and ii) that are exhaustive but not cornell
mutually exclusive.
econ/ilrst/stsci 3110; notes by Tom DiCiccio
PART 1: PROBABILITY
intersection, union
The notions of intersection and union are readily extended to more than two events.
For 𝑘𝑘 events 𝑬𝑬𝟏𝟏 , 𝑬𝑬𝟐𝟐 , … , 𝑬𝑬𝒌𝒌 , we have that
𝒌𝒌
i) 𝑬𝑬𝟏𝟏 ∩ 𝑬𝑬𝟐𝟐 ∩ ⋯ ∩ 𝑬𝑬𝒌𝒌 = ⋂𝒊𝒊=𝟏𝟏 𝑬𝑬𝒊𝒊 is the event that consists of all outcomes belonging to
each of 𝑬𝑬𝟏𝟏 , 𝑬𝑬𝟐𝟐 , … , 𝑬𝑬𝒌𝒌 ;
𝒌𝒌
ii) 𝑬𝑬𝟏𝟏 ∪ 𝑬𝑬𝟐𝟐 ∪ ⋯ ∪ 𝑬𝑬𝒌𝒌 = ⋃𝒊𝒊=𝟏𝟏 𝑬𝑬𝒊𝒊 is the event that consists of all outcomes belonging to
at least one of 𝑬𝑬𝟏𝟏 , 𝑬𝑬𝟐𝟐 , … , 𝑬𝑬𝒌𝒌 .
Note the following rules (which can be shown via Venn Diagrams):
𝑬𝑬 ∩ 𝑭𝑭 ∪ 𝑮𝑮 = 𝑬𝑬 ∩ 𝑭𝑭 ∪ 𝑬𝑬 ∩ 𝑮𝑮 and 𝑬𝑬 ∪ 𝑭𝑭 ∩ 𝑮𝑮 = 𝑬𝑬 ∪ 𝑭𝑭 ∩ 𝑬𝑬 ∪ 𝑮𝑮 .
Complement:
𝑬𝑬 𝑭𝑭
𝑬𝑬
𝑺𝑺
𝑺𝑺
= 𝑬𝑬 ∩ 𝑭𝑭
�
= 𝑬𝑬
Union:
𝑬𝑬 𝑭𝑭
𝑺𝑺
= 𝑬𝑬 ∪ 𝑭𝑭
cornell econ/ilrst/stsci 3110; notes by Tom DiCiccio
PART 1: PROBABILITY
meaning of probability
The goal is to “define” a notion for 𝑷𝑷 𝑬𝑬 , the probability of the event 𝑬𝑬.
Suppose the experiment is repeated 𝒏𝒏 times, and let 𝒇𝒇𝒏𝒏 𝑬𝑬 be the number of times that
the event 𝑬𝑬 occurs in the 𝒏𝒏 repetitions. (Here, 𝒇𝒇 stands for “frequency.”)
𝒇𝒇𝒏𝒏 𝑬𝑬
The ratio is the proportion of times that the event 𝑬𝑬 occurs in the 𝒏𝒏 repetitons.
𝒏𝒏
For small 𝒏𝒏, this ratio can be quite variable; however, as 𝒏𝒏 increases, it is an empirical fact
that the ratio settles down and tends to a limit.
𝒇𝒇𝒏𝒏 𝑬𝑬
We think of this limit as the probability of the event 𝑬𝑬, i.e., → 𝑷𝑷 𝑬𝑬 as 𝒏𝒏 → ∞.
𝒏𝒏
Thus, 𝑷𝑷 𝑬𝑬 is the proportion of times that 𝑬𝑬 occurs in the “long-run,” i.e., over a very large
number of repetitions of the experiment.
cornell econ/ilrst/stsci 3110; notes by Tom DiCiccio
PART 1: PROBABILITY
example of probability
For example, in rolls of a fair die, take 𝑬𝑬 = “the upturned face is even” = 𝟐𝟐, 𝟒𝟒, 𝟔𝟔 .
𝒇𝒇𝒏𝒏 𝑬𝑬
The following table gives 𝒇𝒇𝒏𝒏 𝑬𝑬 and for various values of 𝒏𝒏:
𝒏𝒏
𝒏𝒏 𝒇𝒇𝒏𝒏 𝑬𝑬 𝒇𝒇𝒏𝒏 𝑬𝑬 ⁄𝒏𝒏
5 2 0.400000
Plot of f_n(E)/n vs log n
10 5 0.500000
25 16 0.640000 0.65
50 28 0.560000
100 49 0.490000
250 140 0.560000 0.60
f_n(E)/n
5,000 2,434 0.486800
10,000 5,045 0.504500
0.50
25,000 12,497 0.499880
50,000 25,028 0.500324
100,000 49,879 0.498790 0.45
250,000 125,081 0.500324
500,000 249,799 0.499598
1,000,000 500,257 0.500257 0.40
2,500,000 1,250,287 0.500115
0 1 2 3 4 5 6 7 8
5,000,000 2,498,951 0.499790
log n
10,000,000 4,999,453 0.499945
25,000,000 12,498,273 0.499931 cornell econ/ilrst/stsci 3110; notes by Tom DiCiccio
PART 1: PROBABILITY
example of probability
𝒇𝒇𝒏𝒏 𝑬𝑬 𝟏𝟏 𝟏𝟏
In this case, it is evident that → as 𝒏𝒏 → ∞, and hence we have 𝑷𝑷 𝑬𝑬 = .
𝒏𝒏 𝟐𝟐 𝟐𝟐
𝟏𝟏
How could we have anticipated that 𝑷𝑷 𝑬𝑬 = ?
𝟐𝟐
The event 𝑬𝑬 contains the three outcomes 𝟐𝟐, 𝟒𝟒, 𝟔𝟔, so in the long run, the proportion of
𝟏𝟏 𝟏𝟏 𝟏𝟏 𝟑𝟑 𝟏𝟏
time that the upturned face belongs to 𝑬𝑬 is + + = = .
𝟔𝟔 𝟔𝟔 𝟔𝟔 𝟔𝟔 𝟐𝟐
𝟏𝟏
This argument demonstrates that 𝑷𝑷 𝑬𝑬 = .
𝟐𝟐
We now deduce the major properties of probability based on the “frequentist” interpretation.
Whenever the experiment is performed, if the outcome belongs to 𝑬𝑬 ∪ 𝑭𝑭, then it belongs
either to 𝑬𝑬 or to 𝑭𝑭, but it cannot belong to both 𝑬𝑬 and 𝑭𝑭 because 𝑬𝑬 and 𝑭𝑭 are disjoint.
Thus, in 𝒏𝒏 repetitions of the experiment, 𝒇𝒇𝒏𝒏 𝑬𝑬 ∪ 𝑭𝑭 = 𝒇𝒇𝒏𝒏 𝑬𝑬 + 𝒇𝒇𝒏𝒏 𝑭𝑭 .
𝒇𝒇𝒏𝒏 𝑬𝑬∪𝑭𝑭 𝒇𝒇𝒏𝒏 𝑬𝑬 𝒇𝒇𝒏𝒏 𝑭𝑭
It follows that = + for all 𝒏𝒏.
𝒏𝒏 𝒏𝒏 𝒏𝒏
𝒇𝒇𝒏𝒏 𝑬𝑬 𝒇𝒇𝒏𝒏 𝑭𝑭 𝒇𝒇𝒏𝒏 𝑬𝑬∪𝑭𝑭
Since → 𝑷𝑷 𝑬𝑬 , → 𝑷𝑷 𝑭𝑭 , and → 𝑷𝑷 𝑬𝑬 ∪ 𝑭𝑭 as 𝒏𝒏 → ∞,
𝒏𝒏 𝒏𝒏 𝒏𝒏
we have 𝑷𝑷 𝑬𝑬 ∪ 𝑭𝑭 = 𝑷𝑷 𝑬𝑬 + 𝑷𝑷 𝑭𝑭 .
Whenever the experiment is performed, if the outcome belongs to 𝑬𝑬 ∪ 𝑭𝑭, then it belongs to 𝑬𝑬 but
not 𝑭𝑭, or to 𝑭𝑭 but not 𝑬𝑬, or to both 𝑬𝑬 and 𝑭𝑭, i.e., to 𝑬𝑬 ∩ 𝑭𝑭.
Consider 𝒏𝒏 repetitions of the experiment.
Any repetition yielding an outcome in 𝑬𝑬 ∩ 𝑭𝑭 is counted in each of 𝒇𝒇𝒏𝒏 𝑬𝑬 ∪ 𝑭𝑭 , 𝒇𝒇𝒏𝒏 𝑬𝑬 , and 𝒇𝒇𝒏𝒏 𝑭𝑭 .
Thus, any repetition yielding an outcome in 𝑬𝑬 ∩ 𝑭𝑭 is counted once in 𝒇𝒇𝒏𝒏 𝑬𝑬 ∪ 𝑭𝑭 , but it is counted
twice in 𝒇𝒇𝒏𝒏 𝑬𝑬 + 𝒇𝒇𝒏𝒏 𝑭𝑭 .
In contrast, any repetition yielding an outcome that belongs to 𝑬𝑬 but not 𝑭𝑭, or to 𝑭𝑭 but not 𝑬𝑬 is
counted once in 𝒇𝒇𝒏𝒏 𝑬𝑬 ∪ 𝑭𝑭 and once in 𝒇𝒇𝒏𝒏 𝑬𝑬 + 𝒇𝒇𝒏𝒏 𝑭𝑭 .
Consequently, 𝒇𝒇𝒏𝒏 𝑬𝑬 ∪ 𝑭𝑭 = 𝒇𝒇𝒏𝒏 𝑬𝑬 + 𝒇𝒇𝒏𝒏 𝑭𝑭 − 𝒇𝒇𝒏𝒏 𝑬𝑬 ∩ 𝑭𝑭 .
𝒇𝒇𝒏𝒏 𝑬𝑬∪𝑭𝑭 𝒇𝒇𝒏𝒏 𝑬𝑬 𝒇𝒇𝒏𝒏 𝑭𝑭 𝒇𝒇𝒏𝒏 𝑬𝑬∩𝑭𝑭
It follows that = + − for all 𝒏𝒏.
𝒏𝒏 𝒏𝒏 𝒏𝒏 𝒏𝒏
𝒇𝒇𝒏𝒏 𝑬𝑬 𝒇𝒇𝒏𝒏 𝑭𝑭 𝒇𝒇𝒏𝒏 𝑬𝑬∪𝑭𝑭 𝒇𝒇𝒏𝒏 𝑬𝑬∩𝑭𝑭
Since → 𝑷𝑷 𝑬𝑬 , → 𝑷𝑷 𝑭𝑭 , → 𝑷𝑷 𝑬𝑬 ∪ 𝑭𝑭 , and → 𝑷𝑷 𝑬𝑬 ∩ 𝑭𝑭 as 𝒏𝒏 → ∞,
𝒏𝒏 𝒏𝒏 𝒏𝒏 𝒏𝒏
we have 𝑷𝑷 𝑬𝑬 ∪ 𝑭𝑭 = 𝑷𝑷 𝑬𝑬 + 𝑷𝑷 𝑭𝑭 cornell
− 𝑷𝑷econ/ilrst/stsci
𝑬𝑬 ∩ 𝑭𝑭 . 3110; notes by Tom DiCiccio
PART 1: PROBABILITY
properties of probability
To demonstrate Property 5, 𝑷𝑷 𝑬𝑬 ∪ 𝑭𝑭 = 𝑷𝑷 𝑬𝑬 + 𝑷𝑷 𝑭𝑭 − 𝑷𝑷 𝑬𝑬 ∩ 𝑭𝑭 :
𝟓𝟓
𝑳𝑳𝑳𝑳𝑳𝑳 = 𝑷𝑷 𝑬𝑬 ∪ 𝑭𝑭 = ;
𝟔𝟔
𝟏𝟏 𝟏𝟏 𝟏𝟏 𝟓𝟓
𝑹𝑹𝑹𝑹𝑹𝑹 = 𝑷𝑷 𝑬𝑬 + 𝑷𝑷 𝑭𝑭 − 𝑷𝑷 𝑬𝑬 ∩ 𝑭𝑭 = + − = .
𝟐𝟐 𝟐𝟐 𝟔𝟔 𝟔𝟔
In a sample space having equally likely outcomes, the probability of any event 𝑬𝑬
can be calculated according to the formula:
𝒏𝒏𝒏𝒏𝒏𝒏𝒏𝒏𝒏𝒏𝒏𝒏 𝒐𝒐𝒐𝒐 𝒐𝒐𝒐𝒐𝒐𝒐𝒐𝒐𝒐𝒐𝒐𝒐𝒐𝒐𝒐𝒐 𝒊𝒊𝒊𝒊 𝑬𝑬
𝑷𝑷 𝑬𝑬 = .
𝒏𝒏𝒏𝒏𝒏𝒏𝒏𝒏𝒏𝒏𝒏𝒏 𝒐𝒐𝒐𝒐 𝒐𝒐𝒐𝒐𝒐𝒐𝒐𝒐𝒐𝒐𝒐𝒐𝒐𝒐𝒐𝒐 𝒊𝒊𝒊𝒊 𝑺𝑺
cornell econ/ilrst/stsci 3110; notes by Tom DiCiccio
PART 1: PROBABILITY
combinatorial probability
In sample spaces having equally likely outcomes, the problem of calculating the probability
of an event 𝑬𝑬 reduces to the problem of counting the number of outcomes in 𝑬𝑬 and the
number of outcomes in 𝑺𝑺.
“Combinatorics” is the subject of counting arguments, so the calculation of probabilities in
sample spaces having equally likely outcomes is called “combinatorial probability.”
Combinatorial arguments are crucial because listing all possible outcomes in the sample
space as we did in the previous example is typically not feasible.
For example, consider calculating the probability of obtaining exactly 20 Heads in 30 flips
of a fair coin. It is clear that writing down all possible outcomes for 30 flips of a coin would
be infeasible.
⋮
given the first 𝑲𝑲 − 𝟏𝟏 parts, the 𝑲𝑲𝒕𝒕𝒕𝒕 part can be done in any of 𝒏𝒏𝑲𝑲 ways.
Then the total number of ways in which the process can be done is 𝒏𝒏𝟏𝟏 × 𝒏𝒏𝟐𝟐 × ⋯ × 𝒏𝒏𝑲𝑲 .
Example: License plates consist of three letters followed by four digits. How many plates are
possible?
Creating a plate is a process with 𝑲𝑲 = 𝟕𝟕 parts: 𝒏𝒏𝟏𝟏 = 𝒏𝒏𝟐𝟐 = 𝒏𝒏𝟑𝟑 = 𝟐𝟐𝟐𝟐; 𝒏𝒏𝟒𝟒 = 𝒏𝒏𝟓𝟓 = 𝒏𝒏𝟔𝟔 = 𝒏𝒏𝟕𝟕 = 𝟏𝟏𝟏𝟏.
The total number of possible plates is 𝟐𝟐𝟐𝟐𝟑𝟑 × 𝟏𝟏𝟏𝟏𝟒𝟒 = 𝟏𝟏𝟏𝟏𝟏𝟏, 𝟕𝟕𝟕𝟕𝟕𝟕, 𝟎𝟎𝟎𝟎𝟎𝟎.
Example: In the previous example, how many plates are possible if no letter nor no digit can
be repeated?
As before, creating a plate is a process with 𝑲𝑲 = 𝟕𝟕 parts.
Now, 𝒏𝒏𝟏𝟏 = 𝟐𝟐𝟐𝟐, 𝒏𝒏𝟐𝟐 = 𝟐𝟐𝟐𝟐, 𝒏𝒏𝟑𝟑 = 𝟐𝟐𝟐𝟐, 𝒏𝒏𝟒𝟒 = 𝟏𝟏𝟏𝟏, 𝒏𝒏𝟓𝟓 = 𝟗𝟗, 𝒏𝒏𝟔𝟔 = 𝟖𝟖, 𝒏𝒏𝟕𝟕 = 𝟕𝟕.
The total number of possible plates is 𝟐𝟐𝟐𝟐 × 𝟐𝟐𝟐𝟐 × 𝟐𝟐𝟐𝟐 × 𝟏𝟏𝟏𝟏 × 𝟗𝟗 × 𝟖𝟖 × 𝟕𝟕 = 𝟕𝟕𝟕𝟕, 𝟔𝟔𝟔𝟔𝟔𝟔, 𝟎𝟎𝟎𝟎𝟎𝟎.
Example: License plates consist of three letters followed by four digits. What is the probability
that a randomly chosen plate has repeats in either the letters or the digits or both?
Let 𝑬𝑬 be the event that all the letters and digits on the randomly chosen plate are distinct.
𝟕𝟕𝟕𝟕,𝟔𝟔𝟔𝟔𝟔𝟔,𝟎𝟎𝟎𝟎𝟎𝟎
𝑷𝑷 𝑬𝑬 = = 𝟎𝟎. 𝟒𝟒𝟒𝟒𝟒𝟒𝟒𝟒.
𝟏𝟏𝟏𝟏𝟏𝟏,𝟕𝟕𝟕𝟕𝟕𝟕,𝟎𝟎𝟎𝟎𝟎𝟎
Then 𝑬𝑬� is the event that the randomly chosen plate has repeats in either the letters or the
digits or both.
� = 𝟏𝟏 − 𝑷𝑷 𝑬𝑬 = 𝟏𝟏 − 𝟎𝟎. 𝟒𝟒𝟒𝟒𝟒𝟒𝟒𝟒 = 𝟎𝟎. 𝟓𝟓𝟓𝟓𝟓𝟓𝟓𝟓.
𝑷𝑷 𝑬𝑬
The following table gives 𝑷𝑷 𝑬𝑬 and A main point of the Birthday Problem is that everyone
� for various values of 𝒓𝒓:
𝑷𝑷 𝑬𝑬 � , the probability that two or
tends to underestimate 𝑷𝑷 𝑬𝑬
more of the 𝒓𝒓 people have the same birthday.
𝒓𝒓 𝑷𝑷 𝑬𝑬 �
𝑷𝑷 𝑬𝑬
The assumption that birthdays are evenly distributed
5 0.972864 0.027136 throughout the year is definitely incorrect.
10 0.883052 0.116948 However, if birthdays are not evenly distributed, then
15 0.747099 0.252901 the true probability that two or more of the 𝒓𝒓 people
have the same birthday should be even larger than the
20 0.588562 0.411438
value computed under the incorrect assumption.
25 0.431300 0.568700 � given in the table are
So, if the values for 𝑷𝑷 𝑬𝑬
30 0.293684 0.706316 surprisingly large, the true probabilities, which are
larger than those in the table, would be even more
astonishing!
The general formula for the number of permutations can be expressed in terms of factorials:
𝒏𝒏−𝒓𝒓 × 𝒏𝒏−𝒓𝒓−𝟏𝟏 ×⋯×𝟏𝟏
𝒏𝒏 × 𝒏𝒏 − 𝟏𝟏 × ⋯ × 𝒏𝒏 − 𝒓𝒓 + 𝟏𝟏 = 𝒏𝒏 × 𝒏𝒏 − 𝟏𝟏 × ⋯ × 𝒏𝒏 − 𝒓𝒓 + 𝟏𝟏 ×
𝒏𝒏−𝒓𝒓 × 𝒏𝒏−𝒓𝒓−𝟏𝟏 ×⋯×𝟏𝟏
𝒏𝒏!
= .
𝒏𝒏−𝒓𝒓 !
cornell econ/ilrst/stsci 3110; notes by Tom DiCiccio
PART 1: PROBABILITY
combinations
A COMBINATION of size 𝒓𝒓 taken out of 𝒏𝒏 objects is a collection having the properties:
i) no object can appear more than once in the collection; and
ii) the order of the objects in the collection is ignored.
Example: Consider the 𝒏𝒏 = 𝟒𝟒 objects 𝒂𝒂, 𝒃𝒃, 𝒄𝒄, 𝒅𝒅.
The possible combinations of size 𝒓𝒓 = 𝟐𝟐 are: 𝒂𝒂𝒂𝒂, 𝒂𝒂𝒂𝒂, 𝒂𝒂𝒂𝒂, 𝒃𝒃𝒃𝒃, 𝒃𝒃𝒃𝒃, 𝒄𝒄𝒄𝒄.
Note that 𝒂𝒂𝒂𝒂 is not a combination because the object 𝒂𝒂 is repeated in the collection.
Note that 𝒂𝒂𝒂𝒂 and 𝒃𝒃𝒃𝒃 are not distinct combinations because order is ignored.
In the example there are 6 combinations. How many combinations are possible in general?
Each combination of size 𝒓𝒓 can be ordered in 𝒓𝒓! ways to create permutations.
Thus, the number of permutations is the number of combinations multiplied by 𝒓𝒓!
𝒏𝒏× 𝒏𝒏−𝟏𝟏 ×⋯× 𝒏𝒏−𝒓𝒓+𝟏𝟏 𝒏𝒏!
So, in general, the number of combinations is = .
𝒓𝒓! 𝒓𝒓! 𝒏𝒏−𝒓𝒓 !
𝟒𝟒×𝟑𝟑 𝟒𝟒!
In the example, the number of combinations is = = 𝟔𝟔.
𝟐𝟐×𝟏𝟏 𝟐𝟐!𝟐𝟐!
cornell econ/ilrst/stsci 3110; notes by Tom DiCiccio
PART 1: PROBABILITY
combinations
𝒏𝒏× 𝒏𝒏−𝟏𝟏 ×⋯× 𝒏𝒏−𝒓𝒓+𝟏𝟏 𝒏𝒏!
The number of combinations is = .
𝒓𝒓! 𝒓𝒓! 𝒏𝒏−𝒓𝒓 !
𝒏𝒏
This number is usually denoted by and is read “𝒏𝒏 chose 𝒓𝒓”.
𝒓𝒓
𝒏𝒏
is called a Binomial Coefficient since it appears as a coefficient in
𝒓𝒓 𝒏𝒏 𝒓𝒓 𝒏𝒏−𝒓𝒓
𝒏𝒏
the Binomial Theorem 𝒙𝒙 + 𝒚𝒚 𝒏𝒏 = ∑𝒓𝒓=𝟎𝟎 𝒙𝒙 𝒚𝒚 .
𝒓𝒓
Example: A pool of mediators consists of 12 women and 6 men. Five mediators are chosen at
random from the pool to form a judicial panel. What is the probability that the panel consists of 3
women and 2 men?
𝟏𝟏𝟖𝟖
The panel of 5 can be chosen from the 18 mediators in = 𝟖𝟖, 𝟓𝟓𝟓𝟓𝟓𝟓 equally likely ways.
𝟓𝟓
𝟏𝟏𝟐𝟐 𝟔𝟔
The panel can be chosen so that there are 3 women and 2 men in = 𝟑𝟑, 𝟑𝟑𝟑𝟑𝟑𝟑 ways.
𝟑𝟑 𝟐𝟐
𝟏𝟏𝟐𝟐 𝟔𝟔
𝟑𝟑 𝟐𝟐 𝟑𝟑,𝟑𝟑𝟑𝟑𝟑𝟑
The probability the panel has 3 women and 2 men is 𝟏𝟏𝟖𝟖 = = 𝟎𝟎. 𝟑𝟑𝟑𝟑𝟑𝟑𝟑𝟑𝟑𝟑 .
𝟖𝟖,𝟓𝟓𝟓𝟓𝟓𝟓
𝟓𝟓
cornell econ/ilrst/stsci 3110; notes by Tom DiCiccio
PART 1: PROBABILITY
combinations
There are two important properties of Binomial Coefficients:
𝒏𝒏 𝒏𝒏
i) = = 𝟏𝟏 .
𝟎𝟎 𝒏𝒏
𝒏𝒏 𝒏𝒏! 𝒏𝒏! 𝒏𝒏 𝒏𝒏! 𝒏𝒏!
Note that = = = 𝟏𝟏, and = = = 𝟏𝟏, since 𝟎𝟎! = 𝟏𝟏.
𝟎𝟎 𝟎𝟎! 𝒏𝒏−𝟎𝟎 ! 𝒏𝒏! 𝒏𝒏 𝒏𝒏! 𝒏𝒏−𝒏𝒏 ! 𝒏𝒏!𝟎𝟎!
𝒏𝒏 𝒏𝒏
ii) = .
𝒓𝒓 𝒏𝒏 − 𝒓𝒓
𝒏𝒏 𝒏𝒏! 𝒏𝒏! 𝒏𝒏
Note that = = = .
𝒓𝒓 𝒓𝒓! 𝒏𝒏−𝒓𝒓 ! 𝒏𝒏−𝒓𝒓 ! 𝒏𝒏− 𝒏𝒏−𝒓𝒓 ! 𝒏𝒏 − 𝒓𝒓
𝟏𝟏𝟎𝟎 𝟏𝟏𝟎𝟎
Example: = .
𝟑𝟑 𝟕𝟕
𝟏𝟏𝟎𝟎 𝟏𝟏𝟏𝟏×𝟗𝟗×𝟖𝟖 𝟏𝟏𝟎𝟎 𝟏𝟏𝟏𝟏×𝟗𝟗×𝟖𝟖×𝟕𝟕×𝟔𝟔×𝟓𝟓×𝟒𝟒
= = 𝟏𝟏𝟏𝟏𝟏𝟏, and = = 𝟏𝟏𝟏𝟏𝟏𝟏.
𝟑𝟑 𝟑𝟑×𝟐𝟐×𝟏𝟏 𝟕𝟕 𝟕𝟕×𝟔𝟔×𝟓𝟓×𝟒𝟒×𝟑𝟑×𝟐𝟐×𝟏𝟏
We can use the die-rolling example to motivate the formula for conditional probability.
If 𝑬𝑬 has occurred, the original sample space 𝑺𝑺 = 𝟏𝟏, 𝟐𝟐, 𝟑𝟑, 𝟒𝟒, 𝟓𝟓, 𝟔𝟔 is not longer relevant.
The event 𝑬𝑬 = “the upturned face is even” = 𝟐𝟐, 𝟒𝟒, 𝟔𝟔 becomes the relevant sample space.
𝟏𝟏
The original probabilities ( on each outcome) are inappropriate for the new sample space
𝟔𝟔
𝟏𝟏 𝟏𝟏 𝟏𝟏 𝟏𝟏
𝟐𝟐, 𝟒𝟒, 𝟔𝟔 , because the sum of these probabilities, + + = 𝑷𝑷 𝑬𝑬 = , is different than 𝟏𝟏.
𝟔𝟔 𝟔𝟔 𝟔𝟔 𝟐𝟐
We must re-scale these probabilities so that their relative sizes are the same yet they sum to 𝟏𝟏.
𝟏𝟏
This re-scaling is achieved by dividing each of the original probabilities by 𝑷𝑷 𝑬𝑬 = .
𝟐𝟐
𝟏𝟏
𝟏𝟏
Thus, the appropriate probabilities in the relevant sample space 𝑬𝑬 = 𝟐𝟐, 𝟒𝟒, 𝟔𝟔 are each �𝟏𝟏 = . 𝟔𝟔
𝟑𝟑
𝟐𝟐
In the relevant sample space 𝑬𝑬 = 𝟐𝟐, 𝟒𝟒, 𝟔𝟔 , the event 𝑭𝑭 = “the upturned face is at most 3” can
occur only if the outcome is in 𝑬𝑬 ∩ 𝑭𝑭 = 𝟐𝟐 .
𝟏𝟏
Thus, the probability that 𝑭𝑭 occurs given that 𝑬𝑬 has occurred is , the re-scaled probability of
𝟑𝟑
the outcome 2.
cornell econ/ilrst/stsci 3110; notes by Tom DiCiccio
PART 1: PROBABILITY
conditional probability
𝑷𝑷 𝑬𝑬∩𝑭𝑭
The conditional probability formula 𝑷𝑷 𝑭𝑭 𝑬𝑬 = accomplishes both operations:
𝑷𝑷 𝑬𝑬
i) by having 𝑷𝑷 𝑬𝑬 in the denominator, it does the re-scaling of the original probabilities to
achieve appropriate probabilities for the relevant sample space 𝑬𝑬 ; and
ii) by having 𝑷𝑷 𝑬𝑬 ∩ 𝑭𝑭 in the numerator, it accounts for that 𝑭𝑭 can occur in the relevant
sample space 𝑬𝑬 only if the outcome is in 𝑬𝑬 ∩ 𝑭𝑭.
Example: Rolling a fair die.
Recall that in the die-rolling example, we had 𝑮𝑮 = “the upturned face is at least 3” = 𝟑𝟑, 𝟒𝟒, 𝟓𝟓, 𝟔𝟔 ,
𝟏𝟏
so 𝑬𝑬 ∩ 𝑮𝑮 = 𝟒𝟒, 𝟔𝟔 and 𝑷𝑷 𝑬𝑬 ∩ 𝑮𝑮 = .
𝟑𝟑
𝟏𝟏
𝑷𝑷 𝑬𝑬∩𝑮𝑮 𝟐𝟐
Then, 𝑷𝑷 𝑮𝑮 𝑬𝑬 = = �𝟏𝟏 = .
𝟑𝟑
𝑷𝑷 𝑬𝑬 𝟑𝟑
𝟐𝟐
𝟐𝟐
Note that the “unconditional” probability of 𝑮𝑮 is 𝑷𝑷 𝑮𝑮 = , and the conditional probability that
𝟑𝟑
𝟐𝟐
𝑮𝑮 occurs given that 𝑬𝑬 has occurred is 𝑷𝑷 𝑮𝑮 𝑬𝑬 = .
𝟑𝟑
Knowing that 𝑬𝑬 has occurred does not change the probability that 𝑮𝑮 occurs.
cornell econ/ilrst/stsci 3110; notes by Tom DiCiccio
PART 1: PROBABILITY
independent events
Events 𝑬𝑬 and 𝑭𝑭 are INDEPENDENT if 𝑷𝑷 𝑬𝑬 ∩ 𝑭𝑭 = 𝑷𝑷 𝑬𝑬 × 𝑷𝑷 𝑭𝑭 .
We can understand the meaning of independence through conditional probability: if events 𝑬𝑬
𝑷𝑷 𝑬𝑬∩𝑭𝑭 𝑷𝑷 𝑬𝑬 ×𝑷𝑷 𝑭𝑭
and 𝑭𝑭 are independent then 𝑷𝑷 𝑭𝑭 𝑬𝑬 = = = 𝑷𝑷 𝑭𝑭 , and similarly, 𝑷𝑷 𝑭𝑭 𝑬𝑬 = 𝑷𝑷 𝑭𝑭 .
𝑷𝑷 𝑬𝑬 𝑷𝑷 𝑬𝑬
Thus, for two independent events, knowing that one occurs does not affect the probability that
the other occurs.
Example: Rolling a fair die. 𝑬𝑬 = 𝟐𝟐, 𝟒𝟒, 𝟔𝟔 , 𝑭𝑭 = 𝟏𝟏, 𝟐𝟐, 𝟑𝟑 , 𝑮𝑮 = 𝟑𝟑, 𝟒𝟒, 𝟓𝟓, 𝟔𝟔 .
𝟏𝟏 𝟏𝟏
In the die-rolling example, 𝑷𝑷 𝑭𝑭 = and 𝑷𝑷 𝑭𝑭 𝑬𝑬 = , so events 𝑬𝑬 and 𝑭𝑭 are not independent.
𝟐𝟐 𝟑𝟑
We know that the mathematical criterion for independence must fail, and we can check:
𝟏𝟏 𝟏𝟏 𝟏𝟏 𝟏𝟏
𝑷𝑷 𝑬𝑬 ∩ 𝑭𝑭 = , 𝑷𝑷 𝑬𝑬 × 𝑷𝑷 𝑭𝑭 = × = , i.e., 𝑷𝑷 𝑬𝑬 ∩ 𝑭𝑭 ≠ 𝑷𝑷 𝑬𝑬 × 𝑷𝑷 𝑭𝑭 .
𝟔𝟔 𝟐𝟐 𝟐𝟐 𝟒𝟒
𝟐𝟐 𝟐𝟐
On the other hand, 𝑷𝑷 𝑮𝑮 = and 𝑷𝑷 𝑮𝑮 𝑬𝑬 = , so events 𝑬𝑬 and 𝑭𝑭 are independent.
𝟑𝟑 𝟑𝟑
We know that the mathematical criterion for independence must hold, and we can check:
𝟏𝟏 𝟏𝟏 𝟐𝟐 𝟏𝟏
𝑷𝑷 𝑬𝑬 ∩ 𝑮𝑮 = , 𝑷𝑷 𝑬𝑬 × 𝑷𝑷 𝑮𝑮 = × = , i.e., 𝑷𝑷 𝑬𝑬 ∩ 𝑮𝑮 = 𝑷𝑷 𝑬𝑬 × 𝑷𝑷 𝑮𝑮 .
𝟑𝟑 𝟐𝟐 econ/ilrst/stsci
cornell 𝟑𝟑 𝟑𝟑 3110; notes by Tom DiCiccio
PART 1: PROBABILITY
independent events
The notion of independence can be extended to more than 2 events.
Events 𝑬𝑬𝟏𝟏 , 𝑬𝑬𝟐𝟐 , … , 𝑬𝑬𝒏𝒏 are said to be independent if the probability that any 𝒌𝒌 of them
occur simultaneously is the product of their 𝒌𝒌 individual probabilities:
𝑷𝑷 𝑬𝑬𝒊𝒊𝟏𝟏 ∩ ⋯ ∩ 𝑬𝑬𝒊𝒊𝒌𝒌 = 𝑷𝑷 𝑬𝑬𝒊𝒊𝟏𝟏 × ⋯ × 𝑷𝑷 𝑬𝑬𝒊𝒊𝒌𝒌 , where 𝒊𝒊𝟏𝟏 , … , 𝒊𝒊𝒌𝒌 ∈ 𝟏𝟏, … , 𝒏𝒏 .
Example: In three flips of a fair coin, define events 𝑬𝑬𝟏𝟏 , 𝑬𝑬𝟐𝟐 , 𝑬𝑬𝟑𝟑 as follows:
𝑬𝑬𝟏𝟏 = “same face on the first and second flips” = 𝑯𝑯𝑯𝑯𝑯𝑯, 𝑯𝑯𝑯𝑯𝑯𝑯, 𝑻𝑻𝑻𝑻𝑻𝑻, 𝑻𝑻𝑻𝑻𝑻𝑻 ;
𝑬𝑬𝟐𝟐 = “same face on the first and third flips” = 𝑯𝑯𝑯𝑯𝑯𝑯, 𝑯𝑯𝑯𝑯𝑯𝑯, 𝑻𝑻𝑻𝑻𝑻𝑻, 𝑻𝑻𝑻𝑻𝑻𝑻 ; and
𝑬𝑬𝟑𝟑 = “same face on the second and third flips” = 𝑯𝑯𝑯𝑯𝑯𝑯, 𝑻𝑻𝑻𝑻𝑻𝑻, 𝑯𝑯𝑯𝑯𝑯𝑯, 𝑻𝑻𝑻𝑻𝑻𝑻 .
𝟏𝟏 𝟏𝟏 𝟏𝟏 𝟏𝟏
Then, 𝑷𝑷 𝑬𝑬𝟏𝟏 = , 𝑷𝑷 𝑬𝑬𝟐𝟐 = , 𝑷𝑷 𝑬𝑬𝟑𝟑 = . Furthermore, 𝑷𝑷 𝑬𝑬𝟏𝟏 ∩ 𝑬𝑬𝟐𝟐 = = 𝑷𝑷 𝑬𝑬𝟏𝟏 × 𝑷𝑷 𝑬𝑬𝟐𝟐 ,
𝟐𝟐 𝟐𝟐 𝟐𝟐 𝟒𝟒
𝟏𝟏 𝟏𝟏
𝑷𝑷 𝑬𝑬𝟏𝟏 ∩ 𝑬𝑬𝟑𝟑 = = 𝑷𝑷 𝑬𝑬𝟏𝟏 × 𝑷𝑷 𝑬𝑬𝟑𝟑 , and 𝑷𝑷 𝑬𝑬𝟐𝟐 ∩ 𝑬𝑬𝟑𝟑 = = 𝑷𝑷 𝑬𝑬𝟐𝟐 × 𝑷𝑷 𝑬𝑬𝟑𝟑 .
𝟒𝟒 𝟒𝟒
𝟏𝟏
But, 𝑷𝑷 𝑬𝑬𝟏𝟏 ∩ 𝑬𝑬𝟐𝟐 ∩ 𝑬𝑬𝟑𝟑 = ≠ 𝑷𝑷 𝑬𝑬𝟏𝟏 × 𝑷𝑷 𝑬𝑬𝟐𝟐 × 𝑷𝑷 𝑬𝑬𝟑𝟑 , so 𝑬𝑬𝟏𝟏 , 𝑬𝑬𝟐𝟐 , 𝑬𝑬𝟑𝟑 are not independent.
𝟒𝟒
For example, suppose that among the students registered in a college of humanities and
sciences, 60% major in humanities and 40% major in sciences. Of the students majoring in
humanities, 56% are female, while among the students majoring in sciences, 47% are female.
A student in the college is chosen at random. What is the probability that the student is female?
We want to find the probability of 𝑭𝑭 = “the student is female”.
� = “the student is in the sciences”.
Let 𝑬𝑬 = “the student is in the humanities”. Then, 𝑬𝑬
� = 𝟎𝟎. 𝟒𝟒𝟒𝟒.
We know that 𝑷𝑷 𝑬𝑬 = 𝟎𝟎. 𝟔𝟔, 𝑷𝑷 𝑭𝑭 𝑬𝑬 = 𝟎𝟎. 𝟓𝟓𝟓𝟓, and 𝑷𝑷 𝑭𝑭 𝑬𝑬
How can we combine this given information to compute 𝑷𝑷 𝑭𝑭 ?
� × 𝑷𝑷 𝑭𝑭 𝑬𝑬
The LAW of TOTAL PROBABILITY states 𝑷𝑷 𝑭𝑭 = 𝑷𝑷 𝑬𝑬 × 𝑷𝑷 𝑭𝑭 𝑬𝑬 + 𝑷𝑷 𝑬𝑬 � .
� ∩ 𝑭𝑭 , where 𝑬𝑬 ∩ 𝑭𝑭 and 𝑬𝑬
The justification is as follows: since 𝑭𝑭 = 𝑬𝑬 ∩ 𝑭𝑭 ∪ 𝑬𝑬 � ∩ 𝑭𝑭 are
� ∩ 𝑭𝑭 = 𝑷𝑷 𝑬𝑬 × 𝑷𝑷 𝑭𝑭 𝑬𝑬 + 𝑷𝑷 𝑬𝑬
disjoint, we have 𝑷𝑷 𝑭𝑭 = 𝑷𝑷 𝑬𝑬 ∩ 𝑭𝑭 + 𝑷𝑷 𝑬𝑬 � .
� × 𝑷𝑷 𝑭𝑭 𝑬𝑬
In the example, 𝑷𝑷 𝑭𝑭 = 𝟎𝟎. 𝟔𝟔 × 𝟎𝟎. 𝟓𝟓𝟓𝟓 + 𝟎𝟎. 𝟒𝟒 × 𝟎𝟎. 𝟒𝟒𝟒𝟒 = 𝟎𝟎. 𝟓𝟓𝟓𝟓𝟓𝟓; 52.4% of college students are female.
cornell econ/ilrst/stsci 3110; notes by Tom DiCiccio
PART 1: PROBABILITY
law of total probability
𝑭𝑭 𝑷𝑷 𝑬𝑬 ∩ 𝑭𝑭 = 𝑷𝑷 𝑬𝑬 × 𝑷𝑷 𝑭𝑭 𝑬𝑬
𝑬𝑬
𝑭𝑭� � = 𝑷𝑷 𝑬𝑬 × 𝑷𝑷 𝑭𝑭
𝑷𝑷 𝑬𝑬 ∩ 𝑭𝑭 � 𝑬𝑬
𝑭𝑭 � ∩ 𝑭𝑭 = 𝑷𝑷 𝑬𝑬
𝑷𝑷 𝑬𝑬 �
� × 𝑷𝑷 𝑭𝑭 𝑬𝑬
�
𝑬𝑬
𝑭𝑭� � ∩ 𝑭𝑭
𝑷𝑷 𝑬𝑬 � = 𝑷𝑷 𝑬𝑬
� × 𝑷𝑷 𝑭𝑭
� 𝑬𝑬
�
In the tree diagram, note that 𝑷𝑷 𝑬𝑬� = 𝟏𝟏 − 𝑷𝑷 𝑬𝑬 ,
� 𝑬𝑬 = 𝟏𝟏 − 𝑷𝑷 𝑭𝑭 𝑬𝑬 , 𝑷𝑷 𝑭𝑭
𝑷𝑷 𝑭𝑭 � 𝑬𝑬 � .
� = 𝟏𝟏 − 𝑷𝑷 𝑭𝑭 𝑬𝑬 �
� × 𝑷𝑷 𝑭𝑭 𝑬𝑬
𝑷𝑷 𝑭𝑭 = 𝑷𝑷 𝑬𝑬 × 𝑷𝑷 𝑭𝑭 𝑬𝑬 + 𝑷𝑷 𝑬𝑬
humanities
male
sciences
male
yes
incorrect
𝟏𝟏 𝟏𝟏 − 𝒑𝒑
correct 𝟏𝟏 − 𝒑𝒑 × =
𝑲𝑲 𝑲𝑲
no 𝟏𝟏 − 𝒑𝒑
𝑷𝑷 𝒄𝒄𝒄𝒄𝒄𝒄𝒄𝒄𝒄𝒄𝒄𝒄𝒄𝒄 𝒂𝒂𝒂𝒂𝒂𝒂𝒂𝒂𝒂𝒂𝒂𝒂 = 𝒑𝒑 +
𝑲𝑲
𝒑𝒑 = 𝟔𝟔𝟔𝟔𝟔, 𝑲𝑲 = 𝟐𝟐, 𝑷𝑷 𝒄𝒄𝒄𝒄𝒄𝒄𝒄𝒄𝒄𝒄𝒄𝒄𝒄𝒄 𝒂𝒂𝒂𝒂𝒂𝒂𝒂𝒂𝒂𝒂𝒂𝒂cornell
= econ/ilrst/stsci incorrect
𝟖𝟖𝟖𝟖𝟖. 3110; notes by Tom DiCiccio
PART 1: PROBABILITY
law of total probability
Example. Disease testing.
Six percent of people in a large population have a certain disease. There is a test for the
disease: the test has false negative rate of 1% and false positive rate of 4%. A randomly
chosen person is given the test. What is the probability that the test gives a positive result?
test result
person has
disease positive 𝟎𝟎. 𝟎𝟎𝟎𝟎 × 𝟎𝟎. 𝟗𝟗𝟗𝟗 = 𝟎𝟎. 𝟎𝟎𝟎𝟎𝟎𝟎𝟎𝟎
yes
negative
no
negative
𝑷𝑷 𝒑𝒑𝒑𝒑𝒑𝒑𝒑𝒑𝒑𝒑𝒑𝒑𝒑𝒑𝒑𝒑 𝒕𝒕𝒕𝒕𝒕𝒕𝒕𝒕 = 𝟎𝟎. 𝟎𝟎𝟎𝟎𝟎𝟎
cornell econ/ilrst/stsci 3110; notes by Tom DiCiccio
PART 1: PROBABILITY
Bayes’ theorem
� ,
Recall that for the Law of Total Probability we have the ingredients 𝑷𝑷 𝑬𝑬 , 𝑷𝑷 𝑭𝑭 𝑬𝑬 , and 𝑷𝑷 𝑭𝑭 𝑬𝑬
which we combine to calculate 𝑷𝑷 𝑭𝑭 .
In BAYES’ THEOREM, we use the same ingredients to calculate 𝑷𝑷 𝑬𝑬 𝑭𝑭 .
Since 𝑷𝑷 𝑭𝑭 𝑬𝑬 is part of the ingredients, and we use Bayes’ Theorem to calculate 𝑷𝑷 𝑬𝑬 𝑭𝑭 ,
Bayes’ Theorem is also called “reverse probability”.
For example, in the disease testing example, we might want to calculate the probability that a
randomly chosen person has the disease given that the randomly chosen person yields a
positive test result.
𝑷𝑷 𝑬𝑬 ×𝑷𝑷 𝑭𝑭|𝑬𝑬
Bayes’ Theorem states: 𝑷𝑷 𝑬𝑬 𝑭𝑭 = �
� ×𝑷𝑷 𝑭𝑭|𝑬𝑬
.
𝑷𝑷 𝑬𝑬 ×𝑷𝑷 𝑭𝑭|𝑬𝑬 +𝑷𝑷 𝑬𝑬
𝑷𝑷 𝑬𝑬∩𝑭𝑭 𝑷𝑷 𝑬𝑬 ×𝑷𝑷 𝑭𝑭|𝑬𝑬
The justification is as follows: 𝑷𝑷 𝑬𝑬 𝑭𝑭 = = � , where the numerator is
� ×𝑷𝑷 𝑭𝑭|𝑬𝑬
𝑷𝑷 𝑭𝑭 𝑷𝑷 𝑬𝑬 ×𝑷𝑷 𝑭𝑭|𝑬𝑬 +𝑷𝑷 𝑬𝑬
from the multiplication rule and the denominator is from the Law of Total Probability.
The implementation of Bayes’ Theorem is facilitated by means of a tree diagram as is shown on
the following page.
𝑭𝑭 𝟏𝟏 𝑷𝑷 𝑬𝑬 ∩ 𝑭𝑭 = 𝑷𝑷 𝑬𝑬 × 𝑷𝑷 𝑭𝑭 𝑬𝑬
𝑬𝑬
𝑭𝑭� � = 𝑷𝑷 𝑬𝑬 × 𝑷𝑷 𝑭𝑭
𝑷𝑷 𝑬𝑬 ∩ 𝑭𝑭 � 𝑬𝑬
𝑭𝑭 𝟐𝟐 � ∩ 𝑭𝑭 = 𝑷𝑷 𝑬𝑬
𝑷𝑷 𝑬𝑬 �
� × 𝑷𝑷 𝑭𝑭 𝑬𝑬
�
𝑬𝑬
𝑭𝑭� � ∩ 𝑭𝑭
𝑷𝑷 𝑬𝑬 � = 𝑷𝑷 𝑬𝑬
� × 𝑷𝑷 𝑭𝑭
� 𝑬𝑬
�
𝑷𝑷 𝑬𝑬|𝑭𝑭 =
𝑷𝑷 𝑬𝑬 ×𝑷𝑷 𝑭𝑭|𝑬𝑬
=
𝟏𝟏
. 𝑷𝑷 𝑭𝑭 = 𝑷𝑷 𝑬𝑬 × 𝑷𝑷 𝑭𝑭 𝑬𝑬 + 𝑷𝑷 𝑬𝑬 �
� × 𝑷𝑷 𝑭𝑭 𝑬𝑬
𝑷𝑷 𝑬𝑬 ×𝑷𝑷 𝑭𝑭|𝑬𝑬 +𝑷𝑷 𝑬𝑬 �
� ×𝑷𝑷 𝑭𝑭|𝑬𝑬 𝟏𝟏+𝟐𝟐
yes
incorrect
𝟏𝟏 𝟏𝟏 − 𝒑𝒑
correct 𝟏𝟏 − 𝒑𝒑 × =
𝑲𝑲 𝑲𝑲
no
𝑷𝑷 𝒔𝒔𝒔𝒔𝒔𝒔𝒔𝒔𝒔𝒔𝒔𝒔𝒔𝒔 𝒌𝒌𝒌𝒌𝒌𝒌𝒌𝒌𝒌𝒌|𝒄𝒄𝒄𝒄𝒄𝒄𝒄𝒄𝒄𝒄𝒄𝒄𝒄𝒄 𝒂𝒂𝒂𝒂𝒂𝒂𝒂𝒂𝒂𝒂𝒂𝒂
𝒑𝒑
incorrect =
𝟏𝟏 − 𝒑𝒑
𝟑𝟑 𝒑𝒑 +
𝑲𝑲
𝒑𝒑 = 𝟔𝟔𝟔𝟔𝟔, 𝑲𝑲 = 𝟐𝟐, 𝑷𝑷 𝒔𝒔𝒔𝒔𝒔𝒔𝒔𝒔𝒔𝒔𝒔𝒔𝒔𝒔 𝒌𝒌𝒌𝒌𝒌𝒌𝒌𝒌𝒌𝒌|𝒄𝒄𝒄𝒄𝒄𝒄𝒄𝒄𝒄𝒄𝒄𝒄𝒄𝒄 𝒂𝒂𝒂𝒂𝒂𝒂𝒂𝒂𝒂𝒂𝒂𝒂 =
𝟒𝟒
cornell econ/ilrst/stsci 3110; notes by Tom DiCiccio
PART 1: PROBABILITY
Bayes’ theorem
Example. Disease testing.
Given that the test gives a positive result, what is the probability that the randomly chosen
person actually has the disease?
test result
person has
disease positive 𝟎𝟎. 𝟎𝟎𝟎𝟎 × 𝟎𝟎. 𝟗𝟗𝟗𝟗 = 𝟎𝟎. 𝟎𝟎𝟎𝟎𝟎𝟎𝟎𝟎
yes
negative
no
1 no
yes 𝟎𝟎. 𝟑𝟑 × 𝟎𝟎. 𝟎𝟎𝟎𝟎 = 𝟎𝟎. 𝟎𝟎𝟎𝟎𝟎𝟎
𝟎𝟎. 𝟑𝟑
2
no
yes 𝟎𝟎. 𝟐𝟐 × 𝟎𝟎. 𝟎𝟎𝟎𝟎 = 𝟎𝟎. 𝟎𝟎𝟎𝟎𝟎𝟎
3
no
𝑷𝑷 𝒅𝒅𝒅𝒅𝒅𝒅𝒅𝒅𝒅𝒅𝒅𝒅𝒅𝒅𝒅𝒅𝒅𝒅 = 𝟎𝟎. 𝟎𝟎𝟎𝟎𝟎𝟎 + 𝟎𝟎. 𝟎𝟎𝟎𝟎𝟎𝟎 + 𝟎𝟎. 𝟎𝟎𝟎𝟎𝟎𝟎
𝑷𝑷 𝑴𝑴𝑴𝑴𝑴𝑴𝑴𝑴𝑴𝑴𝑴𝑴𝑴𝑴 𝟑𝟑|𝒅𝒅𝒅𝒅𝒅𝒅𝒅𝒅𝒅𝒅𝒅𝒅𝒅𝒅𝒅𝒅𝒅𝒅 = 𝟎𝟎. 𝟎𝟎𝟎𝟎𝟎𝟎
𝟎𝟎. 𝟎𝟎𝟎𝟎𝟎𝟎
= = 𝟎𝟎. 𝟎𝟎𝟎𝟎𝟎𝟎𝟎𝟎𝟎𝟎
𝟎𝟎. 𝟎𝟎𝟎𝟎𝟎𝟎 + 𝟎𝟎. 𝟎𝟎𝟎𝟎𝟎𝟎 + 𝟎𝟎. 𝟎𝟎𝟎𝟎𝟎𝟎
cornell econ/ilrst/stsci 3110; notes by Tom DiCiccio