Chapter 1 - Probability (With Solutions)
Chapter 1 - Probability (With Solutions)
● Introduction
● Sample spaces
● Probability measures
● Conditional probability
● Independence
2
What is probability?
Probability,
● The higher the probability, the more certain we are that the event will occur
3
Sample space
Definition
An experiment is any action or process whose outcome is subject to uncertainty or
randomness
● In an experiment, only 1 possible outcome can occur. It is uncertain which outcome will occur.
Definition
The sample space of an experiment, denoted by Ω, is the set/collection of all possible
outcomes of that experiment. Each element of Ω, denoted by ω, is an outcome.
4
Examples
Ω = {1, 2, 3, 4, 5, 6}
5
Examples
t
Amount of time between successive customer arrivals
Ω = {t | t ≥ 0}
time
6
Events
Definition
An event A is any collection of outcomes contained in the sample space (i.e., any subset of Ω written as A
⊂ Ω). An event is simple if it consists of exactly 1 outcome & compound if it consists of > 1 outcome.
Events are concerned with sets of outcomes, A ⊂ Ω. (Not just interested in single outcomes)
● Exactly 1 simple event ({ω}) occurs, but many compound events occur simultaneously
7
Examples
8
Examples
9
Algebra of events
Definition
Given any event A, B ⊂ Ω,
● The complement of A, denoted by Ac, Ā, or A’, is the set of all outcomes in Ω that are not contained
in A
● The union of sets A and B, denoted by A ∪ B (read “A or B”), is the event consisting of all
outcomes that are either in A or B or in both events
● The intersection of sets A and B, denoted by A ∩ B or AB & read “A and B”, is the event consisting
of all outcomes that are both in A and B
10
Algebra of events
Definition
The null event, denoted by ∅, is the event consisting of no outcomes
Definition
We say A and B are disjoint or mutually exclusive events when A ∩ B = ∅. It follows that A and Ac must be
disjoint for any event A ⊂ Ω.
Definition
We say A and B are exhaustive events when A ∪ B = Ω. It follows that A and Ac must be exhaustive for
any event A ⊂ Ω.
11
Venn diagrams
Ac A∪B A∩B 12
Venn diagrams
A and B are mutually exclusive or disjoint events. A and Ac are mutually exclusive and exhaustive events
13
Venn diagrams - Quiz
In a 2 coin toss experiment, if Ω = {HH, HT, TH, TT}; A = {HH} is the right circle and B = {HT, TH} is the left
circle, where is C = {TT}?
14
Venn diagrams - Quiz
● Ac =
● A∪B=
● A∪C=
● A∩B=
● A∩C=
● (A ∩ C)c =
● B∩C=
15
Venn diagrams - Quiz
● Ac = {5, 6}
● A ∪ B = {0, 1, 2, 3, 4, 5, 6} = Ω (i.e. A and B are exhaustive)
● A ∪ C = {0, 1, 2, 3, 4} = A (C is a subset of A)
● A ∩ B = {3, 4}
● A ∩ C = {1, 2}
● (A ∩ C)c = {0, 3, 4, 5, 6}
● B ∩ C = ∅ (i.e. B and C are disjoint)
16
Algebra for multiple events
The operations of union & intersection can be extended to ≥ 3 events, and the idea of disjointness &
exhaustiveness can also be generalized
● A, B & C are said to be mutually exclusive or pairwise disjoint if no 2 events have any outcomes in
common
● A, B & C are said to be exhaustive if the event A ∪ B ∪ C consists of all outcomes in Ω
17
Laws for multiple events
● Distributive Laws
○ (A∪B)∩C = (A∩C)∪(B∩C) ○
○ (A∩B)∪C = (A∪C)∩(B∪C)
18
Probability measure
Definition
A probability measure on Ω is a function P from subsets of Ω to [0, 1] that satisfies the following rules:
● P(Ω) = 1
● If A ⊂ Ω, then P(A) ≥ 0
● If A1, A2, · · · , An, · · · are disjoint, then
Note: The above rules do not completely determine an assignment of probabilities to events. They serve
only to rule out assignments inconsistent with our intuitive notions of prob
19
Probability measure
● P(Ac) = 1 − P(A)
● P(∅) = 0 (i.e. probability that there is no outcome is 0)
● P(A) ≤ 1
● If A ⊂ B, then P(A) ≤ P(B)
● Addition Law:
○ P(A ∪ B) = P(A) + P(B) − P(A ∩ B)
Note: The addition law serves as the general formula to compute the probability of an event of interest
which is expressible as a union of events via decomposing the event of interest into “smaller” events
20
Examples
In an undergraduate module, 60% of students have statistics background, 80% have calculus background,
and 50% have both. If a student is selected at random, what is the probability that s/he has background in
(1) at least 1 subject and (2) exactly 1 subject?
Solution:
Let A = {selected student has statistics background}, B = {selected student has calculus background},
hence P(A) = 0.6, P(B) = 0.8, P(A ⋂ B) = 0.5.
21
Calculating probability
● Let
○ P({ωi}) = pi, i = 1, 2, · · · , n
where n ≥ 1, called the cardinality of Ω, is a finite positive integer denoting the total # of
outcomes in Ω
Many experiments have outcomes equally likely to occur, e.g., coin toss, dice throw, birthday date of a
selected student, ...
Counting method:
23
Examples
Question: Solution:
Coin flip: Ω = {HH, HT, TH, TT}, with cardinality N A = {HH, HT, TH} = {HH} U {HT} U {TH}
=4
P(A) = P({HH}) + P({HT}) + P({TH})
Let A denote the event that at least 1 head is
observed For a fair coin,
Then,
P(A) = ¼ + ¼ + ¼ = ¾ =
24
Counting methods
Counting all the outcomes (i.e., cardinality of Ω) is not an easy task; obtaining the cardinality of event A is
also prohibitive
25
The multiplication principle
● If one experiment has m > 0 outcomes and another experiment has n > 0 outcomes, then there are
m × n possible outcomes for the two experiments
● If there are p > 2 experiments, where the first experiment has n1 possible outcomes, the second n2, ·
· · , the pth np possible outcomes, then there are a total of n1 × n2 × · · · × np possible outcomes for
the p experiments
Example: A coin toss has 2 outcomes: {H, T}. 50 coin tosses has 2 × 2 ×… × 2 = 250 outcomes.
26
Examples
27
Permutations and combinations
Interested in the number of ways one can select a The number of outcomes i.e. sample space
subset of size r from a group of n cardinality depends on
distinct/distinguishable objects {c1, …, cn}.
● Whether we are allowed to get duplicated
objects (i.e. sampling with replacement vs
without)
● Whether the sequence of the selected r
objects matter
28
Permutations
When ordering matters:
Definition
A permutation is an ordered arrangement of objects. Selecting a sample of size r (=0, 1, …, n) from a set
of n objects, there are
● nr permutations under sampling with replacement
●
Recall:
● n! is called n factorial
● When n is a positive integer, n! = n(n − 1)(n − 2)· · · 1. We have the convention 0! = 1.
● The total number of permutations of n distinct objects is n!/(n−n)! = n! 29
Examples
Refer to the picture earlier on M&M’s, how many different ordered arrangements of 4 M&M’s selected
from 6 M&M’s of different colors are there?
Solution: Here, n = 6 distinct M&M’s colors; r = 4 selected M&M’s colors (subset size). Note that the
order of the 4 colors matters and the M&M’s cannot be duplicated (i.e., sampling without replacement).
The number of permutations is
30
Examples
31
Quiz
Question: Solution:
32
Quiz
Question: Solution:
33
Combinations
Sometimes, we may be no longer interested in how the objects are arranged, but in the constituents of
the subset. For instance, we do not care about the ordering of M&M’s colors.
Definition
A combination is an unordered arrangement/collection of objects. For a set of n distinct objects and a
subset of size r, there are
34
Examples
Question: Solution:
In the same M&M example as earlier, how many Here, n = 6 distinct M&M’s (group size), ≈ = 4
different combinations of 4 M&M’s selected from selected M&M’s (subset size). The order of the 4
6 M&M’s of different colors are there? colors does not matter and we cannot duplicate
the M&M’s (i.e., sampling without replacement).
The number of combinations is
35
Binomial coefficients
When a = 1, b = 1,
36
Binomial coefficients
(a+b)n = (a+b)(a+b)...(a+b)
● Each of the n brackets gives either a or b in the expansion ⇔ n objects into 2 classes selected (a),
unselected (b)
● arbn-r ⇔ exactly r objects are selected
● Number of terms arbn-r ⇔ number of ways to have exactly r objects selected ⇔
For example, when n=6 and r=4, the coefficient for a4b2 is .
One possibility is
37
Multinomial coefficients
What about higher number of classes k ≥ 3? Classes: Classes 1, 2, · · · , k with n1, n2, · · · , nk objects
38
Multinomial Coefficients
Multinomial Coefficients & The Multinomial Theorem
The number of ways to assign n distinct objects into k distinct classes with ni objects in the i-th class, i =
1, 2, · · · , k, , is
where the sum is over all nonnegative integers n1, n2,… nk such that n1+n2+...+nk = n
Remark: The assignment/sampling is without replacement as each object is classified to exactly 1 class
39
Examples
Question: How many ways are there to give 2 Question: What is the coefficient of x2y2z3 in the
M&M’s each to 2 kids with 6 M&M colours? expansion of (w+x+y+z)7?
Solution: Bucketing 6 M&M’s into 3 groups of 2 Solution: Multinomial expansion with k = 4 classes
each (kid 1, kid2, not given). Ordering of colours and nw = 0, nx = ny = 2 and nz = 3.
does not matter. Hence,
40
Conditional probability
Definition
Let A & B be two events with P(B) > 0. The conditional probability of A given B is defined to be
41
Conditional probability
42
Examples
Suppose individuals who bought digital camera on Amazon were recommended memory card and spare
batteries during checkout. 60% bought a memory card, 40% bought a spare battery, and 30% bought
both.
For a random buyer, let A be the event a memory card is purchased and B be the event a spare battery is
bought.
A bin contains 25 light bulbs, of which 5 are in good condition and function at least 30 days. 10 are
partially defective and will fail in the second day of use, while the rest are totally defective and will not
light up at all. Given that a light bulb lights up, what is the probability that it will still continue to light up
after 1 week?
44
Quiz
A bin contains 25 light bulbs, of which 5 are in good condition and function at least 30 days. 10 are
partially defective and will fail in the second day of use, while the rest are totally defective and will not
light up at all. Given that a light bulb lights up, what is the probability that it will still continue to light up
after 1 week?
Let G be event that light bulb is in good condition (will work >= 30 days), and T be the event that the
randomly chosen bulb is totally defective. So Tc is the event that light bulb is in good condition or partially
defective. We want to find out:
45
Quiz
Given that a fair coin is flipped twice, what is the probability of obtaining 2 heads given that the first flip
is a head?
46
Quiz
Given that a fair coin is flipped twice, what is the probability of obtaining 2 heads given that the first flip
is a head?
Or
Since B has already occurred, the sample space is just B. and A is one out of 2 possible outcomes hence ½.
47
Multiplication law
Multiplication law
Let A and B be two events with P(B) > 0. Then,
● When P(B) and P(A|B) are available or can be easily computed, P(A ∩ B) can be obtained as a
product
● An alternative formula: When P(A) > 0 and P(B|A) are available, P(A ∩ B) = P(A)P(B|A)
● In practice, for any complex event representable as an intersection of 2 events, its prob can be
computed in 2 ways
48
Examples
Four individuals responded to a request to donate blood. Only type O+ is required. However suppose we
do not know their blood type, and only that one of them has the correct blood type, what is the
probability that we have test for the blood type of at least 3 of the individuals before we get O+?
Let, B = {1st type test is not O+}, A = {2nd type test is not O+}.
49
Law of total probability
Definition
A collection of events B1, B2, · · · , Bn is called a partition of size n if
●
●
50
Law of total probability
Idea:
51
Examples
In a factory, 40% of goods come from line 1 and 60% from line 2. Line 1 has a defect rate of 8% and line 2
10%. If an item from the factory is chosen at random, what is the probability that it will not be defective?
P(not defective) = P(line 1)P(not defective | line 1) + P(line 2)(P(not defective | line 2)
52
Quiz
A chain video store sells 3 brands of DVD players. Of its sales, 50% are brand 1 and 30% are brand 2. Each
brand has 1 year warranty, and it is known that 25% of brand 1 requires warranty work, while brand 2 and
3 are 20% and 10% respectively. What is the probability that a randomly chosen purchaser will require
repair under warranty?
53
Quiz
A chain video store sells 3 brands of DVD players. Of its sales, 50% are brand 1 and 30% are brand 2. Each
brand has 1 year warranty, and it is known that 25% of brand 1 requires warranty work, while brand 2 and
3 are 20% and 10% respectively. What is the probability that a randomly chosen purchaser will require
repair under warranty?
54
Examples
● A tree diagram is a handy tool for computing probabilities in experiments composing several stages
/ generations
● Components / ingredients in a tree diagram:
○ Nodes and branches: total number of different generations depending on the total number of outcomes
○ Probabilities attached to each branch
55
56
Bayes rule
Bayes rule
Let B1, B2, · · · , Bn be a partition with P(Bi) > 0 for all i. Then, for any event A with P(A) > 0,
57
Examples
Refer to DVD example again: Suppose a customer returns to store asking for warranty work, what is the
probability that it is brand 1 or 2 or 3 player?
58
Quiz
People are trying to get home at peak hour. As seasoned consumers, everyone opens their preferred ride
hailing app to book a ride. App A is used 60% of the time as it is more popular, being cheaper and
advertises more. App B is used 30% of the time, and C the remaining. Suppose you know that app A will
not find a car/cab 20% of the time, 10% for B and 30% for C. You observe that your friend failed to get a
ride. What is the probability that your friend used app A?
59
Quiz
People are trying to get home at peak hour. As seasoned consumers, everyone opens their preferred ride
hailing app to book a ride. App A is used 60% of the time as it is more popular, being cheaper and
advertises more. App B is used 30% of the time, and C the remaining. Suppose you know that app A will
not find a car/cab 20% of the time, 10% for B and 30% for C. You observe that your friend failed to get a
ride. What is the probability that your friend used app A?
60
Independence
● Intuitively , P(A|B) would be different from P(A) unless knowing B does not tell us anything about A
(i.e., “occurrence of B has nothing to do with occurrence of A”).
61
Independence
Definition
Events A and B are said to be independent if
● P(A ∩ B) = P(A)P(B), or equivalently
● P(A) = P(A|B), or equivalently
● P(B) = P(B|A),
Otherwise they are said to be dependent
● In general, multiplication law P(A ∩ B) = P(A)P(B|A) = P(B)P(A|B) is always true for any intersection;
the first formula above, P(A ∩ B) = P(A)P(B), is a special case following from independence of A & B.
● Independence & disjointness are 2 different concepts
○ conclude disjointness from Venn diagram (no probabilities involved)
○ independence is defined in terms of probabilities
○ disjointness means that P(A|B) = 0 ⇒ dependence as long as P(A) ≠ 0 and P(B) ≠ 0.
62
Independence
Independence of 2 events
If A & B are independent, then so are A&Bc , Ac&B, and Ac&Bc.
Definition
Three events A, B, and C are said to be mutually independent if all the 4 conditions hold:
● P(A ∩ B ∩ C) = P(A)P(B)P(C)
● P(A ∩ B) = P(A)P(B)
● P(A ∩ C) = P(A)P(C)
● P(B ∩ C) = P(B)P(C)
Let A = {is an ace}, D = {is a diamond}. P(A) = 4/52 = Let A = {1st coin H}, B = {2nd coin H}, C = {only 1 H}
1/13, P(D) = 13/52 = 1/4
P(B|A) = P(B) since first coin toss does not
P(A ∩ D) = 1/52 = 1/13 x 1/4 = P(A)P(D) influence second coin toss. I.e. A and B are indept.
A and D are independent Suppose coins are fair (i.e. P(A) = P(B) = 0.5) then
Let A6 = {sum 2 dice gives 6} and B = {first die gives Let A7 denote sum of 2 dice is 7
4}
P(A7) = P({1:6, 2:5, 3:4, 4:3, 5:2, 6:1}) = 6/36 = ⅙,
A6 = {1:5, 2:4, 3:3, 4:2, 5:1} , and B = {4:1, 4:2, 4:3, and P(A7 ∩ B) = P({4:3}) = 1/36
4:4, 4:5, 4:6}, A6 ∩ B = {4:2}
1/36 = P(A7 ∩ B) = P(A7)P(B) = ⅙ x ⅙
1/36 = P(A6 ∩ B) ≠ P(A6)P(B) = 5/36 x 1/6
⇒A7 and B are independent
⇒A6 and B are dependent
65