0% found this document useful (0 votes)
52 views164 pages

Match Chapter 1 - Slides

This document introduces basic concepts of probability including experiments, sample spaces, events, and event operations such as union, intersection, and complement. It also discusses counting methods like the multiplication and addition principles and how they can be applied to permutation and combination.

Uploaded by

Pawandeep Singh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
52 views164 pages

Match Chapter 1 - Slides

This document introduces basic concepts of probability including experiments, sample spaces, events, and event operations such as union, intersection, and complement. It also discusses counting methods like the multiplication and addition principles and how they can be applied to permutation and combination.

Uploaded by

Pawandeep Singh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 164

ST2334

Probability and Statistics


Academic Year 2023/2024
Semester I
One
Basic Concepts of
Probability
1 P ROBABILITY C ONCEPTS AND D EFINITIONS

In this section we introduce the basic terminology of probability the-


ory: experiment, outcomes, sample space, events.
D EFINITION 1 (E XPERIMENT, S AMPLE SPACE , E VENT )
A statistical experiment is any procedure that produces data or observa-
tions.
The sample space, denoted by S, is the set of all possible outcomes of a
statistical experiment. The sample space depends on the problem of
interest!
A sample point is an outcome (element) in the sample space.
An event is a subset of the sample space.
E XAMPLE 1.1
Consider the experiment of rolling a die.

(i) If the problem of interest is “the number that shows on the top
face", then

• Sample space: S = {1, 2, 3, 4, 5, 6}.


• Sample point: 1 or 2 or 3 or 4 or 5 or 6.
• Some possible events are:
– an event where an odd number occurs = {1, 3, 5};
– an event where a number greater than 4 occurs = {5, 6}.
(ii) If the problem of interest is “whether the number is even or odd",
then

• Sample space: S = {even, odd}.


• Sample point: “even" or “odd".
• A possible event is:
– an event where an odd number occurs = {odd}.
R EMARK
The sample space is itself an event and is called a sure event.
An event that contains no element is the empty set, denoted by 0,
/ and is called a
null event.
L-E XAMPLE 1.1
Consider an experiment of rolling two dice. Suppose that the problem
of interest is “the numbers that show on the top faces".
(i) If the dice are labelled, S contains 36 elements.

• Sample space:

S = {(1, 1), (1, 2), . . . , (1, 5), (1, 6), (2, 1), (2, 2), . . . , (6, 5), (6, 6)}.

• Sample point: (1, 1) or (1, 2) or · · · or (6, 6).


• A possible event is:
– event A = {the sum of the dice equals 7}.

A = {(1, 6), (2, 5), (3, 4), (4, 3), (5, 2), (6, 1)}.
(ii) If the dice are not labelled, S contains 21 elements:

• Sample space:

S = {{1, 1}, {1, 2}, . . . , {1, 5}, {1, 6}, {2, 2}, {2, 3}, . . . , {5, 6}, {6, 6}}.

• Sample point: {1, 1} or {1, 2} or · · · or {6, 6}.


• A possible event is:
– event A = {the sum of the dice equals 7}.

A = {{1, 6}, {2, 5}, {3, 4}}.


W HAT ’ S THE DIFFERENCE BETWEEN THE TWO CASES ?
(i) The dice are labelled:
This means that we can tell die 1 apart from die 2. Therefore
• (3, 4) means that die 1 shows 3, die 2 shows 4; and
• (4, 3) means that die 1 shows 4, die 2 shows 3.
(ii) The dice are not labelled:
This means that we cannot tell die 1 apart from die 2. Therefore
• {3, 4} means that between die 1 and die 2, we have the num-
bers 3 and 4; and
• {4, 3} will mean the same.
L-E XAMPLE 1.2
Consider a two step experiment:

Step 1. Flip a coin and observe whether the head (H) or the tail (T )
faces up.

Step 2. If H is obtained in Step 1, flip the coin again; otherwise, roll a


die once.
• Sample space:

S = {(H, H), (H, T ), (T, 1), (T, 2), (T, 3), (T, 4), (T, 5), (T, 6)}.

• Sample point: (H, H) or (H, T ) or · · · or (T, 6).

• A possible event is:

– event A = {no die is thrown}.

A = {(H, H), (H, T )}.


L-E XAMPLE 1.3
Consider an experiment of drawing two balls, one at a time, from a jar
with a blue, a white, and a red ball.
The problem of interest is the colours of the two drawn balls.
• Sample space:

S = {(B,W ), (B, R), (W, B), (W, R), (R, B), (R,W )}.

• Sample point: (B,W ) or (B, R) or · · · or (R,W )

• A possible event is:

– event A = {a white ball is chosen}.

A = {(W, B), (W, R), (B,W ), (R,W )}.


2 E VENT O PERATIONS & R ELATIONSHIPS

Let A and B be two events in the sample space S. We shall go through


some event operations and relationships involving A and B.

• Event operations:
(i) Union; (ii) Intersection; (iii) Complement.

• Event relationships:
(i) Contained; (ii) Equivalent; (iii) Mutually exclusive.
Union
The union of events A and B, denoted by A ∪ B, is the event containing
all elements that belong to A or B or both. That is

A ∪ B = {x : x ∈ A or x ∈ B}.

A A∩B B
Intersection
The intersection of events A and B, denoted by A ∩ B or simply AB, is
the event containing elements that belong to both A and B. That is

A ∩ B = {x : x ∈ A and x ∈ B}.

A A∩B B
We can also consider the union and intersection of n events: A1, A2, . . . , An.
• Union:
n
[
Ai = A1 ∪ A2 . . . ∪ An = {x : x ∈ A1 or x ∈ A2 or . . . or x ∈ An},
i=1

comprises of elements that belong to one or more of A1, . . . , An.


• Intersection:
n
\
Ai = A1 ∩ A2 . . . ∩ An = {x : x ∈ A1 and x ∈ A2 and . . . and x ∈ An},
i=1

comprises of elements that belong to every A1, . . . , An.


Complement
The complement of the event A with respect to S, denoted by A′, is the
event with elements in S, which are not in A. That is

A′ = {x : x ∈ S but x ∈
/ A}.

A A′
Mutually Exclusive
Events A and B are said to be mutually exclusive or disjoint, if A ∩ B =
/ That is, A and B have no element in common.
0.

A B
Contained and Equivalent
If all elements in A are also elements in B, then we say A is contained
in B, denoted by A ⊂ B, or equivalently B ⊃ A.

B A

If A ⊂ B and B ⊂ A, then A = B. That is, set A and B are equivalent.


E XAMPLE 1.2
Consider the sample space and events:

S = {1, 2, 3, 4, 5, 6}, A = {1, 2, 3}, B = {1, 3, 5}, C = {2, 4, 6}.

Then

(i) A ∪ B = {1, 2, 3, 5}; A ∪C = {1, 2, 3, 4, 6}; B ∪C = S.

(ii) A ∩ B = {1, 3}; A ∩C = {2}; B ∩C = 0.


/

(iii) A ∪ B ∪C = S; A ∩ B ∩C = 0.
/

(iv) A′ = {4, 5, 6}; B′ = {2, 4, 6} = C.


Note that B and C are mutually exclusive, since B ∩C = 0.
/ On the other
hand, A and B are not mutually exclusive as A ∩ B = {1, 3} ̸= 0.
/
M ORE E VENT O PERATIONS
(a) A ∩ A′ = 0/ (b) A ∩ 0/ = 0/

(c) A ∪ A′ = S (d) (A′)′ = A

(e) A ∪ (B ∩C) = (A ∪ B) ∩ (A ∪C)

(f) A ∩ (B ∪C) = (A ∩ B) ∪ (A ∩C)

(g) A ∪ B = A ∪ (B ∩ A′)

(h) A = (A ∩ B) ∪ (A ∩ B′)
D E M ORGAN ’ S L AW
For any n events A1, A2, . . . , An,

(i) (A1 ∪ A2 ∪ . . . ∪ An)′ = A′1 ∩ A′2 ∩ . . . ∩ A′n.

A special case: (A ∪ B)′ = A′ ∩ B′.

(j) (A1 ∩ A2 ∩ . . . ∩ An)′ = A′1 ∪ A′2 ∪ . . . ∪ A′n.

A special case: (A ∩ B)′ = A′ ∪ B′.


E XAMPLE 1.3
We return to Example 1.2 where

S = {1, 2, 3, 4, 5, 6}, A = {1, 2, 3}, B = {1, 3, 5}, C = {2, 4, 6}.

We have
A′ = {4, 5, 6}, B′ = {2, 4, 6}, C′ = {1, 3, 5}.

We check that

(A ∪ B)′ = {1, 2, 3, 5}′ = {4, 6}; A′ ∩ B′ = {4, 5, 6} ∩ {2, 4, 6} = {4, 6}.

This agrees with (A ∪ B)′ = A′ ∩ B′.


Also,

(A ∩ B)′ = {1, 3}′ = {2, 4, 5, 6}; A′ ∪ B′ = {4, 5, 6} ∩ {2, 4, 6} = {2, 4, 5, 6}.

This agrees with (A ∩ B)′ = A′ ∪ B′.


Similarly, we can check that

(A ∪ B ∪C)′ = 0/ = A′ ∩ B′ ∩C′ and (A ∩ B ∩C)′ = S = A′ ∪ B′ ∪C′.


L-E XAMPLE 1.4
We revisit L-Example 1.2. Consider a two step experiment:

Step 1. Flip a coin and observe whether the head (H) or the tail (T )
faces up.

Step 2. If H is obtained in Step 1, flip the coin again; otherwise, roll a


die once.
Then the sample space is
S = {(H, H), (H, T ), (T, 1), (T, 2), (T, 3), (T, 4), (T, 5), (T, 6)}.
Consider the events
A = {die is rolled, number is no more than 3} = {(T, 1), (T, 2), (T, 3)};
B = {die is rolled, number is even} = {(T, 2), (T, 4), (T, 6)};
C = {die is not rolled} = {(H, H), (H, T )}.
Then their complements are
A′ = {(H, H), (H, T ), (T, 4), (T, 5), (T, 6)};
B′ = {(H, H), (H, T ), (T, 1), (T, 3), (T, 5)};
C′ = {(T, 1), (T, 2), (T, 3), (T, 4), (T, 5), (T, 6)}.
Here are some possible event operations:

A ∪ B = {(T, 1), (T, 2), (T, 3), (T, 4), (T, 6)}
B ∪C = {(T, 2), (T, 4), (T, 6), (H, H), (H, T )}
A ∩ B = {(T, 2)}
B ∩C = 0/
A ∪ B ∪C = {(H, H), (H, T ), (T, 1), (T, 2), (T, 3), (T, 4), (T, 6)}
A ∩ B ∩C = 0/
(A ∪ B) ∩C = 0/
A′ ∩ B′ = {(H, H), (H, T ), (T, 5)} = (A ∪ B)′
A′ ∪ B′ = {(H, H), (H, T ), (T, 1), (T, 3), (T, 4), (T, 5), (T, 6)} = (A ∩ B)′
3 C OUNTING M ETHODS

In many instances, we need to count the number of ways that some


operations can be carried out or that certain situations can happen.
There are two fundamental principles in counting:

Multiplication principle

Addition principle

They can be applied to derive some important counting methods:


permutation and combination.
M ULTIPLICATION P RINCIPLE
Suppose that r different experiments are to be performed sequentially. Suppose
experiment 1 results in n1 possible outcomes;
for each outcome above, experiment 2 results in n2 possible outcomes;
··· ··· ··· ··· ···
for each outcome above, experiment r results in nr possible outcomes.
Then there are n1n2 · · · nr possible outcomes for the r experiments.
E XAMPLE 1.4
How many possible outcomes are there when a die and a coin are
thrown together?
Solution:
Note that for

• experiment 1: throwing a die, there are 6 possible outcomes:


{1,2,3,4,5,6}.

• experiment 2: throwing a coin, with each outcome of experiment


1, there are 2 possible outcomes: {H, T }.

So altogether there are 6 × 2 = 12 possible outcomes.


In fact, the sample space is given by

S = {(x, y) : x = 1, . . . , 6; y = H or T }.
L-E XAMPLE 1.5
A small community consists of 10 men, each of whom has 3 sons. If
one man and one of his sons are to be chosen as “father and son" of
the year, how many different choices are possible?
Solution:
We can think of the problem as

• experiment 1: choose the father; there are 10 possible choices.

• experiment 2: choose the son; for each of the father, there are 3
sons to choose from.

Altogether, there are 10 × 3 = 30 possible choices.


L-E XAMPLE 1.6
How many even three-digit numbers can be formed from the digits
1, 2, 5, 6 and 9 if each digit can be used at most once?
Solution:
We can think of the problem as

• experiment 1: choose the number for ones place; digits 2 and 6


can be used — 2 possibilities.

• experiment 2: choose the number for tens place; from digits left
from experiment 1 — 4 possibilities.

• experiment 3: choose the number for hundreds place; from digits


left from experiments 1 and 2 — 3 possibilities.

Altogether, we have 2 × 4 × 3 = 24 possibilities.


L-E XAMPLE 1.7
How many even three-digit numbers can be formed from the digits
1, 2, 5, 6 and 9 if there is no restriction on how many times a digit can
be used?
Solution:
Similar to the previous example:

• experiment 1: choose the number for ones place; digits 2 and 6


can be used — 2 possibilities.

• experiment 2: choose the number for tens place; from all digits
provided — 5 possibilities.

• experiment 3: choose the number for hundreds place; from all


digits provided — 5 possibilities.

Altogether, we have 2 × 5 × 5 = 50 possibilities.


L-E XAMPLE 1.8
In how many ways can 4 boys and 5 girls sit in a row if the boys and
girls must alternate?
Solution:
We must have the following arrangement:

G B G B G B G B G

The number of ways will be given as

(5)(4)(4)(3)(3)(2)(2)(1)(1) = 5!4! = 2880,

where n! = n(n − 1)(n − 2) · · · (2)(1).

Question: What happens if we have 5 boys and 5 girls?


A DDITION P RINCIPLE
Suppose that an experiment can be performed by k different procedures.
Procedure 1 can be carried out in n1 ways;

Procedure 2 can be carried out in n2 ways;

··· ··· ··· ··· ···

Procedure k can be carried out in nk ways.


Suppose that the “ways" under different procedures do not overlap. Then the total
number of ways we can perform the experiment is

n1 + n2 + . . . + nk .
E XAMPLE 1.5 (O RCHARD ROAD )
We can take the MRT or bus from home to Orchard road. Suppose
there are three bus routes and two MRT routes. How many ways can
we go from home to Orchard road?
Solution:
Consider the trip from home to Orchard road as an experiment. Two
procedures can used to complete the experiment:

Procedure 1: take MRT – 2 ways.

Procedure 2: take bus – 3 ways.

These ways do not overlap. So the total number of ways we can go


from home to Orchard road is 2 + 3 = 5.
L-E XAMPLE 1.9
How many even three-digit numbers can be formed from the digits
0, 1, 2, 5, 6 and 9 if each digit can be used at most once?
Solution:
Consider the task as two procedures based on the ones place:

Procedure 1. 0 is used for the ones place.


5 × 4 = 20 ways to select the hundreds and tens place.

Procedure 2. 0 is not used for the ones place.

(i) There are two ways (2 or 6) for the ones place;


(ii) 0 cannot be placed at the hundreds place, so we
have 4 digits available for the hundreds place;
(iii) We have 4 possible choices for the tens place.

In summary, we have 2 × 4 × 4 = 32 ways.


Using the Addition rule, we combine Procedures 1 and 2 to conclude
that there are 20 + 32 = 52 ways.
L-E XAMPLE 1.10
Consider the digits 0, 1, 2, 3, 4, 5 and 6. If each digit can be used at most
once, how many 3-digit numbers greater than 420 can be formed?
Solution:
Let us consider three procedures:

Procedure 1. Hundreds place is 4, tens place is 2: (1)(1)(4) = 4 ways.

Procedure 2. Hundreds place is 4, tens place is 3, 5, 6: (1)(3)(5) = 15


ways.

Procedure 3. Hundreds place is 5 or 6: (2)(6)(5) = 60 ways.

Altogether, there are 4 + 15 + 60 = 79 ways.


P ERMUTATION
A permutation is a selection and arrangement of r objects out of n. In this case, order
is taken into consideration.
The number of ways to choose and arrange r objects out of n, where r ≤ n, is denoted
by Prn, where
n!
Prn = = n(n − 1)(n − 2) . . . (n − (r − 1)).
(n − r)!

obj 1 obj 2 obj 3 ··· obj r


n ways (n − 1) ways (n − 2) ways · · · (n − (r − 1)) ways
R EMARK
When r = n, Pnn = n!.
Essentially, it is the number of ways to arrange n objects in order.
E XAMPLE 1.6
Find the number of possible four-letter code words in which all letters
are different.
Solution:
Note that there are n = 26 alphabets, and r = 4 in our case.
So the number of possible four-letter code words is

P426 = (26)(25)(24)(23) = 358800.


L-E XAMPLE 1.11
(i) How many ways can 6 persons line up to get on a bus?

(ii) If 3 persons insist on following one other, how many ways can
these 6 persons line up?

(iii) If 2 persons refuse to follow each other, how many ways of lining
up are possible?
Solution:

(i) We have n = r = 6. So there are P66 = 6! = 720 ways.


(ii) Let a, b, c, d, e, f be the names of 6 persons. Without loss of gener-
ality, suppose a, b, c insist on following one other.

We group them into one group, denoted by G = {a, b, c}. G can


now be viewed as a single person. We need to line up four per-
sons, G, d, e, f , in a row. So there are P44 = 4! = 24 ways.

On the other hand, for each permutation above, such as (G, d, f , e),
we can arrange a, b, c within G differently. The number of ways
of ordering them within G is P33 = 3! = 6.

Therefore, applying the multiplication rule, the number of ways


to line them up is 24 × 6 = 144.
(iii) Using the same principle as Part (ii), we first count the number
of ways of lining up if the two persons are following each other:

P55 × P22 = 5! × 2! = 240.

From Part (i), the total number of ways for lining up 6 persons is
720. Therefore, we have

720 − 240 = 480

ways of lining up 6 persons such that the two given persons are
not following each other.
C OMBINATION
A combination is a selection of r objects out of n, without regard to the order.
n
Crn

The number of combinations of choosing r objects out of n, denoted by or r, is
given by as  
n n!
= .
r r!(n − r)!
n n
 
Note that this formula immediately implies r = n−r .
The derivation is as follows.

(A) Thinking in terms of permutation, the number of ways to choose and


arrange r objects out of n is Prn.

(B) On the other hand, the same permutation task can be achieved by
conducting the following two experiments sequentially:
n

(1) Select r objects out of n without regard to the order: r ways.
(2) For each such combination, permute its r objects: Prr ways.
(C) Therefore, by the multiplication rule, the number of ways to choose
and arrange r objects out of n is nr × Prr .
n
(D) As a consequence, r × Prr = Prn, and so we obtain


Prn n!/(n − r)!


 
n n!
= r= = .
r Pr r! r!(n − r)!
E XAMPLE 1.7
From 4 women and 3 men, find the number of committees of size 3
that can be formed with 2 women and 1 man.
Solution:
4

The number of ways to select 2 women from 4 is 2 = 6.
3

The number of ways to select 1 man from 3 is 1 = 3.
By the multiplication rule, the number of committees formed with 2
women and 1 man is
   
4 3
× = 6 × 3 = 18.
2 1
L-E XAMPLE 1.12
From a group of 4 men and 5 women, how many committees of size 3
are possible

(i) when there is no restriction?

(ii) when we need 2 men and 1 woman, and a certain man must be
on the committee?

(iii) when we need 2 men and 1 woman, and 2 of the men are feuding
and refuse to serve on the committee together?
Solution:

(i) The number of possible committees is


 
9
= 9!/(3!6!) = 84.
3

(ii) Since a certain man must be on the committee, we need to choose


one man from the remaining 3 men and 1 woman from 5 women:
   
1 3 5
= 15 ways.
1 1 1
(iii) The number of committees where the two feuding men serve is
  
2 5
= 5.
2 1
As this includes all the “undesirable" cases, the number of “de-
sirable" cases is
  
4 5
− 5 = 30 − 5 = 25.
2 1
L-E XAMPLE 1.13
Shortly after being put into service, some buses manufactured by a cer-
tain company developed cracks on the underside of the main frame.
Suppose a particular bus company has 20 of these buses, and the cracks
appeared in 8 of them.

(i) How many ways are there to select a sample of 5 buses from the
20 for a thorough inspection?

(ii) In how many ways can we obtain a sample of 5 buses where ex-
actly 4 buses have visible cracks?
Solution:

20

(i) The number of ways to select 5 buses out of 20 is 5 = 15504.

(ii) Now,

• 4 buses need to be selected from the 8 with visible cracks;


• 1 bus need to be selected from the remaining 12.

Using the multiplication rule, the number of ways required is


8 12
 
4 × 1 = 70 × 12 = 840.
4 P ROBABILITY

Intuitively, the term probability is understood as the chance or how


likely a certain event may occur.
More specifically, let A be an event in an experiment. We typically
associate a number, called probability, to quantify how likely the event
A occurs. This is denoted as P(A).
Let us now investigate how we can obtain such a number.
You will discover that the fundamental concept of probability is ex-
tended from an idea based on intuition to a rigorous, abstract, and ad-
vanced mathematical theory. It has also wide applications in various
scientific disciplines.
I NTERPRETATION OF P ROBABILITY: R ELATIVE F REQUENCY
Suppose that we repeat an experiment E for a total of n times.
Let nA be the number of times that the event A occurs.
Then fA = nA/n is called the relative frequency of the event A in the n repetitions of
E.

Clearly, fA may not equal to P(A) exactly. However when n grows large, we expect
fA to be close to P(A); in the sense that fA ≈ P(A). Or mathematically,

fA → P(A), as n → ∞.
Thus fA “mimics" P(A), and has the following properties:

(a) 0 ≤ fA ≤ 1.

(b) fA = 1 if A occurs in every repetition.

(c) If A and B are mutually exclusive events, fA∪B = fA + fB.


Extending this idea, we can define probability on a sample space
mathematically.
A XIOMS OF P ROBABILITY
Probability, denoted by P(·), is a function on the collection of events of the sample
space S, satisfying:

Axiom 1. For any event A,


0 ≤ P(A) ≤ 1.

Axiom 2. For the sample space,


P(S) = 1.

Axiom 3. For any two mutually exclusive events A and B, that is, A ∩ B = 0,
/

P(A ∪ B) = P(A) + P(B).


E XAMPLE 1.8
Let H denote the event of getting a head when a coin is tossed. Find
P(H), if

(i) the coin is fair;

(ii) the coin is biased and a head is twice as likely to appear as a tail.
Solution:
The sample space is S = {H, T }.

(i) “The coin is fair" means that P(H) = P(T ).

The events {H} and {T } are mutually exclusive. Thus based on


Axioms 2 and 3, we have

1 = P(S) = P({H} ∪ {T }) = P(H) + P(T ) = 2P(H).

This gives P(H) = 1/2.


(ii) “A head is twice likely to appear as a tail" means P(H) = 2P(T );
therefore

1 = P(S) = P({H} ∪ {T }) = P(H) + P(T ) = 3P(T ).

This gives P(T ) = 1/3 and P(H) = 2/3.


L-E XAMPLE 1.14
Roll a fair die. Consider the following events:

A = {an even number turns up};

B = {1 or 3 turns up};

C = {a number divisible by 3 is obtained}.

Find P(A), P(B), P(C), P(A ∪ B), and P(A ∪C).


Solution:
The events are
A = {2, 4, 6}, B = {1, 3}, C = {3, 6}.
We have P(A) = 3/6 = 1/2; P(B) = P(C) = 1/3.
Since A ∩ B = 0,
/ based on Axiom 3, we have
P(A ∪ B) = P(A) + P(B) = 5/6.
However A ∩ C = {6} ̸= 0, / so Axiom 3 is not applicable. Instead, as
A ∪C = {2, 3, 4, 6}, this leads to
P(A ∪C) = 4/6 = 2/3.
Basic Properties of Probability
Using the axioms, we can derive the following propositions.
P ROPOSITION 2
The probability of the empty set 0/ is P(0)
/ = 0.
Proof Since 0/ ∩ 0/ = 0/ and 0/ = 0/ ∪ 0,
/ applying Axiom 3 leads to

/ = P(0/ ∪ 0)
P(0) / = P(0)
/ + P(0)
/ = 2P(0).
/

This implies that P(0)


/ = 0. ✠
P ROPOSITION 3
If A1, A2, . . . , An are mutually exclusive events, that is Ai ∩ A j = 0/ for any
i ̸= j, then

P(A1 ∪ A2 ∪ · · · ∪ An) = P(A1) + P(A2) + . . . + P(An).


Proof This proposition can be established easily using induction and
Axiom 3. ✠
P ROPOSITION 4
For any event A, we have

P(A′) = 1 − P(A).
Proof Since S = A ∪ A′ and A ∩ A′ = 0,
/ based on Axioms 2 and 3, we
have
1 = P(S) = P(A ∪ A′) = P(A) + P(A′).
The result follows. ✠
P ROPOSITION 5
For any two events A and B,

P(A) = P(A ∩ B) + P(A ∩ B′).


Proof Based on the properties

A = (A ∩ B) ∪ (A ∩ B′) and (A ∩ B) ∩ (A ∩ B′) = 0,


/

we have
P(A) = P(A ∩ B) + P(A ∩ B′).

P ROPOSITION 6
For any two events A and B,

P(A ∪ B) = P(A) + P(B) − P(A ∩ B).


Proof Based on the event operations

A ∪ B = B ∪ (A ∩ B′) and B ∩ (A ∩ B′) = 0,


/

and Proposition 5 which states

P(A ∩ B′) = P(A) − P(A ∩ B),

we have

P(A ∪ B) = P(B) + P(A ∩ B′) = P(B) + P(A) − P(A ∩ B).


P ROPOSITION 7
If A ⊂ B, then P(A) ≤ P(B).
Proof Since A ⊂ B, we have A ∪ B = B. Also, we have

A ∪ B = A ∪ (B ∩ A′) and A ∩ (B ∩ A′) = 0.


/

Thus we obtain

P(B) = P(A ∪ B) = P(A ∪ (B ∩ A′)) = P(A) + P(B ∩ A′) ≥ P(A).


E XAMPLE 1.9
A retail establishment accepts either the American Express or the VISA
credit card.
A total of 24% of its customers carry an American Express card, 61%
carry a VISA card, and 11% carry both.
What is the probability that a customer carries a credit card that the
establishment will accept?
Solution:
Let
A = {the customer carries an American Express Card}
and
V = {the customer carries an VISA Card}.
Then
P(A) = 0.24, P(V ) = 0.61, P(A ∩V ) = 0.11.

The question asked for P(A ∪V ), which is given as

P(A ∪V ) = P(A) + P(V ) − P(A ∩V ) = 0.24 + 0.61 − 0.11 = 0.74.


L-E XAMPLE 1.15 (H ALL PAGEANT )
Audrey is taking part in her hall’s pageant. The probability that she
will win the crown is 0.14; the probability that she will win Miss Pho-
togenic is 0.3; the probability that she will win both is 0.11.

(i) What is the probability that she wins at least one of two?

(ii) What is the probability that she wins only one of two?
Solution:
Let A = {win the crown} and B = {win Miss Photogenic}.

(i) The probability that she wins at least one of the two titles

P(A ∪ B) = P(A) + P(B) − P(A ∩ B) = 0.14 + 0.3 − 0.11 = 0.33.

(ii) The event that she wins the crown but not Miss Photogenic is
A ∩ B′. Proposition 5 gives

P(A ∩ B′) = P(A) − P(A ∩ B) = 0.14 − 0.11 = 0.03.


A B

A ∩ B′ A∩B A′ ∩ B

Similarly, the probability that she wins Miss Photogenic but not
the crown is
P(B ∩ A′) = P(B) − P(A ∩ B) = 0.3 − 0.11 = 0.19.

As (A ∩ B′) ∩ (A′ ∩ B) = 0,
/ the required probability is
P((A ∩ B′) ∪ (A′ ∩ B)) = 0.03 + 0.19 = 0.22.
F INITE S AMPLE S PACE WITH E QUALLY L IKELY O UTCOMES
Consider a sample space S = {a1, a2, . . . , ak }.
Assume that all outcomes in the sample space are equally likely to occur, i.e.,

P(a1) = P(a2) = · · · = P(ak ).

Then for any event A ⊂ S,

number of sample points in A


P(A) = .
number of sample points in S
E XAMPLE 1.10
A box contains 50 bolts and 150 nuts. Half of the bolts and half of the
nuts are rusted.
If one item is chosen at random, what is the probability that it is rusted
or is a bolt?
Solution:
We define the following events

A = {the item is rusted}, B = {the item is a bolt}, S = {all the items}.

Since the item is selected at random, each of the 200 elements in S is


equally likely to be chosen.

• A consists of 25 + 75 = 100 elements;

• B consists of 50 elements; and

• A ∩ B consists of 25 elements.
These give

P(A) = 100/200 , P(B) = 50/200 , P(A ∩ B) = 25/200.

Therefore the required probability is

P(A ∪ B) = P(A) + P(B) − P(A ∩ B) = 5/8.


L-E XAMPLE 1.16 (B IRTHDAY P ROBLEM )
Here’s a useful party trick: walk into a room or bar with at least 50
people.
Boldly claim that you sense two people sharing the same birthday. Act
awesome afterwards.
How often are you right?
Solution:
We can cast this as a probability question:

There are n people in a room, what is the probability that there are at
least two persons with the same birthday?

We need to make some assumptions:

• Each day is equally likely to be a birthday of everyone.

• There is no leap year.


We can then have the following:

• The sample space is

S = {all possible combinations of birthdays of n people}.

It is formed of equally likely sample points.

• Let

A = {at least two people share the same birthday},

then
A′ = {all people have different birthdays}.
• We count the number of points in S and A′:

– We call the n persons as p1, p2, · · · , pn.


– If all of them have different birthdays,
there are 365 possible days for p1’s birthday;
there are 365 − 1 possible days for p2’s birthday;
·········;
there are 365 − (n − 1) possible days for pn’s birthday.
As a consequence, #(A′) = 365(364) · · · [365 − (n − 1)].
– Similarly, #S = 365n.
• Therefore

′ #(A ) 365(364) · · · [365 − (n − 1)]
P(A ) = = ,
#S 365n
and hence
    
1 2 n−1
P(A) = 1 − P(A′) = 1 − 1 − 1− ··· 1− .
365 365 365
n qn pn
1 1 0
2 0.99726 0.00274
Let qn = P(A′) when there 3 0.99180 0.00820
are n people, and 10 0.88305 0.11695
15 0.74710 0.25290
pn = P(A) = 1 − qn. 20 0.58856 0.41144
21 0.55631 0.44369
22 0.52430 0.47570
The values of pn and qn for
23 0.49270 0.50730
selected values of n are tab- 30 0.29368 0.70632
ulated: 40 0.10877 0.89123
50 0.02963 0.97037
100 3.0725×10−7 ≈1
253 6.9854×10−53 ≈1
• For 50 people, 98% of the time you can find at least two people
with the same birthday.

• The probability of having at least two people sharing the same


birthday exceeds 1/2 once you have 23 people.

• When there are 100 people, it is almost sure that you can find two
people sharing the same birthday!
L-E XAMPLE 1.17 (I NVERSE B IRTHDAY P ROBLEM )
How large does a group of (randomly selected) people have to be such
that the probability that someone shares his or her birthday with you
is larger that 0.5?
Solution:
The probability
 n that n persons all have different birthdays from you is
364
.
365
So we need a n such that

1 − (364/365)n ≥ 0.5.

Solving it, we obtain

log(0.5)
n≥ = 252.7.
log(364/365)
So we need at least 253 people (excluding yourself).
R EMARK (B IRTHDAY P ROBLEMS )
Why is there a big difference in the answers between the two birthday problems?

• The inverse birthday problem requires the sharing of a particular day as the
common birthday;

• The birthday problem allows that any day is the shared birthday.
5 C ONDITIONAL P ROBABILITY

Sometimes we need to compute the probability of some events when


some partial information is available.
Specifically, we might need to compute the probability of an event B,
given that we have the information “an event A has occurred".
Mathematically, we denote
P(B|A)
as the conditional probability of the event B, given that event A has
occurred.
D EFINITION 8 (C ONDITIONAL P ROBABILITY )
For any two events A and B with P(A) > 0, the conditional probability
of B given that A has occurred is defined by

P(A ∩ B)
P(B|A) = .
P(A)
E XAMPLE 1.11
A fair die is rolled twice.

(i) What is the probability that the sum of the 2 rolls is even?

(ii) Given that the first roll is a 5, what is the (conditional) probability
that the sum of the 2 rolls is even?
Solution:

We define the following events:

B = {the sum of the 2 rolls is even},


A = {the first roll is a 5}.
(i) The sample space is given by

2nd roll
1 2 3 4 5 6
1 (1, 1) (1, 2) (1, 3) (1, 4) (1, 5) (1, 6)
2 (2, 1) (2, 2) (2, 3) (2, 4) (2, 5) (2, 6)
1st roll 3 (3, 1) (3, 2) (3, 3) (3, 4) (3, 5) (3, 6)
4 (4, 1) (4, 2) (4, 3) (4, 4) (4, 5) (4, 6)
5 (5, 1) (5, 2) (5, 3) (5, 4) (5, 5) (5, 6)
6 (6, 1) (6, 2) (6, 3) (6, 4) (6, 5) (6, 6)

It is easy to see that P(B) = 18/36.


(ii) Since we know that A has already happened, we can just look at
the fifth row:

2nd roll
1 2 3 4 5 6
1st roll 5 (5, 1) (5, 2) (5, 3) (5, 4) (5, 5) (5, 6)

We are interested to look for instances along this row that gives
an even sum. So P(B|A) = 3/6.
Alternatively, we can use the formula:
3
P(AB) 36
P(B|A) = = 6
.
P(A) 36
R EMARK (R EDUCED S AMPLE S PACE )
P(B|A) is read as:

“the conditional probability that B occurs given that A has occurred."

Since we know that A has occurred, regard A as our new, or reduced sample space.

The conditional probability that the event B given A will equal the probability of
A ∩ B relative to the probability of A.
L-E XAMPLE 1.18
Suppose we roll two fair dice.
Given that the first die is less than 3, what is the probability that the
sum of the 2 dice is more than 7?
Solution:
Define the events:

B = {the sum of the 2 dice is more than 7}


A = {the first die is less than 3}

Consider the reduced sample space, i.e., event A, with the following
12 equally likely sample points:

{(1, 1), (1, 2), . . . , (1, 6), (2, 1), (2, 2), . . . , (2, 6)}

The required probability is P(B|A) = 1/12 since there is only one point
(2, 6) in the reduced sample space that gives a sum more than 7.
M ULTIPLICATION R ULE
Starting from the definition of conditional probability, and rearranging the terms,
we have

P(A ∩ B) = P(A)P(B|A), if P(A) ̸= 0


or P(A ∩ B) = P(B)P(A|B), if P(B) ̸= 0.

This is known as the Multiplication Rule.


I NVERSE P ROBABILITY F ORMULA
The multiplication rule together with the definition of the conditional probability
gives us:
P(A)P(B|A)
P(A|B) = .
P(B)
This is known as the Inverse Probability Formula.
E XAMPLE 1.12
Deal 2 cards from a regular playing deck without replacement. What
is the probability that both cards are aces?
Solution:

P(both aces) = P(1st card is ace and 2nd card is ace)


= P(1st card ace) · P(2nd card ace|1st card ace)
4 3 1
= · = .
52 51 221
L-E XAMPLE 1.19
Suppose that among 12 shirts, 3 are white. Two shirts are chosen ran-
domly one by one without replacement.

(i) What is the probability that both shirts picked are white?

(ii) What is the probability that only one white shirt is picked?
Solution:
Define the events:
A1 = {the first shirt is white};
A2 = {the second shirt is white}.

(i) We have P(A1) = 3/12. Given that the first shirt is white, then
there are 2 white shirts among the remaining 11 shirts, therefore
P(A2|A1) = 2/11.

So the probability that both shirts picked are white is


P(A1 ∩ A2) = P(A1)P(A2|A1) = (3/12)(2/11) = 1/22.
(ii) We need to compute the probability
P((A1 ∩ A′2) ∪ (A′1 ∩ A2)) = P(A1 ∩ A′2) + P(A′1 ∩ A2).
The equality above holds since (A1 ∩ A′2) ∩ (A′1 ∩ A2) = 0.
/

On the other hand, using similar argument as in Part (i), we have


P(A1 ∩ A′2) = P(A1)P(A′2|A1) = (3/12) · (9/11);
P(A′1 ∩ A2) = P(A′1)P(A2|A′1) = (9/12) · (3/11).

Consequently,
P((A1 ∩ A′2) ∪ (A′1 ∩ A2)) = (3/12) · (9/11) + (9/12) · (3/11) = 9/22.
6 I NDEPENDENCE

We saw several examples where conditioning on one event changes


our beliefs about the probability of another event.
In this section, we discuss the important concept of independence,
where learning that the event B occurred gives us no information that
would change our probabilities for another event A occurring.
D EFINITION 9 (I NDEPENDENCE )
Two events A and B are independent if and only if

P(A ∩ B) = P(A)P(B).

We denote this by A ⊥ B.
If A and B are not independent, they are said to be dependent, denoted by
A ̸⊥ B.
R EMARK
If P(A) ̸= 0, A ⊥ B if and only if P(B|A) = P(B).
This follows from the definition of conditional probability –

P(A ∩ B) P(A)P(B)
A ⊥ B ⇔ P(B|A) = = = P(B).
P(A) P(A)

Intuitively, this is the same as saying that A and B are independent if the knowledge
of A does not change the probability of B.

Likewise, if P(B) ̸= 0, A ⊥ B if and only if P(A|B) = P(A).


E XAMPLE 1.13
Suppose we roll 2 fair dice.

(i) Let

A6 = {the sum of two dice is 6}, B = {the first die equals 4}.

Thus

P(A6) = 5/36, P(B) = 6/36 = 1/6 and P(A6 ∩ B) = 1/36.

As P(A6 ∩ B) ̸= P(A6)P(B), we say that A6 and B are dependent.


(ii) Let
A7 = {the sum of two dice is 7}.

Then

P(A7 ∩ B) = 1/36, P(A7) = 1/6 and P(B) = 1/6.

As P(A7 ∩ B) = P(A7)P(B), we say that A7 and B are independent.


I NDEPENDENT VS M UTUALLY E XCLUSIVE
Independence and mutually exclusivity are totally different concepts:

A, B independent ⇔ P(A ∩ B) = P(A)P(B)


A, B mutually exclusive ⇔ A ∩ B = 0/

“Mutually exclusivity" can be illustrated by a Venn diagram (like below), but we


can not do that for “independence".

S
A B
L-E XAMPLE 1.20 (S OME PROPERTIES OF INDEPENDENCE )
Determine if the following statements are TRUE or FALSE.

(a) Suppose P(A) > 0, P(B) > 0. If A ⊥ B, then A and B are not mutually
exclusive.

(b) Suppose P(A) > 0, P(B) > 0. If A and B are mutually exclusive, then
A ̸⊥ B.

(c) S and 0/ are independent of any other event.

(d) If A ⊥ B, then A ⊥ B′, A′ ⊥ B, and A′ ⊥ B′.


Solution:

(a) TRUE
Using independence, P(A ∩ B) = P(A)P(B) > 0.

(b) TRUE
Using mutual exclusivity, P(A ∩ B) = 0 ̸= P(A)P(B).
(c) TRUE
For any event A,
P(A ∩ S) = P(A) = P(A)P(S),
and
P(A ∩ 0)
/ = P(0)
/ = 0 = P(A)P(0).
/

(d) TRUE
We shall derive only one. Note that A = (A ∩ B) ∪ (A ∩ B′). So we
have
P(A ∩ B′) = P(A) − P(A ∩ B) = P(A) − P(A)P(B)
= P(A)(1 − P(B)) = P(A)P(B′).
L-E XAMPLE 1.21
The probability that Tom will be alive in 20 years is 0.7. The probability
that Jack will be alive in 20 years is 0.9.
What is the probability that neither will be alive in 20 years?
Solution:
Define

A = {Tom would be alive in 20 years };


B = {Jack would be alive in 20 years}.

We make the assumption that A and B are independent — whether one


is alive would not affect the other. Then A′ and B′ are also independent.
The desired probability is then given by

P(A′ ∩ B′) = P(A′)P(B′) = (1 − 0.7)(1 − 0.9) = 0.03.


7 T HE L AW OF T OTAL P ROBABILITY

The definition of conditional probability has far-reaching consequences.


In this section we look at the Law of Total Probability (LOTP), which
relates conditional probability to unconditional probability.
D EFINITION 10 (PARTITION )
If A1, A2, . . . , An are mutually exclusive events and ∪ni=1Ai = S, we call
A1, A2, . . . , An a partition of S.
T HEOREM 11 (L AW OF T OTAL P ROBABILITY )
Suppose A1, A2, . . . , An is a partition of S. Then for any event B, we have
n n
P(B) = ∑ P(B ∩ Ai) = ∑ P(Ai)P(B|Ai).
i=1 i=1
S PECIAL CASE : L AW OF T OTAL P ROBABILITY
For any events A and B, we have

P(B) = P(A)P(B|A) + P(A′)P(B|A′).


E XAMPLE 1.14 (F RYING F ISH )
At a nasi lemak stall, the chef and his assistant take turns to fry fish.
The chef burns his fish with probability 0.1, his assistant burns his fish
with probability 0.23.
If the chef is frying fish 80% of the time, what is the probability that
the fish you order is burnt?
Solution:
Let

B = {the fish is burnt},


C = {the fish is fried by the chef}.

We then need to compute P(B). Using the Law of Total Probability,

P(B) = P(C)P(B|C) + P(C′)P(B|C′) = 0.8 × 0.1 + 0.2 × 0.23.


L-E XAMPLE 1.22 (T HE M ONTY H ALL P ROBLEM )
Suppose you are on a game show, and you are given the choice of three
doors: behind one door is a car; behind the others, goats.
You pick a door, say No. 1, and the host Monty, who knows what is
behind the doors, opens another door, say No. 3, which has a goat. He
then says to you, “Do you want to pick door No. 2?"
Is it to your advantage to switch your choice?
Solution:
Let’s formulate this as a probability question. Denote the events:

W = {win the car};


A = {car is behind the door you picked initially}.

Then our interest is P(W ), the probability of winning the car.

Using the Law of Total Probability, we have

P(W ) = P(A)P(W |A) + P(A′)P(W |A′)


= 31 P(W |A) + 23 P(W |A′).
We can choose either one of the following strategies —
(i) A “stick" strategy: stick to your initial choice.

This will give us


• P(W |A) = 1; that is, you are sure to win the car if the initial
choice is the car door;
• P(W |A′) = 0; that is, you are sure to lose the car if the initial
choice is not the car door.
As a consequence,
P(W ) = (1/3) · 1 + (2/3) · 0 = 1/3.
(ii) A “switch" strategy: switch to another door when asked.

In this case, P(W |A) = 0 and P(W |A′) = 1. We then have

P(W ) = (1/3) · 0 + (2/3) · 1 = 2/3.

Conclusion: “Switching" doubles the chance of winning the car!


R EMARK (M ONTY H ALL )
Still confused? Watch the following videos:
https://youtu.be/mhlc7peGlGg
https://youtu.be/ggDQXlinbME
8 B AYES ’ T HEOREM

We now discuss Bayes’ Theorem (or Bayes’ Rule), which will allow us
to relate P(A|B) to P(B|A) and compute conditional probabilities in a
wide range of problems.
T HEOREM 12 (B AYES ’ T HEOREM )
Let A1, A2, . . . , An be a partition of S, then for any event B and k = 1, 2, . . . , n,

P(Ak )P(B|Ak )
P(Ak |B) = n .
∑i=1 P(Ai)P(B|Ak )
Proof Bayes’ Theorem can be derived based on the definition of con-
ditional probability, the Multiplication Rule, and the Law of Total Prob-
ability.
In particular,

P(Ak ∩ B) P(Ak )P(B|Ak )


P(Ak |B) = = n
P(B) ∑i=1 P(B ∩ Ai)
P(Ak )P(B|Ak )
= n .
∑i=1 P(Ai)P(B|Ai)

S PECIAL CASE : B AYES ’ T HEOREM
Let us consider a special case of Bayes’ Theorem when n = 2.
{A, A′} becomes a partition of S, and we have

P(A)P(B|A)
P(A|B) = .
P(A)P(B|A) + P(A′)P(B|A′)
E XAMPLE 1.15
The previous formula is practically meaningful.
For example, consider the events

A = disease status of a person, B = symptom observed.

Then

• P(A): probability of a disease in general;

• P(B|A): if diseased, probability of observing symptom;

• P(A|B): if symptom observed, probability of diseased.


E XAMPLE 1.16
Historically, we observe the collapse of some newly constructed house.
The chance that the design of the house is faulty is 1%. If the design
is faulty, the chance that the house collapses is 75%; otherwise, the
chance is 0.01%.
We observe that a newly constructed house collapsed, what is the
probability that the design is faulty?
Solution:

Let

B = {The design is faulty}, A = {The house collapses}.

We then have

P(B) = 0.01, P(A|B) = 0.75, and P(A|B′) = 0.0001.

The question asked for P(B|A). We will compute it using Bayes’ Theo-
rem.
The denominator can be computed using the Law of Total Probability:

P(A) = P(B)P(A|B) + P(B′)P(A|B′)


= (0.01)(0.75) + (0.99)(0.0001) = 0.007599.

The numerator is

P(A|B)P(B) = 0.75(0.01) = 0.0075.

As a consequence,

P(A|B)P(B)
P(B|A) = = 0.9870.
P(A)
L-E XAMPLE 1.23
An insurance company believes that people can be divided into two
classes: accident-prone and not accident-prone.
Historically, they observe that the probability that an accident-prone
person will have an accident within a fixed 1-year period is 0.04. For a
not accident-prone person, that probability is 0.02.
Assume that 30% of the population is accident prone.

(i) What is the probability that a new policyholder will have an ac-
cident within a year of purchasing a policy?

(ii) If a new policyholder has an accident within a year of purchasing


a policy, what is the probability that he or she is accident-prone?
Solution:
Define the events:

B = {a new policy holder has an accident within a year};


A = {a new policy holder is accident-prone}.
We are told that

P(A) = 0.3, P(B|A) = 0.04, P(B|A′) = 0.02.

(i) The probability asked for is

P(B) = P(A)P(B|A) + P(A′)P(B|A′) = 0.3(0.04) + 0.7(0.02) = 0.026.

(ii) The probability asked for is

P(A|B) = P(A)P(B|A)/P(B) = 0.3(0.04)/0.026 = 6/13.

You might also like