0% found this document useful (0 votes)
44 views66 pages

Chapter 3

This document is the contents page and introduction for a chapter on probability and statistics from a textbook. It covers random experiments, outcomes, events, sample spaces, set theory, properties of probability, and introduces concepts like combinations and permutations that will be important for counting outcomes and calculating probabilities. The chapter appears to lay the groundwork for further lessons on probability rules, bivariate probabilities, and Bayes' theorem.

Uploaded by

egidosantos
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
44 views66 pages

Chapter 3

This document is the contents page and introduction for a chapter on probability and statistics from a textbook. It covers random experiments, outcomes, events, sample spaces, set theory, properties of probability, and introduces concepts like combinations and permutations that will be important for counting outcomes and calculating probabilities. The chapter appears to lay the groundwork for further lessons on probability rules, bivariate probabilities, and Bayes' theorem.

Uploaded by

egidosantos
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 66

Probability and Statistics

Chapter 3
Elements of chance: Probability Methods

Prof. Simón Ortiz

School of Economics and Business


Universidad de Navarra

2021-22

1 / 65
Contents

1 Introduction

2 Random Experiment, Outcomes, and Events


Set theory
Probability
Probability rules

3 Bivariate Probabilities
Bayes Theorem

2 / 65
Contents

1 Introduction

2 Random Experiment, Outcomes, and Events


Set theory
Probability
Probability rules

3 Bivariate Probabilities
Bayes Theorem

3 / 65
Introduction

We have seen several tools used in descriptive statistics and have


described methods used to collect, organize, and present data, as
well as measures of central location, dispersion, and skewness used
to summarize data.
A second facet of statistics deals with computing the chance that
something will occur in the future. This facet of statistics is called
inferential statistics. An inference is a generalization about a popu-
lation based on information obtained from a sample.
Probability plays a key role in inferential statistics. It is used to mea-
sure the reasonableness that a particular sample could have come
from a particular population.

4 / 65
Definitions
• Probability: A value between zero and one, inclusive, describing
the relative possibility (chance or likelihood) an event will occur.
• Experiment: A process that leads to the occurrence of one and
only one of several possible observations.
• Outcome: A particular result of an experiment.
• Event: A collection of one or more outcomes of an experiment.
• Sample space: The set of all possible outcomes of an experi-
ment. The sample space is usually denoted by S.
Example: If the experiment consists of flipping two coins and noting
whether they land heads or tails, then

S = (H, H), (H, T ), (T, H), (T, T ).

Notice that an event is a subset of the sample space. Events will be


denoted by the capital letters A, B, C, and so on.
5 / 65
Contents

1 Introduction

2 Random Experiment, Outcomes, and Events


Set theory
Probability
Probability rules

3 Bivariate Probabilities
Bayes Theorem

6 / 65
Random experiment

A random experiment is a process leading to two or more possible


outcomes, without knowing exactly which outcome will occur.
Examples of random experiments include the following:
• A coin is tossed and the outcome is either a head or a tail.
• Daily change in an index of the stock market.
• Number of people admitted to a hospital emergency room.

7 / 65
Sample Space

The possible outcomes from a random experiment are called the


basic outcomes, and the set of all basic outcomes is called the sample
space. We use the symbol S to denote the sample space.
We must define the basic outcomes in such a way that no two
outcomes can occur simultaneously. In addition, the random exper-
iment must necessarily lead to the occurrence of one of the basic
outcomes.
For tossing a single six-sided die, the typical sample space is:

S = 1, 2, 3, 4, 5, 6

8 / 65
Event

An event, E, is any subset of basic outcomes from the sample space.


An event occurs if the random experiment results in one of its con-
stituent basic outcomes. The null event represents the absence of a
basic outcome and is denoted by ∅.
We could be interested in the simultaneous occurrence of two or
more events. Example: odd number when tossing a die.

EventA = 1, 3, 5

9 / 65
Set theory

Set is a collection of elements/objects. Operations on sets:


• Intersection ⇒ A ∩ B

A ∩ B : {x|x ∈ A, andx ∈ B}

• Union ⇒ A ∪ B

A ∪ B : {x|x ∈ A, orx ∈ B}

• Complement ⇒ A → Ac
• Difference ⇒ A ∩ B c

A \ B : A ∩ Bc

10 / 65
Set theory

• Subset: A ⊂ B: for all a ∈ A, we have a ∈ B.


• C = {2,4,6,8} and D={2,4}. D is a subset of C because every
element of D is an element of C.
• In general, D ⊂ C iff x ∈ D implies x ∈ C
• J = {1,2,3,4,5}. How many subsets of J are there in J?

25 = 32 = ∅, {1, 2, 3, 4, 5}, {1}, {1, 2}...

• A = B: A ⊂ B and B ⊂ A.

11 / 65
Intersection

Let A and B be two events in the sample space S. Their intersection,


denoted by A ∩ B, is the set of all basic outcomes in S that belong
to both A and B. Hence, the intersection A ∩ B occurs if and only
if both A and B occur. We use the term joint probability of A and
B to denote the probability of the intersection of A and B.
If the events A and B have no common basic outcomes, they are
called mutually exclusive, and their intersection, A ∩ B, is said to
be the empty set (∅), indicating that A ∩ B has no members.

12 / 65
Intersection: Venn Diagram

13 / 65
Intersection: Venn Diagram

14 / 65
Union

Let A and B be two events in the sample space S. Their union,


denoted by A ∪ B, is the set of all basic outcomes in S that belong
to at least one of these two events. Hence, the union of A ∪ B
occurs if and only if (iff) either A or B or both occur.
If the union of several events covers the entire sample, S, we say that
these events are collectively exhaustive. Since every basic outcome
is in S it follows that every outcome of the random experiment will
be in at least one of these events.

15 / 65
Union: Venn Diagram

16 / 65
Complement

Let A be an event in the sample space, S. The set of basic outcomes


of a random experiment belonging to S but not to A is called the
complement of A and is denoted by A.
Clearly, events A and A are mutually exclusive -no basic outcome
can belong to both- and collective exhaustive - every basic outcome
must belong to one or the other.

17 / 65
Complement: Venn Diagram

18 / 65
Venn Diagram

19 / 65
Postulates of probability
• Objective probability
• Classical probability: the proportion of times that an event will
occur, assuming that all outcomes in a sample are equally
likely to occur. The probability of an event is determined
dividing the number of outcomes that satisfy the event by the
total number of outcomes in the sample space.
NA
P (A) =
N
• Empirical probability: The probability of an event happening is
the fraction of the time similar events happened in the past.
The proportion of times that an event A occurs in a large
number of trials, n.
nA
P (A) =
n
• Subjective probability: The likelihood (probability) of a
particular event happening that is assigned by an individual
based on whatever information is available.
20 / 65
Properties of probability
Consider an experiment whose sample space is S. We suppose that
for each event A there is a number, denoted P (A) and called the
probability of event A, that is in accord with the following three
properties:
• For any event A, the probability of A is a number between 0
and 1:
0 ≤ P (A) ≤ 1
• The probability of sample space S is 1:
P (S) = 1
• The probability of the union of disjoint events is equal to the
sum of the probabilities of these events. For instance, if A and
B are disjoint, then:
P (A ∪ B) = P (A) + P (B)
21 / 65
Law of large numbers

Law of Large Numbers: Over a large number of trials the empirical


probability of an event will approach its true probability.

22 / 65
Combinations and permutations

The classical statement of probability requires that we count out-


comes in the sample space. Then we use the counts to determine
the required probability.
However, counting all the outcomes would be very time consuming
if we first had to identify every possible outcome.
We will use combinations and permutations to count all possible
outcomes.
When the order doesn’t matter, it is a Combination. When the order
does matter it is a Permutation.

23 / 65
Permutations

Two types of permutations:


• Repetition is allowed. Example: lock number is 333.
• No Repetition: for example the first three people in a running
race. You can’t be first and second.

24 / 65
Permutations with repetition

We have to chose among n different objects. we have n choices


each time. More generally, we are choosing r of something that has
n different types. If r = 1, then we have n possibilities.
If r > 1 then:

n ∗ n ∗ n...(rtimes) = nr

Example: In a lock, there are 10 numbers to choose from


(0,1,2,3,4,5,6,7,8,9) and we choose 3 of them:

10 ∗ 10 ∗ 10(3times) = 103 = 1, 000permutations

In sum, nr , where n is the number of things to choose from, and


we choose r of them, repetition is allowed, and order matters.

25 / 65
Permutations without repetition
In this case, we have to reduce the number of available choices each
time. We will use the factorial function (3! = 3 ∗ 2 ∗ 1). Remember
0! = 1
Example: what order could 16 pool balls be in?
After choosing, say, number ”14” we can’t choose it again. So,
our first choice has 16 possibilities, and our next choice has 15
possibilities, then 14, 13, 12, 11, ... etc. And the total permutations
are:

16 ∗ 15 ∗ 14 ∗ 13... = 20, 922, 789, 888, 000


But maybe we don’t want to choose them all, just 3 of them, and
that is then:

16 ∗ 15 ∗ 14 = 3., 360
There are 3.360 different ways that 3 pool balls could be arranged
out of 16 balls. 26 / 65
Permutations without repetition

Now, we want to know the order of of 3 out of 16 pool balls. But


when we want to select just 3 we don’t want to multiply after 14.
How do we do that? There is a neat trick: we divide by 13!

16! 16 ∗ 15 ∗ 14 ∗ 13...
= = 16 ∗ 15 ∗ 14
13! 13 ∗ 12...
13 is cancelled out, leaving only 16 ∗ 15 ∗ 14
Therefore, the formula is:

n!
(n − r)!

27 / 65
Permutations without repetition

Example: order of 3 out of 16 pool balls

16! 16
= = 3, 360
(16 − 3)! 13

Example: How many ways can first and second place be awarded to
10 people?

10! 10
= = 90
(10 − 2)! 8

28 / 65
Combinations without Repetition

Remember, the order does not matter for combinations. Same as


with permutations: we can have with and without repetition.
Going back to our pool ball example, let’s say we just want to know
which 3 pool balls are chosen, not the order.
We already know that 3 out of 16 gave us 3.360 permutations. But
many of those are the same to us now, because we don’t care what
order!

29 / 65
Combinations without Repetition

For example, let us say balls 1, 2 and 3 are chosen. These are the
possibilities:

Order does matter Order doesn’t matter


1 2 3
1 3 2
2 1 3
2 3 1 123
3 1 2
3 2 1
Table 1: Combination without repetition

The permutations have 6 times as many possibilities (3! = 6).

30 / 65
Combinations without Repetition

So we adjust our permutations formula to reduce it by how many


ways the objects could be in order (because we aren’t interested in
their order any more)

n! 1 n!
∗ =
(n − r)! r! r!(n − r)!

Different notation:

 
n n!
=
r r!(n − r)!

where n is the number of things to choose from, and we choose r of


them, no repetition, order doesn’t matter.

31 / 65
Combinations without Repetition

Example: Pool Balls (without order)

n! 16!
= = 560
r!(n − r)! 3!(16 − 3)!

32 / 65
Combinations with Repetition

Let us say there are five flavors of ice cream: banana, chocolate,
lemon, strawberry and vanilla.
We can have three scoops. How many variations will there be?
Let’s use letters for the flavors: {b, c, l, s, v}. Example selections
include:
• {c, c, c} (3 scoops of chocolate)
• {b, l, v} (one each of banana, lemon and vanilla)
• {b, v, v} (one of banana, two of vanilla)
There are n = 5 things to choose from, we choose r = 3 of them,
order does not matter, and we can repeat!

33 / 65
Combinations with Repetition
Think about the ice cream being in boxes or containers, we could
say ”move past the first box, then take 3 scoops, then move along
3 more boxes to the end” and we will have 3 scoops of chocolate.

b c l s v

3 scoops of chocolate :

→ ◦ ◦ ◦ →→→

One each of banana, lemon and vanilla:

◦ →→ ◦ →→ ◦

One of banana, two of vanilla:

◦ →→→→ ◦◦

34 / 65
Combinations with Repetition
Notice that there are always 3 circles (3 scoops of ice cream) and
4 arrows (we need to move 4 times to go from the 1st to 5th con-
tainer). So (being general here) there are r + (n − 1) positions, and
we want to choose r of them to have circles.

   
(r + n − 1)! r+n−1 r+n−1
= =
r!(n − 1) r n−1

(3 + 5 − 1)! 7!
= = 35
3!(5 − 1) 3!4!
There are 35 ways of having 3 scoops from five flavors of ice cream.
Probability that, from a random choice, I get C-C-C?

1
35
35 / 65
Probability rules: Complement rule

Let A be an event and A its complement. Then, the complement


rule is as follows:

P (A) = 1 − P (A)

Example: Roll a dice, the probability of obtaining 1 = 16 , and the


probability of obtaining other than 1 is 1 − 16 = 65 .
This result is important because in some problems it may be easier
to find P (A) and then obtaining P (A).

36 / 65
Probability rules: Addition rule

Let A and B be two events. Using the addition rule of probabilities,


the probability of their union is as follows:

P (A ∪ B) = P (A) + P (B) − P (A ∩ B)

If both events are mutually exclusive:

P (A ∪ B) = P (A) + P (B)

37 / 65
Probability rules: Addition rule

38 / 65
Probability rules: Addition rule

Example: A cell phone company found that 75% of all customers


want text messaging on their phones, 80% want photo capability,
and 65% want both. What is the probability that a customer will
want at least one of these?

P (A) = 0.75P (B) = 0.80P (A ∩ B) = 0.65

P (A ∪ B) = P (A) + P (B) − P (A ∩ B) = 0.75 + 0.80 − 0.65 = 0.90

39 / 65
Conditional probability
Consider a pair of events, A and B. Suppose that we are con-
cerned about the probability of A, given B. This problem can be
approached using the concept of conditional probability.
Let A and B be two events. The conditional probability of event A,
given that event B has occurred, is denoted by the symbol P (A|B)
and is found to be as follows:

P (A ∩ B)
P (A|B) =
P (B)
Provided that P (B) > 0
Similarly

P (A ∩ B)
P (B|A) =
P (A)
Provided that P (A) > 0
40 / 65
Conditional probability
Example: A cell phone company found that 75% of all customers
want text messaging on their phones, 80% want photo capability,
and 65% want both. What are the probabilities that a person who
wants text messaging also want photo capability and that a person
who wants photo capability also wants text messaging?
Probability that a person who wants photo capability also wants text
messaging is the conditional probability of event A, given event B
is

P (A ∩ B) 0.65
P (A|B) = = = 0.8125
P (B) 0.80
The probability that a person who wants text messaging also wants
photo capability:

P (A ∩ B) 0.65
P (B|A) = = = 0.8667
P (A) 0.75
41 / 65
The multiplication rule of probabilities

Let A and B be two events. Using the multiplication rule of prob-


abilities, the probability of their intersection can be derived from
conditional probability as

P (A ∩ B) = P (A|B)P (B)

And also as

P (A ∩ B) = P (B|A)P (A)

42 / 65
The multiplication rule of probabilities

When the conditional probability of text messaging, given photo


capability,

0.65
P (A|B) = = 0.8125
0.80
is multiplied by the probability of photo capability, we have the joint
probability of both messaging and photo capability:

P (A ∩ B) = (0.8125)(0.80) = 0.65

43 / 65
Contents

1 Introduction

2 Random Experiment, Outcomes, and Events


Set theory
Probability
Probability rules

3 Bivariate Probabilities
Bayes Theorem

44 / 65
Bivariate Probabilities

In this section we introduce a class of problems that involve two dis-


tinct set of events, which we label A1 , A2 , ..., Ak and B1 , B2 , ..., Bk .
These events can be studied using a contingency table or a tree di-
agram.
The events Ai and Bj are mutually exclusive and collectively exhaus-
tive within their sets, but intersections (Ai ∩ Bj ) can occur between
all events from the two sets.
Two sets of events, considered jointly in this way, are called bivariate,
and the probabilities are called bivariate probabilities.

45 / 65
Bivariate Probabilities

In this section we introduce a class of problems that involve two dis-


tinct set of events, which we label A1 , A2 , ..., Ak and B1 , B2 , ..., Bk .
These events can be studied using a contingency table or a tree di-
agram.
The events Ai and Bj are mutually exclusive and collectively exhaus-
tive within their sets, but intersections (Ai ∩ Bj ) can occur between
all events from the two sets.
Two sets of events, considered jointly in this way, are called bivariate,
and the probabilities are called bivariate probabilities.

46 / 65
Bivariate Probabilities

B1 B2 ... Bj
A1 (A1 ∩ B1 ) (A1 ∩ B2 ) ... (A1 ∩ Bj )
A2 (A2 ∩ B1 ) (A2 ∩ B2 ) ... (A2 ∩ Bj )
. . . . .
. . . . .
Ai (Ai ∩ B1 ) (Ai ∩ B2 ) ... (Ai ∩ Bj )
Table 2: Contingency Table

47 / 65
Bivariate Probabilities

Example: Consider a potential advertiser who wants to know both


income and other relevant characteristics of the audience for a par-
ticular television show. Families may be categorized, using Ai , as
to whether they regularly, occasionally, or never watch a particular
series. In addition, they can be categorized, using Bj , according to
low, middle, and high income.
Then, we have nine possible cross-classification, with i = 3 and
j = 3.

48 / 65
Contingency Table

Viewing frequency High Income Middle Income Low Income Total


Regular 0.04 0.13 0.04 0.21
Occasional 0.10 0.11 0.06 0.27
Never 0.13 0.17 0.22 0.52
Total 0.27 0.41 0.32 1.00

Table 3: Contingency Table

49 / 65
Decision Tree

50 / 65
Joint and Marginal Probabilities

In the context of bivariate probabilities the intersection probabili-


ties, P (Ai ∩ Bj ) are called joint probabilities. The probabilities for
individual events, P (Ai ) o P (Bj ), are called marginal probabilities.
Marginal probabilities are at the margin of the table and ca be com-
puted by summing the corresponding row or column.

P (Ai ) = P (Ai ∩ B1 ) + P (Ai ∩ B2 ) + ... + P (Ai ∩ Bj )

P (A2 ) = P (A2 ∩ B1 ) + P (A2 ∩ B2 ) + P (A2 ∩ B3 ) =

= 0.10 + 0.11 + 0.06 = 0.27

51 / 65
Decision Tree

52 / 65
Decision Tree

53 / 65
Conditional Probabilities

The conditional probability can be obtained easily from the table


because we have all the joint probabilities and the marginal probabil-
ities. For example, the probability of a high-income family regularly
the show is as follows:

P (A1 ∩ B1 ) 0.04
P (A1 |B1 ) = = = 0.15
P (B1 ) 0.27

54 / 65
Independent events

We can also check whether or not paired events are statistically in-
dependent. Recall that event Ai and Bj are independent if and only
if their joint probability is the product of their marginal probabilities.

P (Ai |Bj ) = P (Ai )P (Bj )

P (A2 |B1 ) = 0.27

P (A2 ) ∗ P (B1 ) = 0.27 ∗ 0.27 6= 0.10

Hence, events A2 and B1 are not statistically independent.

55 / 65
Practical Example

Suppose that in a best out five tournament, Sevilla F.C. has a 55%
chance of winning any game. Solve the probabilistic tree by solv-
ing the conditional probabilities at each node and the unconditional
probability that this particular team will win the series.
• What is the unconditional probability that this team will
become champion?
• Provided that the team wins the first two games, what is the
updated probability they the team wins the whole series?
• Provided that the team actually loses the first two games,
what is the updated probability they the team wins the whole
series?
• Provided that the team loses the first game but wins the
second game, what is the updated probability that the team
wins the series?

56 / 65
Bayes Theorem

Bayes’ theorem describes the probability of an event, based on prior


knowledge of conditions that might be related to the event.
Let A1 , ...An be n mutually exclusive (disjoint) events whose prob-
ability sum to 1 and let B be an event. Then:

P (Aj )P (B|Aj )
P (Aj |B) = Pn
i=1 P (Ai )P (B|Ai )

If we have two mutually exclusive events A1 and A2

P (Aj )P (B|Aj )
P (Aj |B) =
P (A1 ) ∗ P (B|A1 ) + P (A2 )P (B|A2 )

57 / 65
Bayes Theorem

Suppose that 5% of the population have a given disease. Let A1


be an event ”has the disease” and A2 the vent ”does not have the
disease”. We know that selecting a person at random, we have
P (A1 ) = 0.05 and therefore P (A2 ) = 0.95
P (A1 ) is called the prior probability because it is assigned before
any empirical data is obtained.
The diagnostic technique for the disease is not very accurate. Let
B be the event ”test shows the disease is present”.
Historical evidence shows that if a person actually has the disease,
the probability that the test indicates the presence of the disease is
P (B|A1 ) = 0.90. The probability for the test to be positive when
the person does not have the disease is P (B|A2 ) = 0.15

58 / 65
Bayes Theorem

What is the probability for a randomly selected person whose test


is positive to actually have the disease? In other words, what is
P (A1 |B)?
P (A1 |B) is called posterior probability, that is the revised probability
based on additional information.

P (A1 )P (B|A1 )
P (A1 |B) =
P (A1 ) ∗ P (B|A1 ) + P (A2 )P (B|A2 )

0.05 ∗ 0.90
= 0.24
0.05 ∗ 0.90 + 0.95 + 0.15

59 / 65
Bayes Theorem

Conclusion: A randomly selected person has a probability of 5% of


having the disease. A random selected person who tested positive
has a probability of 24% of having the disease.

60 / 65
The Monty Hall Problem

21 Blackjack clip
We are going to solve The Monty Hall problem using Bayes’ theorem.
Doors A, B and C (no events, as we have seen before)
Conditions:
• Monty needs to open a door
• He can’t open the door with the car behind it or my door.
Let’s assume we pick door A, then Monty opens door B.
Monty wouldn’t open C if the car was behind C so we only need to
calculate 2 posteriors:
• P(door=A—opens=B), the probability A is correct if Monty
opened B.
• P(door=C—opens=B), the probability C is correct if Monty
opened B.

61 / 65
The Monty Hall Problem: Prior information

The probability of any door being correct before we pick a door is


1/3. Prizes are randomly arranged behind doors and we have no
other information. So the prior, P (A), of any door being correct is
1/3.
• P (door = A), the prior probability that door A contains the
car = 1/3
• P (door = C), the prior probability that door C contains the
car = 1/3

62 / 65
.

63 / 65
The Monty Hall Problem:

• If the car is actually behind door A, then Monty can open


door B or C. So the probability of opening either is 50%
• If the car is actually behind door C then monty can only open
door B. He cannot open A, the door we picked. He also
cannot open door C because it has the car behind it.
The likelihood monty opened door B if door A is correct:

P (opens = B|door = A) = 1/2

the likelihood Monty opened door B if door C is correct:

P (opens = B|door = C) = 1

63 / 65
The Monty Hall Problem:

P (door = A)P (opens = B|door = A) = 1/3 ∗ 1/2 = 1/6

P (door = c)P (opens = B|door = C) = 1/3 ∗ 1 = 1/3

P (B) = 1/6 + 1/3 = 3/6 = 1/2

64 / 65
The Monty Hall Problem: Posterior

P (door = A|opens = B) = (1/6)/(1/2) = 1/3

P (door = C|opens = B) = (1/3)/(1/2) = 2/3

This leaves us with a with a higher probability of winning if we


change doors after Monty opens a door.

65 / 65

You might also like