0% found this document useful (0 votes)
13 views104 pages

Probability

This document introduces the fundamentals of probability and its role in statistical inference and decision making. It covers concepts such as random experiments, sample spaces, probability assignments, events, joint and marginal probabilities, conditional probabilities, independence, and the complement rule. Additionally, it provides examples related to mutual fund managers to illustrate these concepts.

Uploaded by

8v5cwrc9wn
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views104 pages

Probability

This document introduces the fundamentals of probability and its role in statistical inference and decision making. It covers concepts such as random experiments, sample spaces, probability assignments, events, joint and marginal probabilities, conditional probabilities, independence, and the complement rule. Additionally, it provides examples related to mutual fund managers to illustrate these concepts.

Uploaded by

8v5cwrc9wn
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 104

Probability

Introduction
Although descriptive statistics consist of a set of
useful graphical and numerical methods, we are
particularly interested in developing statistical
inference about a population from a sample.
Our primary objective in this and the following
two chapters is to develop the probability-
based tools that are at the basis of statistical
inference.
However, probability can also play a critical role
in decision making, a subject we explore in
Chapter 22.
6-1 Random Experiment
A random experiment is an action or process
that leads to one of several possible outcomes.
The list of possible outcomes of an experiment
must be exhaustive and mutually exclusive.
Examples:
Illustration 1. Experiment: Flip a coin.
Outcomes: Heads and tails
Illustration 2. Experiment: Record marks on a
statistics test (out of 100).
Outcomes: Numbers between 0 and
100
Illustration 3. Experiment: Record grade on a
statistics test.
Outcomes: A, B, C, D, and F
6-1 Sample Space and Probability
Requirements
A sample space of a random experiment is a
list of all possible outcomes of the experiment.
The outcomes must be exhaustive and
mutually exclusive.
Using set notation, we represent the sample
space and its k outcomes as:

𝑆 = {𝑂! , 𝑂" , 𝑂# , … , 𝑂$ }
When we begin the task of assigning
probabilities to the outcomes, we must obey
two probability requirements:
1. The probability of any outcome must lie
between 0 and 1. That is,

0 ≤ 𝑃(𝑂% ) ≤ 1 for each i


2. The sum of the probabilities of all the
outcomes in a sample space must be 1.
That is,
∑$%&! 𝑃(𝑂% ) = 1
6-1a Three Approaches to Assigning
Probabilities
The classical approach is used by
mathematicians to determine probability based
on games of chance, and it consists of equally
likely outcomes.
The relative frequency approach is applied to
experimentation or historical data, and it defines
probability as the long-run relative frequency
with which an outcome occurs.
The subjective approach assigns probabilities
based on a subjective judgment, and it is used
when the classical approach is not reasonable
and there are no available experimentations or
historical data to apply the relative frequency
approach.
6-1b Defining Events
An individual outcome of a sample space is
called a simple event.
An event is a collection or set of one or more
simple events in a sample space.
Example:
From illustration 2 (6-1 Random Experiment
slide), we can define the event to achieve a
grade of A as the set of numbers that lie
between 80 and 100, inclusive.
Using set notation, we have:

𝐴 = {80, 81, 82, … , 100}


6-1c Probability of an Event
The probability of an event is the sum of the
probabilities of the simple events that constitute
the event.
Example:
Suppose that in illustration 3 (6-1 Random
Experiment slide), we employed the relative
frequency approach to assign probabilities to
the simple events as follows:
𝑃 (𝐴) = .20, 𝑃(𝐵 ) = .30, 𝑃 (𝐶 ) = .25, 𝑃(𝐷 ) =
.15, 𝑃(𝐹 ) = .10
The probability of the event, pass the course, is:

𝑃(Pass the course)


= 𝑃(𝐴) + 𝑃 (𝐵) + 𝑃(𝐶 ) + 𝑃(𝐷 )
= 0.2 + 0.3 + 0.25 + 0.15 = 0.9
6-1d Interpreting Probability
No matter what method was used to assign
probability, we always interpret it using the
relative frequency approach for an infinite
number of experiments
Examples:
Subjective approach: an investor assumed that
there is a 65% probability that a particular stock’s
price will increase over the next month.
Interpretation: if we had an infinite number of
stocks with the exact same economic and
market characteristics as the one the investor
will buy, 65% of them will increase in price over
the next month.
Classical approach: the probability that a
balanced die lands on the number 5 is 1/6.
Interpretation: the proportion of times that a
5 is observed on a balanced die thrown an
infinite number of times.
6-2a Intersection
The intersection of events A and B is the event
that occurs when both A and B occur.
It is denoted as:
A and B
The probability of the intersection is called the
joint probability.
Example 6.1 – Determinants of Success
among Mutual Fund Managers – Part 1*
Suppose that a potential investor examined the
relationship between how well the mutual fund
performs and which university awarded the
manager’s MBA.
After the analysis, the following table of joint
probabilities, was developed:

MBA MUTUAL FUND MUTUAL


PROGRAM OUTPERFORMS FUND DOES
MARKET NOT
OUTPERFORM
MARKET

Top 20 .11 .29

Non top .06 .54


20

Analyze these probabilities and interpret the


results.
*See notes for the source of this example.
Solution:
Let us represent the events as follows:
• A1 = Fund manager graduated from a top-
20 MBA program
• A2 = Fund manager did not graduate from a
top-20 MBA program
• B1 = Fund outperforms the market
• B2 = Fund does not outperform the market
Thus, the joint probabilities are:

• 𝑃(𝐴! 𝑎𝑛𝑑 𝐵! ) = .11


• 𝑃(𝐴" 𝑎𝑛𝑑 𝐵! ) = .06
• 𝑃(𝐴! 𝑎𝑛𝑑 𝐵" ) = .29
• 𝑃(𝐴" 𝑎𝑛𝑑 𝐵" ) = .54
6-2b Marginal Probabilities
Marginal probabilities are calculated in the
margins of the table, and they are computed by
adding across rows or down columns.
In Example 6.1, adding across the two rows and
column produces the marginal probabilities:
First row: 𝑃 (𝐴! ) = 𝑃(𝐴! 𝑎𝑛𝑑 𝐵! ) +
𝑃 (𝐴! 𝑎𝑛𝑑 𝐵" ) = .11 + .29 = .40
Second row: 𝑃 (𝐴" ) = 𝑃(𝐴" 𝑎𝑛𝑑 𝐵! ) +
𝑃 (𝐴" 𝑎𝑛𝑑 𝐵" ) = .06 + .54 = .60
First column: 𝑃 (𝐵! ) = 𝑃 (𝐴! 𝑎𝑛𝑑 𝐵! ) +
𝑃 (𝐴" 𝑎𝑛𝑑 𝐵! ) = .11 + .06 = .17
Second column: 𝑃(𝐵" ) = 𝑃(𝐴! 𝑎𝑛𝑑 𝐵" ) +
𝑃 (𝐴" 𝑎𝑛𝑑 𝐵" ) = .29 + .54 = .83
We can visualize marginal and joint probabilities
as follows:

MBA MUTUAL FUND MUTUAL FUND


PROGRAM OUTPERFORMS DOES NOT
MARKET OUTPERFORM
MARKET TOTALS

Top 20 𝑃 (𝐴! 𝑎𝑛𝑑 𝐵! ) 𝑃 (𝐴! 𝑎𝑛𝑑 𝐵" ) 𝑃 (𝐴! ) = .40


= .11 = .29

Non top 𝑃 (𝐴" 𝑎𝑛𝑑 𝐵! ) 𝑃 (𝐴" 𝑎𝑛𝑑 𝐵" ) 𝑃 (𝐴" ) = .60


20 = .06 = .54

Totals 𝑃(𝐵! ) = .17 𝑃(𝐵" ) = .83 1.00

6-2c Conditional Probability


The conditional probability is the probability of
one event given the occurrence of another
related event.
The probability of event A given event B is:

𝑃(𝐴 and 𝐵)
𝑃(𝐴|𝐵 ) =
𝑃(𝐵)

The probability of event B given event A is:

𝑃(𝐴 and 𝐵)
𝑃 ( 𝐵 |𝐴 ) =
𝑃(𝐴)

In Example 6.1, the conditional probability that a


fund managed by a graduate of a top-20 MBA
program (event A1) will outperform the market
(event B1) is:

𝑃(𝐴! and 𝐵! ) . 11
𝑃(𝐵! |𝐴! ) = = = .275
𝑃(𝐴! ) . 40

Example 6.1 – Determinants of Success


among Mutual Fund Managers – Part 2
Suppose that in Example 6.1 we select one
mutual fund at random and discover that it did
not outperform the market.
What is the probability that a graduate of a top-
20 MBA program manages it?
Solution:
We can reformulate the conditional probability
we need to calculate as follows:
The probability that a randomly selected mutual
fund is managed by a graduate of a top-20
MBA program (event A1), given the fact that the
fund did not outperform the market (event B2).

𝑃(𝐴! and 𝐵" ) . 29


𝑃(𝐴! |𝐵" ) = = = .349
𝑃(𝐵" ) . 83

6-2d Independence
One of the objectives of calculating conditional
probability is to determine whether two events
are related.
In particular, we would like to know whether
they are independent events.
Two events A and B are said to be independent
if:

𝑃 (𝐴|𝐵) = 𝑃(𝐴)
or:

𝑃(𝐵|𝐴) = 𝑃(𝐵)
In other words, whether one event happens has
no influence on the likelihood that the other
event happens.
Example 6.1 – Determinants of Success
among Mutual Fund Managers – Part 3
Determine whether the event that the manager
graduated from a top-20 MBA program and the
event the fund outperforms the market are
independent events.
Solution:
A1 = Fund manager graduated from a top-20
MBA program
B1 = Fund outperforms the market
To determine whether A1 and B1 are
independent events, we need do calculate the
probability of A1 given B1:

𝑃(𝐴! and 𝐵! ) . 11
𝑃(𝐴! |𝐵! ) = = = .647
𝑃(𝐵! ) . 17
The marginal probability that a manager
graduated from a top-20 MBA program is:

𝑃(𝐴! ) = .40
Since the two probabilities are not equal, we
conclude that the two events are dependent.
6-2e Complement Rule and Union of Two
Events
The complement of event A is the event that
occurs when event A does not occur. The
C
complement of event A is denoted by A .
The complement rule defined here derives
from the fact that the probability of an event
and the probability of the event’s complement
must sum to 1.

𝑃(𝐴' ) = 1 − 𝑃(𝐴)
for any event A.
The union of events A and B is the event that
occurs when either A or B or both occur.
It is denoted as:
𝐴 or 𝐵
Example 6.1 – Determinants of Success
among Mutual Fund Managers – Part 4
Determine the probability that a randomly
selected fund outperforms the market, or the
manager graduated from a top-20 MBA
program.
Solution:
We want to compute the probability of the
union of two events:
P(A1 or B1)
The union A1 or B1 occurs when any of the
following joint events occurs:
• A1 and B1
• A2 and B1
• A1 and B2
• We can visualize the union A1 or B1
highlighting the corresponding three joint
probabilities in the table:
MBA MUTUAL FUND MUTUAL
PROGRAM OUTPERFORMS FUND DOES
MARKET NOT
OUTPERFORM
MARKET

Top 20 𝑃 (𝐴! 𝑎𝑛𝑑 𝐵! ) 𝑃(𝐴! 𝑎𝑛𝑑 𝐵" )


= .11 = .29

Non top 𝑃 (𝐴" 𝑎𝑛𝑑 𝐵! ) 𝑃(𝐴" 𝑎𝑛𝑑 𝐵" )


20 = .06 = .54

• 𝑃 (𝐴! or 𝐵! ) = .11 + .06 + .29 = .46


• Notice that the only joint probability not
representing an event that is part of the
union is the joint probability of the events
A2 and B2.
• Which is the probability that the union does
not occur. Thus, the probability of the union
can also be written as:

• 𝑃 (𝐴! or 𝐵! ) = 1 − 𝑃(𝐴" and 𝐵" ) = 1 −


.54 = .46
6-3b Multiplication Rule
The multiplication rule is used to calculate
the joint probability of two events. It is based
on the formula for conditional probability:

𝑃(𝐴 and 𝐵)
𝑃 (𝐵 | 𝐴 ) =
𝑃(𝐴)

We derive the multiplication rule by solving


for P(A and B):

𝑃 (𝐴 and 𝐵) = 𝑃(𝐴)𝑃(𝐵 |𝐴)


Recollect that, if A and B are independent
events, 𝑃(𝐵 |𝐴) = 𝑃(𝐵).
It follows that the joint probability of two
independent events is simply the product of
the probabilities of the two events:

𝑃 (𝐴 and 𝐵) = 𝑃(𝐴)𝑃(𝐵)
Examples – Selecting Two Students
Example 6.5 – Without Replacemet
A graduate statistics course has seven male
and three female students. The professor
wants to select two students at random.
What is the probability that the two students
chosen are female?
Solution:
Let us define the following events:
A. the first chosen student is female.
B. the second chosen student is also female.
We need: 𝑃(𝐴 and 𝐵 ) = 𝑃(𝐴)𝑃(𝐵|𝐴)
# "
With:𝑃(𝐴) = and 𝑃(𝐵 |𝐴) =
!( )

# " *
Thus: 𝑃 (𝐴 and 𝐵) = = = .067
!( ) )(

Example 6.6 – With Replacement


The professor needs to select one student at
random as a substitute for the next two
classes.
What is the probability that the students
selected for the two classes are both females?
Solution:
This time the same student can be selected
for both classes.
Therefore, A and B are now independent
events.
We need: 𝑃(𝐴 and 𝐵 ) = 𝑃(𝐴)𝑃(𝐵)
# #
With:𝑃(𝐴) = and 𝑃(𝐵) =
!( !(

# # )
Thus: 𝑃 (𝐴 and 𝐵) = = = .09
!( !( !((

6-3c Addition Rule


The addition rule calculates the probability of
the union of two events. That is, the
probability that event A, or event B, or both
occur is:

𝑃(𝐴 or 𝐵) = 𝑃(𝐴) + 𝑃 (𝐵) − 𝑃(𝐴 and 𝐵)


This is because:

𝑃 (𝐴! ) = 𝑃(𝐴! 𝑎𝑛𝑑 𝐵! ) + 𝑃(𝐴! 𝑎𝑛𝑑 𝐵" )


𝑃(𝐵! ) = 𝑃(𝐴! 𝑎𝑛𝑑 𝐵! ) + 𝑃 (𝐴" 𝑎𝑛𝑑 𝐵! )
Plugging P(A1) and P(B1) into the addition rule:

𝑃 (𝐴! or 𝐵! ) = 𝑃(𝐴! 𝑎𝑛𝑑 𝐵! ) +


𝑃 (𝐴! 𝑎𝑛𝑑 𝐵" ) + 𝑃(𝐴! 𝑎𝑛𝑑 𝐵! ) +
𝑃 (𝐴" 𝑎𝑛𝑑 𝐵! ) − 𝑃(𝐴! and 𝐵! )
If we now simplify P(A1 or B1), we obtain the
sum of the joint probabilities as seen in Part 4
of Example 6.1:

𝑃(𝐴! or 𝐵! )
= 𝑃 (𝐴! 𝑎𝑛𝑑 𝐵" ) + 𝑃 (𝐴! 𝑎𝑛𝑑 𝐵! )
+ 𝑃(𝐴" 𝑎𝑛𝑑 𝐵! )
Example 6.7 – Applying the Addition Rule
In a large city, two newspapers are published,
the Sun and the Post.
The circulation departments report that 22% of
the city’s households have a subscription to
the Sun and 35% subscribe to the Post. A
survey reveals that 6% of all households
subscribe to both newspapers.
What proportion of the city’s households
subscribe to at least one newspaper?
Solution: We can reformulate the question
as: “what is the probability of selecting a
household at random that subscribes to the
Sun or the Post or both?”

𝑃(𝑆𝑢𝑛 or 𝑃𝑜𝑠𝑡)
= 𝑃(𝑆𝑢𝑛) + 𝑃 (𝑃𝑜𝑠𝑡 )
− 𝑃(𝑆𝑢𝑛 and 𝑃𝑜𝑠𝑡)
𝑃(𝑆𝑢𝑛 or 𝑃𝑜𝑠𝑡) = .22 + .35 − .06 = .51
Interpret: There is a 51% probability that a
randomly selected household subscribes to
one or the other or both papers.
6-3d Probability Trees
In a probabilty tree, the events in an
experiment are represented by branches,
which are lines linked to each other. Then, we
calculate the joint probabilities by multiplying
the probabilities on the linked branches.
Parallel branches from the same node are
mutually exclusive and can be added
together.
Consider again the example of the probability
of selecting two female students at random
from a graduate statistics course that has
seven male and three female students, either
without or with replacement.
*See notes for additional considerations.
Probability Tree for Exercise 6.5

Probability Tree for Exercise 6.6

Example 6.8 - Probability of Passing the Bar


Exam
Suppose that the pass rate for law school
graduates taking the bar exam for the first
time is 72%.
Candidates who fail the first exam may take it
again later. Of those who fail their first test,
88% pass on their second attempt.
Find the probability that a randomly
selected law school graduate becomes a
lawyer.
Assume that candidates cannot take the exam
more than twice.
Solution:
The included probability tree describes the
experiment. Note that we use the
complement rule to determine the probability
of failing each exam, i.e., P(Fail) and
P(Fail|Fail).
𝑃(Becoming Lawyer)
= 𝑃 (Pass) + 𝑃(Fail)𝑃(Pass|Fail)
= .72 + .28 ∙ .88 = .9664
6-4 Bayes’s Law
Conditional probability measures the
likelihood that event B occurs given that a
possible cause of the event A has occurred.
Bayes’s Law is used when we witness event B,
and we need to compute the probability of
one of its possible causes (event A).

𝑃(𝐵 |𝐴) ⇒ 𝑃(𝐴|𝐵)


Terminology:
C
The probabilities P(A) and P(A ) are called
prior probabilities because they are
determined prior to determining whether
event B has taken place.
The conditional probability P(B|A) is called a
likelihood probability.
The conditional probability P(A|B) is called a
posterior probability (or revised
probability), because the prior probability is
revised after event B has taken place.
Example 6.9 – Building a Probability Tree
Should an MBA Applicant Take a
Preparatory Course?
Suppose that a survey of MBA students
reveals that among GMAT scores of at least
650, 52% took a preparatory course, whereas
among GMAT scores of less than 650, only
23% took a preparatory course.
An applicant to an MBA program knows that
the probability of scoring at least 650 is 10%,
and he is willing to purchase the prep course
only if the probability of achieving at least 650
at least doubles, or 20%.
We can begin by defining the events A and B
as follows:
A = GMAT score is 650 or more
B = Took preparatory course
Then, we can define the provided
probabilities as:
c
P(A ) = .10
P(B|A) = .52
c
P(B|A ) = .23
Using the complement and multiplications
rules we can complete the probability tree
shown to the right.

Example 6.9 – Solving a Probability Tree


with Bayes’s Law
Should an MBA Applicant Take a
Preparatory Course?
We need to calculate the probability that an
MBA applicant scores at least 650 (event A),
given that he has taken a preparatory course
(event B).
Using the conditional probability formula, we
can write:

𝑃(𝐴 and 𝐵)
𝑃 (𝐴 | 𝐵 ) =
𝑃(𝐵)

The probability tree provides: 𝑃 (𝐴 and 𝐵) =


.052
From the probability tree we can also derive
the marginal probability P(B) as:

𝑃(𝐵) = 𝑃(𝐴 and 𝐵) + 𝑃(𝐴' and 𝐵)


= .052 + .207 = .259
Thus, the probability that an MBA applicant
scores at least 650, given that he has taken a
preparatory course is:

𝑃(𝐴 and 𝐵) . 052


𝑃 (𝐴 | 𝐵 ) = =
𝑃(𝐵) . 259
= .201 which is greater than 20%.
6-4a Bayes’s Law Formula
For those who prefer an algebraic approach
rather than a probability tree, Bayes’s Law can
be expressed as a formula:

𝑃 (𝐴% |𝐵)
𝑃(𝐴% )𝑃(𝐵|𝐴% )
=
𝑃(𝐴! )𝑃(𝐵|𝐴! ) + 𝑃(𝐴" )𝑃(𝐵|𝐴" ) + ⋯ + 𝑃(𝐴$ )𝑃(𝐵|𝐴$ )
where:
B is the given event,
A1, A2,…, Ak are the events with known prior
probabilities P(A1), P(A2),…, P(Ak),
P(B|A1), P(B|A2),…, P(B|Ak) are the likelihood
probabilities,
P(Ai|B), with i = 1, 2,…, k are the posterior
probabilities we seek.
Example 6.9 – Solving with Bayes’s Law
Formula
Should an MBA Applicant Take a
Preparatory Course?
We define the events as follows:
A1 = GMAT score is 650 or more
A2 = GMAT score is less than 650
B = Student took preparatory course
The provided prior probabilities are:
P(A1) = .10 (probability student scored 650 or
more)
P(A2) = 1 − .10 = .90 (complement probability
– probability student scored less than 650)
The provided conditional (likelihood)
probabilities are:
P(B|A1) = .52 (probability student took prep
course among those scoring 650 or more)
P(B|A2) = .23 (probability student took prep
course among those scoring less than 650)
The Bayes’s Law formula yields the probability
a student scores 650 or more after taking the
prep course:
𝑃(𝐴! )𝑃(𝐵|𝐴! )
𝑃(𝐴! |𝐵) =
𝑃(𝐴! )𝑃(𝐵|𝐴! ) + 𝑃(𝐴" )𝑃(𝐵|𝐴" )
. 10(.52) . 052
= =
. 10(. 52) + .90(.23) . 052 + .207
= .201
6-5 Identifying the Correct Method
The key issue is to determine whether joint
probabilities are provided or are required:
Joint probabilities are provided
1. Compute marginal probabilities by adding
across rows and down columns.
2. Use the joint and marginal probabilities to
compute conditional probabilities.
3. Determine whether the events described by
the table are independent.
4. Apply the addition rule to compute the
probability that either of two events occur.
Joint probabilities are required
1. Apply probability rules or build a probability
tree.
2. Use the multiplication rule to calculate the
probability of intersections.
3. Apply addition and complement rules for
mutually exclusive events.
4. Compute posterior probability using the
Bayes’s Law.
Chapter Summary
• The first step in assigning probability is to
create an exhaustive and mutually exclusive
list of outcomes.
• The second step is to use the classical,
relative frequency, or subjective approach
and assign probability to the outcomes.
• A variety of methods are available to
compute the probability of other events.
These methods include probability rules
and trees.
• An important application of these rules is
Bayes’s Law, which allows us to compute
conditional probabilities from other forms
of probability.
Chapter 7: Random Variables and Discrete
Probability Distributions
7-1 Random Variables and Probability
Distributions
A random variable is a function or rule that
assigns a number to each outcome of an
experiment.
There are two types of random variables:
A discrete random variable takes on a
countable number of values.
• e.g., values on the roll of two dice: 2, 3, 4, ...,
12
A continuous random variable takes on an
uncountable number of values.
• e.g., time: 30.1 minutes? 30.0001 minutes?
30.0000001 minutes?
• We will consider continuous random
variables in Chapter 8.
A probability distribution is a table, formula,
or graph that describes the values of a
random variable and the probability
associated with these values.
• P(X = x) or simply P(x).
7-1a Discrete Probability Distributions
The probabilities of the values of a discrete
random variable may be derived by means of
probability tools such as tree diagrams or by
applying one of the definitions of probability.
These two conditions apply:

1. 0 ≤ 𝑃(𝑥 ) ≤ 1 for all 𝑥


2. ∑+,, . 𝑃(𝑥) = 1
Example 7.1 – Probability Distribution of
Persons per Household
The following table, published in The
Statistical Abstract of the United States,
summarizes the number of persons living in a
household.
Number of Millions of Households
Persons

1 35.2

2 43.5

3 19.5

4 16.2

5 7.3

6 2.8

7 or more 1.6

Total 126.1

Question: Develop the probability distribution


for the number of persons per household.
Solution: The probability of each value of X,
the number of persons per household, is
computed as the relative frequency.

Number of P(x)
Persons

1 35.2 / 126.1 = .279


2 43.5 / 126.1 = .345

3 19.5 / 126.1 = .155

4 16.2 / 126.1 = .128

5 7.3 / 126.1 = .058

6 2.8 / 126.1 = .022

7 or more 1.6 / 126.1 = .013

Total 126.1 / 126.1 = 1.000

The probability of a household with at least 4


persons:

𝑃(𝑋 ≥ 4) = .128 + .058 + .022 + .013 = .221


Example 7.2 – Probability Distribution of
the Number of Sales
A mutual fund salesperson has arranged to
call on three people. Based on past
experience, the salesperson knows that there
is a 20% chance of closing a sale on each call.
Determine the probability distribution of the
number of sales the salesperson will make.
Let S denote closing a sale: P(S) = .2
C C
Thus, S is not closing a sale. P(S ) = .8
We can use the multiplication rule for
independent events to develop a probability
tree, and use it to calculate the probability
distribution of making a number of X sales:
x Events P(X = x)

0 SCSCSC .512

1 SSCSC, .128 + .128 + .128 = .384


SCSSC, SCSCS
2 SSSC, SSCS, .032 + .032 + .032 = .096
SCSS
3 SSS .008

Where, for example:


C C C C C C
for X = 0, P(S S S ) = P(S ) P(S ) P(S ) =
(.8)(.8)(.8) = .512
C C C C
for X = 1, P(SS S ) = P(S) P(S ) P(S ) =
(.2)(.8)(.8) = .128
C C
for X = 2, P(SSS ) = P(S) P(S) P(S ) = (.2)(.2)(.8)
= .032
for X = 3, P(SSS) = P(S) P(S) P(S) = (.2)(.2)(.2)
= .008
7-1c Describing the Population/Probability
Distribution
Because a discrete probability distribution
represents a population, we can describe it by
computing various parameters.
Population mean (expected value)

𝐸 (𝑋 ) = 𝜇 = g 𝑥𝑃(𝑥)
/00 1

Population variance (*see notes for shortcut


calculations)

𝑉 (𝑋 ) = 𝜎 " = g (𝑥 − 𝜇)" 𝑃(𝑥)


/00 1

Shortcut calculations for population variance:

𝑉 (𝑋 ) = 𝜎 " = g 𝑥 " 𝑃(𝑥) − 𝜇"


+,, .

Population standard deviation

𝜎 = j𝜎 "
Example 7.3 – Describing the Population of
the Number of Persons per Household
Find the mean, variance, and standard
deviation for the population of the number of
persons per household in Example 7.1.
For this example, we will assume that the last
category is exactly seven persons.
The mean of X is:

𝐸 (𝑋) = 𝜇 = g 𝑥𝑃(𝑥)
/00 1
= 1𝑃(1) + 2𝑃(2) + ⋯ + 7𝑃(7)
= 1(. 279) + 2(. 345) + ⋯ + 7(. 013)
= 2.46
The variance of X is:

𝑉(𝑋) = 𝜎 " = g (𝑥 − 𝜇)" 𝑃(𝑥)


/00 1
= (1 − 2.46)" (. 279)
+ (2 − 2.46)" (. 345) + ⋯
+ (7 − 2.46)" (. 013) = 1.931
The standard deviation of X is:

𝜎 = j𝜎 " = √1.931 = 1.39


7-1d Laws of Expected Value and Variance
Laws of Expected Value

1. 𝐸 (𝑐 ) = 𝑐
2. 𝐸 (𝑋 + 𝑐 ) = 𝐸 (𝑋 ) + 𝑐
3. 𝐸 (𝑐𝑋) = 𝑐𝐸(𝑋)
where c is a constant.
Laws of Variance

1. 𝑉 (𝑐) = 0
2. 𝑉 (𝑋 + 𝑐) = 𝑉 (𝑋 )
3. 𝑉 (𝑐𝑋) = 𝑐 " 𝑉(𝑋)
Example 7.4 – Describing the Population of
Monthly Profit
The monthly sales at a computer store have a
mean of $25,000 and a standard deviation of
$4,000. Profit is calculated by multiplying sales
by 30% and subtracting a fixed cost of $6,000.
Question: find the mean and standard
deviation of monthly profit.
We can describe the relationship between
profit and sales as: Promit = .30 (Sales) −
6,000
Solution: expected value
The expected or mean profit is:

𝐸 (Promit) = 𝐸 [. 30(Sales) − 6,000]


nd
Applying the 2 law of expected value, we
obtain:

𝐸 (Promit) = 𝐸[.30 (Sales)] − 6,000


rd
Applying the 3 law yields the mean of
monthly profit:

𝐸 (Promit) = .30 𝐸 (Sales) − 6,000


= .30(25,000) − 6,000 = $1,500
Solution: variance and standard deviation
The variance is:

𝑉 (Promit) = 𝑉 [. 30(Sales) − 6,000]


nd
Applying the 2 law of variance, we obtain:

𝑉 (Promit) = 𝑉[.30 (Sales)]


rd
Applying the 3 law yields:

𝑉 (Promit) = (.30)" 𝑉 (Sales) = .09(4,000)"


= 1,440,000
Thus, the standard deviation of monthly profit
is:
𝜎234567 = j1,440,000 = $1,200
7-2 Bivariate Distributions
A bivariate distribution provides joint
probabilities of the combination of two
variables. We refer to the probability
distribution of one variable we saw previously
as a univariate distribution.
A joint probability distribution of X and Y is a
table or formula that lists the joint
probabilities for all pairs of values x and y, and
is denoted P(x,y).
These two conditions apply to a discrete
bivariate distribution:

1. 0 ≤ 𝑃(𝑥, 𝑦) ≤ 1 for all pair of values (𝑥, 𝑦)


2. ∑/00 1 ∑/00 8 𝑃 (𝑥, 𝑦) = 1
Example 7.5 – Bivariate Distribution of the
Number of House Sales
Xavier and Yvette are real estate agents. Let X
denote the number of houses that Xavier will
sell in a month and Y denote the number of
houses Yvette will sell in a month.
An analysis of their past monthly
performances has the following joint
probabilities:
Bivariate probability distribution

0 1 2

0 .12 .42 .06

Y 1 .21 .06 .03

2 .07 .02 .01

For example, the probability that Xavier sells 0


houses and Yvette sells 1 house in the month
is P(0,1) = .21.
As we did in Chapter 6, we can calculate the
marginal probabilities by summing across
rows or down columns.
The marginal probability distributions of X
and Y are as follows:
x P(X=x)

0 .12 + .21 + .07 = .4

1 .42 + .06 + .02 = .5

2 .06 + .03 + .01 = .1

y P(Y=y)

0 .12 + .42 + .06 = .6

1 .21 + .06 + .03 = .3

2 .07 + .02 + .01 = .1

Example 7.5 – Describing the Bivariate


Distribution of the Number of House Sales
As we did with the univariate distribution, we
can describe the bivariate distribution in
Example 7.5 by computing the mean,
variance, and standard deviation of each
variable utilizing the respective marginal
probabilities.
Marginal Distribution of X
𝐸 (𝑋 ) = 𝜇. = ∑+,, . 𝑥𝑃(𝑥) = 0(. 4) + 1(. 5) +
2(. 1) = 0.7
𝑉 (𝑋) = 𝜎. " = ∑+,, .(𝑥 − 𝜇. )" 𝑃(𝑥) = (0 −
.7)" (. 4) + (1 − .7)" (. 5) + (2 − .7)" (. 1) =
0.41

𝜎. = j𝜎. " = √0.41 = 0.64


Marginal Distribution of Y
𝐸 (𝑌) = 𝜇8 = ∑+,, 8 𝑦𝑃(𝑦) = 0(. 6) + 1(. 3) +
2(. 1) = 0.5
"
𝑉 (𝑌 ) = 𝜎8 " = ∑+,, 8v𝑦 − 𝜇8 w 𝑃 (𝑦) = (0 −
.5)" (. 6) + (1 − .5)" (. 3) + (2 − .5)" (. 1) =
0.45

𝜎8 = x𝜎8 " = √0.45 = 0.67

7-2b Covariance and Coefficient of


Correlation
The covariance and the coefficient of
correlation, previously introduced in Chapter
4, describe the relationship between the two
discrete variables of the bivariate distribution.
Covariance (*see notes for shortcut
calculation):
COV(𝑋, 𝑌) = 𝜎.8

= g g (𝑥 − 𝜇. )v𝑦 − 𝜇8 w𝑃(𝑥, 𝑦)
/00 1 /00 8

Shortcut calculation for population


covariance:

𝐶𝑂𝑉 (𝑋, 𝑌) = 𝜎.8 = g g 𝑥𝑦𝑃 (𝑥, 𝑦) − 𝜇. 𝜇8


+,, . +,, 8

Coefficient of correlation:
𝜎.8
𝜌=
𝜎. 𝜎8

Example 7.6 – Covariance and Coefficient


of Correlation for the Number of House
Sales
Calculation of the covariance using the
shortcut method:
COV(𝑋, 𝑌 ) = 𝜎.8 = g g 𝑥𝑦𝑃 (𝑥, 𝑦) − 𝜇. 𝜇8
/00 1 /00 8

= (0)(0)(. 12) + (1)(0)(. 42)


+ +(2)(0)(. 06) + (0)(1)(. 21)
+ (1)(1)(. 06) + +(2)(1)(. 03)
+ (0)(2)(. 07) + (1)(2)(. 02)
+ +(2)(2)(. 01) − (0.7)(0.5) = −0.15
Given the previously calculated standard
deviations for X and Y:
𝜎.8 −0.15
𝜌= = = −.35
𝜎. 𝜎8 (0.64)(0.67)

There is a weak negative relationship between


the two variables: the number of houses
Xavier will sell in a month (X) and the number
of houses Yvette will sell in a month (Y).
7-2c Sum of Two Variables
Of particular interest to us is the sum of two
variables. The analysis of this type of
distribution leads to an important statistical
application in finance.
To develop the probability distribution of the
sum of two variables, add together the joint
probabilities corresponding to each total
value of X + Y.
In Example 7.5, the possible values of X + Y
are:
0, 1, 2, 3, and 4 houses.
Example: if we want to calculate the
probability associated to the total sale of 2
houses, we write:

𝑃(𝑋 + 𝑌 = 2) = 𝑃(0,2) + 𝑃(1,1) + 𝑃 (2,0)


= .07 + .06 + .06 = .19
We calculate the probabilities of the other
values of X + Y, similarly, producing the
following table:

x+
Events P(x + y)
y

0 P(0,0) .12

1 P(0,1) + P(1,0) .63


P(0,2) + P(1,1) +
2 .19
P(2,0)
3 P(1,2) + P(2,1) .05

4 P(2,2) .01

We can now compute the parameters of X +


Y as we learned to do for univariate
distributions:

𝐸 (𝑋 + 𝑌) = .63 + 2(. 19) + 3(. 05) + 4(. 01)


= 1.2
𝑉 (𝑋 + 𝑌) = (0 − 1.2)" (. 12) + (1 − 1.2)" (. 63)
+(2 − 1.2)" (. 19) + (3 − 1.2)" (. 05)
+ (4 − 1.2)" (. 01) = 0.56
7-2c Laws of Expected Value and Variance
of the Sum of Two Variables

1. 𝐸 (𝑋 + 𝑌) = 𝐸 (𝑋) + 𝐸(𝑌)
2. 𝑉 (𝑋 + 𝑌 ) = 𝑉 (𝑋) + 𝑉 (𝑌) + 2 𝐶𝑂𝑉(𝑋, 𝑌)
If X and Y are independent, 𝐶𝑂𝑉(𝑋, 𝑌) = 0,
thus:

𝑉 (𝑋 + 𝑌) = 𝑉 (𝑋) + 𝑉 (𝑌 )
Applying the formulas to Example 7.5, we
obtain the same values shown for the
distribution of the sum of two variables:

𝐸 (𝑋 + 𝑌) = 𝐸 (𝑋) + 𝐸 (𝑌) = 0.7 + 0.5 = 1.2


𝑉 (𝑋 + 𝑌) = 𝑉 (𝑋 ) + 𝑉 (𝑌) + 2 𝐶𝑂𝑉(𝑋, 𝑌)
= .41 + .45 + 2(−0.15) = 0.56

𝜎.98 = j𝑉(𝑋 + 𝑌) = √0.56 = .75


7-3 Mean and Variance of a Portfolio of
Two Stocks
Financial analysts lower the risk that is
associated with the stock market through
diversification. The mathematical strategy of
diversification developed by Harry Markowitz
in 1952 paved the way for the development
of modern portfolio theory (MPT).
Mean and variance for a portfolio consisting
of two stocks are determined as:

𝐸 (𝑅: ) = 𝑤! 𝐸 (𝑅! ) + 𝑤" 𝐸 (𝑅" )

𝑉 (𝑅: ) = 𝑤!" 𝜎!" + 𝑤"" 𝜎!" + 2𝑤! 𝑤" 𝜌𝜎! 𝜎"


Where w1 and w2 are the weights of
investments 1 and 2, E(R1) and E(R2) their
expected values, σ1 and σ2 their standard
deviations, and ρ the coefficient of
correlation.
Example 7.8 – Describing the Population of
the Returns on a Portfolio
An investor has decided to form a portfolio by
putting 25% into McDonald’s stock and 75%
into Cisco Systems stock.
The investor assumes that the expected
returns will be 8% and 15%, and the standard
deviations will be 12% and 22%, respectively.
a. Find the portfolio expected return.
b. Compute the standard deviation of the
returns on the portfolio assuming that:
i. the two stocks’ returns are perfectly
positively correlated.
ii. the coefficient of correlation is .5.
iii. the two stocks’ returns are
uncorrelated.
a. Expected return on the portfolio:

𝐸 (𝑅: ) = 𝑤! 𝐸 (𝑅! ) + 𝑤" 𝐸 (𝑅" )


= .25(. 08) + .75(. 15) = .1325
b. Standard deviation of the portfolio return:

𝑉(𝑅: ) = 𝑤!" 𝜎!" + 𝑤"" 𝜎!" + 2𝑤! 𝑤" 𝜌𝜎! 𝜎" =


=. 25" (. 12) +. 75" (. 22)
+ 2𝜌(. 25)(. 75)(. 12)(. 22)
= .0281 + .0099𝜌
i. When ρ = 1: 𝑉 (𝑅: ) = .0281 + .0099 =
.0380

𝜎: = j𝑉(𝑅: ) = √. 0380 = .1949


i. When ρ = .5: 𝑉(𝑅: ) =
.0281 + (.0099)(.5) = .0331

𝜎: = j𝑉(𝑅: ) = √. 0331 = .1819


i. When ρ = 0: 𝑉 (𝑅: ) = .0281 +
.0099(0) = .0281

𝜎: = j𝑉(𝑅: ) = √. 0281 = .167


7-4 Binomial Distribution
The binomial distribution is the result of a
binomial experiment, which has the following
properties:
Binomial Experiment
1. The binomial experiment consists of a fixed
number of n trials.
2. Each trial has two possible outcomes: a
success and a failure.
3. The probability of success is p. The
probability of failure is 1 – p.
4. The trials are independent.
A Bernoulli process is a trial that satisfies
properties 2-4.
The binomial random variable is the random
variable of a binomial experiment. It is defined
as the number of successes in the
experiment’s n trials.
7-4 Examples of Binomial Experiments
1. Flip a coin 10 times.
• n = 10
• Two outcomes: heads (success) and
tails (failure).
• If coin is fair: P(heads) = P(tails) = .5
• Each coin toss is independent.
2. Draw five cards out of a shuffled deck.
• n=5
• We label success any rank and/or suit
of cards we seek, such as a five or clubs.
• If success is a club, then P(club) = 13 /
52; if it is a rank of five, then P(five) =
1/13.
• Each draw is independent only if we
replace the drawn card in the deck and
reshuffle each time.
3. A political survey of 1,500 voters about an
upcoming election.
• n = 1,500
• If there are more than two candidates,
the candidate of choice is a success, the
other a failure.
• The actual value of p is unknown. The
task of the statistics practitioner is to
estimate its value.
• The trials are independent because the
choice of one voter does not affect the
choice of others.
7-4a Binomial Random Variable
The binomial random variable is the number
of successes X in the experiment’s n trials. It
can take on values 0, 1, 2, …, n. Thus, the
random variable X is discrete.
We can draw the n trials of a binomial
experiment with a probability tree. The stages
represent the outcomes for each of the n
trials. At each stage, there are two branches
representing success (S) and failure (F).
To calculate the probability that there are X
successes in n trials, we multiply together the
probability of a success, p, or a failure, 1 – p,
for each stage of the sequence. And if there
are X successes, there must be n – X failures.

7-4a Binomial Probability Distribution


We can write the probability for each
sequence of branches that represents x
successes with probability p, and n – x failures
with probability 1 – p as:

𝑝 . (1 − 𝑝);<.
The combinatorial formula (*see notes for an
explanation of the factorial) yields the count
of branch sequences that produces x
successes and n – x failures:
𝑛!
𝐶.; =
𝑥! (𝑛 − 𝑥)!

Where n! = n(n-1)(n-2)…(2)(1).
For example, 3! = 3 x 2 x 1 = 6.
Incidentally, although it may not appear to be
logical, 0! = 1.
Combining the two components yields the
binomial probability distribution:

𝑛!
𝑃 (𝑥 ) = 𝑝 . (1 − 𝑝);<. for 𝑥
𝑥! (𝑛 − 𝑥 )!
= 1, 2, … , n
Example 7.9 – Pat Statsdud and the
Statistics Quiz
Pat Statsdud is a (not good) student taking a
statistics course. Pat’s exam strategy is to rely
on luck for the next quiz.
The quiz consists of 10 multiple-choice
questions. Each question has five possible
answers, only one of which is correct. Pat
plans to guess the answer to each question.
a. What is the probability that Pat gets no
answers correct?
b. What is the probability that Pat gets two
answers correct
Solution:
This is a binomial experiment because:
1. n = 10
2. Two outcomes: correct and incorrect
answer.
3. Probability of correct answer: p = 1/5 = .2.
4. Answers to questions are independent.
We can apply the binomial probability
distribution to answer both questions:
!(!
a. x = 0: 𝑃(0) = . 2( . 8(!(<() =
(!(!(<()!

1(1)(.8)!( = .1074
!(!
b. x = 2: 𝑃(2) = . 2" . 8(!(<") =
"!(!(<")!
)∙!(
(.2)" (.8)A
"

= 45(. 04)(. 1678) = .3020


7-4b Cumulative Probability
The probability that a random variable is less
than or equal to a value x is called a
cumulative probability, and it is represented
as P(X ≤ x).
In the case of a discrete probability
distribution, such as the binomial distribution,
we can write:
.

𝑃(𝑋 ≤ 𝑥 ) = g 𝑃(𝑋 = 𝑥)
B&(

Example 7.10 – Will Pat Fail the Quiz?


Find the probability that Pat fails the quiz. For
the purpose of this exercise, a mark is
considered a failure if it is less than 50%.
Solution:
Because there are 10 questions, 50%
corresponds to a mark of 5. Because the
marks must be integers, a mark of 4 or less is
a failure.
𝑃 (𝑋 ≤ 4)
= 𝑃(0) + 𝑃(1) + 𝑃 (2) + 𝑃(3) + 𝑃(4)
= .1074 + .2684 + .3020 + .2013
+ .0881 = .9672
There is a 96.72% probability that Pat will fail
the quiz by guessing the answer for each
question.
7-4c Binomial Table
Table 1 in Appendix B of the textbook
provides cumulative binomial probabilities for
selected values of n and p.
We can use this table to answer the question
in Example 7.10, where we need P(X ≤ 4).
The table to the right shows the values of P(X
≤ 4) for n = 10 and p = .2.
We can use the table and the complement
rule to determine probabilities of the type P(X
= x) and P(X ≥ x).
For example, to find the probability that Pat
answer exactly two questions:
𝑃(2) = 𝑃(𝑋 ≤ 2) − 𝑃(𝑋 ≤ 1) = .6778 − .3758
= .3020
And the probability that Pat passes the quiz is:

𝑃(𝑋 ≥ 5) = 1 − 𝑃 (𝑋 ≤ 4) = 1 − .9672
= .0328
*See notes for instructions to calculate
binomial probabilities with Excel.
Cumulative Binomial Probabilities with n =
10 and p = .2

X P(X ≤ x)

0 .1074

1 .3758

2 .6778

3 .8791

4 .9672

5 .9936
6 .9991

7 .9999

8 1.0000

7-4d Mean and Variance of a Binomial


Distribution
Statisticians have developed general formulas
for the mean, variance, and standard deviation
of a binomial random variable:

𝜇 = 𝑛𝑝
𝜎 " = 𝑛𝑝(1 − 𝑝)

𝜎 = j𝑛𝑝(1 − 𝑝)
Example 7.11: suppose that a professor has a
class full of students like Pat.
a. What is the mean mark? 𝜇 = 𝑛𝑝 =
10(. 2) = 2
b. What is the standard deviation? 𝜎 =
j𝑛𝑝(1 − 𝑝) = j10(. 2). 8) = 1.26
7-5 Poisson Distribution
A binomial random variable is the number of
successes in a set number of trials, whereas a
Poisson random variable is the number of
successes in an interval of time or specific
region of space.
Poisson Experiment
1. The number of successes that occur in any
interval is independent of the number of
successes that occur in any other interval.
2. The probability of a success in an interval is
the same for all equal-size intervals.
3. The probability of a success in an interval is
proportional to the size of the interval.
4. The probability of more than one success in
an interval approaches 0 as the interval
becomes smaller.
7-5 Poisson Probability Distribution
The probability that a Poisson random variable
assumes a value of x in a specific interval is:
𝑒 <C 𝜇$
𝑃 (𝑥 ) = for 𝑥 = 0, 1, 2, …
𝑘!
Where μ is the mean number of successes in
the interval or region and e is the base of the
natural logarithm (approximately 2.71828).
Incidentally, the variance of a Poisson random
variable is equal to its mean; that is, 𝜎 " = 𝜇.
Examples - Probability of the Number of
Typographical Errors in Textbooks …
A statistics instructor has observed that the
number of typos in new editions of textbooks is
Poisson distributed with a mean of 1.5 per 100
pages. The instructor randomly selects 100
pages of a new book.
Example 7.12 – … in 100 pages
We want to determine the probability that a
Poisson random variable with a mean of 1.5 is
equal to 0.
Using the formula for Poisson probability
distribution, with x = 0, and μ = 1.5, we get:
𝑒 <C 𝜇$ 𝑒 <!.E 1.5(
𝑃 ( 0) = = = 𝑒 <!.E = .2231
𝑘! 0!
The probability that in the 100 pages selected
there are no errors is .2231.
Example 7.13 – ... in a textbook of 400 pages
Calculate the probability that for a book of 400
pages there are (a) no typos, and (b) no more
than five typos.
If there are 1.5 error per 100 pages, then there
must be 4 x 1.5 = 6 errors per 400 pages of
textbook, or μ = 6.
a. Thus, the probability that there are no typos
is:

𝑒 <C 𝜇$ 𝑒 <* 6(
𝑃 (0 ) = = = 𝑒 <* = .002479
𝑘! 0!
b. And the probability that there five or less
typos is:
𝑃 (𝑋 ≤ 5)
= 𝑃(0) + 𝑃(1) + 𝑃 (2) + 𝑃(3) + 𝑃(4)
+ 𝑃 ( 5)
= .002479 + .01487 + .04462 + .08924
+ .1339 + .1606 = .4457
7-5a Poisson Table
Table 2 in Appendix B of the textbook provides
cumulative Poisson probabilities for selected
values of μ.
We can use this table to answer the question in
Example 7.13, part b, where we need P(X ≤ 5).
The table to the right shows the values of P(X ≤
5) for μ = 6.
We can use the table and the complement rule
to determine probabilities of the type P(X = x)
and P(X ≥ x).
For example, to find the probability that there
are exactly ten typos:

𝑃(10) = 𝑃 (𝑋 ≤ 10) − 𝑃(𝑋 ≤ 9)


= .9574 − .9161 = .0413
And the probability that there are more than
five typos:

𝑃(𝑋 ≥ 6) = 1 − 𝑃(𝑋 ≤ 5) = 1 − .4457 = .5543


*See notes for instructions to calculate Poisson
probabilities with Excel.
Cumulative Poisson Probabilities for μ = 6

X P(X ≤ x)

0 .0025

1 .0174

2 .0620

3 .1512

4 .2851

5 .4457

6 .6063

7 .7440

8 .8472

9 .9161
10 .9574

11 .9799

12 .9912

13 .9964

Chapter Summary
• There are two types of random variables:
• A discrete random variable whose
values are countable.
• A continuous random variable that can
assume an uncountable number of
values.
• In this chapter, we defined the expected
value, variance, and standard deviation of a
population described by a discrete random
variable and represented by a discrete
probability distribution.
• We also introduced bivariate discrete
distributions based on an important
application in finance.
• Finally, we presented the two most
important discrete distributions: the
binomial and the Poisson distribution.
Chapter 8: Continuous Probability Distributions
8-1 Continuous Random Variable
A continuous random variable can assume an
uncountable number of values, and differs from
a discrete variable in several respects:
• We cannot list the possible values of a
continuous random variable because there
is an infinite number of them.
• Because there is an infinite number of
values, the probability of each individual
value is virtually 0.
• Thus, we can only calculate the probability
of a range of values.
The following requirements apply to a
probability density function f(x) whose range
is a ≤ x ≤ b.
1. f(x) ≥ 0 for all x between a and b.
2. The total area under the curve between a
and b is 1.
8-1 Probability Density Functions
Estimated Probability between 25 and 45
Years from the Histogram for Example 3.1

Probability Density Function for Example 3.1

In Example 3.1, we could only estimate the


relative frequency that an ACBL member is
between 25 and 45 by multiplying base ×
height for each dark brown rectangle, knowing
that the sample size is 200.
If we can determine a probability density
function, which is essentially a smooth
histogram drawn from a large population and
made of infinitesimally small rectangles, we can
calculate the exact probability
8-1a Uniform Distribution
To illustrate how we find the area under the
curve that describes a probability density
function, consider the uniform probability
distribution, also called the rectangular
probability distribution, which is described by
the function:
1
𝑓 (𝑥 ) = where 𝑎 ≤ 𝑥 ≤ 𝑏
𝑏−𝑎
To calculate the probability that X falls between
x1 and x2, simply determine the area in the
rectangle whose base is x2 – x1 and whose
height is 1/(b – a).
Thus:

𝑃 (𝑥! ≤ 𝑋 ≤ 𝑥" ) = 𝐵𝑎𝑠𝑒 × 𝐻𝑒𝑖𝑔ℎ𝑡


1
= (𝑥" − 𝑥! )
𝑏−𝑎
Uniform Distribution

P(x1 ≤ X ≤ x2)

Example 8.1 – Uniformly Distributed Gasoline


Sales
The amount of gasoline sold daily at a service
station is uniformly distributed between 2,000
and 5,000 gallons. Find the following:
a. Probability that daily sales fall between
2,500 and 3,000 gallons.
b. Probability that the service station sells at
least 4,000 gallons.
c. Probability that the station sells exactly
2,500 gallons.
Solution
The probability density function is:

1 1
𝑓 (𝑥 ) = = when 2,000 ≤ 𝑥
5,000 − 2,000 3,000
≤ 5,000
a. 𝑃(2,500 ≤ 𝑥 ≤ 3,000) = (3,000 −
!
2,500) = .1667
#(((

!
b. 𝑃(𝑥 ≥ 4,000) = (5,000 − 4,000) =
#(((

.3333
c. 𝑃(𝑥 = 2,500) = 0
8-1b Using a Continuous Distribution to
Approximate a Discrete Distribution
In our definition, we distinguish between
discrete and continuous random variables by
noting whether the number of possible values is
countable or uncountable.
In practice, we use a continuous distribution to
approximate a discrete one when the number of
values the variable can assume is countable but
large.
Example: the values of weekly income,
expressed in dollars.
It is a countable variable, but because the
outcomes are so many, it is preferrable to
employ a continuous probability distribution to
determine the probabilities associated with the
variable.
8-2 Normal Distribution
The normal distribution is the most important of
all probability distributions because of its crucial
role in statistical inference.
Normal Density Function
The probability density function of a normal
random variable is:

1 ! .<C !
< F H
𝑓 (𝑥 ) = 𝑒 " G
𝜎√2𝜋
where e = 2.71828... and π = 3.14159...
The normal distribution is described by the
mean μ and the standard deviation σ.
a. Notice how a normal distribution is
symmetric about its mean and the random
variable ranges between –∞ and + ∞.
b. Changing the value of μ shifts the
distribution left or right.
c. Increasing the value of σ makes the
distribution wider.

8-2a Calculating Normal Probabilities


To calculate the probability that a normal
random variable falls into any interval, we must
compute the area in the interval under the
curve.
However, because the area under the curve for
any interval depends on the mean, μ, and the
standard deviation, σ, we need to first
standardize the random variable by subtracting
μ and dividing by σ.
When the variable is normal, the transformed
variable is called a standard normal random
variable and denoted by Z:
𝑋−𝜇
𝑍=
𝜎
Example 8.2 – Standardize the Normal
Variable
Suppose that the daily demand for regular
gasoline at a gas station is normally distributed
with a mean of 1,000 gallons and a standard
deviation of 100 gallons.
The station manager has just opened for
business and noted that there is exactly 1,100
gallons of regular gasoline in storage.
The manager would like to know the probability
that there will be enough regular gasoline to
satisfy today’s demands before the new supply.
Solution
We label the demand for regular gasoline as X,
and we want to find
P(X < 1,100), as shown to the top right.
First, we standardize X:
𝑋 − 𝜇 1,100 − 1,000
𝑃(𝑋 < 1,100) = 𝑃 • < Ž
𝜎 100
= 𝑃(𝑍 < 1.00)
As shown to the bottom right.
P(X < 1,100)

P(Z < 1.00)


Example 8.2 – Calculating Probabilities
To calculate probabilities of the type P(Z < z) we
can use the standard normal probability table,
which is included in Table 3, Appendix B of the
textbook. The table lists the cumulative
probabilities for values of z ranging from –3.09
to +3.09.
We find the desired probability in the table’s left
margin:

𝑃 (𝑋 < 1,100) = 𝑃(𝑍 < 1.00) = .8413

To determine the probability that the standard


normal random variable is greater than some
value of z, we can use the complement rule.
For example, to find the probability that Z is
greater than 1.80, we write:

𝑃(𝑍 > 1.80) = 1 − 𝑃(𝑍 ≤ 1.80)


= 1 − 𝑃(𝑍 < 1.80)
Again, in the table’s left margin we find:

𝑃 (𝑍 > 1.80) = 1 − .9641 = .0359

Example 8.2 – Probability of an Interval


To determine the probability that a standard
normal random variable lies between two values
of z.
For example, we can obtain the probability of
the interval between –0.71 and 0.92 by
calculating the difference between two
cumulative probabilities:
𝑃(−0.71 < 𝑍 < 0.92)
= 𝑃(𝑍 < 0.92) − 𝑃(𝑍 < −0.71)
From the table to the right, we find:

𝑃(𝑍 < −0.71) = .2389


And

𝑃 (𝑍 < 0.92) = .8212


Thus:

𝑃 (−0.71 < 𝑍 < 0.92) = .8212 − .2389 = .5823

8-2b Finding Values of Z


There is a family of problems that requires us to
determine the value of Z given a probability.
We use the notation ZA to represent the value of
z such that the area to its right under the
standard normal curve is A; that is:

𝑃(𝑍 > 𝑍I ) = 𝐴
To determine the value of Z associated to a
given probability, we need to use the table
backward, as shown in the next example.
In Chapter 4, we introduced percentiles, which
are measures of relative standing. The values of
ZA are the 100(1 - A)th percentiles of a standard
normal random variable.
For example, Z0.05 = 1.645, which means that
1.645 is the 95th percentile.
Example 8.4 – Finding Z.05
Find the value of a standard normal random
variable such that the probability that the
random variable is greater than this quantity is
5% (Z.05).
Solution
The probability that z is less than Z.05 must be
1 – .05 = .9500. To find Z.05, we locate .9500 on
the table.
As you can se from the figure to the right, there
are two values of Z that are equally close: .9495
and .9505.
Those two probabilities correspond to the Z-
values of 1.64 and 1.65, respectively.
Thus, we can say that: Z.05 = 1.645.
Example 8.5 – Finding –Z .05
Find the value of a standard normal random
variable such that the probability that the
random variable is less than this quantity is 5%
(−Z −.05).
Solution
Because the standard normal curve is symmetric
about 0, and we know that Z.05 = 1.645, we can
say that:
−Z.05 = −1.645.

*See notes for instructions on how to calculate


probabilities from z values and z values from
probabilities using Excel.
Excel instructions:

We can use Excel to compute probabilities as well as


values of X and Z. To compute cumulative normal
probabilities P(X < x), type (in any cell)

=NORM.DIST([X],[μ],[σ],True)

(Typing “True” yields a cumulative probability. Typing


“False” will produce the value of the normal density
function, a number with little meaning.)

If you type 0 for μ and 1 for σ, you will obtain


standard normal probabilities.

Alternatively, type
=NORM.S.DIST([Z],True)

In Example 8.2, we found P(X,< 1,100) = P(Z < 1. 00)


= .8413. To instruct Excel to calculate this probability,
we enter:

=NORMDIST(1100,1000,100,True)

Or

=NORM.S.DIST(1. 00,True)

To calculate a value for ZA, type

=NORM.S.INV([1 - A])

In Example 8.4, we would type

=NORMSINV(.95)

and produce 1. 6449. We calculated Z.05 = 1. 645.

To calculate a value of x given the probability P(X > x)


= A, enter

=NORM.INV([1 – A],[μ],[σ])

The chapter-opening example would be solved by


typing

=NORM.INV(.99,490,61)
which yields 632.

Example 8.6 – Determining the Reorder Point


The manager of a large home-improvement
store would like to reduce the incidence of
shortages so that only 5% of orders will arrive
after inventory drops to 0. This policy is
expressed as a 95% service level. From previous
periods, the company has determined that
demand during lead time is normally distributed
with a mean of 200 and a standard deviation of
50.
Find the reorder point (ROP).
Solution
The reorder point is set so that the probability
that demand during lead time exceeds this
quantity is 5%.
We know that the standard normal value of the
reorder point (ROP) is Z.05 = 1.645.
To find ROP, we must unstandardize Z.05.
JK2<C
𝑍.(E = ⇒ 𝑅𝑂𝑃 = 𝜇 + 𝑍.(E 𝜎 ⇒ ROP =
G
200 + 1.645(50) = 282.25
which we round up to 283. The reorder policy is
to order a new batch of fans when there are 283
fans left in inventory.

8-3 Exponential Distribution


Exponential Probability Density Function
A random variable X is exponentially distributed
if its probability density function is given by:

𝑓 (𝑥) = 𝜆𝑒 <L. 𝑥 ≥ 0
where e = 2.71828… and λ is the parameter of
the distribution. The image to the right show
the exponential distribution for different values
of λ.
The mean of an exponential random variable is
equal to its standard deviation:
μ = σ = 1/λ
Probability Associated with an Exponential
Random Variable
If X is an exponential random variable:

𝑃 (𝑋 > 𝑥 ) = 𝑒 <L.
𝑃(𝑋 < 𝑥 ) = 1 − 𝑒 <L.
𝑃 (𝑥! < 𝑋 < 𝑥" ) = 𝑒 <L.! − 𝑒 <L."

Example 8.7 – Lifetimes of Alkaline Batteries


The lifetime of an alkaline battery (measured in
hours) is exponentially distributed with λ = .05.
a. What are the mean and standard deviation
of the battery’s lifetime?
b. Find the probability that a battery will last
between 10 and 15 hours.
c. What is the probability that a battery will
last for more than 20 hours?
Solution
a. μ = σ = 1/λ = 1/.05 = 20 hours
b. Let X be the lifetime of a battery. Then:

𝑃(10 < 𝑋 < 15) = 𝑒 <.(E(!() − 𝑒 <.(E(!E) =


𝑒 <(.E − 𝑒 <.ME
= .6065 − .4724 = .1341
c. 𝑃(𝑋 > 20) = 𝑒 <.(E("() = 𝑒 <! = .3679
*See notes for Excel instructions.
Excel instructions:

Type (in any cell)

=EXPON.DIST([X],[λ],True)

To produce the answer for Example 8.7c, we would


find P(X < 20) and subtract it from 1.

To find P(X < 20), type

=EXPON.DIST(20,.05,True)

which outputs .6321 and hence P(X > 20) = 1 - .6321


= .3679, which is the same number we produced
manually.

Example 8.8 – Supermarket Checkout


Counter
A checkout counter at a supermarket completes
the process according to an exponential
distribution with a service rate of 6 per hour.
A customer arrives at the checkout counter.
Find the probability of the following events.
a. The service is completed in fewer than 5
minutes.
b. The customer leaves the checkout counter
more than 10 minutes after arriving.
c. The service is completed in a time between
5 and 8 minutes.
Solution
Because all the questions are about services
completed in minutes, let us convert the service
rate also in minutes:
λ = 6 / hour = .1 / minute.
Thus:

a. 𝑃(𝑋 < 5) = 1 − 𝑒 <.!(E) = 1 − 𝑒 <(.E =


.3935

b. 𝑃(𝑋 > 10) = 𝑒 <.!(!() = 𝑒 <! = .3679


c. 𝑃(5 < 𝑋 < 8) = 𝑒 <.!(E) − 𝑒 <.!(A) = 𝑒 <(.E −
𝑒 <(.A = .6065 − .4493 = .1572
8-4 Other Continuous Distributions
In this section, we introduce three more
continuous distributions that are used
extensively in statistical inference.
• Student t distribution
• Chi-squared distribution
• F distribution
8-4a Student t Distribution
The Student t distribution was first derived by
William S. Gosset in 1908. It is very commonly
used in statistical inference, and we will employ
it in Chapters 12, 13, 14, 16, 17, and 18.
Student t Density Function
<(N9!)/"
𝛤 [(𝜈 + 1)/2] 𝑡"
𝑓 (𝑡 ) = –1 + —

√𝜈𝜋 𝛤(𝜈 2) 𝜈
where ν (Greek letter ‘nu’) are the degrees of
freedom, π = 3.14159…, and Г the Gamma
function (not defined here.)
The mean and variance of a Student t random
variable are:
𝜈
𝐸 (𝑡) = 0 and 𝑉 (𝑡) = for 𝑣 > 2
𝜈−2
The Student t distribution is said to be mound-
shaped:
a. It has a greater variance than the standard
normal distribution.
b. Its variance also decreases with the degrees
of freedom.
8-4a Determining Student t Probabilities and
Values
Probabilities: calculate probabilities of the
Student t random variable manually as we did
for the normal random variable is not practical
because we would need a different table for
each ν. Alternatively, we can use Microsoft Excel
(*see notes.)
Values: finding values of a Student t random
variable is easier than determining values of a
normal random variable using Table 3
backward. Table 4 in Appendix B of the
textbook lists the values of tA,ν, the values of a
Student t random variable with ν degrees of
freedom such that P(t > tA,ν) = A (see figure.)
Example: find the value of t with ν = 10 and A
= .05,
We locate 10 in the first column and move
across this row until we locate
the number under the heading t.05. From Table 4
we find t.05,10 = 1.812.
Because the Student t distribution is symmetric
about 0, the
value of t such that the area to its left is A is
−tA,ν.
t.95,10 = −t.05,10 = −1.812
Note how in the last row of the Student t table,
the degrees of freedom are infinite, and the t
values are identical to the values of z.
Excel Instructions:

To compute Student t probabilities, type

=T.DIST([x],[ν],[True])

For example, to calculate the probability such that:


P(t > 1.812), with 10 degrees of freedom, we type

=T.DIST(1.812,10,TRUE) = 0.9500

Which matches the example shown with A = 0.05 and


v = 10.

To determine tA type:

=T.INV([1-A],[ν])

For example, if A = 0.05 and ν = 10, we obtain:

=T.INV(0.95,10)=1.812

8-4b Chi-Squared Distribution


The Chi-squared distribution is positively
skewed (see Figure a) and ranges between 0
and ∞. The shape of the Chi-squared
distribution also depends on the number of its
degrees of freedom. Figure (b) shows the effect
of increasing the
degrees of freedom.
Chi-Squared Density Function

1 1 " )Q ⁄"<! <R! ⁄"


𝑓 (𝛸 " ) = (𝛸 𝑒
𝛤(𝜈⁄2) 2N⁄"

where ν are the degrees of freedom, and Г the


Gamma
function (not defined here.)
The mean and variance of a chi-squared
random variable are:

𝐸 (𝛸 " ) = 𝜈 and 𝑉 (𝛸 " ) = 2𝜈


8-4b Determining Chi-Squared Probabilities
and Values
Probabilities: just as it was for the Student t
random variable, calculate probabilities for the
chi-squared random variable manually is not
practical either because we would need a
different table for each ν. Alternatively, we can
use Microsoft Excel (*see notes.)
2
Values: The value of Χ with ν degrees of
freedom such that the area to its right under the
chi-squared curve is equal to A is denoted by
Χ2A,ν (see Table 5 in Appendix B of the textbook.)
To represent left-tail critical values, we find
Χ21−A,ν, because the complement area to its right
must be 1−A.
2
Example: find the value of Χ with ν = 8 and A
= .05,
We locate 8 in the first column and move across
this row until
2
we locate the number under the heading Χ .05.
From Table 5
2
we find Χ .05,8 = 15.5.
To find the point in the same distribution such
that the area to
2 2
its left is .05, we find Χ (1−.05),8 = Χ .95,8 = 2.73.
For values of degrees of freedom greater than
100, the chi-
squared distribution can be approximated by a
normal distribution with μ = ν and σ = √2𝜈.
Excel Instructions:

To calculate P(Χ2 > x), type into any cell

=CHI.DIST([x],[ν])

For example, =

CHI.DIST(6.25,3) = .100

To determine Χ2A,ν , type

=CHI.INV([A],[ν])

For example,

=CHI.INV(.100,3) = 6.25

8-4c F Distribution
The F distribution is also positively skewed (see
image) and ranges between 0 and ∞. The
shape of the F distribution depends on two
numbers of degrees of freedom.
F Density Function
𝜈! + 𝜈" N" N" <"
𝛤 š › 𝜈! " 𝐹 "
𝑓 (𝐹 ) = 2
𝜈! 𝜈" •𝜈" Ž N" 9N! 𝐹
𝛤š ›𝛤š › 𝜈 𝐹 "
2 2 š1 + ! ›
𝜈"
>0
Where ν1 and ν2 are the numerator and
denominator degrees of freedom, respectively,
and Г the Gamma function (not defined here.)
The mean and variance of an F random variable
are:
𝜈"
𝐸 (𝐹 ) = 𝜈" > 2 and 𝑉 (𝐹 )
𝜈" − 2
2𝜈"" (𝜈! + 𝜈" − 2)
= 𝜈 >4
𝑣! (𝜈" − 2)" (𝜈" − 4) "
Note how the mean of the F distribution
depends only on ν2 and that for large ν2 it
approaches 1.
8-4c Determining F Probabilities and Values
Probabilities: just as it was for the Student t and
Chi-squared random variables, calculate
probabilities for the F random variable manually
is not practical either because we would need a
different table for each pair of ν1 and ν2.
Alternatively, we can use Microsoft Excel (*see
notes.)
Values: we define FA,ν1,ν2 as the value of F with ν1
and ν2 degrees of freedom such that the area to
its right under the curve is A; that is, P(F > FA,ν1,ν2)
= A (see Table 6 in Appendix B of the textbook.)
Values of F1−A,ν1,ν2 are unavailable, but it can be
shown that F1−A,ν1,ν2 = 1 / FA,ν1,ν2.
Example: find the value of F.05,5,7 in Table 6.
Locate the numerator degrees of freedom, 5,
across the
top and the denominator degrees of freedom,
7, down the
left column. The intersection is 3.97. Thus, F.05,5,7
= 3.97.
Remember that the order of the degrees of
freedom is
important, F.05,5,7 ≠ F.05,7,5.
Suppose that we want to determine the left tail
of an F
distribution with ν1 = 4 and ν2 = 8 such that the
area to its
right is .95. Thus, F.95,4,8 = 1 / F.05,4,8 = 1 / 6.04 =
0.166.
Excel Instructions:

For probabilities, type =F.DIST([X],[ν1],[ν2])

For example, =F.DIST(3.97,5,7) = .05

To determine FA,ν1,ν2, type =F.INV([A],[ν1],[ν2])

For example, =F.INV(.05,5,7) = 3.97

Chapter Summary
• This chapter dealt with continuous random
variables and their distributions.
• We showed how to compute the probability
of a range of values as the area in the
interval under the density function.
• We introduced the most important
distribution in statistics and showed how to
compute the probability that a normal
random variable falls into any interval.
• Additionally, we demonstrated the inverse
problem of finding values of a normal
random variable given a probability.
• Next, we introduced the exponential
distribution, a distribution that is particularly
useful in several management science
applications.
• Finally, we presented three more
continuous random variables and their
probability density functions. The Student t,
chi-squared, and F distributions will be used
extensively in statistical inference.

You might also like