IFT Notes R04 Common Probability Distributions
IFT Notes R04 Common Probability Distributions
IFT Notes R04 Common Probability Distributions
This document should be read in conjunction with the corresponding reading in the 2022 Level I
CFA® Program curriculum. Some of the graphs, charts, tables, examples, and figures are copyright
2020, CFA Institute. Reproduced and republished with permission from CFA Institute. All rights
reserved.
Required disclaimer: CFA Institute does not endorse, promote, or warrant the accuracy or quality of
the products or services offered by IFT. CFA Institute, CFA®, and Chartered Financial Analyst® are
trademarks owned by CFA Institute.
Version 1.0
values is the same for all outcomes. With six outcomes, p(x) = 1/6 for all values of X (X = 1, 2,
3, 4, 5, 6). The table below summarizes the two views of this random variable – the
probability function and the cumulative distribution function.
X=x Probability Function Cumulative Distribution
p(x) = P(X = x) Function F(x) = P(X ≤ x)
1 1/6 1/6
2 1/6 2/6
3 1/6 3/6
4 1/6 4/6
5 1/6 5/6
6 1/6 6/6
Example
Using the table above, find the following probabilities:
1. F(4)
2. P(3 ≤ X < 6)
3. F(9)
Solution to 1:
To find F(4), we must find the cumulative probability of P(X ≤ 4) using the cumulative
distribution function (third column). From the table, we can see that P(X ≤ 4) = 4/6 = 2/3.
Therefore, the probability is 2/3.
Solution to 2:
To find P (3 ≤ X < 6), we need to find the sum of three probabilities: p(3), p(4), p(5). This is
1/6 + 1/6 + 1/6 = 3/6 = 1/2.
Solution to 3:
To find F(9), we must find the cumulative probability of P(X ≤ 9). This includes all possible
outcomes; hence the probability is 1.
The probability that the random variable will take a value between x1 and x2, where x1 and x2
both lie within the range is given by:
x2 − x1
P(x1 ≤ X ≤ x2 ) =
b−a
Example
X is a uniformly distributed continuous random variable between 10 and 20. Calculate the
probability that X will fall between 12 and 18.
Solution:
18 − 12
P(12 ≤ X ≤ 18) = = 0.6
20 − 10
The cumulative distribution function for a continuous random variable is shown below:
Example
A commodity analyst predicts that the price per ounce of gold three years from now will be
between $1,500 and $1,700. Assume gold prices follow a continuous uniform distribution.
What is the probability that the price will be less than $1,600 three years from now?
Solution:
1,600 – 1,500
F(1,600) = = 50% The probability that gold price will be less than $1,600 per
1,700−1,500
ounce three years from now is 50%.
Bernoulli trial is one where there are only two possible outcomes: success or failure.
Flipping a coin is an example of a Bernoulli trial – you either get heads or tails, but nothing
else. This can be expressed as:
P (Y = 1) = p
P (Y = 0) = 1 – p
where:
p = probability that the trial is a success
In a binomial distribution, the random variable, X, is the number of successes in a given
number of Bernoulli trials. Continuing with the coin example, say we flip the coin 10 times
and we define success as ‘Heads’. Clearly with 10 flips we can get 0 to 10 successes.
The probability distribution of a binomial random variable for the probability of "x" success
in "n" trials is calculated using the following formula:
P(x) = P(X = x) = (𝑛 Cx ) px (1 − p)n – x
where:
p = the probability of success on each trial
x = number of successes
n = number of trials
Instructor’s Note:
Two important points help illustrate the intuition behind the formula:
The successes can be in any order. That is why we use the combination function and not
the permutation function.
The events are independent. That is why we simply multiply the probability of each
event.
Example
If we flip a fair coin (p = 0.5) ten times (n = 10), what is the probability of seven successes?
Solution:
P(7) = P (X = 7) = 10C7 0.57 0.53 = 0.117
Mean and variance of a binomial variable
The mean and variance of a binomial variable can be calculated as:
Random Variable Mean Variance
Bernoulli, B (1, p) p p (1 - p)
Binomial, B (n, p) np np (1 - p)
For our coin-flip example, the mean value of the binomial random variable is np = 10 x 0.5 =
5. The intuition: if we perform the binomial experiment several times, where each
experiment refers to 10 coin-flips, on average we will have 5 successes. The actual number of
successes will be distributed equally on either side of the mean value. The random value for
every trial moves closer to the expected value as the number of trials grows.
Example
Over the last 10 years, Abro corporation’s EPS increased year over year six times and
decreased year over year four times. You decide to model the number of EPS increases for
the next decade as a binomial random variable.
1. If success is defined as an increase in the annual EPS, determine the probability of
success.
2. What is the probability that EPS will increase in exactly 5 of the next 10 years?
3. Calculate the expected number of yearly EPS increases during the next 10 years.
4. Calculate the variance and standard deviation of the number of yearly EPS increases
during the next 10 years.
Solution to 1:
There are only two possible outcomes: increase in the EPS and no increase in the EPS.
Probability of success: p = 6/10 = 0.6
Probability of failure: 1 – p = 1 – 0.6 = 0.4
Solution to 2:
Using the binomial model:
P(X = 5) = (nCx ) px qn−x
P (X = 5) = (10 C5) 0.65 0.45
P (X = 5) = 252 × 0.07776 × 0.01024 = 20.06%
Solution to 3:
Expected number of yearly EPS increases: E(x) = np = 10 × 0.6 = 6
Solution to 4:
Variance = np (1 – p) = 6 × 0.4 = 2.4
Standard Deviation = √2.4 = 1.549
The variance of the distribution is calculated as n p (1 – p) = 10 x 0.5 x (1 – 0.5) = 2.5
Binomial tree
A binomial tree can be used to model stock price movements. Refer to the tree diagram
below. ‘S’ represents the initial stock price. ‘u’ represents an up move and ‘d’ represents a
down move. The nodes show each possible value of the stock after 1, 2 and 3 time periods.
The expected stock price after each period is equal to the sum of possible stock prices at the
end of the period multiplied by their respective probabilities.
Example
Consider an initial stock price of $100. In one time period, the stock can either rise by a
factor of 1.1 or go down by a factor of 1/1.1. In any given time period, the probability of an
up move is 0.6 and the probability of a down move is 0.4. After two periods, what are the
possible stock prices and their respective probabilities? What is the expected stock price?
Solution:
uuS = 1.1 x 1.1 x 100 = 121 with probability 0.6 x 0.6 = 0.36
udS = 1.1 x 1/1.1 x 100 = 100 with probability 0.6 x 0.4 = 0.24
duS = 1/1.1 x 1.1 x 100 = 100 with probability 0.4 x 0.6 = 0.24
ddS = 1/1.1 x 1/1.1 x 100 = 82.64 with probability 0.4 x 0.4 = 0.16
Expected stock price = 121 x 0.36 + 100 x 0.24 + 100 x 0.24 + 82.64 x 0.16 = $104.78
ounce three years from now is 50%.
4. Normal Distribution
The normal distribution is the most extensively used probability distribution in quantitative
work. A normal distribution is symmetrical and bell-shaped as shown in the graph below:
In this figure ‘m’ stands for mean, 1s means one standard deviation, 2s means two standard
deviations, and so on. We can make the following probability statements for a normal
distribution:
Approximately 68% of all observations fall in the interval m ± 1s.
Approximately 95% of all observations fall in the interval m ± 2s.
Approximately 99% of all observations fall in the interval m ± 3s.
The intervals indicated above are easy to remember but are only approximate for the stated
probabilities. More precise intervals (confidence intervals) are:
90% of all observations are in the interval m ± 1.65s.
95% of all observations are in the interval m ± 1.96s.
99% of all observations are in the interval m ± 2.58s.
The characteristics of a normal distribution are as follows:
The normal distribution is completely described by two parameters – its mean, µ, and
variance, σ2. We indicate this as X ~ N (µ, σ 2).
The normal distribution has a skewness of 0 (it is symmetric) and a kurtosis
(measure of peakedness) of 3. Due to the symmetry, the mean, median and mode are
all equal for a normal random variable.
A linear combination of two or more normal random variables is also normally
distributed.
Standard normal distribution
The normal distribution with mean (µ) = 0 and standard deviation (σ) = 1 is called the
standard normal distribution.
The formula for standardizing a random variable X is:
(X − µ)
Z=
σ
where:
µ is the population mean.
To find the probability that a standard normal variable is less than or equal to 0.5, for
example, locate the row that contains 0.50, look at the 0 column, and find the entry 0.6915.
Thus, P(Z ≤ 0.5) = 0.6915 or 69.15%.
The probability that a standard normal variable is less than or equal to 0 is 0.5000. This is
true by definition because the mean of a standard normal distribution is 0. The table above
validates this fact.
For a non-negative number x, we can use N(x) directly from the table. For a negative number
–x, N(-x) = 1.0 – N(x). Essentially, we are using the fact that the normal distribution is
symmetric around the mean.
Example
A portfolio has a mean return of 15% and a standard deviation of return of 20% per year.
What is the probability that the portfolio return would be below 18%? We are given the
following information from the z-table: P(Z < 0.15) = 0.5596, P(Z > 0.15) = 0.4404, P(Z <
0.18) = 0.5714, P(Z > 0.18) = 0.4286.
Solution:
X− μ 0.18 − 0.15
P (Z < ) = P (Z < ) = P(Z < 0.15) = 0.5596
σ 0.20
Univariate v/s multivariate distribution
Univariate distribution
15 − 8
SFB = = 0.7
10
Since A has a higher safety first ratio, the investor should select portfolio A.
Roy’s safety first criteria
It states that an optimal portfolio minimizes the probability that the actual portfolio return
will fall below the target return.
If we are given the holding period return over any time period, we can calculate the
equivalent continuously compounded rate of return for that period as:
r = ln (HPR +1)
Example
If the holding period return of a stock was 10% for a period of one year. What is the
equivalent continuously compounded rate of return for the year?
Solution:
r = ln (0.1 +1) = 0.0953 = 9.53%
The relationship between the chi-square and F-distributions is as follows: If χ12 is one chi-
square random variable with m degrees of freedom and χ22 is another chi-square random
variable with n degrees of freedom, then F = (χ12/m)/(χ22 /n) follows an F-distribution with
m numerator and n denominator degrees of freedom.
Summary
LO.a: Define a probability distribution and compare and contrast discrete and
continuous random variables and their probability functions.
A random variable is a variable whose outcome cannot be predicted. A probability
distribution lists all possible outcomes of a random variable along with their associated
probabilities.
A discrete random variable is one for which the number of possible outcomes can be
counted. It has measurable probabilities associated with each specific outcome.
A continuous random variable is one for which we cannot count the number of possible
outcomes. Therefore, probabilities cannot be associated with specific outcomes, instead, it
has to be assigned to a particular range.
LO.b: Calculate and interpret probabilities for a random variable, given its cumulative
distribution function.
The probability that an outcome will be less than or equal to a specific value is represented
by the area under the cumulative probability distribution to the left of that value.
LO.c: Describe the properties of a discrete uniform random variable, and calculate and
interpret probabilities given the discrete uniform distribution function.
A discrete uniform random variable is one where the probability of all the possible outcomes
is equal. For example, the roll of a dice.
Probabilities for a discrete uniform distribution: If the total number of outcomes is n, then
the probability of each outcome = 1/n.
LO.d: Describe the properties of the continuous uniform distribution, and calculate
and interpret probabilities given a continuous uniform distribution.
The continuous uniform distribution is defined over a range from a lower limit ‘a’ to an
upper limit ‘b’. The probability that the random variable will take a value between x 1 and x2,
where x1 and x2 both lie within the range is given by:
x2 − x1
P(x1 ≤ X ≤ x2 ) =
b−a
LO.e: Describe the properties of a Bernoulli random variable and a binomial random
variable, and calculate and interpret probabilities given the binomial distribution
function.
A Bernoulli trial is an experiment that has only two possible outcomes: a success or a failure.
For example, the toss of a coin.
If the experiment is carried out n times, the number of success (denoted by X) is called a
Bernoulli random variable.
(X − µ)
Z=
σ
LO.j: Calculate and interpret probabilities using the standard normal distribution.
The Z-table is used to find the probability that X will be less than or equal to a given value.
Suppose we have a normal random variable, X, with µ = 10 and σ = 2. If the value of X is 11,
we standardize X with Z = (11 – 10)/2 = 0.5.
The probability that we will observe a value less than 11 for X ~ N (10, 2) is exactly the same
as the probability that we will observe a value less than 0.5 for Z ~ N (0, 1).
LO.k: Define shortfall risk, calculate the safety-first ratio, and identify an optimal
portfolio using Roy’s safety-first criterion.
Shortfall risk is the risk that portfolio value will fall below some minimum acceptable level
over some time horizon.
[E(R p)– R L ]
SF Ratio =
σp
To select the optimal portfolio according to Roy’s criterion, we follow the following steps:
Calculate each portfolio’s SF-Ratio.
Choose the portfolio with the highest SF-Ratio.
LO.l: Explain the relationship between normal and lognormal distributions and why
the lognormal distribution is used to model asset prices.
If x is a random variable that is normally distributed, then to create a lognormal distribution
of x we take ex and plot the values on a graph.
A lognormal distribution is often used to model asset prices because the asset prices need to
be positive, they cannot be negative.
LO.m: Calculate and interpret a continuously compounded rate of return, given a
specific holding period return.
For continuous compounding, the EAR is given by: EAR = er – 1.
If we are given the holding period return over any time period, we can calculate the
equivalent continuously compounded rate of return for that period as: r = ln(HPR +1)
LO.n: Describe the properties of the Student’s t-distribution, and calculate and
interpret its degrees of freedom.
The properties of a Student’s t-distribution are:
It is symmetrical, bell-shaped and similar to a normal distribution.
It has a lower peak and fatter tails as compared to a normal distribution.
It is defined by a single parameter, degrees of freedom (df) = n – 1.