Chapter 3
Chapter 3
UNSW Sydney
OPEN LEARNING
CHAPTER 3
Common Distributions
2 / 121
Outline:
3 / 121
Common Continuous Distributions
Supplementary Material
4 / 121
In the previous chapters, we saw that the random variables
may be characterised by probability mass functions
(discrete case) and probability density functions
(continuous case).
5 / 121
3.1 Bernoulli Distribution
6 / 121
The Bernoulli distribution is very important to statistics
because it can be used to model responses to any Bernoulli
trial.
Definition
A Bernoulli trial is an experiment with two possible
outcomes. The outcomes are often labelled success or
failure.
7 / 121
Example
Coin tossing is a Bernoulli trial. We may define, for
example,
success = heads
failure = tails.
8 / 121
In the Bernoulli trials, there are only two possible
responses to any of the questions. These trials are
modelled using the Bernoulli distribution.
Definition
For a Bernoulli trial, we define the random variable, X as
(
1 if the trial results in a success
X=
0 otherwise.
9 / 121
Result
If X is a Bernoulli random variable defined according to a
Bernoulli trial Jump to the definition of a Bernoulli trial with a
probability of success p ∈ (0, 1), then the probability mass
function of X is
P (X = x) = px (1 − p)1−x , x = 0, 1,
Definition
A constant p in the probability mass function is called a
parameter.
10 / 121
Derivation:
P (X = 0) = p0 (1 − p)1 = 1 − p
P (X = 1) = p1 (1 − p)0 = p.
X ∼ Bernoulli(p).
Notation
The symbol ∼ is commonly used in Statistics for the phrase
11 / 121
Example
1
Consider tossing a coin. If the coin is fair, we have p = 2
and
1 x 1 1−x 1
P (X = x) = 1− = , x = 0, 1.
2 2 2
This is consistent with the notion that heads and tails are
equally likely for a fair coin.
12 / 121
3.2 Binomial Distribution
13 / 121
The Binomial distribution arises when several Bernoulli
trials are repeated in succession. Hence, the Bernoulli
distribution is a special case of a Binomial distribution.
Definition
Consider a sequence of n independent Bernoulli trials, each
with a probability of success p. If
14 / 121
The mathematical expression
X ∼ Bin(29, 0.72)
15 / 121
Example
Write down a distribution that could be used to model X.
1. The number of patients who survive a new type of
surgery out of 12 patients who have a 20% chance of
surviving.
16 / 121
Example
Write down a distribution that could be used to model X.
1. The number of patients who survive a new type of
surgery out of 12 patients who have a 20% chance of
surviving.
Solution:
Here, X is the number of patients who survive a new type
of surgery with parameters n = 12 and the probability of
success p = 0.20. That is
X ∼ Bin(12, 0.20).
16 / 121
Example
Write down a distribution that could be used to model X.
2. The number of patients who visited a doctor and were
sick out of the 36 patients. Assume that, in general,
70% of the people who visit the doctor are sick.
17 / 121
Example
Write down a distribution that could be used to model X.
2. The number of patients who visited a doctor and were
sick out of the 36 patients. Assume that, in general,
70% of the people who visit the doctor are sick.
Solution:
Here, X is the number of patients who visited the doctors
and were sick with parameters n = 36 and the probability
of success (being sick) p = 0.70. That is
X ∼ Bin(36, 0.70).
17 / 121
Example
Write down a distribution that could be used to model X.
3. The number of plants that are flowering out of 52
randomly selected plants in the wild, when the
proportion of plants in the wild flowering at the time is
p.
18 / 121
Example
Write down a distribution that could be used to model X.
3. The number of plants that are flowering out of 52
randomly selected plants in the wild, when the
proportion of plants in the wild flowering at the time is
p.
Solution:
Here, X is the number of plants flowering in the wild with
parameters n = 52 and the probability of success
(flowering) p. That is
X ∼ Bin(52, p).
18 / 121
Example
Write down a distribution that could be used to model X.
4. The number of plasma screen televisions a store sells in
a month, out of the store’s total stock of 18, each with
a probability of 0.6 of being sold.
19 / 121
Example
Write down a distribution that could be used to model X.
4. The number of plasma screen televisions a store sells in
a month, out of the store’s total stock of 18, each with
a probability of 0.6 of being sold.
Solution:
Here, X is the number of plasma screen televisions a store
sells monthly with parameters n = 18 and the probability
of success (being sold) p = 0.60. That is
X ∼ Bin(18, 0.60).
19 / 121
Example
Write down a distribution that could be used to model X.
5. The number of plasma screen televisions returned to
the store due to faults, out of the nine televisions sold,
when the return rate for this type of television is 8%.
20 / 121
Example
Write down a distribution that could be used to model X.
5. The number of plasma screen televisions returned to
the store due to faults, out of the nine televisions sold,
when the return rate for this type of television is 8%.
Solution:
Here, X is the number of plasma screen televisions returned
to the store due to faults with parameters n = 9 and the
probability of success (return rate) p = 0.08. That is
X ∼ Bin(9, 0.08).
20 / 121
In each example, we required the assumption of
independence of responses across the n units to use the
binomial distribution.
21 / 121
Result
If X ∼ Bin(n, p), then its probability mass function is given
by
n x
P (X = x) = p (1−p)n−x , x = 0, 1, . . . , n, 0 < p < 1.
x
This result follows from the fact that there are nx ways by
22 / 121
Results
If X ∼ Bin(n, p), then
1 E(X) = np See Derivation Here
u n
3 mX (u) = (p e + 1 − p) See Derivation Here
23 / 121
Example
Adam pushes ten pieces of toast off a table. Seven of these
landed butter side down.
What distribution could be used to model the number
of slices of toast that landed butter side down? Assume
a 50:50 chance of each slice landing butter side down.
24 / 121
Example
Adam pushes ten pieces of toast off a table. Seven of these
landed butter side down.
What distribution could be used to model the number
of slices of toast that landed butter side down? Assume
a 50:50 chance of each slice landing butter side down.
Solution:
Let X be the number of slices of toast that landed butter
side down. So X is a binomial random variable with
parameters n = 10 and the probability of success (slice
landing butter side down) p = 0.5, that is, X ∼ Bin(10, 12 ).
24 / 121
Example
Adam pushes ten pieces of toast off a table. Seven of these
landed butter side down.
What is the expected number of pieces of toast that
land butter side down, and what is the standard
deviation?
25 / 121
Example
Adam pushes ten pieces of toast off a table. Seven of these
landed butter side down.
What is the expected number of pieces of toast that
land butter side down, and what is the standard
deviation?
Solution:
We know that the expected number of a binomial random
variable is
1
E(X) = np = 10 × = 5
2
and the standard deviation is
r
p p 1 1
V ar(X) = np(1 − p) = 10 × × (1 − ) ≈ 1.88
2 2
25 / 121
Example
Adam pushes ten pieces of toast off a table. Seven of these
landed butter side down.
What is the probability that exactly seven slices land
butter side down?
26 / 121
Example
Adam pushes ten pieces of toast off a table. Seven of these
landed butter side down.
What is the probability that exactly seven slices land
butter side down?
Solution:
Here we want the probability P (X = 7). That is,
10 1 7 1 10−7 1 10
P (X = 7) = 1− = 120 ×
7 2 2 2
26 / 121
Example
Adam pushes ten pieces of toast off a table. Seven of these
landed butter side down.
What is the probability that at least seven slices land
butter side down?
27 / 121
Example
Adam pushes ten pieces of toast off a table. Seven of these
landed butter side down.
What is the probability that at least seven slices land
butter side down?
Solution:
Here we want the probability P (X ≥ 7). That is,
" #
10 10 10 10 1 10
P (X ≥ 7) = + + + ≈ 0.1719.
7 8 9 10 2
27 / 121
As mentioned earlier, the Binomial distribution generalises
the Bernoulli distribution.
Result
X has a Bernoulli distribution with parameter p if and only
if
X ∼ Bin(1, p).
28 / 121
3.3 Geometric Distribution
29 / 121
The Geometric distribution arises when a Bernoulli trial
is repeated until the first success. In this case,
30 / 121
Results
If X has a geometric distribution with parameter p ∈ (0, 1),
then
1 X has probability mass function
P (X = x) = (1 − p)x−1 p, x = 1, 2, 3, . . .
1
2 E(X) = p
1−p
3 V ar(X) = p2
The above form of the geometric distribution is used for modelling the number of
trials up to and including the first success.
By contrast, the following form of the geometric distribution is used for modelling
Y , the number of failures until the first success:
P (Y = x) = P (X = x + 1) = (1 − p)x p x = 0, 1, 2, . . . , .
31 / 121
Example
Suppose we repeatedly flip a fair coin with probability p of
coming up heads and the probability of 1 − p of coming up
tails, where 0 < p < 1.
32 / 121
3.4 Hypergeometric Distribution
33 / 121
Hypergeometric random variables arise when counting the
number of binary (success/failure or yes/no) responses
when objects are sampled independently from finite
populations and the total number of successes in the
population is known.
X ∼ hypergeometric(N, m, n).
34 / 121
Results
If X has a hypergeometric distribution with parameters
N, m and n, then
probability mass function is given by
m
N −m
x n−x
P (X = x) = N
, max(0, n + m − N ) ≤ x ≤ min(m, n)
n
m
E(X) = n · N
m m
N −n
V ar(X) = n · N
1− N N −1
.
35 / 121
The hypergeometric distribution can be considered a finite
population version of the binomial distribution.
36 / 121
It can be shown that as N gets large, a hypergeometric
distribution with parameters N, m and n approaches the
m
random variable Y ∼ Bin n, N .
m
A suggestion of this can be seen in E(X) = n · N is equal to
m
its binomial version, E(Y ), where Y ∼ Bin n, N .
m m
N −n
Also, V ar(X) = n · N 1 − N N −1 only differs from its
binomial version by a finite population correction factor
N −n
N −1
, which tends to one as N gets large.
37 / 121
Example
A box contains balls numbered from one to eighty. You may select ten numbers
from the integers one to eighty. Twenty balls are specified randomly (think of
these as red balls). You win the major prize if all ten numbers are on red balls.
38 / 121
Example
A box contains balls numbered from one to eighty. You may select ten numbers
from the integers one to eighty. Twenty balls are specified randomly (think of
these as red balls). You win the major prize if all ten numbers are on red balls.
Solution:
This hypergeometric distribution has parameters N = 80, n = 10 and m = 20.
That is, let X be the number of your selected numbers that are on red balls. Then
20 60
x 10−x
P (X = x) = 80
and
10
20 60
10 10−10
P (win major prize) = P (X = 10) = 80
10
20
10
= 80
,
10
38 / 121
Example
Write down a distribution that could be used to model X.
1. The number of patients a town doctor sees who are
sick: when 800 people want to see a doctor, 500 of
these are sick, and when the doctor only has time to
see 32 of the 800 people (who are selected effectively at
random from those that want to see the doctor).
39 / 121
Example
Write down a distribution that could be used to model X.
1. The number of patients a town doctor sees who are
sick: when 800 people want to see a doctor, 500 of
these are sick, and when the doctor only has time to
see 32 of the 800 people (who are selected effectively at
random from those that want to see the doctor).
Solution:
Let X be the number of sick patients a town doctor sees.
Here we have N = 800, m = 500 and n = 32. That is,
39 / 121
Example
Write down a distribution that could be used to model X.
2. The number of plants that are flowering out of 52
randomly selected plants in the wild, from a
population of 800 plants, of which 650 are flowering.
40 / 121
Example
Write down a distribution that could be used to model X.
2. The number of plants that are flowering out of 52
randomly selected plants in the wild, from a
population of 800 plants, of which 650 are flowering.
Solution:
Let X be The number of plants that are flowering with
N = 800, m = 650 and n = 52. That is,
40 / 121
Example
Write down a distribution that could be used to model X.
3. The number of faulty plasma screen televisions
returned to a store, when five of the stores’ eighteen
televisions were faulty, and six of the eighteen were
sold.
41 / 121
Example
Write down a distribution that could be used to model X.
3. The number of faulty plasma screen televisions
returned to a store, when five of the stores’ eighteen
televisions were faulty, and six of the eighteen were
sold.
Solution:
Let X be the number of faulty plasma screen televisions
returned to a store with N = 18, m = 5 and n = 6. That is,
X ∼ hypergeometric(18, 5, 6).
41 / 121
3.5 Poisson Distribution
42 / 121
The Poisson distribution often arises when the random
variable of interest is a count.
43 / 121
Definition
The random variable X has a Poisson distribution with
parameter λ > 0 if its probability mass function is given by
e−λ λx
P (X = x) = , x = 0, 1, 2, 3, . . . , .
x!
A common abbreviation is
X ∼ Poisson(λ).
44 / 121
Results
If X ∼ Poisson(λ), then
1 E(X) = λ
2 V ar(X) = λ
u
3 mX (u) = eλ (e −1) .
45 / 121
Example
Suggest a distribution that could be useful for studying X.
1. The number of workplace accidents in a month when
the average number of accidents is 1.4 .
46 / 121
Example
Suggest a distribution that could be useful for studying X.
1. The number of workplace accidents in a month when
the average number of accidents is 1.4 .
Solution:
Let X be the number of workplace accidents per month
with parameter λ = 1.4. That is,
X ∼ Poisson(1.4) .
46 / 121
Example
Suggest a distribution that could be useful for studying X.
2. The number of people calling a helpline per day.
47 / 121
Example
Suggest a distribution that could be useful for studying X.
2. The number of people calling a helpline per day.
Solution:
Let X be the number of people calling a helpline per day,
with parameter λ being the average number of people
calling the helpline daily. That is,
X ∼ Poisson(λ).
47 / 121
Example
Suggest a distribution that could be useful for studying X.
3. The number of ATM customers overnight when a bank
is closed when the average number is 5.6.
48 / 121
Example
Suggest a distribution that could be useful for studying X.
3. The number of ATM customers overnight when a bank
is closed when the average number is 5.6.
Solution:
Let X be the number of ATM customers overnight when a
bank is closed with parameter λ = 5.6. That is
X ∼ Poisson(5.6).
48 / 121
Example
4. If, on average, five servers go offline during the day,
what is the chance that no more than one server will go
offline? Assume independence of servers going offline.
49 / 121
Example
4. If, on average, five servers go offline during the day,
what is the chance that no more than one server will go
offline? Assume independence of servers going offline.
Solution:
Let X be the number of servers that go offline during the day
with parameter λ = 5. That is, X ∼ Poisson(5). We want
P (no more than one server goes offline during the day)
= P (X ≤ 1)
= P (X = 0) + P (X = 1)
e−5 50 e−5 51
= + = e−5 + e−5 5 = 6e−5 .
0! 1!
49 / 121
Common Continuous Distributions
50 / 121
So far, we considered discrete random variables X for
which P (X = x) > 0 for certain values of x.
51 / 121
Definition
A random variable X is continuous if
P (X = x) = 0 for all x ∈ R.
52 / 121
3.6 Uniform Distribution
53 / 121
The uniform distribution is the simplest common
distribution for continuous random variables.
Definition
A continuous random variable X that can take values in
the interval (a, b) with equal probability is said to have a
uniform distribution on (a, b), and has probability
density function given by
1
fX (x; a, b) = , a < x < b; a < b.
b−a
X ∼ Uniform(a, b).
54 / 121
1
Note that fX (x; a, b) = b−a a < x < b; a < b, is simply a
constant function over the interval (a, b) and zero
otherwise. That is,
(
1
, a < x < b, a < b
fX (x; a, b) = b−a
0 otherwise.
55 / 121
Graphs of Different Uniform Density Functions
56 / 121
Results
If X ∼ Uniform(a, b), then
1 E(X) = a+b
2
(b−a)2
2 V ar(X) = 12
ebu −eau
3 mX (u) = (b−a) u
.
57 / 121
Note that there is also a discrete version of the uniform
distribution, which is useful for modelling the outcome of an
event with k equally likely outcomes, such as a roll of a die.
58 / 121
3.7 Exponential Distribution
59 / 121
The Exponential distribution is the simplest common
distribution for describing the probability structure of
continuous positive random variables, such as
lifetimes.
60 / 121
Definition
A random variable X is said to have an exponential
distribution with parameter β > 0 if it has the probability
density function of X is given by
1 −x/β
fX (x; β) = e , x>0
β
X ∼ exp(β).
61 / 121
Graphs of Various Exponentials
62 / 121
The cumulative distribution function of X is given by
Z x
1 −y/β
FX (x; β) = P (X ≤ x) = e dy
0 β
= 1 − e−x/β , x > 0,
63 / 121
Results
If X has an exponential distribution with parameter β > 0,
then
1 E(X) = β
2 V ar(X) = β 2
1
3 mX (u) = 1−β u
, u < 1/β.
64 / 121
The exponential distribution is closely related to the
Poisson distribution.
65 / 121
Example
If, on average, five servers go offline during the day, what is
the chance that no servers will go offline in the next hour?
1
(Hint: Note that an hour is 24 of a day.)
66 / 121
Example
If, on average, five servers go offline during the day, what is
the chance that no servers will go offline in the next hour?
1
(Hint: Note that an hour is 24 of a day.)
Solution:
Let T be the time until the next server goes offline with
parameter β = 1/λ = 245
, where λ = 5 per day and λ = 24 5
66 / 121
An important property of the exponential distribution
is lack of memory or the memoryless property.
67 / 121
3.8 Special Functions Arising in Statistics
68 / 121
There are three more special distributions that we consider,
Gamma distribution
Normal distribution
Beta distribution
69 / 121
3.8.1 Gamma Function
70 / 121
The Gamma function is essentially an extension of the
factorial function (e.g. 4! = 4 × 3 × 2 × 1 = 24) to the
general real numbers.
Definition
The Gamma function at x ∈ R is given by
Z ∞
Γ(x) = tx−1 e−t dt.
0
71 / 121
Results
Some basic results for the Gamma function are
1 Γ(x) = (x − 1) Γ(x − 1)
2 Γ(n) = (n − 1)!, n = 1, 2, 3, . . .
√
3 Γ( 21 ) = π
4 If m is a non-negative integer, then
Z ∞
Γ(m + 1) = xm e−x dx = m!
0
72 / 121
Example
Suppose X has a probability density function
1 5 −x
fX (x) = x e , x > 0.
120
Find E(X) using the gamma function.
73 / 121
Example
Suppose X has a probability density function
1 5 −x
fX (x) = x e , x > 0.
120
Find E(X) using the gamma function.
Solution:
Z ∞
E(X) = x fX (x) dx definition
−∞
Z ∞ 1 5 −x
= x x e dx
0 120
Z ∞
1
= x6 e−x dx
120 0
1 Jump to definition of a Gamma function
= Γ(7)
120
1 6×5×4×3×2×1
= 6! = = 6.
120 120
73 / 121
3.8.2 Beta Function
74 / 121
Definition
The Beta function at x, y ∈ R is given by
Z 1
B(x, y) = tx−1 (1 − t)y−1 dt.
0
Result
For all x, y ∈ R,
Γ(x) Γ(y)
B(x, y) = .
Γ(x + y)
75 / 121
Example
Suppose X has a probability density function
76 / 121
Example
Suppose X has a probability density function
Γ(5) Γ(6) 4! 5!
= 168 = 168 .
Γ(11) 10!
76 / 121
3.8.3 The Digamma and Trigamma Functions
77 / 121
The digamma and trigamma functions will be required
later, but we will give their definitions here since they are
closely related to the Gamma function.
Definition
For all x ∈ R,
d
digamma(x) = ln{Gamma(x)}
dx
d2
trigamma(x) = ln{Gamma(x)}.
dx2
78 / 121
Unlike other mathematical functions, such as sin(x),
tan−1 (x), log10 , etc., the digamma and trigamma functions
are unavailable on ordinary hand-held calculators.
79 / 121
Graphs of Digamma, Trigamma, and Other Gamma Functions
R Documentation
80 / 121
3.8.4 The Φ Function and its Inverse
81 / 121
The final special function we will consider is routinely
denoted by the capital phi symbol Φ. It is defined as
follows.
Definition
For all x ∈ R,
Z x
1 2 /2
Φ(x) = √ e−t dt.
2π −∞
82 / 121
Results
1 limx→−∞ Φ(x) = 0
2 limx→∞ Φ(x) = 1
3 Φ(0) = 12
4 Φ is a monotonically increasing function over R.
83 / 121
The previous result shows that the inverse of Φ, denoted by
Φ−1 , is well-defined for all 0 < x < 1.
Examples are
1
Φ−1 = 0 and Φ−1 (0.975) = 1.95996 · · · ≈ 1.96.
2
84 / 121
3.9 Normal Distribution
85 / 121
A particularly important family of continuous random
variables is those following the normal distribution.
Definition
The random variable X is said to have a normal
distribution with parameters µ, −∞ < µ < ∞, and σ 2 > 0
if X has probability density function
1 −(x−µ)2
fX (x; µ, σ) = √ e 2 σ2 , −∞ < x < ∞.
σ 2π
X ∼ N (µ, σ 2 ).
86 / 121
Normal density functions are symmetric bell shaped curves,
symmetric about µ. The following figures show four
different normal density functions.
87 / 121
88 / 121
The normal distribution is important because of the
central limit theorem, which will be discussed in
Chapter 6.
89 / 121
Results
1 E(X) = µ
2 V ar(X) = σ 2
1 2 u2
3 mX (x) = eµ u + 2 σ .
90 / 121
The special case of µ = 0 and σ 2 = 1 is known as the
standard normal distribution.
91 / 121
3.9.1 Computing Normal Distribution Probabilities
92 / 121
93 / 121
Consider the problem
Z 0.47
1 2
P (Z ≤ 0.47) = √ e−x /2 dx.
−∞ 2π
> pnorm(0.47)
[1] 0.6808225
That is,
P (Z ≤ 0.47) ≈ 0.6808.
94 / 121
95 / 121
Some other examples are
P (Z ≤ 1) = Φ(1) ≈ 0.8413 and P (Z ≤ 1.54) = Φ(1.54) ≈ 0.9382.
96 / 121
How about probabilities concerning non-standard normal
random variables?
Result
If X ∼ N (µ, σ 2 ), then
X −µ
Z= ∼ N (0, 1).
σ
97 / 121
Example
Find P (X ≤ 12), where X ∼ N (10, 9).
98 / 121
Example
Find P (X ≤ 12), where X ∼ N (10, 9).
Solution:
!
X − 10 12 − 10
P (X ≤ 12) = P ≤
3 3
= P (Z ≤ 0.67), where Z ∼ N (0, 1)
= Φ(0.67) ≈ 0.7486.
98 / 121
Example
Young men’s height is approximately normal, with a mean
of 174 cm and standard deviations of 6.4 cm.
1 What percentage of these men are taller than six feet
(i.e. 182 cm)?
2 What is the chance that a randomly selected young
man is 170-something cm tall?
3 Find a range of heights that contains 95% of young
men.
99 / 121
Example
Solution:
Let X be the height of a young man, and we are given that
X ∼ N (174, 6.42 ).
➊ We want P (X > 182.9). That is
X − 174 182.9 − 174
P (X > 182.9) = P >
6.4 6.4
= P (Z > 1.3906)
= 1 − P (Z ≤ 1.3906) ≈ 0.0822.
100 / 121
Example
Solution:
Let X be the height of a young man, and we are given that
X ∼ N (174, 6.42 ).
➋ We want P (170 < X < 180). That is
170 − 174X − 174 180 − 174
P (170 < X < 180) = P < <
6.4 6.4 6.4
= P (−0.625 < Z < 0.9375)
= Φ(0.9375) − Φ(−0.625) ≈ 0.5598.
101 / 121
Example
Solution:
Let X be the height of a young man, and we are given that
X ∼ N (174, 6.42 ).
➌ We want to find c such that
X − 174
0.95 = P < c = P (|Z| < c).
6.4
=⇒ c = Φ−1 (0.975) = 1.96
=⇒ Range: 174 ± 1.96 × 6.4 = (161.456, 186.544)
102 / 121
3.10 Gamma Distribution
103 / 121
Definition
A random variable X is said to have a Gamma
distribution with parameters α > 0 (shape parameter)
and β > 0 (scale parameter) if X has a probability density
function
e−x/β xα−1
fX (x; α, β) = , x > 0.
Γ(α) β α
104 / 121
Gamma density functions are skewed curves on the positive
half-line.
105 / 121
Graphs of Various Exponentials
106 / 121
Results
If X ∼ Gamma(α, β), then
1 E(X) = α β
2 V ar(X) = α β 2
α
1
3 mX (u) = 1−β u
, u < β1 .
107 / 121
As mentioned previously, the Gamma distribution
generalises the Exponential distribution.
Result
X has an exponential distribution ( i.e. X ∼ Exp(β)) if
and only if
X ∼ Gamma(1, β).
108 / 121
3.11 Beta Distribution
109 / 121
For completeness, we conclude the chapter with the Beta
Distribution.
110 / 121
Definition
A random variable X is said to have a Beta distribution
with parameters α > 0 and β > 0 if its probability density
function is
xα−1 (1 − x)β−1
fX (x; α, β) = , 0 < x < 1.
B(α, β)
111 / 121
Results
If X has a Beta distribution with parameters α > 0 and
β > 0, then
α
1 E(X) = α+β
αβ
2 V ar(X) = (α + β + 1) (α + β)2
.
112 / 121
Supplementary Material
113 / 121
Supplementary Material-Binomial Mean
n
X n x
E(X) = x p (1 − p)n−x
x=0
x
n
X n!
= x px (1 − p)n−x
x=0
x! (n − x)!
n
X n!
= px (1 − p)n−x
x=1
(x − 1)! (n − x)!
(since the zero term vanishes)
114 / 121
Supplementary Material-Binomial Mean (continued)
m
X (m + 1)! x
E(X) = p (1 − p)n−x
y=0
y!, (m − y)!
m
X (m + 1)! y+1
= p (1 − p)m−y
y=0
y! (m − y)!
m
X m!
= (m + 1) p py (1 − p)m−y
y=0
y! (m − y)!
m
X m!
= np py (1 − p)m−y .
y=0
y! (m − y)!
115 / 121
Supplementary Material-Binomial Mean (continued)
E(X) = np.
116 / 121
Supplementary Material- Binomial Variance
n
X n x
ux
= e p (1 − p)n−x
x=0
x
n
X n
= (p eu )x (1 − p)n−x
x=0
x
with a = p e
u
.
and b = 1 − p
118 / 121
Supplementary Material-Binomial Theorem
Binomial Theorem
Let a and b be constant. Then,
m
m
X m y m−y
(a + b) = a b
y=0
y
m
X m!
= ay bm−y .
y=0
y! (m − y)!
119 / 121
Supplementary Material - R Documentation on Digamma and other Gamma Functions
Description
Special mathematical functions related to the beta and gamma functions.
Usage
beta(a, b)
lbeta(a, b)
gamma(x)
lgamma(x)
psigamma(x, deriv = 0)
digamma(x)
trigamma(x)
Arguments
a, b non-negative numeric vectors.
x, n numeric vectors.
k, deriv integer vectors.
The functions beta and lbeta return the beta function and the natural logarithm of the beta
function,
Γ(a) Γ(b)
B(a, b) =
Γa + b)
The functions gamma and lgamma return the Γ(x) and the natural logarithm of the absolute
value of the gamma function. The gamma function (See Abramowitz and Stegun, Section 6.2.1,
page 255) is defined by Z ∞
x−1 −t
Gamma(x) = t e dt,
0
for all real x except zero and negative integers (when NaN is returned).
120 / 121
Supplementary Material - R Documentation on Digamma and other Gamma Functions -
continued
The functions digamma and trigamma return the first and second derivatives of the logarithm
of the gamma function psigamma(x, deriv) (deriv >= 0) computes the deriv-th derivative of
Ψ(x).
That is,
d Γ′ (x)
digamma(x) = Ψ(x) = ln Γ(x) = ,
dx Γ(x)
where Ψ and its derivatives, the psigamma(), are often called the polygamma functions (See
Abramowitz and Stegun, page 260).
References
Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988) The New S Language.
Wadsworth & Brooks/Cole. (For gamma and lgamma.)
121 / 121