0% found this document useful (0 votes)

27 views142 pages

Chapter 3

This document provides an introduction to common probability distributions used in statistics, including discrete distributions like Bernoulli, Binomial, and Poisson, as well as continuous distributions such as Normal and Exponential. It details the definitions, properties, and examples of these distributions, emphasizing their applications in modeling random variables. The chapter serves as a foundational resource for understanding how these distributions are characterized and utilized in statistical analysis.

Uploaded by

huangde1212

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

27 views142 pages

Chapter 3

Uploaded by

huangde1212

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 142

School of Mathematics and Statistics

UNSW Sydney

Introduction to Probability and Stochastic Processes

OPEN LEARNING
CHAPTER 3

Common Distributions

2 / 121
Outline:

Common Discrete Distributions

3.1 Bernoulli Distribution

3.2 Binomial Distribution
3.3 Geometric Distribution
3.4 Hypergeometric Distribution
3.5 Poisson Distribution

3 / 121
Common Continuous Distributions

3.6 Uniform Distribution

3.7 Exponential Distribution
3.8 Special Functions Arising in Statistics
① Gamma Function
② Beta Function
③ The Digamma and Trigamma Functions
④ The Φ Function and its Inverse
3.9 Normal Distribution
3.10 Gamma Distribution
3.11 Beta Distribution

Supplementary Material

4 / 121
In the previous chapters, we saw that the random variables
may be characterised by probability mass functions
(discrete case) and probability density functions
(continuous case).

Any non-negative function that sums to one is a legal

probability mass function.

Any non-negative function that integrates to one is a legal

probability density function.

Certain families of probability mass functions and

probability density functions are particularly useful in
Statistics.

This chapter covers the most common distributions.

5 / 121
3.1 Bernoulli Distribution

6 / 121
The Bernoulli distribution is very important to statistics
because it can be used to model responses to any Bernoulli
trial.

Definition
A Bernoulli trial is an experiment with two possible
outcomes. The outcomes are often labelled success or
failure.

7 / 121
Example
Coin tossing is a Bernoulli trial. We may define, for
example,
success = heads
failure = tails.

Some other examples of Bernoulli trials are

dead or alive
sick or not sick
flowering or not flowering
sold or not sold
faulty or not faulty.

8 / 121
In the Bernoulli trials, there are only two possible
responses to any of the questions. These trials are
modelled using the Bernoulli distribution.

Definition
For a Bernoulli trial, we define the random variable, X as
(
1 if the trial results in a success
X=
0 otherwise.

Then X is said to have a Bernoulli distribution.

9 / 121
Result
If X is a Bernoulli random variable defined according to a
Bernoulli trial Jump to the definition of a Bernoulli trial with a
probability of success p ∈ (0, 1), then the probability mass
function of X is

P (X = x) = px (1 − p)1−x , x = 0, 1,

Definition
A constant p in the probability mass function is called a
parameter.

10 / 121
Derivation:

P (X = 0) = p0 (1 − p)1 = 1 − p
P (X = 1) = p1 (1 − p)0 = p.

We denote a Bernoulli random variable X as

X ∼ Bernoulli(p).

Notation
The symbol ∼ is commonly used in Statistics for the phrase

is distributed as or has distribution.

11 / 121
Example
1
Consider tossing a coin. If the coin is fair, we have p = 2
and
1 x 1 1−x 1
P (X = x) = 1− = , x = 0, 1.
2 2 2

This is consistent with the notion that heads and tails are
equally likely for a fair coin.

12 / 121
3.2 Binomial Distribution

13 / 121
The Binomial distribution arises when several Bernoulli
trials are repeated in succession. Hence, the Bernoulli
distribution is a special case of a Binomial distribution.

Definition
Consider a sequence of n independent Bernoulli trials, each
with a probability of success p. If

X = total number of successes,

then X is a Binomial random variable with parameters n

and p.

A common shorthand to represent a Binomial random

variable is
X ∼ Bin(n, p).

14 / 121
The mathematical expression

X ∼ Bin(29, 0.72)

is usually read as X has a Binomial distribution with

parameters n = 29 and p = 0.72.

We have a binomial distribution when we sum up the

number of times we observe a particular successful outcome
across n independent trials.

15 / 121
Example
Write down a distribution that could be used to model X.
1. The number of patients who survive a new type of
surgery out of 12 patients who have a 20% chance of
surviving.

16 / 121
Example
Write down a distribution that could be used to model X.
1. The number of patients who survive a new type of
surgery out of 12 patients who have a 20% chance of
surviving.

Solution:
Here, X is the number of patients who survive a new type
of surgery with parameters n = 12 and the probability of
success p = 0.20. That is

X ∼ Bin(12, 0.20).

16 / 121
Example
Write down a distribution that could be used to model X.
2. The number of patients who visited a doctor and were
sick out of the 36 patients. Assume that, in general,
70% of the people who visit the doctor are sick.

17 / 121
Example
Write down a distribution that could be used to model X.
2. The number of patients who visited a doctor and were
sick out of the 36 patients. Assume that, in general,
70% of the people who visit the doctor are sick.

Solution:
Here, X is the number of patients who visited the doctors
and were sick with parameters n = 36 and the probability
of success (being sick) p = 0.70. That is

X ∼ Bin(36, 0.70).

17 / 121
Example
Write down a distribution that could be used to model X.
3. The number of plants that are flowering out of 52
randomly selected plants in the wild, when the
proportion of plants in the wild flowering at the time is
p.

18 / 121
Example
Write down a distribution that could be used to model X.
3. The number of plants that are flowering out of 52
randomly selected plants in the wild, when the
proportion of plants in the wild flowering at the time is
p.

Solution:
Here, X is the number of plants flowering in the wild with
parameters n = 52 and the probability of success
(flowering) p. That is

X ∼ Bin(52, p).

18 / 121
Example
Write down a distribution that could be used to model X.
4. The number of plasma screen televisions a store sells in
a month, out of the store’s total stock of 18, each with
a probability of 0.6 of being sold.

19 / 121
Example
Write down a distribution that could be used to model X.
4. The number of plasma screen televisions a store sells in
a month, out of the store’s total stock of 18, each with
a probability of 0.6 of being sold.

Solution:
Here, X is the number of plasma screen televisions a store
sells monthly with parameters n = 18 and the probability
of success (being sold) p = 0.60. That is

X ∼ Bin(18, 0.60).

19 / 121
Example
Write down a distribution that could be used to model X.
5. The number of plasma screen televisions returned to
the store due to faults, out of the nine televisions sold,
when the return rate for this type of television is 8%.

20 / 121
Example
Write down a distribution that could be used to model X.
5. The number of plasma screen televisions returned to
the store due to faults, out of the nine televisions sold,
when the return rate for this type of television is 8%.

Solution:
Here, X is the number of plasma screen televisions returned
to the store due to faults with parameters n = 9 and the
probability of success (return rate) p = 0.08. That is

X ∼ Bin(9, 0.08).

20 / 121
In each example, we required the assumption of
independence of responses across the n units to use the
binomial distribution.

This assumption is guaranteed to be satisfied if we

randomly select units from some larger population (as was
done in the plant example ).

21 / 121
Result
If X ∼ Bin(n, p), then its probability mass function is given
by

n x
P (X = x) = p (1−p)n−x , x = 0, 1, . . . , n, 0 < p < 1.
x

This result follows from the fact that there are nx ways by

which X can take the values of x, and each of these ways

have probability px (1 − p)n−x of occurring.

22 / 121
Results
If X ∼ Bin(n, p), then
1 E(X) = np See Derivation Here

2 V ar(X) = np(1 − p) See Derivation Here

u n
3 mX (u) = (p e + 1 − p) See Derivation Here

23 / 121
Example
Adam pushes ten pieces of toast off a table. Seven of these
landed butter side down.
What distribution could be used to model the number
of slices of toast that landed butter side down? Assume
a 50:50 chance of each slice landing butter side down.

24 / 121
Example
Adam pushes ten pieces of toast off a table. Seven of these
landed butter side down.
What distribution could be used to model the number
of slices of toast that landed butter side down? Assume
a 50:50 chance of each slice landing butter side down.

Solution:
Let X be the number of slices of toast that landed butter
side down. So X is a binomial random variable with
parameters n = 10 and the probability of success (slice
landing butter side down) p = 0.5, that is, X ∼ Bin(10, 12 ).

24 / 121
Example
Adam pushes ten pieces of toast off a table. Seven of these
landed butter side down.
What is the expected number of pieces of toast that
land butter side down, and what is the standard
deviation?

25 / 121
Example
Adam pushes ten pieces of toast off a table. Seven of these
landed butter side down.
What is the expected number of pieces of toast that
land butter side down, and what is the standard
deviation?
Solution:
We know that the expected number of a binomial random
variable is
1
E(X) = np = 10 × = 5
2
and the standard deviation is
r
p p 1 1
V ar(X) = np(1 − p) = 10 × × (1 − ) ≈ 1.88
2 2

25 / 121
Example
Adam pushes ten pieces of toast off a table. Seven of these
landed butter side down.
What is the probability that exactly seven slices land
butter side down?

26 / 121
Example
Adam pushes ten pieces of toast off a table. Seven of these
landed butter side down.
What is the probability that exactly seven slices land
butter side down?

Solution:
Here we want the probability P (X = 7). That is,

10 1 7 1 10−7 1 10
P (X = 7) = 1− = 120 ×
7 2 2 2

26 / 121
Example
Adam pushes ten pieces of toast off a table. Seven of these
landed butter side down.
What is the probability that at least seven slices land
butter side down?

27 / 121
Example
Adam pushes ten pieces of toast off a table. Seven of these
landed butter side down.
What is the probability that at least seven slices land
butter side down?

Solution:
Here we want the probability P (X ≥ 7). That is,
" #
10 10 10 10 1 10
P (X ≥ 7) = + + + ≈ 0.1719.
7 8 9 10 2

27 / 121
As mentioned earlier, the Binomial distribution generalises
the Bernoulli distribution.

Result
X has a Bernoulli distribution with parameter p if and only
if
X ∼ Bin(1, p).

28 / 121
3.3 Geometric Distribution

29 / 121
The Geometric distribution arises when a Bernoulli trial
is repeated until the first success. In this case,

X = the number of trials until the first success,

and X is said to have a geometric distribution with

parameter p, where p is the probability of success on each
trial.

A common shorthand to represent a geometric random

variable is
X ∼ geometric(p).

30 / 121
Results
If X has a geometric distribution with parameter p ∈ (0, 1),
then
1 X has probability mass function

P (X = x) = (1 − p)x−1 p, x = 1, 2, 3, . . .
1
2 E(X) = p
1−p
3 V ar(X) = p2

The above form of the geometric distribution is used for modelling the number of
trials up to and including the first success.

By contrast, the following form of the geometric distribution is used for modelling
Y , the number of failures until the first success:
P (Y = x) = P (X = x + 1) = (1 − p)x p x = 0, 1, 2, . . . , .

In either case, the sequence of probabilities is geometric.

31 / 121
Example
Suppose we repeatedly flip a fair coin with probability p of
coming up heads and the probability of 1 − p of coming up
tails, where 0 < p < 1.

Let X be the number of tails that appear before the first

heads.

Then, for x = 1, 2, 3, . . . , X = x if and only if the coin

shows exactly x − 1 tails followed by a head.

The probability of this is equal to (1 − p)x−1 p.

32 / 121
3.4 Hypergeometric Distribution

33 / 121
Hypergeometric random variables arise when counting the
number of binary (success/failure or yes/no) responses
when objects are sampled independently from finite
populations and the total number of successes in the
population is known.

Suppose that a box contains N balls, m are red and N − m

balls are black. Suppose n balls are drawn at random. Let

X = the number of red balls drawn.

Then X has a hypergeometric distribution with

parameters N, m and n and is denoted as

X ∼ hypergeometric(N, m, n).

34 / 121
Results
If X has a hypergeometric distribution with parameters
N, m and n, then
probability mass function is given by
m
N −m
x n−x
P (X = x) = N
, max(0, n + m − N ) ≤ x ≤ min(m, n)
n

m
E(X) = n · N
m m
N −n
V ar(X) = n · N
1− N N −1
.

35 / 121
The hypergeometric distribution can be considered a finite
population version of the binomial distribution.

Instead of assuming some constant probability p of success

in the population, we say that there are N units in the
population of which m are successes.

36 / 121
It can be shown that as N gets large, a hypergeometric
distribution with parameters N, m and n approaches the
m
random variable Y ∼ Bin n, N .
m
A suggestion of this can be seen in E(X) = n · N is equal to
m
its binomial version, E(Y ), where Y ∼ Bin n, N .

m m
N −n
Also, V ar(X) = n · N 1 − N N −1 only differs from its
binomial version by a finite population correction factor
N −n
N −1
, which tends to one as N gets large.

37 / 121
Example
A box contains balls numbered from one to eighty. You may select ten numbers
from the integers one to eighty. Twenty balls are specified randomly (think of
these as red balls). You win the major prize if all ten numbers are on red balls.

38 / 121
Example
A box contains balls numbered from one to eighty. You may select ten numbers
from the integers one to eighty. Twenty balls are specified randomly (think of
these as red balls). You win the major prize if all ten numbers are on red balls.

Solution:
This hypergeometric distribution has parameters N = 80, n = 10 and m = 20.
That is, let X be the number of your selected numbers that are on red balls. Then

20 60
x 10−x
P (X = x) = 80
and
10
20 60
10 10−10
P (win major prize) = P (X = 10) = 80
10
20
10
= 80
,
10

which is about 1 in 8.9 million.

38 / 121
Example
Write down a distribution that could be used to model X.
1. The number of patients a town doctor sees who are
sick: when 800 people want to see a doctor, 500 of
these are sick, and when the doctor only has time to
see 32 of the 800 people (who are selected effectively at
random from those that want to see the doctor).

39 / 121
Example
Write down a distribution that could be used to model X.
1. The number of patients a town doctor sees who are
sick: when 800 people want to see a doctor, 500 of
these are sick, and when the doctor only has time to
see 32 of the 800 people (who are selected effectively at
random from those that want to see the doctor).

Solution:
Let X be the number of sick patients a town doctor sees.
Here we have N = 800, m = 500 and n = 32. That is,

X ∼ hypergeometric(800, 500, 32).

39 / 121
Example
Write down a distribution that could be used to model X.
2. The number of plants that are flowering out of 52
randomly selected plants in the wild, from a
population of 800 plants, of which 650 are flowering.

40 / 121
Example
Write down a distribution that could be used to model X.
2. The number of plants that are flowering out of 52
randomly selected plants in the wild, from a
population of 800 plants, of which 650 are flowering.

Solution:
Let X be The number of plants that are flowering with
N = 800, m = 650 and n = 52. That is,

X ∼ hypergeometric(800, 650, 52).

40 / 121
Example
Write down a distribution that could be used to model X.
3. The number of faulty plasma screen televisions
returned to a store, when five of the stores’ eighteen
televisions were faulty, and six of the eighteen were
sold.

41 / 121
Example
Write down a distribution that could be used to model X.
3. The number of faulty plasma screen televisions
returned to a store, when five of the stores’ eighteen
televisions were faulty, and six of the eighteen were
sold.
Solution:
Let X be the number of faulty plasma screen televisions
returned to a store with N = 18, m = 5 and n = 6. That is,

X ∼ hypergeometric(18, 5, 6).

41 / 121
3.5 Poisson Distribution

42 / 121
The Poisson distribution often arises when the random
variable of interest is a count.

For example, the number of traffic accidents in a city on

any given day could be well-described by a Poisson random
variable.

The Poisson distribution is particularly useful for modelling

the number of times that rare events occur - events that
can reasonably be assumed to follow what is known as the
Poisson process.

43 / 121
Definition
The random variable X has a Poisson distribution with
parameter λ > 0 if its probability mass function is given by

e−λ λx
P (X = x) = , x = 0, 1, 2, 3, . . . , .
x!

A common abbreviation is

X ∼ Poisson(λ).

44 / 121
Results
If X ∼ Poisson(λ), then
1 E(X) = λ
2 V ar(X) = λ
u
3 mX (u) = eλ (e −1) .

45 / 121
Example
Suggest a distribution that could be useful for studying X.
1. The number of workplace accidents in a month when
the average number of accidents is 1.4 .

46 / 121
Example
Suggest a distribution that could be useful for studying X.
1. The number of workplace accidents in a month when
the average number of accidents is 1.4 .

Solution:
Let X be the number of workplace accidents per month
with parameter λ = 1.4. That is,

X ∼ Poisson(1.4) .

46 / 121
Example
Suggest a distribution that could be useful for studying X.
2. The number of people calling a helpline per day.

47 / 121
Example
Suggest a distribution that could be useful for studying X.
2. The number of people calling a helpline per day.

Solution:
Let X be the number of people calling a helpline per day,
with parameter λ being the average number of people
calling the helpline daily. That is,

X ∼ Poisson(λ).

47 / 121
Example
Suggest a distribution that could be useful for studying X.
3. The number of ATM customers overnight when a bank
is closed when the average number is 5.6.

48 / 121
Example
Suggest a distribution that could be useful for studying X.
3. The number of ATM customers overnight when a bank
is closed when the average number is 5.6.

Solution:
Let X be the number of ATM customers overnight when a
bank is closed with parameter λ = 5.6. That is

X ∼ Poisson(5.6).

48 / 121
Example
4. If, on average, five servers go offline during the day,
what is the chance that no more than one server will go
offline? Assume independence of servers going offline.

49 / 121
Example
4. If, on average, five servers go offline during the day,
what is the chance that no more than one server will go
offline? Assume independence of servers going offline.

Solution:
Let X be the number of servers that go offline during the day
with parameter λ = 5. That is, X ∼ Poisson(5). We want

P (no more than one server goes offline during the day)
= P (X ≤ 1)
= P (X = 0) + P (X = 1)
e−5 50 e−5 51
= + = e−5 + e−5 5 = 6e−5 .
0! 1!

49 / 121
Common Continuous Distributions

50 / 121
So far, we considered discrete random variables X for
which P (X = x) > 0 for certain values of x.

Now, we wish to consider continuous random variables.

A continuous random variable takes an infinite number of

possible values.

That is, a random variable X is continuous if possible

values comprise either a single interval on the number
line or a union of disjoint intervals.

Continuous random variables are usually measurements

such as height, weight, the time required to walk to school,
etc.

51 / 121
Definition
A random variable X is continuous if

P (X = x) = 0 for all x ∈ R.

52 / 121
3.6 Uniform Distribution

53 / 121
The uniform distribution is the simplest common
distribution for continuous random variables.

Definition
A continuous random variable X that can take values in
the interval (a, b) with equal probability is said to have a
uniform distribution on (a, b), and has probability
density function given by
1
fX (x; a, b) = , a < x < b; a < b.
b−a

A common shorthand to denote a random variable of the

uniform distribution is

X ∼ Uniform(a, b).

54 / 121
1
Note that fX (x; a, b) = b−a a < x < b; a < b, is simply a
constant function over the interval (a, b) and zero
otherwise. That is,
(
1
, a < x < b, a < b
fX (x; a, b) = b−a
0 otherwise.

55 / 121
Graphs of Different Uniform Density Functions

56 / 121
Results
If X ∼ Uniform(a, b), then
1 E(X) = a+b
2
(b−a)2
2 V ar(X) = 12
ebu −eau
3 mX (u) = (b−a) u
.

57 / 121
Note that there is also a discrete version of the uniform
distribution, which is useful for modelling the outcome of an
event with k equally likely outcomes, such as a roll of a die.

This has a different formula for its expectation and

variance than the continuous case, which can be derived
from first principles.

58 / 121
3.7 Exponential Distribution

59 / 121
The Exponential distribution is the simplest common
distribution for describing the probability structure of
continuous positive random variables, such as
lifetimes.

60 / 121
Definition
A random variable X is said to have an exponential
distribution with parameter β > 0 if it has the probability
density function of X is given by
1 −x/β
fX (x; β) = e , x>0
β

The common abbreviation is

X ∼ exp(β).

61 / 121
Graphs of Various Exponentials

62 / 121
The cumulative distribution function of X is given by
Z x
1 −y/β
FX (x; β) = P (X ≤ x) = e dy
0 β
= 1 − e−x/β , x > 0,

and tail probability is

P (X > x) = 1 − P (X ≤ x) = 1 − FX (x; β) = e−x/β , x > 0.

63 / 121
Results
If X has an exponential distribution with parameter β > 0,
then
1 E(X) = β
2 V ar(X) = β 2
1
3 mX (u) = 1−β u
, u < 1/β.

64 / 121
The exponential distribution is closely related to the
Poisson distribution.

We know that if a random variable follows a Poisson

process, then the number of times a particular event is
counted has a Poisson distribution with parameter λ.

It can be shown that the time between two consecutive

events has an exponential distribution with parameter
β = 1/λ.

65 / 121
Example
If, on average, five servers go offline during the day, what is
the chance that no servers will go offline in the next hour?
1
(Hint: Note that an hour is 24 of a day.)

66 / 121
Example
If, on average, five servers go offline during the day, what is
the chance that no servers will go offline in the next hour?
1
(Hint: Note that an hour is 24 of a day.)

Solution:
Let T be the time until the next server goes offline with
parameter β = 1/λ = 245
, where λ = 5 per day and λ = 24 5

per hour. That is T ∼ exp( 24

5
), and we want the P (T > 1).
Z 1
1 −x/β
P (T > 1) = 1 − P (T ≤ 1) = 1 − e dx
0 β
Z 1
1 −x/(24/5)
= 1− e dx = e−5 x/24 .
0 24/5

66 / 121
An important property of the exponential distribution
is lack of memory or the memoryless property.

That is, if X has an exponential distribution, then

P (X > s + t|X > s) = P (X > t).

In other words, if the waiting time until the next event is

exponential, then the waiting time until the next event is
independent of the time you have already been waiting.

Note that the exponential distribution is a special case of

the Gamma distribution, which we will discuss later.

67 / 121
3.8 Special Functions Arising in Statistics

68 / 121
There are three more special distributions that we consider,

Gamma distribution
Normal distribution
Beta distribution

However, before discussing these distributions, we need to

introduce some special functions closely related to these
distributions.

69 / 121
3.8.1 Gamma Function

70 / 121
The Gamma function is essentially an extension of the
factorial function (e.g. 4! = 4 × 3 × 2 × 1 = 24) to the
general real numbers.

Definition
The Gamma function at x ∈ R is given by
Z ∞
Γ(x) = tx−1 e−t dt.
0

71 / 121
Results
Some basic results for the Gamma function are
1 Γ(x) = (x − 1) Γ(x − 1)
2 Γ(n) = (n − 1)!, n = 1, 2, 3, . . .
√
3 Γ( 21 ) = π
4 If m is a non-negative integer, then
Z ∞
Γ(m + 1) = xm e−x dx = m!
0

72 / 121
Example
Suppose X has a probability density function
1 5 −x
fX (x) = x e , x > 0.
120
Find E(X) using the gamma function.

73 / 121
Example
Suppose X has a probability density function
1 5 −x
fX (x) = x e , x > 0.
120
Find E(X) using the gamma function.
Solution:
Z ∞
E(X) = x fX (x) dx definition
−∞
Z ∞ 1 5 −x
= x x e dx
0 120
Z ∞
1
= x6 e−x dx
120 0
1 Jump to definition of a Gamma function
= Γ(7)
120
1 6×5×4×3×2×1
= 6! = = 6.
120 120

73 / 121
3.8.2 Beta Function

74 / 121
Definition
The Beta function at x, y ∈ R is given by
Z 1
B(x, y) = tx−1 (1 − t)y−1 dt.
0

Result
For all x, y ∈ R,

Γ(x) Γ(y)
B(x, y) = .
Γ(x + y)

75 / 121
Example
Suppose X has a probability density function

fX (x) = 168 x2 (1 − x)5 dx, 0 < x < 1.

Find E(X 2 ), using the beta function.

76 / 121
Example
Suppose X has a probability density function

fX (x) = 168 x2 (1 − x)5 dx, 0 < x < 1.

Find E(X 2 ), using the beta function.

Solution:
Z ∞
E(X 2 ) = x2 fX (x) dx definition
−∞
Z 1
= x2 168 x2 (1 − x)5 dx
0
Z 1
= 168 x4 (1 − x)5 dx
0
= 168 B(5, 6) Jump to definition of a beta function

Γ(5) Γ(6) 4! 5!
= 168 = 168 .
Γ(11) 10!

76 / 121
3.8.3 The Digamma and Trigamma Functions

77 / 121
The digamma and trigamma functions will be required
later, but we will give their definitions here since they are
closely related to the Gamma function.

Definition
For all x ∈ R,
d
digamma(x) = ln{Gamma(x)}
dx
d2
trigamma(x) = ln{Gamma(x)}.
dx2

78 / 121
Unlike other mathematical functions, such as sin(x),
tan−1 (x), log10 , etc., the digamma and trigamma functions
are unavailable on ordinary hand-held calculators.

However, they are just another set of special functions

available in mathematics and statistics computer software
such as Maple, Matlab, and R.

The following figure (constructed using R software)

displays them graphically.

79 / 121
Graphs of Digamma, Trigamma, and Other Gamma Functions

R Documentation
80 / 121
3.8.4 The Φ Function and its Inverse

81 / 121
The final special function we will consider is routinely
denoted by the capital phi symbol Φ. It is defined as
follows.

Definition
For all x ∈ R,
Z x
1 2 /2
Φ(x) = √ e−t dt.
2π −∞

Note that Φ(x) cannot be simplified any further than in the

2
above expression since e−t /2 does not have a closed-form
anti-derivative. This function gives the cumulative function
of the standard normal distribution.

82 / 121
Results
1 limx→−∞ Φ(x) = 0
2 limx→∞ Φ(x) = 1
3 Φ(0) = 12
4 Φ is a monotonically increasing function over R.

83 / 121
The previous result shows that the inverse of Φ, denoted by
Φ−1 , is well-defined for all 0 < x < 1.

Examples are
1
Φ−1 = 0 and Φ−1 (0.975) = 1.95996 · · · ≈ 1.96.
2

We will see later that the function Φ−1 plays a particular

important role in statisitical inference.

84 / 121
3.9 Normal Distribution

85 / 121
A particularly important family of continuous random
variables is those following the normal distribution.

Definition
The random variable X is said to have a normal
distribution with parameters µ, −∞ < µ < ∞, and σ 2 > 0
if X has probability density function
1 −(x−µ)2
fX (x; µ, σ) = √ e 2 σ2 , −∞ < x < ∞.
σ 2π

A common shorthand notation of a normally distributed

random variable is

X ∼ N (µ, σ 2 ).

86 / 121
Normal density functions are symmetric bell shaped curves,
symmetric about µ. The following figures show four
different normal density functions.

87 / 121
88 / 121
The normal distribution is important because of the
central limit theorem, which will be discussed in
Chapter 6.

It is also sometimes useful in practice when studying

variables with an approximately normal distribution (such
as the height of people).

This, however, is not the main use of the distribution. It is

more commonly used in making inferences about statistics
via the central limit theorem, which will be described in
Chapters 8-10.

89 / 121
Results
1 E(X) = µ
2 V ar(X) = σ 2
1 2 u2
3 mX (x) = eµ u + 2 σ .

90 / 121
The special case of µ = 0 and σ 2 = 1 is known as the
standard normal distribution.

It is common to use the letter Z to denote standard normal

random variables. Its common shorthand notation is
Z ∼ N(0, 1).

The standard normal density function is

1 −x2
fZ (x) = √ e 2 , −∞ < x < ∞,
2π
and the cumulative distribution function of a standard
normal random variable is
Z x
1 2
Φ(x) = FZ (x) = P (Z ≤ x) = √ e−t /2 dt.
2 π −∞

91 / 121
3.9.1 Computing Normal Distribution Probabilities

92 / 121
93 / 121
Consider the problem
Z 0.47
1 2
P (Z ≤ 0.47) = √ e−x /2 dx.
−∞ 2π

The standard normal density function does not have a

closed-form anti-derivative and cannot be solved as usual.

However, tables for Φ(x) = P (Z ≤ x) are available or using

statistical package RStudio, by typing pnorm at the
prompt. That is,

> pnorm(0.47)
[1] 0.6808225

That is,
P (Z ≤ 0.47) ≈ 0.6808.

94 / 121
95 / 121
Some other examples are
P (Z ≤ 1) = Φ(1) ≈ 0.8413 and P (Z ≤ 1.54) = Φ(1.54) ≈ 0.9382.

In RStudio, type pnorm(1) and pnorm(1.54),

respectively.

To find the probability such as P (Z > 0.81), we need to

work with the complement, i.e.,

P (Z > 0.81) = 1 − P (Z ≤ 0.81) = 1 − Φ(0.81)

≈ 1 − 0.7910 = 0.2090.

96 / 121
How about probabilities concerning non-standard normal
random variables?

For example, how do we find

P (X ≤ 12) where X ∼ N (10, 9)?

Result
If X ∼ N (µ, σ 2 ), then

X −µ
Z= ∼ N (0, 1).
σ

97 / 121
Example
Find P (X ≤ 12), where X ∼ N (10, 9).

98 / 121
Example
Find P (X ≤ 12), where X ∼ N (10, 9).
Solution:
!
X − 10 12 − 10
P (X ≤ 12) = P ≤
3 3
= P (Z ≤ 0.67), where Z ∼ N (0, 1)
= Φ(0.67) ≈ 0.7486.

We can also use RStudio, and type at the prompt

pnorm(12, 10, 3), which gives us 0.7475075.

In general, using RStudio, and type at the prompt

pnorm(x, µ, σ).

98 / 121
Example
Young men’s height is approximately normal, with a mean
of 174 cm and standard deviations of 6.4 cm.
1 What percentage of these men are taller than six feet
(i.e. 182 cm)?
2 What is the chance that a randomly selected young
man is 170-something cm tall?
3 Find a range of heights that contains 95% of young
men.

99 / 121
Example

Solution:
Let X be the height of a young man, and we are given that
X ∼ N (174, 6.42 ).
➊ We want P (X > 182.9). That is
X − 174 182.9 − 174
P (X > 182.9) = P >
6.4 6.4
= P (Z > 1.3906)
= 1 − P (Z ≤ 1.3906) ≈ 0.0822.

In RStudio, type at the prompt

1 − pnorm(182.9, 174, 6.4),
which gives 0.08216959.

100 / 121
Example

Solution:
Let X be the height of a young man, and we are given that
X ∼ N (174, 6.42 ).
➋ We want P (170 < X < 180). That is
170 − 174X − 174 180 − 174
P (170 < X < 180) = P < <
6.4 6.4 6.4
= P (−0.625 < Z < 0.9375)
= Φ(0.9375) − Φ(−0.625) ≈ 0.5598.

In RStudio, type at the prompt

pnorm(180, 174, 6.4) − pnorm(170, 174, 6.4),
which gives 0.5597638.

101 / 121
Example

Solution:
Let X be the height of a young man, and we are given that
X ∼ N (174, 6.42 ).
➌ We want to find c such that
X − 174
0.95 = P < c = P (|Z| < c).
6.4
=⇒ c = Φ−1 (0.975) = 1.96
=⇒ Range: 174 ± 1.96 × 6.4 = (161.456, 186.544)

In RStudio, type at the prompt

qnorm(0.025, 174, 6.4) and qnorm(0.975, 174, 6.4),
separately, which gives (161.4562, 186.5438).

102 / 121
3.10 Gamma Distribution

103 / 121
Definition
A random variable X is said to have a Gamma
distribution with parameters α > 0 (shape parameter)
and β > 0 (scale parameter) if X has a probability density
function
e−x/β xα−1
fX (x; α, β) = , x > 0.
Γ(α) β α

A common shorthand notation is

X ∼ Gamma(α, β) or X ∼ Γ(α, β).

104 / 121
Gamma density functions are skewed curves on the positive
half-line.

The following figure shows four different Gamma density

functions.

105 / 121
Graphs of Various Exponentials

106 / 121
Results
If X ∼ Gamma(α, β), then
1 E(X) = α β
2 V ar(X) = α β 2
α
1
3 mX (u) = 1−β u
, u < β1 .

107 / 121
As mentioned previously, the Gamma distribution
generalises the Exponential distribution.

The following result makes this explicit.

Result
X has an exponential distribution ( i.e. X ∼ Exp(β)) if
and only if
X ∼ Gamma(1, β).

108 / 121
3.11 Beta Distribution

109 / 121
For completeness, we conclude the chapter with the Beta
Distribution.

This generalises the Uniform(0,1) distribution, which

can be thought of as a beta distribution with α = β = 1

110 / 121
Definition
A random variable X is said to have a Beta distribution
with parameters α > 0 and β > 0 if its probability density
function is
xα−1 (1 − x)β−1
fX (x; α, β) = , 0 < x < 1.
B(α, β)

Here B(α, β) is the beta function. See Definition of Beta Function

111 / 121
Results
If X has a Beta distribution with parameters α > 0 and
β > 0, then

α
1 E(X) = α+β
αβ
2 V ar(X) = (α + β + 1) (α + β)2
.

112 / 121
Supplementary Material

113 / 121
Supplementary Material-Binomial Mean

n
X n x
E(X) = x p (1 − p)n−x
x=0
x
n
X n!
= x px (1 − p)n−x
x=0
x! (n − x)!
n
X n!
= px (1 − p)n−x
x=1
(x − 1)! (n − x)!
(since the zero term vanishes)

Let y = x − 1 and m = n − 1. Substituting x = y + 1 and

n = m + 1 into the last sum (and using the fact that the
limits x = 1 and x = n correspond to y = 0 and
y = n − 1 = m, respectively)

114 / 121
Supplementary Material-Binomial Mean (continued)

m
X (m + 1)! x
E(X) = p (1 − p)n−x
y=0
y!, (m − y)!
m
X (m + 1)! y+1
= p (1 − p)m−y
y=0
y! (m − y)!
m
X m!
= (m + 1) p py (1 − p)m−y
y=0
y! (m − y)!
m
X m!
= np py (1 − p)m−y .
y=0
y! (m − y)!

115 / 121
Supplementary Material-Binomial Mean (continued)

The binomial theorem says that

m
m
X m!
(a + b) = ay bm−y .
y=0
y! (m − y)!

Setting a = p and b = 1 − p, we see that

m m
X m! y m−y
X m! y m−y m m
p (1 − p) = a b = (a + b) = (p + 1 − p) = 1.
y=0 y! (m − y)! y=0 y! (m − y)!

We the above calculation, we conclude that

E(X) = np.

116 / 121
Supplementary Material- Binomial Variance

Similarly, but this time using y = x − 2 and m = n − 2, we

have
n
X n x
E(X (X − 1)) = x (x − 1) p (1 − p)n−x
x=0
x
n
X n!
= px (1 − p)n−x
x=2
(x − 2)! (n − x)!
n
X (n − 2)!
= n (n − 1) p2 px−2 (1 − p)n−x
x=2
(x − 2)! (n − x)!
m
X m!
= n (n − 1) p2 py (1 − p)m−y
y=0
y! (m − y)!
= n (n − 1) p (p + (1 − p))m
2
(by the binomial theorem)
2
= n (n − 1) p .
So, the variance of X ∼ Bin(n, p) is
117 / 121
Supplementary Material-Binomial Moment Generating Function

mx (u) = E(eu X ) definition

n
X n x
ux
= e p (1 − p)n−x
x=0
x
n
X n
= (p eu )x (1 − p)n−x
x=0
x

= (p eu + 1 − p)n by the Binomial Theorem

with a = p e
u
.
and b = 1 − p

118 / 121
Supplementary Material-Binomial Theorem

Binomial Theorem
Let a and b be constant. Then,
m
m
X m y m−y
(a + b) = a b
y=0
y
m
X m!
= ay bm−y .
y=0
y! (m − y)!

119 / 121
Supplementary Material - R Documentation on Digamma and other Gamma Functions

Description
Special mathematical functions related to the beta and gamma functions.

Usage
beta(a, b)
lbeta(a, b)

gamma(x)
lgamma(x)
psigamma(x, deriv = 0)
digamma(x)
trigamma(x)

Arguments
a, b non-negative numeric vectors.
x, n numeric vectors.
k, deriv integer vectors.

The functions beta and lbeta return the beta function and the natural logarithm of the beta
function,
Γ(a) Γ(b)
B(a, b) =
Γa + b)
The functions gamma and lgamma return the Γ(x) and the natural logarithm of the absolute
value of the gamma function. The gamma function (See Abramowitz and Stegun, Section 6.2.1,
page 255) is defined by Z ∞
x−1 −t
Gamma(x) = t e dt,
0
for all real x except zero and negative integers (when NaN is returned).

120 / 121
Supplementary Material - R Documentation on Digamma and other Gamma Functions -

continued

The functions digamma and trigamma return the first and second derivatives of the logarithm
of the gamma function psigamma(x, deriv) (deriv >= 0) computes the deriv-th derivative of
Ψ(x).
That is,
d Γ′ (x)
digamma(x) = Ψ(x) = ln Γ(x) = ,
dx Γ(x)

where Ψ and its derivatives, the psigamma(), are often called the polygamma functions (See
Abramowitz and Stegun, page 260).

References
Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988) The New S Language.
Wadsworth & Brooks/Cole. (For gamma and lgamma.)

Abramowitz, M. and Stegun, I. A. (1972) Handbook of Mathematical Functions. New

York: Dover. Chapter 6: Gamma and Related Functions.

121 / 121

Module 5 Common Discrete Probability Distribution - Latest
No ratings yet
Module 5 Common Discrete Probability Distribution - Latest
45 pages
Lecture 4 STA32101 Intro To Statistics Random Variables P2
No ratings yet
Lecture 4 STA32101 Intro To Statistics Random Variables P2
45 pages
Discrete Probability Distributions
No ratings yet
Discrete Probability Distributions
25 pages
Chapter 4
No ratings yet
Chapter 4
27 pages
5 IntroProb4
No ratings yet
5 IntroProb4
53 pages
ST 1210-Discrete Random Variables
No ratings yet
ST 1210-Discrete Random Variables
53 pages
03 Discrete Prob Dist
No ratings yet
03 Discrete Prob Dist
71 pages
Lecture 4-Probability Distributions-FOR UPLOAD
No ratings yet
Lecture 4-Probability Distributions-FOR UPLOAD
70 pages
Probability Distribution
No ratings yet
Probability Distribution
11 pages
Add Math Probability Distribution
No ratings yet
Add Math Probability Distribution
10 pages
LectureNotes5 PDF
No ratings yet
LectureNotes5 PDF
28 pages
p3 Sol
No ratings yet
p3 Sol
40 pages
Unit 4
No ratings yet
Unit 4
30 pages
Discrete Random Variables and Probability Distributions
No ratings yet
Discrete Random Variables and Probability Distributions
19 pages
Distribution PPT
No ratings yet
Distribution PPT
75 pages
Lecture 3-Discrete Random Variables
No ratings yet
Lecture 3-Discrete Random Variables
46 pages
Module 5 - Discrete Probability Distributions Upd
No ratings yet
Module 5 - Discrete Probability Distributions Upd
68 pages
Lecture Note 1
No ratings yet
Lecture Note 1
5 pages
Chap 3
No ratings yet
Chap 3
18 pages
Slides-Probability and Random Processes, 4, March 2024
No ratings yet
Slides-Probability and Random Processes, 4, March 2024
116 pages
Binomial and Related Distributions
No ratings yet
Binomial and Related Distributions
17 pages
Types of Data
No ratings yet
Types of Data
45 pages
MetNum1 2023 1 Week 12
No ratings yet
MetNum1 2023 1 Week 12
61 pages
Chapter 05 - Distribution Theory
No ratings yet
Chapter 05 - Distribution Theory
41 pages
Bayesian Lec.1
No ratings yet
Bayesian Lec.1
73 pages
Discrete Random Variables
No ratings yet
Discrete Random Variables
88 pages
Chapter 4
No ratings yet
Chapter 4
31 pages
Chap 1 Module
No ratings yet
Chap 1 Module
31 pages
3 BMGT 220 Binomial Distribution
No ratings yet
3 BMGT 220 Binomial Distribution
3 pages
Discrete Random Variables
No ratings yet
Discrete Random Variables
33 pages
Lec 4
No ratings yet
Lec 4
69 pages
Chapter 4: Probability Distributions: 4.1 Random Variables
100% (2)
Chapter 4: Probability Distributions: 4.1 Random Variables
53 pages
Topic 5 Discrete Distributions
No ratings yet
Topic 5 Discrete Distributions
30 pages
2 - Probability (Part 3) - Discrete PD (Binomial, Poisson, Hypergeometric)
No ratings yet
2 - Probability (Part 3) - Discrete PD (Binomial, Poisson, Hypergeometric)
30 pages
Chapter 4
No ratings yet
Chapter 4
21 pages
Section N Notes With Answers
No ratings yet
Section N Notes With Answers
4 pages
PDF Bernollui
No ratings yet
PDF Bernollui
27 pages
Lecture 7
No ratings yet
Lecture 7
26 pages
QBM 101 Business Statistics: Department of Business Studies Faculty of Business, Economics & Accounting HE LP University
No ratings yet
QBM 101 Business Statistics: Department of Business Studies Faculty of Business, Economics & Accounting HE LP University
65 pages
6.discrete Probability Distribution
No ratings yet
6.discrete Probability Distribution
47 pages
WINSEM2024-25 MAT1011 ETH AP2024254000674 2025-02-05 Reference-Material-I
No ratings yet
WINSEM2024-25 MAT1011 ETH AP2024254000674 2025-02-05 Reference-Material-I
45 pages
CSD502 Standard Probability Dist
No ratings yet
CSD502 Standard Probability Dist
15 pages
Bab 8 Probablity Distribution
No ratings yet
Bab 8 Probablity Distribution
10 pages
Discrete Probability Distributions
No ratings yet
Discrete Probability Distributions
72 pages
MAT 263 Lecture 3 - Special Discrete Distribution
No ratings yet
MAT 263 Lecture 3 - Special Discrete Distribution
21 pages
MTK3005 Chapter 2
No ratings yet
MTK3005 Chapter 2
5 pages
Lecture+Slides+ +week+1
No ratings yet
Lecture+Slides+ +week+1
30 pages
Lecture Slides - Inferential Statistics
100% (1)
Lecture Slides - Inferential Statistics
42 pages
Binomial Distribution
No ratings yet
Binomial Distribution
15 pages
Stat 253 Part 4 Special Probability Distributions
No ratings yet
Stat 253 Part 4 Special Probability Distributions
95 pages
STATISTICS 101 - Day 3 - Discrete Probability Distribution + Normal - T3
No ratings yet
STATISTICS 101 - Day 3 - Discrete Probability Distribution + Normal - T3
68 pages
Probability
No ratings yet
Probability
37 pages
Binomial Probability Distribution
No ratings yet
Binomial Probability Distribution
23 pages
Probability Problem
No ratings yet
Probability Problem
36 pages
Topic 4 Commonly Used Distribution
No ratings yet
Topic 4 Commonly Used Distribution
139 pages
Prob Distributions
No ratings yet
Prob Distributions
12 pages
(Lecture 4) Discrete Probability Distributions
No ratings yet
(Lecture 4) Discrete Probability Distributions
57 pages
Business Math and Statistics
No ratings yet
Business Math and Statistics
66 pages
Text File 3
No ratings yet
Text File 3
1 page
GS
No ratings yet
GS
2 pages
Week 01 B
No ratings yet
Week 01 B
45 pages
Math 5846 Assignment
No ratings yet
Math 5846 Assignment
3 pages
Week 02 A
No ratings yet
Week 02 A
44 pages
Quiz 6
No ratings yet
Quiz 6
8 pages
Math 5846 Chapter 2
No ratings yet
Math 5846 Chapter 2
102 pages
Question 1
No ratings yet
Question 1
5 pages
Math 5846 Chapter 1
No ratings yet
Math 5846 Chapter 1
102 pages
Math5846 Chapter10
No ratings yet
Math5846 Chapter10
76 pages
Math5846 Chapter9
No ratings yet
Math5846 Chapter9
43 pages
SEMI-DETAILED LESSON PLAN DAY 8 Week 4
No ratings yet
SEMI-DETAILED LESSON PLAN DAY 8 Week 4
3 pages
Subs Titu It On Tech
No ratings yet
Subs Titu It On Tech
14 pages
Micro Strategy Material
No ratings yet
Micro Strategy Material
298 pages
IT3101 - Object-Oriented Systems Development: University of Colombo, Sri Lanka
No ratings yet
IT3101 - Object-Oriented Systems Development: University of Colombo, Sri Lanka
12 pages
Paper Scheme MATH 113
100% (1)
Paper Scheme MATH 113
2 pages
DZ01524505211 24 01 2025 Etrf
No ratings yet
DZ01524505211 24 01 2025 Etrf
1 page
Present Perfect Simple or Present Perfect Continuous Exercise 3
No ratings yet
Present Perfect Simple or Present Perfect Continuous Exercise 3
3 pages
English 1 2020
No ratings yet
English 1 2020
170 pages
B.ED Course 202 Notes
No ratings yet
B.ED Course 202 Notes
54 pages
Dampak Perkembangan Teknologi Komunikasi Terhadap Bahasa Indonesia
No ratings yet
Dampak Perkembangan Teknologi Komunikasi Terhadap Bahasa Indonesia
18 pages
اساسيات الحاسوب وتطبيقاته المكتبية الجزء الثاني
No ratings yet
اساسيات الحاسوب وتطبيقاته المكتبية الجزء الثاني
3 pages
Time Measurement Notes
No ratings yet
Time Measurement Notes
4 pages
Actividad Integradora 3.: A Happy Day
No ratings yet
Actividad Integradora 3.: A Happy Day
2 pages
Jake S Resume Anonymous
No ratings yet
Jake S Resume Anonymous
1 page
Devc Lecture Notes (20A54201) : I - Btech
No ratings yet
Devc Lecture Notes (20A54201) : I - Btech
218 pages
Research Document Group 2
No ratings yet
Research Document Group 2
20 pages
Skin Creator - User Guide
No ratings yet
Skin Creator - User Guide
10 pages
Wisdom Diary On Ministry
100% (2)
Wisdom Diary On Ministry
15 pages
The Effect of Colored Picture To The Students' Vocabulary Mastery
100% (2)
The Effect of Colored Picture To The Students' Vocabulary Mastery
29 pages
Anna Karenina by Leo Tolstoy
No ratings yet
Anna Karenina by Leo Tolstoy
2 pages
2nd Pui Ching
No ratings yet
2nd Pui Ching
48 pages
IGNOU Social and Political Thoughts (MPSE-004)
No ratings yet
IGNOU Social and Political Thoughts (MPSE-004)
188 pages
OET Speaking Checklist
No ratings yet
OET Speaking Checklist
2 pages
Concurrency in OOPS
No ratings yet
Concurrency in OOPS
62 pages
2020 Mod (Includes Functions) Maths Specialist Complex Test
No ratings yet
2020 Mod (Includes Functions) Maths Specialist Complex Test
8 pages
Ngữ Âm E8 (Unit 1-6)
No ratings yet
Ngữ Âm E8 (Unit 1-6)
10 pages
Java Applets
No ratings yet
Java Applets
22 pages
The Map of Love
No ratings yet
The Map of Love
17 pages
Level: M.A English Term: 2: Course Outline
No ratings yet
Level: M.A English Term: 2: Course Outline
3 pages
Jair 14714 Corr
No ratings yet
Jair 14714 Corr
35 pages