0% found this document useful (0 votes)
4 views

Study Guide

Uploaded by

jshajyh
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views

Study Guide

Uploaded by

jshajyh
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

Common Families of Distributions

Study Guide
By Yuanhao Jiang

1 Introduction
The main purpose of this chapter is to find different types of distributions and their mean and
variance and many other useful applications will be taken into consideration.

2 Discrete Distributions
Discrete distributions mean the sample space of a random variable X is countable, and mostly, X
has integer-valued outcomes.

Discrete Uniform Distribution


Definition: A random variable X has a discrete uniform (1, 𝑁) distribution if
1
𝑃(𝑋 = 𝑥|𝑁) = , 𝑥 = 1,2, … , 𝑁,
𝑁
𝑁 is a specified integer.
Mean:
𝑁+1
𝐸𝑋 =
2
Variance:
(𝑁 + 1)(𝑁 − 1)
𝑉𝑎𝑟 𝑋 =
12

Hypergeometric Distribution
Definition: A random variable has a hypergeometric distribution if
(𝑀
𝑥
)(𝑁−𝑀
𝐾−𝑥
)
𝑃(𝑋 = 𝑥|𝑁, 𝑀, 𝐾) = , 𝑥 = 0,1, … , 𝐾
(𝑁
𝐾
)
Mean:
𝐾𝑀
𝐸𝑋 =
𝑁
Variance:
𝐾𝑀 (𝑁 − 𝑀)(𝑁 − 𝐾)
𝑉𝑎𝑟 𝑋 = ( )
𝑁 𝑁(𝑁 − 1)

Binomial Distribution
Before we look into Binomial Distribution, we should consider the Bernoulli Distribution first.
Definition: A random variable has a Bernoulli distribution if
1 𝑤𝑖𝑡ℎ 𝑝𝑟𝑜𝑏𝑎𝑙𝑖𝑙𝑡𝑦 𝑝
𝑋= { 0 ≤ 0 ≤ 1.
0 𝑤𝑖𝑡ℎ 𝑝𝑟𝑜𝑏𝑎𝑏𝑖𝑙𝑖𝑡𝑦 1 − 𝑝,
Mean:
𝐸𝑋 = 𝑝
Variance:
𝑉𝑎𝑟 𝑋 = 𝑝(1 − 𝑝)

Binomial distribution is based on Bernoulli Distribution, so now let’s take a look at binomial
distribution.
Definition: A random variable has a binomial(𝑛, 𝑝) distribution if
𝑛
𝑃(𝑌 = 𝑦|𝑛, 𝑝) = ( ) 𝑝 𝑦 (1 − 𝑝)𝑛−𝑦 , 𝑦 = 0,1,2, … , 𝑛
𝑦
Mean:
𝐸𝑋 = 𝑛𝑝
Variance:
𝑉𝑎𝑟 𝑋 = 𝑛𝑝(1 − 𝑝)
MGF:
𝑀𝑋 (𝑡) = [𝑝𝑒 𝑡 + (1 − 0)]𝑛 .

Poisson Distribution
Definition: A random variable has a Poisson(𝜆) distribution if
𝑒 −𝜆 𝜆𝑥
𝑃(𝑋 = 𝑥|𝜆) = , 𝑥 = 0,1, …
𝑥!
Mean:

𝑒 −𝜆 𝜆𝑥
𝐸𝑋 = ∑ 𝑥
𝑥!
𝑥=0

𝑒 −𝜆 𝜆𝑥
= ∑𝑥
𝑥!
𝑥=1

−𝜆
𝜆𝑥−1
= 𝜆𝑒 ∑
(𝑥 − 1)!
𝑥=1

−𝜆
𝜆𝑦
= 𝜆𝑒 ∑
𝑦!
𝑦=0
=𝜆
Variance: Similar calculation to Mean.
𝑡 −1)
𝑉𝑎𝑟 𝑋 = 𝑒 𝜆(𝑒
MGF:
𝑡 −1)
𝑀𝑋 (𝑡) = 𝑒 𝜆(𝑒 .
Special relationship between Poission Distribution and Binomial Distribution: when 𝑛 is very
large and 𝑝 is small, the Poisson is approximate to binomial.

Negative Binomial Distribution


Definition: A random variable has a negative binomial(𝑟. 𝑝) distribution if
𝑥−1 𝑟
𝑃(𝑋 = 𝑥|𝑟, 𝑝) = ( ) 𝑝 (1 − 𝑝)𝑥−𝑟 , 𝑥 = 𝑟, 𝑟 + 1, …,
𝑟−1
Another form of it is
𝑟+𝑦−1 𝑦
𝑃(𝑌 = 𝑦) = ( ) 𝑝 (1 − 𝑝)𝑦 , 𝑦 = 0,1, …,
𝑦

Mean:
(1 − 𝑝)
𝐸𝑌 = 𝑟
𝑝
Variance:
(1 − 𝑝)
𝑉𝑎𝑟 𝑌 = 𝑟
𝑝2
Special relationship between Poission Distribution and Binomial Distribution: If 𝑟 → ∞ 𝑎𝑛𝑑 𝑝 →
1 such that 𝑟(1 − 𝑝) → 𝜆, 0 < 𝜆 < ∞, then
(1 − 𝑝)
𝐸𝑌 = 𝑟 → 𝜆,
𝑝
(1 − 𝑝)
𝑉𝑎𝑟 𝑌 = 𝑟 → 𝜆,
𝑝2

Geometric Distribution
Definition: Geometric Distribution is the simplest of the waiting time distributions and is a special
case of the negative binomial distribution. A random variable has a geometric distribution if
𝑃(𝑋 = 𝑥|𝑝) = 𝑝(1 − 𝑝)𝑥−1 , 𝑥 = 1,2, …,
This distribution is when 𝑟 = 1 in
𝑥−1 𝑟
𝑃(𝑋 = 𝑥|𝑟, 𝑝) = ( ) 𝑝 (1 − 𝑝)𝑥−𝑟 , 𝑥 = 𝑟, 𝑟 + 1, …,
𝑟−1

Mean:
1
𝐸𝑋 = 𝐸𝑌 + 1 =
𝑝
Variance:
1−𝑝
𝑉𝑎𝑟 𝑋 =
𝑝2
3 Continuous Distribution
In this section we will talk about some famous continuous distribution.

Uniform Distribution
Definition: The continuous uniform distribution is defined by spreading mass uniformly over an
interval [a, b]. Its pdf is given by
1
𝑓(𝑥|𝑎, 𝑏) = {𝑏 − 𝑎 𝑖𝑓 𝑥 ∈ [𝑎, 𝑏]
0 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒.

Mean:
𝑏+𝑎
𝐸𝑋 =
2
Variance:
(𝑏 − 𝑎)2
𝑉𝑎𝑟 𝑋 =
12

Gamma Distribution
Definition: The gamma(𝛼, 𝛽) distribution is defined over an interval [0, +∞). Its pdf is given by
1
𝑓(𝑥|𝛼, 𝛽) = 𝑥 𝛼−1 𝑒 −𝑥/𝛽 , 0 < 𝑥 < +∞, 𝛼 > 0, 𝛽 > 0
Γ(𝛼)𝛽 𝛼

Mean:
𝐸𝑋 = 𝛼𝛽
Variance:
𝑉𝑎𝑟 𝑋 = 𝛼𝛽 2
MGF:
1 1
𝑀𝑋 (𝑡) = ( )𝛼 , 𝑡 <
1 − 𝛽𝑡 𝛽
Special cases of the gamma distribution
𝑝
When 𝛼 = , where 𝑝 is an integer and 𝛽 = 2, then the gamma pdf becomes
2
1 𝑝
( )−1 −𝑥/2
𝑓(𝑥|𝑝) = 𝑥 2 𝑒 , 0 < 𝑥 < +∞,
Γ(𝑝/𝑥)2𝑝/2
which is the χ2 pdf with p degrees of freedom.
When 𝛼 = 1, then the gamma pdf becomes
1
𝑓(𝑥|𝛽) = 𝑒 −𝑥/𝛽 , 0 < 𝑥 < +∞,
𝛽
which is the exponential pdf with the scale parameter 𝛽.
When 𝑋~𝑒𝑥𝑝𝑜𝑛𝑒𝑛𝑡𝑖𝑎𝑙(𝛽), then 𝑌 = 𝑋1/γ has a Weibull(𝛾, 𝛽) distribution. Its pdf is given by
𝛾 𝛾−1 −𝑦 𝛾/𝛽
𝑓𝑌 (𝑦|𝛾, 𝛽) = 𝑦 𝑒 , 0 < 𝑦 < +∞, 𝛾 > 0, 𝛽 > 0
𝛽
Weibull Distribution is important for analyzing failure time data and very useful for modeling
hazard functions.

Normal Distribution (Guass Distribution)


Definition: The normal(𝜇, 𝜎 2 ) distribution is defined over an interval of (−∞, +∞). Its pdf is
given by
1 2 2
𝑓(𝑥|𝜇, 𝜎 2 ) = 𝑒 −(𝑥−𝜇) /(2𝜎 ) , − ∞ < 𝑥 < +∞
√2𝜋𝜎

Mean:
𝐸𝑋 = 𝜇
Variance:
𝑉𝑎𝑟 𝑋 = 𝜎 2

Beta Distribution
Definition: The beta(𝛼, 𝛽) distribution is defined over an interval of (0,1). Its pdf is given by
1
𝑓(𝑥|𝛼, 𝛽) = 𝑥 𝛼−1 (1 − 𝑥)𝛽−1 𝑑𝑥
𝐵(𝛼, 𝛽)
Mean:
𝛼
𝐸𝑋 =
𝛼+𝛽
Variance:
𝛼𝛽
𝑉𝑎𝑟 𝑋 = 2
(𝛼 + 𝛽) (𝛼 + 𝛽 + 1)

Cauchy Distribution
Definition: The normal(𝜇, 𝜎 2 ) distribution is defined over an interval of (−∞, +∞). Its pdf is
given by
1 1
𝑓(𝜋|𝜃) = , − ∞ < 𝑥 < +∞, −∞ < 𝜃 < +∞
𝜋 1 + (𝜋 − 𝜃)2
Mean:
𝐸|𝑋| = ∞
Variance: Doesn’t exist.
Lognormal Distribution
Definition: The lognormal𝑙𝑜𝑔𝑋~𝑛(𝜇, 𝜎 2 ) distribution is defined over an interval of (0, +∞). Its
pdf is given by
1 1 −(𝑙𝑜𝑔𝑥−𝜇)2/(2𝜎2 )
𝑓(𝑥|𝜇, 𝜎 2 ) = 𝑒 , 0 < 𝑥 < +∞, −∞ < 𝜇 < +∞, 𝜎 > 0,
√2𝜋𝜎 𝑥
Mean:
2 /2
𝐸𝑋 = 𝑒 𝜇+𝜎
Variance:
2 2
𝑉𝑎𝑟 𝑋 = 𝑒 2(𝜇+𝜎 ) − 𝑒 2𝜇+𝜎

Double Exponential Distribution


Definition: The double exponential distribution is defined over an interval of (−∞, +∞). Its pdf
is given by
1 −|𝑥−𝜇|/𝜎
𝑓(𝑥|𝜇, 𝜎) = 𝑒 , − ∞ < 𝑥 < +∞, −∞ < 𝜇 < +∞, 𝜎 > 0,
2𝜎
Mean:
𝐸𝑋 = 𝜇
Variance:
𝑉𝑎𝑟 𝑋 = 2𝜎 2

4 Exponential Families
Definition 4.1: A family of pdfs or pmfs is called an Exponential family if it can be expressed as
𝑘

𝑓(𝑥|𝜃) = ℎ(𝑥)𝑐(𝜃) exp (∑ 𝜔𝑖 (𝜃)𝑡𝑖 (𝑥))


𝑖=1
Exponential families include the continuous families—normal, gamma, and beta, and the discrete
families—binomial, Poisson, and negative binomial.
To verify that a family of pdfs or pmfs is an exponential family of not, we show know the
functions ℎ(𝑥), 𝑐(𝜃), 𝑤𝑖 (𝜃), 𝑡𝑖 (𝑥) and then the family should have the form above.

Theorem 4.2 𝐼𝑓 𝑋 𝑖𝑠 𝑎 𝑟𝑎𝑛𝑑𝑜𝑚 𝑣𝑎𝑟𝑖𝑎𝑏𝑙𝑒 𝑤𝑖𝑡ℎ 𝑝𝑑𝑓 𝑜𝑟 𝑝𝑚𝑓 𝑜𝑓 𝑡ℎ𝑒 𝑎𝑏𝑜𝑣𝑒 𝑓𝑜𝑟𝑚, 𝑡ℎ𝑒𝑛
𝑘
∂𝜔𝑖 (𝜃) ∂
𝐸 (∑ 𝑡𝑖 (𝑋)) = − 𝑙𝑜𝑔𝑐(𝜃);
∂𝜃𝑗 ∂𝜃𝑗
𝑖=1
𝑘 𝑘
∂𝜔𝑖 (𝜃) ∂2 ∂2 𝜔𝑖 (𝜃)
𝑉𝑎𝑟 (∑ 𝑡𝑖 (𝑋)) = − 2 𝑙𝑜𝑔𝑐(𝜃) − 𝐸 (∑ 𝑡𝑖 (𝑋))
∂𝜃𝑗 ∂𝜃𝑗 ∂𝜃2𝑗
𝑖=1 𝑖=1

Definition 4.3 The indicator function of a set A, most often donated by 𝐼𝐴 (𝑥), is the function
1, 𝑥 ∈ 𝐴
𝐼𝐴 (𝑥) = {
0, 𝑥 ∉ 𝐴
We can re-parameterize an exponential family, then we get
𝑘

𝑓(𝑥|𝜂) = ℎ(𝑥)𝑐 ∗ (𝜂) exp (∑ 𝜂𝑖 𝑡𝑖 (𝑥))


𝑖=1

The set 𝛨 = {𝜂 = (𝜂1 , … , 𝜂𝑘 ): ∫−∞ ℎ(𝑥) exp(∑𝑘𝑖=1 𝜂𝑖 𝑡𝑖 (𝑥)) 𝑑𝑥 < ∞} is called the natural
parameter space for the family. For the values of 𝜂 ∈ 𝐻, to ensure that the pdf integrated to 1,
∞ −1
we must have 𝑐 ∗ (𝜂) = [∫−∞ ℎ(𝑥)exp (∑𝑘𝑖=1 𝜂𝑖 𝑡𝑖 (𝑥)) 𝑑𝑥] .

Definition 4.4 A curved exponential family is a family of densities of the form for which the
dimension of the vector 𝜃 to be equal to 𝑑 < 𝑘. If 𝑑 = 𝑘, the family is a full exponential family.

5 Location and Scale Families


In this section we will talk about 3 approaches to constructing families of distributions. The 3 types
of families are called location families, scale families, and location-scale families. Each of them is
constructed by specifying a single pdf 𝑓(𝑥), called the standard pdf for the family. Then all other
pdfs in the family are generated by transforming the standard pdf in a prescribed way. We start
with a simple theorem about pdfs.

Theorem 5.1 Let 𝑓(𝑥) be any pdf and let 𝜇 𝑎𝑛𝑑 𝜎 > 0 be any given constants. Then the
function
1 𝑥−𝜇
𝑔(𝑥|𝜇, 𝜎) = 𝑓 ( )
𝜎 𝜎

Definition 5.2 Assume 𝑓(𝑥) to be any pdf. Then the family of pdfs 𝑓(𝑥 − 𝜇), indexed by the
parameter 𝜇, −∞ < 𝜇 < +∞, is called the location family with standard pdf 𝑓(𝑥) and 𝜇 is
called the location parameter for the family.

1 𝑥
Definition 5.3 Assume 𝑓(𝑥) to be any pdf. Then for any 𝜎 > 0 , the family of pdfs 𝑓(𝜎) ,
𝜎

indexed by the parameter 𝜎, is called the scale family with standard pdf 𝑓(𝑥), and 𝜎 is called
the scale parameter of the family.

Definition 5.4 Assume 𝑓(𝑥) to be any pdf. Then for any 𝜇, −∞ < 𝜇 < +∞, and any 𝜎 > 0,
1 𝑥−𝜇
the family for pdfs 𝑓( ), indexed by the parameter(𝜇, 𝜎), is called the location-scale family
𝜎 𝜎

with standard pdf 𝑓(𝑥); 𝜇 is called the location parameter and 𝜎 is called the scale parameter.

Theorem 5.5 Let 𝑓(·) be any pdf. Let 𝜇 ben any real number, and let 𝜎 be any positive real
1 𝑥−𝜇
number. Then 𝑋 is random variable with pdf 𝑓( ) if and only if there exists a random
𝜎 𝜎

variable 𝑍 with pdf 𝑓(𝑧) and 𝑋 = 𝜎𝑍 + 𝜇.


Theorem 5.6 Let 𝑍 be a random variable with pdf 𝑓(𝑧). Suppose EZ and Var Z exist. If X is a
1 𝑥−𝜇
random variable with pdf 𝑓( ), then
𝜎 𝜎

E𝑋 = 𝜎E𝑍 + 𝜇 𝑎𝑛𝑑 Var 𝑋 = 𝜎 2 Var𝑍.


In particular, if E𝑍 = 0 and Var 𝑍 = 1, then E𝑋 = 𝜇 and Var 𝑋 = 𝜎 2 .

6 Inequalities and Identities


Theorem 6.1 Let 𝑋 be a random variable and let 𝑔(𝑥) be a nonnegative function. Then for any
𝑟 > 0,
𝐸𝑔 (𝑋)
𝑃(𝑔(𝑋) ≥ 𝑟) ≤
𝑟

Theorem 6.2 Let 𝑋𝛼,𝛽 denote a gamma(𝛼, 𝛽) random variable with pdf 𝑓(𝑥|𝛼, 𝛽), where 𝛼 >

1. Then for any constants 𝑎 𝑎𝑛𝑑 𝑏,


𝑃(𝑎 < 𝑋𝛼,𝛽 < 𝑏) = 𝛽(𝑓(𝑎|𝛼, 𝛽) − 𝑓(𝑏|𝛼, 𝛽)) + 𝑃(𝑎 < 𝑋𝛼−1,𝛽 < 𝑏)

Lemma 6.3 (Stein’s Lemma) Let 𝑋~𝑛(𝜃, 𝜎 2 ), and let 𝑔 be a differentiable function satisfying
𝐸|𝑔′ (𝑋)| < ∞. Then
𝐸[𝑔(𝑋)(𝑋 − 𝜃)] = 𝜎 2 𝐸𝑔′ (𝑋).

Theorem 6.4 Let 𝑋𝑝2 denote a chi squared random variable with 𝑝 degrees of freedom. For any
function ℎ(𝑥),
2
ℎ(𝑋𝑝+2 )
𝐸ℎ(𝑋𝑝2 ) = 𝑝𝐸 ( 2 )
𝑋𝑝+2

Theorem 6.5 (Hwang) Let 𝑔(𝑥) be a function with −∞ < 𝐸𝑔(𝑋) < ∞ and −∞ < 𝑔(−1) <
∞. Then:
a. 𝐼𝑓 𝑋~𝑃𝑜𝑖𝑠𝑖𝑜𝑛(𝜆),
𝐸(𝜆𝑔(𝑋)) = 𝐸(𝑋𝑔(𝑋 − 1)).
b. 𝐼𝑓 𝑋~𝑛𝑒𝑔𝑎𝑡𝑖𝑣𝑒 𝑏𝑖𝑛𝑜𝑚𝑖𝑎𝑙(𝑟, 𝑝),
𝑋
𝐸((1 − 𝑝)𝑔(𝑋)) = 𝐸 ( 𝑔(𝑋 − 1))
𝑟+𝑋−1

You might also like