15 Discrete Distributions
15 Discrete Distributions
Special Distributions (Discrete and Continuous) MIT 14.30 Spring 2006 Herman Bennett
15
Discrete Distributions
We have already seen the binomial distribution and the uniform distribution.
15.1
Hypergeometric Distribution
Let the RV X be the total number of successes in a sample of n elements drawn from a population of N elements with a total number of M successes. Then, the pmf of X , called hypergeometric distribution, is given by: M N M f (x) = P (X = x) =
x x Nn n
for x = 0, 1, ..., n.
(40)
Caution: These notes are not necessarily self-explanatory notes. They are to be used as a complement
Herman Bennett
15.2
The binomial distribution counts the number of successes in a xed number of trials (n). Suppose that, instead, we count the number of trials required to get a xed number of successes (r). Let the RV X be the total number of trials required to get r successes. The pmf of X , called negative binomial distribution, is given by: x1 r f (x) = P (X = x) = p (1 p)xr r1
for x = r, r + 1, r + 2, ...
(41)
15.3
Poisson Distribution
A RV X is said to have a Poisson distribution with parameter ( > 0) if the pmf of X is: X P () : f (x) = P (X = x) = e x x! for x = 0, 1, 2, ... (42)
Herman Bennett
If X1 and X2 are independent RVs that have a Poisson distribution with means 1 and 2 , respectively, then the RV Y = X1 + X2 has a Poisson distribution with mean 1 + 2 (function of RVs, Lecture Note 5).
x
Note:
x=0
f (x) = e
x! x=0
= e
= 1.
The Poisson distribution is not derived from a natural experiment, as with the two previous distributions.
Example 15.2. Assume the number of customers that visit a store daily is a random variable distributed Poisson(). It is known that the store receives on average 20 customers per day, so = 20. What is the probability i) that tomorrow there will be 20 visits? ii) that during the next 2 days there will be 30 visits? iii) that tomorrow before midday there will be at least 7 visits?
A Poisson process with rate per unit time is a counting process that satises the following two properties: i) The number of arrivals in every xed interval of time of length t has a Poisson distribution for which the mean is t. ii) The number of arrivals in every two disjoint time intervals are independent.
Poisson process: Use t when your experiment covers t units. Example 15.3. Answer Example 15.2 assuming now that the number of customers that visit a certain store follows a Poisson process (with the same average of 20 visits per day).
Poisson v/s binomial approach. As n , p 0, and np , the limit of the binomial distribution Poisson distribution.
Herman Bennett
16
Continuous Distributions
16.1
Normal Distribution
A RV X is said to have a Normal distribution with parameters and 2 ( 2 > 0), if the pdf of X is: X N (, 2 ) : 1 2 2 f (x) = e(x) /(2 ) , 2 for < x < (43)
Why is the Normal distribution so important? 1. The Normal distribution has a familiar bell shape. It gives a theoretical base to the empirical observation that many random phenomena obey, at least approximately, a normal probability distribution: The further away any particular outcome is from the mean, it is less likely to occur; this characteristic is symmetric whether the deviation is above or below the mean. Examples: height or weight of individuals in a population; error made in measuring a physical quantity; level of protein in a particular seed; etc.
Herman Bennett
2. The Normal distribution gives a good approximation to other distributions, such as the Poisson and the Binomial. 3. The Normal distribution is analytically much more tractable than other bell shape distributions. 4. Central limit theorem (more on this later in LN7). 5. The Normal distribution is very helpful to represent population distributions (linked to point 1).
Graphic properties. 1. Bell shape and symmetric. 2. Centered in the mean (), which coincides with the median. 3. Dispersion/atness only depends on the variance ( 2 ). 4. P ( < X < + ) = 0.6826 5. P ( 2 < X < + 2 ) = 0.9544 , 2 ! , 2 !
If X N (, 2 ), then the RV Z = (X )/ is distributed Z N (0, 1). This distribution, N (0, 1), is called standard normal distribution, and sometimes its cdf is denoted FZ (z ) = (z ).
The cdf of the normal distribution does not have an analytic solution and its values must be looked up in a N (0, 1) table (see attached table).
Herman Bennett
H=
n i=0
n n 2 ai Xi + bi N ( ai i + bi , a2 i i ). i=0 i=0
(44)
Example 16.1. Using the tools developed in Lecture Note 5, derive the distribution of Z = (X )/ as a transformation of the RV X N (, 2 ).
Herman Bennett
Example 16.3. Assume that the RV X has a normal distribution with mean 5 and standard deviation 2. Find P (1 < X < 8) and P (|X 5| < 2).
Example 16.4. Assume two types of light bulbs (A and B). The life of bulb type A is distributed normal with mean 100 (hours) and variance 16. The life of bulb type B is distributed normal with mean 110 (hours) and variance 30. i) What is the probability that bulb type A lasts for more than 110 hours? ii) If a bulb type A and a bulb type B are turned on at the same time, what is the probability that type A lasts longer than type B? iii) What is the probability that both bulbs last more than 105 hours?
Herman Bennett
The binomial distribution can be approximated with a normal distribution. Rule of thumb: min(np, n(1 p)) 5.
16.2
LogNormal Distribution
If X is a RV and the ln(X ) is distributed N (, 2 ), then X has a lognormal distribution with pdf (RV transformation):
f (x) =
1 1 (ln(x))2 /(22 ) e , 2 x
(45)
and
If X N (, 2 ), then eX LnN (, 2 ).
16.3
Gamma Distribution
A RV X is said to have a gamma distribution with parameters and (, > 0) if the pdf of X is: 1 x1 ex/ , () () =
0
f (x) = where,
(46)
x1 ex dx nite if > 0.
Var(X ) = 2
Assume a Poisson process. Let Y have a Poisson distribution with parameter . Denote X as the waiting time for the rth event to occur. Then, X is distributed gamma with parameters = r and = 1/.
16.4
Exponential Distribution
A RV X is said to have an exponential distribution with parameter ( > 0) if the pdf of X is: 1 x/ e ,
(47)
E (X ) =
and
Var(X ) = 2
16.5
Chi-squared Distribution
A RV X is said to have an chi-squared distribution with parameter p > 0 (degrees of freedom) if the pdf of X is:
X 2 ( p) :
f (x) =
(48)
10
Var(X ) = 2p
If Y N (0, 1), then the RV Z = Y 2 is distributed: Z = Y 2 2 (1) (random variable transformation.) (49)
H = X1 + X2 2 (p+q )
(50)
Extensively used in Econometrics. Concept of single distribution vs. family of distributions (indexed by one or more parameters).
16.6
A bivariate random vector (X1 , X2 ) is said to have a bivariate normal distribution if the pdf of (X1 , X2 ) is: f (x1 , x2 ) = 21 2 1 eb/(2(1
2 ))
1 2
(51)
= 0 X1 and X2 independent (only in the normal case) fX1 ,X2 (x1 , x2 ) = fX1 (x1 )fX2 (x2 ).
11