Prob Distributions
Prob Distributions
Probability Distributions
2B.1. Introduction
We know that the result of a random experiment is called an outcome. The outcome of a trial is
being determined by the chance factor. The all possible outcomes of an experiment are, therefore,
chance outcomes. Usually we are not interested in the outcomes of an experiment as such. It is
often more useful to describe a particular property or an attribute of an outcome in numerical terms.
Probability distributions, as the name suggests, is the listing of all possible outcomes of an
experiment together with their probabilities.
x: 0 1 2
P(x): 1 1
2
1
4 4
This presentation is called a discrete probability distribution (probability mass function).
Thus probability mass function (pmf) a random variable X takes values x1, x2, …, xn is defined
as f(x) = P({X=x}), x = x1, x2, …, xn.
Properties of pmf
1. f(x) 0 for all x (probabilities non-negative)
2. f ( x) =1 ( Total probability is 1)
x
Here the total frequency is 20. When divide the frequencies by total frequency, we get
relative frequencies. These relative frequencies are the probabilities of the observations. So, it is
possible to find the mean of a probability distribution in the same way as the simple mean can be
found from a frequency distribution.
We know that mean of the values x1, x2, …, xn with frequencies f1, f2, …, fn is:
x 1f1 + x 2 f 2 + . + x n f n
x = , where N is the total frequency.
N
f1 f f
= x1 + x2 2 + …. + xn n
N N N
= x1p1 + x2 p2 + …. + xnpn
= E(X)
Thus E(X) is the arithmetic mean of the distribution.
2 5 10 2 1
The mean of the above distribution is E(X) = 1 + 2 + 3 + 4 + 5
20 20 20 20 20
1 2 + 2 5 + 3 10 + 4 2 + 5 1
=
20
= 2.75
Similarly, E(X2) = x12 p1 + x22 p2 + …. + xn2 pn and Variance of X = E(X2) – E(X)2
2 5 10 2 1
For the above distribution, E(X2) = 12 + 22 + 32 + 42 + 52
20 20 20 20 20
= 8.45
So, Variance of X = 8.45 – (2.75)2
= 0.8875
2B.5. Standard Distributions
Some populations may have common characteristics and they have the same probability structure
or probability distribution. There are several probability distributions are defined to describe
different populations. We shall, therefore, now describe some discrete probability distributions
applicable to many real life problems.
1. Binomial Distribution
The binomial distribution was derived by the Swiss Mathematician Jacob Bernoulli. It
deals with populations whose members can be divided into two categories with reference to
the presence or absence of a particular characteristic.
For example, the items produced by a manufacturing process can either defective or non-
defective; the sex of new born child is either male or female; a factory worker may be
educated or uneducated, and so on.
Any random experiments that satisfy the following properties are called Bernoulli trials.
1. There are only two mutually exclusive and collectively exhaustive outcomes in the
experiment.
2. In repeated trials, the probabilities of occurrence of the events remain constant.
3. The trials are independent.
Consider a random experiment (trial), which has only two outcomes, say, success and failure.
Let p be the probability of success. Then 1-p = q be the probability of failure. Let the trial be
repeated n times and the trials are independent. Let X be the number of successes happens
out of n trials. The X follows the binomial distribution with parameters n and p. The
probability mass function of X is defined as:
Example 1: A machining process produces 90% good items. Write the probability mass
function of the number of good items in 10 units produced from this process.
Solution: Here the items produced may be either good or defective. Given that probability of
a good item produced (p) is 0.9 and hence q is 1- .9 = 0.1. Also n= 10. So, the pmf is f(r) = 10Cr
(0.9)r(0.1)10-r
Example 2: Assuming the probability of a male birth is 1 , find the probability that a family
2
of 3 children will have a) at least one girl, b) two boys and one girl, and c) at the most two
girls. Also find the expected number of families with a), b), and c) out of 300 families with 3
children.
Solution: Let X be the number of girls out of three children. Then X follows binomial
distribution with parameters n=3 and p= 1 .
2
3 r -r
So, f(r) = Cr (0.5) (0.5)3
= 3Cr (0.5)3, r= 0,1,2,3.
a) P(at least one girl) = P(X 1)
= 1 - P(X=0)
= 1 – f(0)
= 1 - 3C0 (0.5)3
= 1 – 0.125 = 0.875
b) P(two boys and one girl) = P(X=1)
= 3C1 (0.5)3
= 0.375
c) P(at most two girls) = P(X 2)
= 1 – P(X=3)
= 1 - 3C3 (0.5)3
= 1 – 0.125
= 0.875
Expected number of families with at least one girl = 300 P(at least one girl)
= 300 0.875
263
Similarly, expected number of families with two boys and one girl = 300 0.375
113
And expected number of families with at most two girls = 300 0.875
263
Example 3: The incidence of occupational disease in an industry is such that the workers have a 20% chance of suffering from it. What is
the probability that out of six workers, 4 or more will contact the disease?
2. Poisson Distribution
The Poisson distribution was derived by Simeon Poisson in 1837. This distribution is useful in
cases of rare events ( i.e, probability is very small). Rare events are common in all fields. For
example, the number of defective articles produced by a high quality machine; in quality control,
the number of defects per item is to be counted; in insurance, the number of occurrence of deaths
and accidents in a specific time period or region is to be considered and so on. Poisson distribution
is a limiting form of binomial distribution as n moves towards infinity and p moves towards zero
but mean=np remains constant. That is, when p is very small and the value of n is very large, the
Poisson distribution is more appropriate than the binomial distribution.
The probability distribution of the Poisson distribution with parameter ‘m’ is given by f(x) =
e −m m x
, x = 0, 1, 2, …
x!
Mean and Variance of Poisson distribution
Mean = m
Variance = m
Example 4: An inspection of a random sample of 100 pages printed by a press revealed 20 printing
errors. Find the probability that a page contains:
a) less than 3 errors
b) at the most 3 errors
c) at least 3 errors
d) exactly 3 errors
Solution: Let X be the number of printing mistakes per page. Then X follows Poisson
20 e −0.2 (0.2) x
distribution with parameter m = = 0.2. So, f(x) = , x = 0, 1, 2, …
100 x!
3. Normal distribution
Normal distribution is continuous distribution and is most important of all the theoretical
distribution. Discrete distributions relate to discrete random variables involving count data
expressed in discrete numbers such as 0, 1, 2, …. But in continuous distribution, the variable of
interest may take any value within a given range. That is, measured data like, height, weight,
temperature etc. are follows continuous distributions. In a continuous distribution, the probability
of random variable take a particular value is zero. For example, the probability that a student has
weight exactly 55.5 kg is negligible because the weights close to 55.5 kg in the specified range are
so innumerable that it is difficult to distinguish this particular weight from other weights in close
proximity. That is, if X follows a continuous distribution, then P(X=x) = 0.
The probability density function of the normal distribution with parameters and σ is given by
( x− )2
1 −
f(x)= e 2 2
,- < x< ; - < < ; and is denoted by N( , ).
2
95.45%
99.73%
The standard normal distribution is a normal distribution with mean μ=0 and standard deviation σ
=1. The standard normal distribution is very useful because normal distributions with mean μ and
standard deviation σ can be converted into a standard normal by change of origin and scale. This
becomes necessary as otherwise in different normal distributions with different values of mean
and standard deviation, it would be very difficult to find out the area between various ordinates of
the normal curve.
Suppose X is a normal variable with mean μ and standard deviation σ. Then the variable Z =
X −
is standard normal. The values obtained by this transformation are called Z scores or
standard scores.
0 z
The other type of tables give the area between z = 0 and and any other positive z-value as given
below.
0 z
As the normal curve is perfectly symmetrical with 50% of the area on either side of the
maximum ordinate at its mean, the areas given by first type tables when subtracted from 0.5 will
give areas offered by second type.
Example 6: A normal variable has a mean of 10 and a standard deviation of 5. What is the
probability that the normal variable will take a value in the interval 0.2 to 19.8?
Solution: The required probability is P(0.2 < X < 19.8). Given mean = 10 Standard deviation = 5
0.2 − 10 X − 10 19.8 − 10
P(0.2 < X < 19.8) = P( < < )
5 5 5
= P(-1.96<Z<1.96)
= 0.95
( From the normal table, we find that area between z = 0 and z = 1.96 is 0.475. By symmetry
property of normal distribution, area in between z = -1.96 and z = 0 is same as the area in between
z = 0 and z = 1.96. Thus area in between z = -1.96 and z = 1.96 is 0.95 )
0.475 0.475
-1.96 0 1.96
Solution: Let X be the time required to complete the task. Given X is normally distributed with
mean 15 and standard deviation 3.
a) The required probability = P(X<8)
8 − 15
= P(Z< )
3
= P(Z< -2.33)
= 0.5 – P(0<Z<2.33)
= 0.5 – 0.4901 = 0.0099
b) The required probability = P(X>9)
9 − 15
= P(Z> )
3
= P(Z> -2)
= 0.5+ P(0<Z<2)
= 0.5 + 0.4772 = 0.9772
c) The required probability = P(10<X<12)
= P(-1.67<Z<-1)
= P(0<Z<1.67)-P(0<Z<1)
= 0.4525 –0.3413 = 0.1112
Example 8: An air lines company has the policy of employing only Indian women whose height
is between 62 inches and 69 inches. If the height of Indian women is approximately normal with
mean 64 inches and standard deviation 3 inches. Out of 1000 applicants find the number of
applicants that would be a) too tall b) too short c) of acceptable height.
Solution: Let X be the height of Indian women. Given mean = 64 and standard deviation is 3.
62 − 64
When X = 62, Z = = -0.67
3
69 − 64
When Z= 69, Z = = 1.67
3
a) P(an applicant is too tall ) = P(Z>1.67)
= 0.5 – P(0<Z<1.67)
= 0.5-.4525 = 0.0475
So, the number of too tall applicants = 1000 0.0475 = 48
2B.6. Exercises
8. The diameter of ball bearings are normally distributed with a mean of 2.42inches and a
standard deviation of 0.01 inches, Determine the percentage of ball bearings with
diameter:
a) between 2.4 and 2.43 inches
b) greater than 2.43 inches
c) less than 2.39 inches
Reference
1.. Nabendu Pal, Sahadeb Sarkar(2005), Statistics: Concepts and Applications, Prentice- Hall of
India, New Delhi.
2. Jit S Chandan (2003), Statistics for Business and Economics, Vikas Publishing House, New
Delhi.
3. R.P. Hooda (1994), Statistics for Business and Economics, Macmillan India Ltd.