0% found this document useful (0 votes)
77 views32 pages

Random Variable: The Term Random Variable Is Widely Used in Statistics. A Practical

A random variable is a value that can take on different possible outcomes in a random experiment, with each outcome having an associated probability. Some key examples of random variables include the number obtained when rolling a die, or the number of heads when flipping a coin. The probability distribution of a random variable systematically arranges the probabilities of each possible value it can take. The expected value or mathematical expectation of a random variable is the weighted average of all possible values, with the probabilities as weights. Expected value is useful for analyzing situations with uncertain outcomes.

Uploaded by

Ravi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
77 views32 pages

Random Variable: The Term Random Variable Is Widely Used in Statistics. A Practical

A random variable is a value that can take on different possible outcomes in a random experiment, with each outcome having an associated probability. Some key examples of random variables include the number obtained when rolling a die, or the number of heads when flipping a coin. The probability distribution of a random variable systematically arranges the probabilities of each possible value it can take. The expected value or mathematical expectation of a random variable is the weighted average of all possible values, with the probabilities as weights. Expected value is useful for analyzing situations with uncertain outcomes.

Uploaded by

Ravi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 32

Random Variable: The term random variable is widely used in statistics.

A practical
definition of random variable is “the value of the observations in an experiment”. In
other words, by a random variable, we mean a real number X associated with the
outcomes of the random experiment. It can take any one of the various possible values
each with a definite probability. For example, in a throw of a die, if X denotes the
number obtained then X is a random variable, which can take any one of the values 1, 2,
3, 4, 5 or 6 each with equal probability 1/6. Similarly, in a toss of a coin, if X denotes
the number of heads, then X is a random variable which can take any one of the two
values 0 (no head i.e. tail) or 1 (i.e. head) each with equal probability 1/2 .
Let us consider a random experiment of three tosses of a coin (or three coins
tossed simultaneously), then the exhaustive number of cases (total number of outcomes)
or sample space S consists of 23 = 8 points as shown below.
S = (H,T) x (H,T) x (H,T) = (HH, HT, TH, TT) x (H,T)
= (HHH, HTH, THH, TTH, HHT, HTT, HTT, THT, TTT)
If we consider the variable X, which is the number of heads obtained, then X is a
random variable, which can take any one of the values 0, 1, 2, 3. This can be presented
as below.
Outcome: HHH HTH THH TTH HHT HTT THT TTT
Value of X: 3 2 2 1 2 1 1 0
Thus random variable may be defined as a real values function on the sample
space. In other word, random variable is a function which takes real value, which is
determined by the outcomes of a random experiment.

Probability Distribution of a Random Variable


The concept of probability distribution is analogous to that of frequency
distribution. As frequency distribution is the summarization of data into tabular form, a
theoretical probability distribution is also a systematic arrangement of probability value
associated with the outcome of the experiment. Probability distribution tells us how
total probability of 1 is distributed among different values, which the random variable
can take. It is usually represented in a tabular given below:

x x1 x2 x3 ………………..xn
p(x) p1 p2 p3 ……………… pn
Let us consider a random experiment of three tosses of a coin (or three coins are tossed
simultaneously, then the exhaustive number of cases (total number of possible
outcomes) is given by
S = (H,T) x (H,T) x (H,T) = (HH, HT, TH, TT) x (H,T)
= (HHH, HTH, THH, TTH, HHT, HTT, HTT, THT, TTT)
Obviously, if we consider head as success, X is a random variable, which can take the
values 0, 1, 2 or 3. Then the probability distribution of the number of heads is given as
below:

1
No. of heads Favorable No. of favorable Probability
(x) events cases p(x)
0 TTT 1 1/8
1 TTH, HTT, 3 3/8
2 THT 3 3/8
3 HTH, THH, 1 1/8
HHT
HHH

Example: A die is tossed twice. Getting “an odd number” is termed as success. Find the
probability distribution of the number of successes

Solution: Since the favorable number of cases to getting an odd number in a throw of a
die are (1, 3, 5) i.e. 3 in all.
3 1
Hence, probability of success (S) = 
6 2
.

1 1
And, probability of failure (F) = 1  
2 2
If X denotes the number of successes in two throws of a die, then X is a random
variable which takes the values 0, 1, 2
1 1 1
P(X=0) = P(F in the 1st throw and F in the 2nd throw) = P (FF) = P(F).P(F) = . 
2 2 4
1 1 1 1 1
P(X=1) = P(S and F) + P(S and F) = P(S).P(F) + P(S).P(S) = .  . 
2 2 2 2 2
1 1 1
P(X=2) = P(Sand S) = P(S).P(S) = . 
2 2 4
Hence the probability distribution of X is given by
-------------------------------------
x: 0 1 2
-------------------------------------
p(x): ¼ ½ ¼
-------------------------------------

Exercise: Obtain the probability distribution of X, the number of heads in three tosses
of a coin (or a simultaneous toss of three coins).

Mathematical Expectation: The mathematical expectation, also called the expected


value of a random variable is the weighted arithmetic mean of the variable; the weights
are being the respective probabilities of the values that the random variable can possibly
assume.

2
Thus, if X is a random variable, which can take the values x 1, x2, ……..xn with
respective probabilities p1, p2, ……..pn, where  pi 1 , then the mathematical
expectation of X denoted by E(X) is defined as
E(X) = p1x1 + p2x2 + …………+ pnxn =  pi xi
Where,  p  p1  p 2  ............  p n 1
More precisely, if X is a random variable with probability distribution {x, p(x)}, then
E(x) =  x. p( x)
Illustration 1: If it rains, a dealer of raincoats earns Rs 4000 per day and if it is fair, he
may loose Rs, 800 per day. What is his expectation, if the probability of fair day is 0.6?
Solution: Rainy day Fair day
Profit/loss- Rs.4000(x1) - Rs.800(x2)
Probability (p)- 0.4(p1) 0.6(p2)
 Expected Profit, E(X) =  x. p( x)  x1 p1  x 2 p 2
= 4000 0.4 - 800 0.6 = 1600 – 480 = Rs.1120
Illustration 2: Find the expected sales of Toyota car in Kathmandu city in a week from
the following information.
Day: Sun Mon Tue Wed Thu Fri
Sales (Rs): 90 60 90 50 75 55
Probability: 0.25 0.18 0.12 0.05 0.20 0.20
Solution: Expected Sales of Toyota car in Kathmandu is
E(X) =  x. p( x)  x1 p1  x 2 p 2  x3 p3  x 4 p 4  x5 p5  x6 p6
= 90 0.25  60 0.18  90 0.12  50 0.05  75 0.20  55 0.20
=22.5+10.8 + 10.8 + 2.5 + 15 + 11 = 72.6
Illustration 3: A consignment is offered to two firms X and Y for Rs.100,000. The
following table shows the probability at which the firm will be able to sell it at different
prices:
Selling price 80,000 90,000 105,000 110,000
(Rs)
Probability of X 0.1 0.4 0.4 0.1
Probability of Y 0.25 0.20 0.50 0.05
Which firm X or Y will be more inclined towards the offer?
Solution: Expected selling price for firm X is
E(X) = 80000 0.1  90000 0.4  105000 0.4  110000 0.1 97000
Expected selling price for firm Y is
E(Y) = 80000 0.25  90000 0.20  1050000.50  110000 0.05 114000
Since firm X is nearer to the offer 100000, firm X will be more inclined towards the
offer.
Exercise: A manufacturing company is faced with the problem of choosing four
products to manufacture. The potential demand for each product may turn out to be
good, satisfactory and poor. The probabilities estimated for each type of demand are
given below:

3
Product Probabilities of demand
Good Satisfactor Poor
y
A 0.60 0.20 0.20
B 0.75 0.15 0.10
C 0.60 0.25 0.15
D 0.50 0.20 0.30

The estimated profit or loss under the different states of demand in respect to each
product may be taken as
Product Demand in Rs.
Good Satisfactor Poor
y
A 40,000 10,000 11,000
B 40,000 20,000 - 7000
C 50.000 15.000 - 8000
D 40.000 18,000 15.000
Prepare expected value (profit) for each product and advice the company about the
choice of the products to manufacture.

Theoretical Probability Distribution:


The statistical measures like the averages, dispersion, skewness, kurtosis,
correlation etc for the sample frequency distribution give us not only the nature and
form of the sample data, but also help us in formulating certain ideas about the
characteristics of the population. However, a more scientific way of drawing inferences
about the population characteristics is through the study of the theoretical distribution.
In the population, the values of the variable may be distributed according to some
definite probability law which can be expressed mathematically and the corresponding
probability distribution is known as theoretical probability distribution.
As random variable can take discrete or continuous values, the probability
distributions are also of two type namely discrete and continuous distribution. Binomial
and Poisson distribution are the important discrete distribution and the normal
distribution is the continuous probability distribution.

Binomial distribution: Binomial distribution is one of the widely used probability


distribution of a discrete random variable. It is also known as the Bernoulli distribution
after the name James Bernoulli. Many experiments consist of repeated independent
trials, each trial having only two possible complementary outcomes. For example, the
possible outcomes of a trial may be head or tail, success or failure, right or wrong alive
or dead, good or defective, infected or not infected and so on. If the probability of each
outcome remains the same throughout the trials, then such trials are called the Bernoulli
trials and the experiment having the Bernoulli trials is called binomial experiment. In

4
other words, an experiment is called a binomial experiment if it possesses the following
four conditions:
i) The random experiment is performed repeatedly a finite of fixed number of times
say n times.
ii) The outcome of the random experiment can be classified into two mutually
disjoint categories -successes (the occurrence of the event) and failure (the non-
occurrence of the event)
iii) All the trials are independent i.e. the result of any trial is not affected by the
preceding trials and does not affect the result of succeeding trials.
iv) The probability of success in any trial is denoted by p and probability of failure by
q and is constant for each trial.
Thus, if X denotes the number of successes in n trials of a binomial experiment
satisfying the above conditions, then X is a random variable, which can take any one of
the (n+1) integer values 0, 1, 2,………n, then the probability of r successes out of n
trials is given by
p(r) = P(X = r) = ncr prqn-r, r = 0, 1, 2, 3, ………..n
Where, q = 1 – p; the probability of failure on each trial. The binomial probability
distribution has two parameters n and p.
Putting r = 0, 1, 2, 3, ……….n, we get the probability of 0,1, 2,…….n successes
respectively in n trials, which are given below:
r p(r) = P(X = r ) = ncr prqn-r
------------------------------------------------------------
0 p ( 0) = P (X = 0) = nc0 p0qn-0 = qn
1 p (1) = P (X = 1) = nc1 p1qn-1
2 p (2) = P (X = 2) = nc2 p2qn-2
. …… ………. …………….
. ……. ………. …………….
. ……. ………. ……………
n p (n) = P (X = n) = ncn pnqn-n = pn
-------------------------------------------------------------
Which are the successive terms in the binomial expansion of (q + p ) n. As these
probabilities are the terms of binomial expansion, it is called binomial distribution and
there are (n + 1) outcomes including the outcome r = 0.
Note that the total probability is unity i.e. 1.
 p(r )  p(0)  p(1)  p(2)  ..........  p(n)
q n  n c1 p 1 q n  1  n c 2 p 2 q n  2  ...................  p n (q  p ) n 1
The mean of the binomial distribution = np
and variance = npq
Example 1: Four unbiased coins are tossed simultaneously. What is the probability of
getting
i) exactly 3 heads
ii) at least one head
Solution: i) Let r be the number of heads, then
5
p=½, q=½
P (X=3 ) = ?
We have, p( r) = P (X = r) = ncr prqn-r
Therefore, P(X=3) = 4c3 (1/2)3(1/2)4-3 = 4.1/16 = ¼.
Example 2: The average percentage of failures in a certain examination is 40. What is
the probability that out of a group of 6 candidates, at least 4 passed in the examination.
Here, n= 6, q = 40% = 0.4, p = 1- q = 1 – 0.4 = 0.6
p ( r  4) =?
p ( r  4) = p(4) + p(5) + p(6)
= c4(0.6) (0.4)2 +6c5(0.6)5(0.4)1+6c6(0.6)6(0.4)0 = 15  0.1296  0.16 + 6x 0.07776 
6 4

0.4+0.046656
= 0.311040+0.186624 + 0.046656 = 0.544320
Example 3: The incidence of occupational disease in a industry is such that the workers
have 20% chance of suffering from it. What is the probability that out of 6 workers, 4 or
more will contract the disease?
Here, n = 6, p = 20% = 20/100 = 1/5
q = 1 – p = 1 – 1/5 = 4/5
p(r 4) = ?
p(r 4) = p(4) + p(5) + p(6) = 6c4(1/5)4(4/5)2 +6c5(1/5)5(4/5)1+6c6(1/5)6(4/5)0
6 6
1
   6
 =  15  15 16  6 4  1
c 4 4 2  6 c 5 4 6 c 6 1
 5  
1 265
= 15625  240  24  1 = 15625 = 0.01696

Exercise:. What are the chances of getting any combination i.e. 2 boys, 2 girls or 1boy
and 1 girl when the number of pregnancies is 2? Assume equal probability for boys and
girl.
Exercise: In a box containing 81 screws, 27 were defective. What is the probability
that out of a sample of 6 screws i) all defectives ii) no defectives iii) at least one
defective iv) exactly 3 defectives v) at most two defectives.
Exercise: In one region of the country, 20% of the children under 6 years of age were
found to be severely malnourished i.e. in grade III and IV. If 4 children were selected at
random, what is the probability of 0, 1 and 2 children being severely malnourished?

Fitting of Binomial Distribution: If a random experiment consisting of n trials is


repeated N times satisfying the conditions of binomial distribution, then the frequency
of r successes is given by
f (r) = N. p(r) = N. ncr prqn-r
Putting r = 0, 1, 2, ………n, we can get the expected or theoretical frequencies of the
binomial distribution.

r f(r) = N. p (r) = N. ncr prqn-r


6
------------------------------------------------------------
0 f(0) = N. p (0) = N. nc0 p0qn-0 = N.qn
1 f(1) = N. p (1) = N. nc1 p1qn-1
2 f(2) = N. p (2) = N. nc2 p2qn-2
. …… ………. …………….
. ……. ………. …………….
. ……. ………. ……………
n f(n)= N. p (n) = N. cn pnqn-n = N. pn
n

-------------------------------------------------------------
If p the probability of success is known, then the frequencies of r successes can
be calculated very easily. However, if p is not known and if we want to fit a binomial
distribution to a given frequency distribution, first we find the mean of the given
= 
fx
frequency distribution by the formula x and equate it to np, which is the mean
N
of the binomial probability distribution. Hence, p can be estimated by the following
relation.
x
x = np  p = n
, then q = 1-p
With these values of p and q the expected or theoretical binomial frequencies can be
obtained by using the above formula.
Example 4: Fit a binomial distribution to the following data:
x: 0 1 2 3 4
f: 28 62 46 10 4

x f fx
----- ---- ------ x =
 fx =
200 4
=
N 150 3
0 28 0
1 62 62 4
2 46 92 x = np or = 4p  p= 1/3 and q = 1 – p =
3
3 10 30 2/3
4 4 16
----- ----- ------
N= 150 200

We have, f ( r ) = N. ncr prqn-r


Putting r = 0,1,2,3,4, we get the expected frequencies.
f ( 0 ) = 150x 4c0 (1/3)0 (2/3)4 = 150 x 16/81 = 29.63 = 30
f ( 1 ) = 150x 4c1 (1/3)1 (2/3)3 = 150x 0.3951= 59.26 = 59
f ( 2 ) = 150x 4c2 (1/3)2 (2/3)2 = 150x 0.2963 = 44.44 = 44
f ( 3 ) = 150x 4c3 (1/3)3 (2/3)1= 150x 0.0988 = 14.81 = 15
f ( 4 ) = 150x 4c4 (1/3)4 (2/3)0= 150x 0.0123 = 1.85 = 2
Hence the fitted distribution is
x: 0 1 2 3 4 Total
7
f : 30 59 44 15 2 150
Exercise: 8 coins are tossed at a time 256 times. Find the expected frequencies of
successes (getting a head) and tabulate the result obtained. Also obtain the values of the
mean and standard deviation of the theoretical (fitted) distribution.
Exercise: The probability of a bomb hitting a target is 1/5. Two bombs are enough to
destroy a bridge. If six bombs are aimed at the bridge, find the probability that the
bridge is destroyed.
(Hints: The bridge will be destroyed if at least two bombs hit it.)

Poisson Distribution: Poisson distribution was derived in 1837 by a French


Mathematician Simeon D. Poisson. Poisson distribution is a limiting form of binomial
probability distribution may be used under the following conditions:
(i) n, the number of trials is indefinitely large i.e. n
(ii) p, the probability of success for each trial is indefinitely small i.e. p0
(iii) np = m (say) is finite.
Let the random variable X, which satisfies the above conditions, then the
probability density function of Poisson distribution is given by
e mmr
p( r ) = P ( X=r) =
r!
where, X is the number of successes (occurrence of the events),
m = np is the mean number of occurrence called the parameter,
e= 2.7183 is the exponential constant,
r= Number of successes.
Poisson distribution describes the behavior of the rare events such as the number
of accidents in a road, number of printing mistakes in a book, number of earthquakes in
a year etc. All these explain the behavior of the discrete random variables, where the
probability of occurrence of the events is very small and total number of possible cases
is sufficiently large.
Poisson distribution is a good approximation of binomial distribution when the
number of trial is greater than or equal to 20 and the probability of success is less than
or equal to 0.05 (5 percent). Hence this distribution is usually appropriate when n 20
and p  0.05.
Remark: In Poisson distribution, mean and variance = m.
Example 1: It is known from the past experience that in a certain plant there are on the
average 4 industrial accidents per year. Find the probability that in a given year there
will be less than 4 accidents. Assume Poisson distribution. (e-4= 0.0183)
Solution: Given, m = 4, P(X<4) = ?
e mmr
We have, p( r ) = P ( X=r) =
r!
e  4 40 e  4 41 e 4 42 e  4 43
P(X<4) = P(X=0) + P(X=1) + P(X=2) + P(X=3) = + + +
0! 1! 2! 3!

8
 16 64   16 64 
= e  4 1  4 
2
  = e  4 1  4 
6 2
 =
6
0.0183(1+4+8+10.67) = 0.0183 23.67 =
  
0.4332
Example 2: If 5% of the electric bulbs manufactured by a company are defective, use
Poisson distribution to find the probability that in a sample of 100 bulbs (i) none is
defective (ii) 5 bulbs will be defective (Given e-5= 0.007) .
5
Solution: Probability that the electric bulb being defective, p = 100
= 0.05, n = 100
Now, m = np = 0.05 100 = 5
e mmr
We have, p( r ) = P ( X=r) =
r!
e  5 50
(i) P(X = 0) = =0.007
0!
e  5 55 5 5 5 5 5
(ii) (ii) P(X = 5) = = 0.007 1 2 3 4 5 =0.1823
5!
Example 3: The quality control manager of Marlyin’s Cookies is inspecting a batch of
chocolate-chip that has just been baked. If the production process is in control, the
average number of chip parts per cookies is 6. What is the probability that in any
particular cookie being inspected.
(i) Fewer than five chip parts will be found?
(ii) Exactly five chip parts will be found?
(iii) Five or more chip parts will be found?
(iv) Four or five chip parts will be found?
Solution: Given, m = 6
e mmr
We have, P(X = r) =
r!
(i) P ( X  5) = P(X = 0) + P(X = 1) + P(X = 2) + P(X = 3) + P(X = 4)

e  6 60 e  6 61 e 6 62 e  6 63 e 6 64  6 0 61 6 2 6 3 6 4 
= + + + + = e 6      
0! 1! 2! 3! 4! 1 1 2 6 24 
= 0.00248[1+6+18+36+54] = 0.00248× 115 = 0.28581
e  6 65 0.00248 7776
(ii) P(X = 5) = = 120
= 0.1606
5!
(iii) P(X ≥ 5) = 1 – P(X<5) = 1 – 0.2851 =0.715
4
e  6 6 4 e  6 65  6 6 65 
(iv) P(X = 4 or 5) = P(X = 4) + P(X = 5) =  = e  4!  5! 
4! 5!  
 6 6 6 6 6 6 6 6 6 
= 0.00248  = 0.00248[54 + 64.8]
 1 2 3 4 1 2 3 4 5 
=0.00248 118.8 = 0.2946

Fitting of Poisson Distribution: If we want to fit a Poisson distribution to a given


frequency distribution, we first compute the mean x of a given frequency distribution
and equate x = m. Once, m is known the various probabilities of the Poisson
distribution can be obtained by using the formula
9
e mmr
p( r ) = P (X=r) =
r!
If N is the total number of frequencies, then the expected or theoretical frequencies of
the Poisson distribution are given by
e mmr
f ( r ) = N. p( r ) = N. P (X=r) = N.
r!
If we calculate first the expected frequency of r = 0, then the other expected or
theoretical frequencies can be very conveniently as follows.
e mm0
When r = 0, f (0) = N.p(0) = N = N. e-m m0 = N.e-m
0!
m
e m 1
m
Then, f( 1 ) = N.p(1) = N = f (0). 1
1!
e m2
m
m
f( 2 ) = N.p(2) = N = f(1). 2
2!
e m3
m
m
f(3) = N.p(3) = N = f (2). 3
3!
and so on.

Example 4: A systematic sample of 100 pages was taken from Concise Oxford
Dictionary and observed frequency distribution of foreign words per page was found to
be as follows.
No. of foreign words per page (x): 0 1 2 3 4 5 6
F requency (f) : 48 27 12 7 4 1 1
Calculate the expected frequencies using Poisson distribution. Also compute the
variance of fitted distribution.
x f fx
----- ------ -------
0 48 0
1 27 27
2 12 24
3 7 21
4 4 16
5 1 5
6 1 6
------ ------ ------
N = 100 99
We have, x
 fx = 99
= 0.99 = m
N 100
e mmr
Then, f(r) = N. p (r) = N =
r!
e  0.99 (0.99) 0
Putting r = 0, we have, f (0) 100 = 100. e-0.99 = 100 ……
0!

10
Poisson Approximation to Binomial Distribution:

The binomial distribution is suitable when the number of trials (n) is small and the
probability of success p is constant in each trial. However, if the number of trials n is
large and the binomial probability of success p is small, in such situation, the Poisson
distribution becomes reasonable approximation of the binomial distribution. Thus, the
Poisson approximation is appropriate over binomial distribution when,
(i) the number of trial is greater than or equal to 20 (i.e. n 20 ) and
(ii) the probability of success p is small (generally less than 0.05)
The binomial distribution can be converted into Poisson distribution by converting
mean of binomial distribution = mean of Poisson distribution np = m.
Let r denotes the number of success, then the probability of getting r success is
given by
e mmr
P(X= r) = , r = 0, 1, 2, 3, …………….
r!
e (np ) r
 np
=
r!
Example 5: Five percent of the tools produced in a certain manufacturing process turn
out to be defective. Find the probability that in a sample of 30 tools chosen at random
exactly 4 will be defective by using
(i) the binomial distribution
(ii) the Poisson approximation to binomial distribution
Solution:
Here, p = probability of being defective = 0.05
q = probability of not being defective = 1 – p = 1 – 0.05 = 0.95
n = number of trials = 30
(i) Using binomial distribution, we have the probability of r defectives out of n is
p(r) = P(X = r) = n c r p r q n  r , r = 0, 1, 2, 3, ………….30,
Hence, the probability of exactly 4 defective tools is
P(X = 4) = 30 c 4 (0.5) 4 (0.95) 30  4 = 27405 0.00000625 0.26352 = 0.045
Using Poisson approximation to the binomial distribution, the probability of r defectives
is given by
e mmr
P(X = r) =
r!
m = np = 30 0.05 = 1.5
Hence the probability of exactly 4 defective tools is
e  1.5 (1.5) 4 0.2231 5.0625
P(X = 4) = = 24
= 0.047
4!
(the error in the approximation = 0.047 – 0.045 = 0.002)

Exercises:
1. If the chance of being defective bottles of medicine produced by a manufacturer
is 1/1000. The bottles are packed in boxes containing 500 bottles. A drug
11
manufacturer buys 100 boxes from the producer of bottles. How many boxes
contain (i) no defectives (ii) less than 2 defectives bottles?
2. The probability of getting no misprint in a page of a book is e -4. What is the
probability that a page contains more than 2 misprints?

Normal Distribution:

A distribution is discrete if the variable takes distinct values such as 1, 2, 3, …,


but not those in between. Continuous means that all values are possible for the variable.
In this section, we shall turn to cases in which the variable can take on any value within
a given range and in which the probability distribution is continuous. Normal
distribution is one of the most used and best-known continuous theoretical distributions
in statistics. Several mathematicians were instrumental in its development, including the
eighteenth-century mathematician-astronomer Karl Gauss. In honor of his work, the
normal probability distribution is often called the Gaussian distribution.
There are two basic reasons why the normal distribution occupies such a
prominent place in statistics. First, it has some properties that make it applicable to a
great many situations in which it is necessary to make inferences by taking samples. We
will find that the normal distribution is a useful sampling distribution. Second, the
normal distribution comes close to fitting the actual observed frequency distributions of
many phenomena relating to economics, business, social and physical science,
including human characteristics such as weights, heights and IQs and so on.
A continuous random variable X is said to be normally distributed with mean 
and standard deviation  if it has a probability density function of X represented by the
equation
1  x  2
1  
p (x) = e 2  
,  x  
 2
where, x = the value of the random variable
 = mean of the distribution of the random variable
 = the standard deviation of the distribution
22
= 7
= 3.14159
e = 2.7183

Remarks: 1. The mean  and  are called the parameters of the normal distribution.
Symbolically, x follows normal variable with mean  and standard
deviation  (or variance  2 ) and is written as x –N(  ,  2 ).
2. If x is a random variable following the normal distribution with mean  and
standard deviation , then the random variable z defined as
x 
Z= 
is called the standard normal variate.

12
The standard normal distribution has zero mean and standard deviation unity
and is denoted by N(0,1)

Some important properties of normal distribution:

1. It is perfectly symmetrical about the mean and is bell-shaped curve.


2. In a normal distribution Mean = Median = Mode
3. It is uni-modal i.e. it has only one mode.
4. The ordinate at the mean of the normal curve divides the total area under the
normal curve into two equal parts and each part is of area 0.5.
5. The two tails of the normal distribution extend indefinitely both sides and never
touch the horizontal axis.
6. Normal distribution is a limiting case of binomial and Poisson distribution.

7. Area Property:

(i) The area under the normal probability curve between the ordinate x =  - 
and x =  +  is 0.6826 i.e.    occupies 68.26% of the observations.
(ii) The area under the normal probability curve between the ordinates at x = 
- 2 and x =  + 2 is 0.9544 i.e.   2 occupies 95.44% of the
observations.
(iii) The area under the normal probability curve between the ordinates x =  -
3 and x =  + 3 is 0.9973 i.e.   3 occupies 99.73% of the
observations.

x 
We have, z = 
      
When x =  + , z = 
1 , When x =  - , z = 
 1
  2     2  
When x =  + 2, z = 
2 , When x =  -2, z == 
 2
  3     3  
When x =  +3, z = 
3 , When x =  -3, z = 
 3

13
Hence the area under the standard normal probability curve
(i) between the ordinates at z =  1 is 0.6826
(ii) between the ordinates at z =  2 is 0.9544
(iii) between the ordinates at z =  3 is 0.9973

Example 1: The inside diameter of 500 washers produced by a machine is 5.02mm.


And the standard deviation is 0.05mm. The purpose for which these washers are
intended allows a maximum tolerance in the diameter of 4.96 to 5.08 mm., otherwise
the washers are considered defective. Determine the percentage of defective washers
produced by the machine assuming the diameters are normally distributed.
Given, N= 500,  = 5.02 mm.,  0.05 mm.
x 
We have, 
z
4.96  5.02 0.06
When, x= 4.96, z  0.05 
0.05
 1.2
5.08  5.02 0.06
Again, when, x=5.08, z= 0.05 
0.05
1.2

Now, P(4.96<x<5.08) = P(-1.2<z<1.2)


= P(-1.2<z<0) + P(0<z<1.2)
= 0.3849 + 0.3849 = 0.7698
 the probability of non defective washers = 0.7698
 the probability of defective washers = 1 – 0.7698 = 0.2302
 the % of defective washers = 0.2302 100 = 23.02%

Example 2: A banker claimed that the life of a regular saving account opened in his
bank average 18 months with a standard deviation of 6.45 months. What is the
probability that:
(i) There will be still money in a saving account between 20 to 22 months by a
depositor.
(ii) The account will be closed (no money in the deposit) after 2 years?
(Given that Z0.31=0.1217, Z0.62 = 0.2324, Z0.93 = 0.3238)
Solution: With the usual notation,  =18 months,  6.45 months
(i) Probability that there will be still money in the saving account between 20 to
22 months is P(20<X<22)
x 
Let, z

20  18
When x1=20, z1 
6.45
= 0.31, and
22  18
when x2=22, z2 
6.45
= 0.62
 P(20<X<22) = P(0.31<z<0.62) = P(0<z<0.62) – P(0<z<0.31)
=0.2324-0.1217 = 0.1107
(ii) Probability that the account will be closed (no money in the deposit) after two
years(24 months) is P(x>24)
14
24  18
When x=24, z
6.45
=0.93
P(x>24) = P(z>0.93) = P(0<z<  )-P(0<z<0.93) = 0.5-0.3238 = 0.1762

Example 3: The mean yield per one acre is 662 kgs. with a standard deviation 32 kgs.
Assuming normal distribution, how many one-acre plots in a batch of 1000 plots would you
expect to have yielded (i) over 700 kgs. (ii) below 650 kgs., and (iii) what is the lowest
yield of the best 100 plots?
Solution: Let the random variable x denotes the yield (in kg.) per one acre plot, then
 =662, and  32
x 
The standard normal variate z is given by z

(i) Probability that a plot has yielded over 700 kgs. is given by P(x>700)
700  662
When x=700, z
32
=1.19
 P(x >700)=P(z >1.19)= P(0< z <  )-P(0 <z
<1.19)
= 0.5-0.3830 = 0.1170
Hence, in a batch of 1000 plots, the expected number of plots with yields over
700 kgs is 1000  0.1170 = 117
(ii) Probability that a plot has yield below 650 kgs, is P(x<650)
650  662
When x=650, 32
z = -0.38
 P(x<650)=P(z<-0.38)= P(-  <z<0)-P(-0.38<z<0)
= 0.5-P(0<z<0.38) (by symmetry)
=0.5-0.1480 = 0.3520
Hence, in a batch of 1000 plots, the expected number of plots with yields below
650kgs.is 1000  0.3520 = 352
(iii) Let the lowest yield of the best 100 plots be
x1, then
P(X=x1) = 100/1000 = 0.1
x1  662
When X=x1, z
32
=z1(say) …………
(i)
 P(X>x1)=P(z>z1)=0.1
Or P(z>z1)=P(0<z<  )-P(0<z<z1) = 0.1
Or 0.5 – P(0<z<z1) = 0.1
Or –P(0<z<z1) = -0.4 or P(0<z<z1) = 0.4
The value of z closer to 0.4 in z table is 0.3997 at z=1.28
 z1 = 1.28
x1  662
Now from equation (i), 32
=1.28 or x1=662 + 1.28  32 = 702.96 kgs.
Hence the best 100 plots have yield over 702.96 kgs.

15
Example 4: Average weight of the babies at birth is 3.05 kg with standard deviation of
0.39 kg. If the birth weights are normally distributed, would you regard:
(a) Wt. of 4 kg. as abnormal? (b)Wt. of 2.5 kg as normal?
Solution:
(a) Normal limits of weight at 1.96 s.d. (3.05  1.96 0.39) will be 2.29 kg. and 3.81
kg. Hence the weight of 4 kg. will be abnormal in more than 95% of cases.
(b) The wt. of 2.5 kg. lies within the normal limits of 2.29 kg. and 3.81 kg. and as such
it is not taken as abnormal.
Ex1: Which probability distribution would you use to find binomial probabilities in
the following situations: binomial, Poisson or normal?
(a) 112 trials, probability of success o.o6
(b) 15 trials, probability of success 0.4
(c) 650 trials, probability of success 0.02
(d) 59 trials, probability of success 0.1 [(a) normal, (b) binomial (c)Poisson (d)
normal ]

Ex2: The mean weight of products is 68.22 grams with a variance of 10.8 grams.
How many products in a batch of 1000 would you expect (i) to be over 72 grams (ii)
between 70 and 72 grams.
Ex3: Assume the mean height of soldiers to be 68.52 inches with a variance of 10.8
inches. How many soldiers in a regiment of 1000 would you expect to be over six feet
tall?
Ex4. The average daily sales of 500 branch offices were Rs 150 thousand and the
standard deviation Rs 150 thousand. Assuming the distribution to be normal indicate
how many branches have sales between
(i) Rs 120 thousand and Rs 145 thousand
(ii) Rs 140 thousand and Rs 165 thousand
Ex5: As result of tests of 2000 electric bulbs manufactured by a company it was
found that life time of the bulbs was normally distributed with an average life of 2040
hours and a standard deviation of 60 hours. On the basis of this information, estimate
the number of bulbs that is expected to burn for (a) more than 2150 hours (b) less than
1960 hours.

Some additional exercises and examples

Problems on Binomial distribution:


1. An accountant is to audit 24 accounts of a firm. Sixteen of these are of highly valued
customers. If the accountant selects 4 of the accounts at random, what is the
probability that he chooses at least one highly valued account?
Hint: Let the probability of selecting a highly valued customer be p = 16/24 = 2/3
 q = 1- 2/3 = 1/3, n = 4, P (X  1) = ?
n r n r
We have, p(r) = P(X = r) = c r p q
P (X  1) = P(X=1) + P(X=2) + P(X=3) + P(X=4)= …………………..
16
__________________________________________________________
Alternatively, P (X  1) = 1 – P(X=0) Ans. 80/81
________________________________________
2. The screws produced by a certain machine were checked by examining samples of
seven. The following table shows the distribution of 128 samples according to the
number of defective items they contained.
No. of defectives: 0 1 2 3 4 5 6 7 Total
No. of samples: 7 6 19 35 30 23 7 1 128
Fit a binomial distribution and find the expected frequencies if the chance of screw
being defective is ½ . Find the mean and variance of the fitted distribution.
Solution:
No. of defectives No. of samples
x f fx
------------------ ------------------- -----------
0 7 0
1 6 6
2 19 38
3 35 105
4 30 120
5 23 115
6 7 42
7 1 7
--------------- -------------------- ----------
N= 128 433
We have, x   fx N  433128  3.3828
Now, x  np or p  x n  3.3828 7  0.48  0.5  q  1  0.5  0.5
The expected frequencies of r successes (defectives) is given by
7
r 7 r
1 1 1 1
f(r) = N.p(r) = N. cr p q
n r nr
 128. cr      128. 7 cr    128. 7 cr .
7
 7 cr
2 2  2 128
Putting the value of r = 0, 1, 2, 3, 7, we have
7
f(0) = c0 = 1 f(1) = 7 c1 = 7
f(2) = 7 c2 = 21 f(3) = 7 c3 = 35
f(4) = 7 c4 = 35 f(5) = 7 c5 = 21
f(6) = 7 c6 = 7 f(7) = 7 c7 = 1
Mean of the binomial distribution, np = 7 0.5  3.5
& variance of the binomial distribution, npq = 7  0.5  0.5  1.75
3. Harry Ohme is in charge of the electronics section of a large department store. He
has noticed that the probability that a customer who is just browsing will buy
something is 0.3. Suppose that 15 customers browse in the electronic section each
hour.
17
(a) What is the probability that at least one browsing customer will buy
something during a specified hour?
(b) What is the probability that at least four browsing customers will buy
something during a specified hour?
(c) What is the probability that no more than four browsing customers will buy
something during a specified hour?
4. A quality control engineer takes daily samples of 10 electronic components and
checks them for inspection. On 200 consecutive working days, he obtained 112 days
with zero defective, 76 days with one defective and 12 days with two defectives. If
these samples can be looked upon as has been drawn from the binomial population,
obtain the expected or theoretical number of days with zero defective, one defective
and two or more defectives.
Solution: No. of trials or sample size, n= 10
No. of No. of
defectives (x) days (f) fx
----------------- ---------- -------
0 112 0
We have, x  
fx 100 1
1 76 76  
N 200 2
2 12 24
---------------- ---------- -------- Now,
x 1 1
x  np  p     0.05
n 2  10 20
200 100 Thus, q = 1 – p = 1 –
0.05 = 0.95.
The theoretical or expected frequencies of r successes out of n trials is given by
f(r) = N. p(r) = N . n cr p r q n r  200.10 cr (0.05)r (0.95)10r
Then the expected frequencies of 0, 1, 2 or above successes are:
f(0) = 200  10 c0  (0.05)0 (0.95)100  200  0.5987  119.75  120
f(1) = 200  10 c0  (0.05)0 (0.95)10 0  200 10  0.05  (0.950)101  63.02  63
Again, expected frequency of getting 2 or more is
f  r  2   N  1  p (0)  p (1)   N  f (0)  f (1) = 200 – 120 -63 =17

Problems on Poisson Distribution


1. A manufacturing of pins knows that on an average 2% in the production is defective.
He sells pins in boxes of 100 and guarantee that not more than two pins will be
defective. What is the probability that a box selected randomly (i) will meet the
guaranteed quality (ii) will not meet the guaranteed quality (e -1 =0.3679, e-2 =
0.1353)
2. In a certain factory turning out razorblades, there is a small chance 1/500 for any
blade to be defective. The blades are supplied in packets of 10. Use Poisson
distribution to calculate the approximate number of packets containing no defective,
18
one defective and two defective blades in a consignments of 10,000 packets, given
that e-0.02 = 0.9802.

Problems on Normal Distribution


1. The marks of 500 candidates in an examination are normally distributed with a mean
of 45 and standard deviation of 20 marks.
(a) Given that the pass mark is 40, estimate the number of candidates who passed
the examination.
(b) If 5% of the candidates obtained a distinction by scoring x marks or more,
estimate the value of x.
(c) If 400 candidates to be passed, what should be the lowest mark for passing? (Q.7,
2007, PU)
Solution: Let the random variable x denotes the marks obtained by the candidates.
Then x follows normal
distribution with mean  and standard deviation  .
Given, N = 500,  = 45, and  = 20
x
(a) we have, z =

40  45 5
When x = 40, z =   0.25
20 20
If the pass mark is 40, the area in the normal curve indicates the area above
x = 40, which can be written as
P( 40  x   ) = P( 0.25  z   ) = P(0.25  z  0)  P (0  z  )
= 0.0987 + 0.5 = 0.5987
 The expected number of students who passed the examination=500 
0.5987=299.71  300

(b) Let x denotes the lowest mark of 5% candidates, who have scored distinction
mark, then
P ( X  x) = 0.05.
X   x  45
We have, z =   z1 ……………………….(i)
 20
P ( X  x ) = 0.05 or P ( z  z1 ) = 0.05
or P(0  z  )  P(0  z  z1 )  0.05
or 0.5  P(0  z  z1 ) = 0.05
or  P(0  z  z1 ) = 0.05 – 0.5 = - 0.45
or P (0  z  z1 ) = 0.45
From the normal table, the value of z corresponding to the probability 0.45 is
1.645 i.e. z1=1.645
From equation (i), we have,
x  45
 1.645 or x – 45 = 32.9 or x = 45 + 32.9 = 77.9
20
Hence, the lowest score of top 5% is 77.9 marks.
19
400
(c) The percentage of candidates to be passed =  100  80% i.e 20% of the
500
candidates are to be
failed. Let x denotes the highest score of the lowest 20% of the candidates. Then
P(X  x) = 20% = 0.20
X   x  45
We have, z =    z2 ………………………(ii)
 20
P ( X  x)  0.20 or P ( z  z2 )  0.20
or P(  z  0)  P( z2  z  0) = 0.20
or 0.5  P( z2  z  0) = 0.20
or  P( z2  z  0) = 0.20 – 0.50 = - 0.30
or P( z2  z  0)  0.30
From the normal table, the value of z closure to 0.30 is 0.2995 at z = 0.84.
x  45
From equation (ii),  0.84 or x – 45 = - 16.8 or x = - 16.8 + 45 = 28.2
20
Hence if 400 candidates are to be passed out of 500, the lowest mark for passing
= 28.2 mark.

2. We have a training program designed to upgrade the supervisory skills of


production-line supervisors. Because the program is self-administered, supervisors
require different number of hours to complete the program. A study of past
participants indicates that the mean length of time spent on the program is 500 hours
and that this normally distributed random variable has a standard deviation of 100
hours.
(a) What is the probability that the candidate selected at random will take between
500 and 650 hours to complete the training program?
(b) What is the probability that a candidate selected at random will take more than
700 hours to complete the program?
(c) What is the probability that a candidate selected at random will require fewer
than 580 hours to complete the program?
(d) What the probability that a candidate chosen at random will take between 420
and 570 hours to complete the program?

3. In a certain examination 20% of the students, who appeared in Statistics paper, got
less than 30 marks and 97% of the students got less than 62 marks. Assuming the
distribution to be normal, find the mean and standard deviation of the distribution.
Solution: Let x be a random variable of marks obtained by the students, which is
normally distributed.
According to the question, we have,
P(x<30) = 0.20 ………..(i)
P(x<62) = 0.97 ……….(ii)
x
The standard normal variate is z 

20
30  
When x =30, then, z   z1 …………(iii)

62  
When x = 62, then, z   z2 …………(iv)

Now, P(x<30) = 0.20 or P(z < -z1) = 0.20 or P(  Z  0)  P( z1  Z  0)  0.20

Or 0.5 - P(0  Z  z1 )  0.20 or P(0  Z  z1 )  0.30

From the normal table, the value of Z closer to 0.30 is 0.2995 at Z = 0.84
 Z = z1 = 0.84

30  
From equation (iii) we get,  0.84 or 30    0.84 ………….(v)

Again, P(x<62) = 0.97 or P(Z  z2 )  0.97orP (  Z  0)  P(0  Z  z2 )  0.97


Or 0.5  P(0  Z  z2 )  0.97
From normal table, the value of Z closure to 0.47 is 0.4699 at Z = 1.88
 Z  z2  1.88
62  
From equation (iv), we get,  1.88 or 62    1.88 ………………..(vi)

Subtracting (vi) from (v), we have,
62    1.88
30    0.84
- + +
------------------------------
32 = 2.72  or   32 2.72  11.76
Substituting the value of  in eqn. (v), we get, 30 -  = - 0.84  11.76 or  =
30 + 9.88 = 39.88
Some More Illustration on Theoretical Distribution

Illustration 1: What are the chances of getting any combination viz.


(i) 2 boys (ii) two girls and (iii) one boy and one girl, when the number
of pregnancies is 2? Assume equal probability for boy and girl.
Solution:
Let p denotes the probability of getting a boy (success) and q the
probability of girl (failure).
Assuming the equal probability for boy and girl, we have, p = 1/2 and
q = 1/2.
Number of trials (pregnancies), n = 2
If the random variable X denotes the number of boys, then by using
binomial probability law, the probability of r boys is given by

21
P(X=r) = p(r) = n
cr pr qn r , r= 0, 1, 2,……..n
2 2 2
 1  1 1
(i) P(X=2) = p(2) = 2
c2     = = 0.25
 2  2 4
0 2 0
 1  1 1
(ii) P(X=0) =p(0) = 2
c0     = = 0.25
 2  2 4
1 21
 1  1 1 1 1
(iii) P(X=1) =p(1) = 2
c1     = 2  = = 0.5
 2  2 2 2 2
Illustration 2: In a box containing 90 screws, 27 were defective.
What is probability that out of a sample of 6 screw (i) all are defective
(ii) no defective (iii) at least one defective (iv) exactly 3 defectives (iii)
at most two defective?
Solution:
Total number of screws in the box = 90
Number of defective screws in the box = 27
27
Let, the probability of producing defective screw (success) is p= =
90
0.3
Then, the probability of producing non-defective screw (failure) is q =
1 - p = 1 - 0.3 = 0.7
Sample size (number of trials) = n = 6.
Using binomial distribution, we have, P(X=r) = p(r) = n cr prqnr , r= 0, 1,
2,……..n
(i) Probability that all screws are defective
P(X=6) = p(6) = 6c6  0.3 6  0.7 66 = 0.00073
(ii) Probability of no defective
P(X=0) = p(0) = 6c0  0.3 0  0.7 60 = 0.1176
(iii) Probability that at least one defective
P(X  1)  1 P(X  1) P(X  1)  1 P(X  1) = 1 - P(X=0) = 1 - 0.1176 = 0.8824
(iv) Probability of exactly 3 defectives
P(X=3) = p(3) = 6c3  0.3 3  0.7 63 = 0.1852
(v) Probability of at most two defectives
P(X  2) = P(X=0 or 1 or 2) = P(X=0) + P(X=1) + P(X=2)
= 6c0  0.3 0  0.7 60  6c1  0.3 1  0.7 61  6c2  0.3 2  0.7 62 =
0.1176+0.3025+3241=0.7442
Illustration 3: The incidence of occupational disease in an industry
is such that the workers have a 20% chance of suffering from it. What
is the probability that out of six workmen, 4 or more will contract the
disease.
Solution:
p = probability of suffering from occupational disease = 0.2

22
q = probability of not suffering from occupational disease = 1 - p
= 1 - 0.2 = 0.8
n = number of workmen (trial) = 6
Now, the probability that out of 6 workmen, 4 or more will contract
the disease is
P(X  4) = P(X=4 or 5 or 6) = P(X=4) + P(X=5) + P(X=6)
………………………………………….
………………………………………Self exercise.
Illustration 4: A salesman makes a sale on the average to 40
percent of the consumers he contacts. If 4 customers are contacted
today, what is the probability that he makes sales to exactly two?
What assumption is required for your answer?
Solution: Let X represent the number of sales the salesman makes in
a day. Then the possible values of X are 0, 1, 2, 3, 4
Here, n=4, p= 0.4, q=0.6
Using the binomial probability law, we have, P(X=r) = p(r) =
…………Self exercise………
Assumptions: (i) When the salesman contacts a customer, either it
results in sales or no sales
(ii) The probability of making sales or not making sales remains
the same for each customer.
(iii) The salesman contact each customer independently.
Illustration 5: Out of 8000 families in a city with 4 children each,
what percentage would be expected to have (i) 2 boys and 2 girls (ii)
1 boy and 3 girls (iii) no boy (iv) at least one boy (v) at most 2 boys.
Assume equal probabilities for boys and girls.
Solution: Number of children, n=4
Let the probability of having boys (success) = p = 1/2
Then the probability of having girls (failure) = q = 1-p = 1/2
Total number of children, N= 8000
(i) Out of 8000 families with 4 children each, the probability of having
2 boys and 2 girls is given by
2 4 2 4
 1  1  1 3
P(X=2) = p(2) = 4
c2     6  
 2  2  2 8
Hence, the percentage of families having 2 boys and 2 girls =
3
 100 =37.5%
8
(ii) Out of 8000 families with 4 children each, the probability of having
1 boy and 3 girls is given by
1 41 4
 1   1  1 1 1
P(X=1) = p(1) = 4
c1      4    4 
 2   2  2 16 4

23
Hence, the percentage of families having 1 boys and 3 girls =
1
 100 = 25%
4
………Calculate in the similar ways………………………………………

Illustration 6: One fifth percent of the blades produced by a blade


manufacturing factory turn out to be defective. The blades are
supplied in packets of 10. Use Poisson distribution to calculate the
approximate number of packets containing no defective, one
defective and two defective blades respectively in a consignment of
100,000 packets.(Given e-0.02= 0.9802).
Solution: Let the random variable X denote the number of defective
blades in a packet of 10. Then, it given that
1
m=np= 10  = 0.02
500
e mmr
Using the Poisson probability law, we have, P(X=r)= p(r) =
r
e0.02 0.020
P(X=0)= 0
= e--0.02=0.9802
Therefore, the required number of packets containing no defective
blade is given by
f(0)= Np(0)= 100,000  0.9802= 98,020

P(X=1)= e--0.02  m = 0.9802  0.02= 0.0196


Therefore, the required number of packets containing one defective
blade is given by
f(1)= Np(1)= 100,000  0.0196= 1,960

e0.02 0.022
P(X=2)= =0.9802  0.0002= 0.000196
2
Therefore, the required number of packets containing two defective
blade is given by
f(2)= Np(2)= 100,000  0.000196= 19.6  20

Illustration 7: 1582 drivers were asked how many accidents they


had caused. 951 had caused no accident, 501 had caused one
accident each, 100 had caused two accidents and 30 had caused
three accidents each. Which distribution is appropriate for these data
and why? Also fit the distribution.
Solution:

24
The probability of an accident is very small and trials for a driver are
quite large in number, so Poisson distribution can suitably be applied.
For the observed distribution, the mean number of accidents
951 0  501 100 2  30 3
m = 0.5
1582
Now to fit the Poisson distribution, the respective expected number of
frequencies can be computed by:
e mmr
f(r) = Np(r) = N , r = 0, 1, 2, 3
r
e0.50.5r
f(r) = 1582 p(r) = 1582
r
Now, putting the values of r = 0,1, 2 and 3, we can present the
respective expected frequencies as follows in the tabular form:
No. of Probability p(r)= Expected number of
accidents e0.50.5r accidents(frequenci
(r) r es) f(r) = Np(r)
0 p(0)= f(0) = 1582  p(0) =
e0.50.50 1582  0.6065 =
 0.6065
0 959.53 960
1 p(1)= f(1) = 1582  p(1) =
0.5
e 0.5 1
1582  0.3032 =
 0.3032
1 
479.76 480
2 p(2)= f(2) = 1582  p(2) =
0.5
e 0.5 2
1582  0.0758 =
 0.0758
2 119.91  120
3 p(3)= f(3) = 1582  p(3) =
0.5
e 0.5 3
1582  0.0126 = 20
 0.0126
3

Alternative Method:
e mmr
f(0) = N = N  em = 1582 e0.5  959.53  960
r
m
f(1) = f (0)  = 959.53  0.5 =479.76  480
1
m 0.5
f(2) = f (1)  = 479.76  =119.94  120
2 2
m 0.5
f(3) = f (2)  = 119.94  =19.98  20
3 3

Illustration 8: A manufacturer of pins knows that on an average 2%


in production is defective. He sells pins in boxes of 100 and guarantee
that not more than two pins will be defective. What is the probability

25
that a box randomly selected (i) will meet the guaranteed quality (ii)
will not meet the guaranteed quality (e-1= 0.3679,
e-0.2=0.1353).
Solution:
p= Probability of defective pins= 0.02
n= Sample size of pins per box = 100
m= Average number of defective pins = np = 100  0.02=2
Now the probability of r defective pins in the selected box is
computed by using the Poisson law as
e mmr
P(X=r) = p(r)= , r= 0, 1, 2, …………
r
i) The probability that the box will meet the guaranteed quality
number of defective pins per box is not more that 2 is given by
P(X 2) = P(X=0 or 1 or 2)= p(0) + p(1) + p(2)
=……………………………………………….= …………………= 0.6765
ii) The probability that the box will not meet the guaranteed quality =
Probability that number of defective pins is more than two is given
by
P(X>2) = 1 - P(X2) = …………. = 0.3335

Illustration 9: Fit a Poisson distribution to the following data:

No. of 0 1 2 3 4
deaths:
Frequenci 12 5 5 2 1
es: 0 8

Solution: To fit the Poisson distribution, first of we have to compute


the average death from the given distribution as
m= 
fx 0 120  1 58  2 5  3 2  4 1 78
 = =0.42
N 120  58  5  2  1 186
Now the expected frequency for 0, 1, 2, 3 and 4 deaths is computed
as
e mmr
f(r) = Np(r) = N , r = 0, 1, 2, 3,4
r
f(0) = 186 e0.42  122.21 122

e0.42 0.421
f(1) = 186  51.32  51
1
e0.42 0.422
f(2) = 186  10.78  11
2

26
e0.42 0.423
f(3) = 186  1.52  2
3
e0.42 0.424
f(4) = 186  0.16  0
4
Hence the expected death for 0, 1, 2, 3 and 4 are 122, 51, 11, 2 and 0
respectively.

Illustration 9: Assuming that the typing mistakes per page


committed by a typist follows a Poisson distribution, find the
expected frequencies for the following distribution of typing
mistakes:
Mistakes per page: 0 1 2 3 4 5
No. of pages: 40 30 20 15 10 5
-1.5
(Given e = 0.22313)
Solution: Self exercise (Do as above)

Illustration 10: The mean life of tubes supplied by a company is


1500 hrs. with a standard deviation of 500 hrs. Assuming the data
to be normally distributed, what percentage of the tubes are
expected to have life: (i) less than 1000 hrs (ii) more than 2000
hours (iii) between 1200 and 1800 hrs. (iv) between 1000 and 1400
hrs. (v) between 1600 and 2500 hrs. (vi) more than 1200 hrs. (vii)
less than 1600 hrs.
Solution: Let X be a random variable of the life of tubes supplies by
a company. Assuming that the life of the tubes are normally
distributed, we have,
 = 1500 hrs.,  = 500 hrs.
X 
The standard normal variate, Z=

1200  1500
(i) When X=1000, then, Z= = -1
500
The probability that the tubes having life less than 1000 hrs. is given
by
P(X<1000) = P(Z< -1) = P(-<Z<0) - P(-1<Z<0)
= P((0<Z<) - P(0<Z<1) = 0.5 - 0.3413 = 0.1587
Hence, the percentage of the tubes having expected life less than
1000 hrs.= 100  0.1587=15.87%

X  2000  1500
(ii) When X= 2000 hrs., then Z= = = 1
 500
The probability that the tubes having life more than 2000 hrs. is
given by

27
P(X>2000) = P(Z> 1) = P((0<Z<) - P(0<Z<1)
= 0.5 - 0.3413 = 0.1587
Hence, the percentage of the tubes having expected life of more
than 2000 hrs.= 100  0.1587 =15.87%
1200  1500
(iii) When X1=1200 hrs, then, Z1= = - 0.6
500
1800  1500
When X2=1200 hrs, then, Z2= = 0.6
500
The probability that the tubes having life between 1200 and 1800
hrs. is given by
P(1200<X<1800) = P(-0.6<Z<0.6) = P(-0.6<Z<0) + P(0<Z<0.6)
=2P(0<Z<0.6) = 2  0.2257 = 0.4514
Hence, the percentage of the tubes having expected life between
1200 and 1800 hrs..= 100  0.4514 =45.14%
1000  1500
(iv) When X1=1000 hrs, then, Z1= =-1
500
1400  1500
When X2=1400 hrs, then, Z2= = - 0.2
500
Now the probability that the tubes having life between 1000 and
1400 hrs. is given by
P(1000<X<1400) = P(-0.1<Z<0) - P(-0.2<Z<0)
=0.3413 - 0.0793 = 0.262
Hence, the percentage of the tubes having expected life between
1000 and 1400 hrs..= 100  0.262=26.2%
1600  1500
(v) When X1=1600 hrs, then, Z1= = 0.2
500
2500  1500
When X2=2500 hrs, then, Z2= =2
500
Now the probability that the tubes having life between 1600 and
2500 hrs. is given by
P(1600<X<2500) = P(0.2<Z<2) = P(0<Z<2) - P(0<Z<0.2)
=0.4772 - 0.0793 = 0.3979
Hence, the percentage of the tubes having expected life between
1600 and 2500 hrs.= 100  0.3979=39.79%
1200  1500
(vi) When X=1200, then, Z= = -0.6
500
Now, the probability that the tubes having life less than 1200 hrs.
is given by
P(X>1200) = P(Z> -0.6) = P(-0.6<Z<0) + P(0<Z<)
= P((0<Z<0.6) + P(0<Z<) = 0.2257 + 0.5 = 0.7257
Hence, the percentage of the tubes having expected life more than
1200 hrs.= 100  0.7257= 72.57%
1600  1500
(vii) When X=1600, then, Z= = 0.2
500
28
Now, the probability that the tubes having life less than 1600 hrs.
is given by
P(X<1600) = P(Z<0.2) = P(-<Z<0) + P(0<Z<0.2)
= 0.5 + 0.0793 = 0.5793
Hence, the percentage of the tubes having expected life less than
1600 hrs.= 100  0.5793 = 57.93%
Illustration 11: The sales tax officer has reported that the average
sales of the 500 businesses that he has to deal with during a year
amount to Rs.36,000 with a standard deviation of Rs.10,000.
Assuming that the sales in these businesses are normally
distributed, find:
(i) the number of businesses, the sales of which are over
Rs.40,000.
(ii) the percentage of business, the sales of which are likely to
range between Rs.30,000 and Rs.40,000
(iii) the probability that the sales of a business selected at random
will be over Rs.30,000.
(Same as above, do for your practice yourself)
Illustration 12: Time taken by a construction company to construct
a flyover is a normal variate with mean 400 labour days and
standard deviation of 100 labour days. If the company promises to
construct the flyover in 450 days or less and agrees to pay a
penalty of Rs.10,000 for each labour day spent in excess of 450,
what is the probability that (i) the company pays a penalty of at
least Rs.200.000?. (ii) the company takes at most 500 days to
complete the flyover?
Solution: Let the random variable X denote the time (in labour days)
to construct the flyover. Then X is normally distributed with mean 
=400 days and standard deviation  = 100 days.
(i) Since the penalty for each labour day delay (excess over 450 days)
is given to be Rs.10,000, the company paying penalty of at least
Rs. 200,000 means it takes at least (200,000/10.000) or 20 days
over and above 450 promised days, i.e. 470 days to complete the
flyover. Thus, the required probability is given by:
470  400
When, X= 470, then, Z= = 0.7
100
Now, the probability that the construction of flyover takes at least
470 days is given by
P(X 470) = P(Z0.7) = 0.5 - P(0Z0.7)
= 0.5 - 0.2580 = 0.2420.
500  400
(ii) When, X= 500, then, Z= = 1
100

29
Now, the probability that the construction of flyover takes at most
500 days is given by
P(X 500) = P(Z1) = 0.5 +P(0Z1)
= 0.5 + 0.3413 = 0.8413
Illustration 13: The mean inner diameter (in inches) of a sample of
2000 tubes produced by a machine is 0.502 and a standard
deviation is 0.005. The purpose for which these tubes are intended
allows a maximum tolerance in the diameter of 0.496 to 0.508,
otherwise the tubes are considered to be defective. What
percentage of tubes produced by the machine is defective, if the
diameters are found to be normally distributed?
(Do yourself for practice).
Illustration 14: The daily earnings of 2,000 workers in a factory are
normally distributed with a mean of Rs.200 and a variance of
Rs.400. Estimate the lowest daily earnings of 197 highest paid
workers and the highest daily earnings of the 197 lowest paid
workers.
Solution: Let the random variable X denote the daily earnings of
2,000 workers in the given factory. Then X is normally distributed
with mean  = Rs.200 and standard deviation 2 = Rs.400. The
standard normal variate Z is
X  200
Z=
20
(i) Proportion of 197 highest paid workers is 197/2000 = 0.0985
Let x1 be the minimum salary of the highest paid 2000 workers.
We have to determine X= x1 such that
P(X>x1) = 0.0985
x1  200
When, X=x1 , Z=  z1(say)
20
Then P(Z>z1) = 0.0985 or P(0<Z<z1) = 0.5 - 0.0985 = 0.4015
x1  200
or, z1   1.29 (from normal table)
20
 x1  200  20 1.29 = Rs.225.8
Hence the lowest weekly wage of the 197 highest paid workers is
Rs.225.8.
(ii) Similarly, the highest daily earnings of the 197 lowest paid
workers can be calculated by finding the value of Z which covers
0.0985 area in the left tail. Thus,
P(X<x2) = 0.0985 or P(Z<z2) = 0.0985
or P(-z2<Z<0) = 0.5 - 0.0985 = 0.4015 or P(0<Z<z2) =0.4015
x1  200 x  200
Z=  z2 (say) or 1  1.29
20 20
 x2  200  20 1.29 = Rs.174.2

30
Hence the highest wage of the 197 lowest paid workers is Rs.174.
Illustration 15: The marks obtained by the students in an
examination are known to be normally distributed. If 10% of the
students got less than 40 marks, while 15% got over 80, what are
the mean and standard deviation of marks?
Solution: Let X be the random variable of marks obtained by the
students in an examination.
According to the statement, we have,
P(X<40) = 0.10 ……………(i)
P(X>80) = 0.15 ……………(ii)
X 
The standard normal variate is Z=

40 
When, X=40, then, Z= = - z1 …………….(iii)

(Less than 40 marks lies to the left to the curve)

80 
When, X=80, then, Z= = z2 ………………(iv)

(More than 80 marks lies to the right to the curve)
Then, P(X<40) = 0.10 or P(Z<-z 1) = 0.1 or P(-<Z<0) - P(-z1<Z<0)
=0.10
or 0.5 - P(0<Z<z1) = 0.40
From the normal table, the value of Z closer to 0.40 is 0.3997 at
Z=1.28
 Z=z1 = 1.28
40 
Again, from equation (iii), we get, = -1.28

or  = 40 + 1.28  ………………………………(v)
Again, P(X>80) = 0.15 or P(Z>z2) = 0.15 or P(0<Z<) -
P(0<Z<z2) = 0.15
or 0.5 - P(0<Z<z2) =0.15 or P(0<Z<z2) = 0.35
From the normal table, the value of Z closer to 0.35 is 0.3508 at Z
= 1.04
 Z=z2 = 1.04
80 
From (iv) , we get, = 1.04

or  = 80 - 1.04  ………………………………(vi)
Subtracting equation (vi) from (v) we have,
- 40 + 2.32  or 2.32  = 40 or  = 17.24
Substituting the value of  in equation (v), we get,  = 40 + 1.28 
17.24 = 40 +22.07 = 62.07

31
Illustration 17: The marks of 500 candidates in an examination are
normally distributed with a mean of 45 and standard deviation of
20 marks.
(i) Given that the pass mark is 40, estimate the number of candidates
who passed the examination.
(ii) If 5% of the candidates obtained a distinction by scoring x marks
or more, estimate the value of x.
(iii) estimate the interquartile range of the distribution.
(iv) If 400 candidates to be passed, what should be the lowest marks
for passing?

32

You might also like