0% found this document useful (0 votes)
14 views25 pages

Topic 4B Common Continuous Distributions

The document discusses common continuous probability distributions, focusing on the uniform and exponential distributions. It defines the uniform distribution, provides examples of probability calculations, and explains key concepts such as expectation, variance, median, and cumulative distribution functions. Additionally, it covers the exponential distribution, including its properties and examples related to real-world scenarios.

Uploaded by

Bageya Alexis
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views25 pages

Topic 4B Common Continuous Distributions

The document discusses common continuous probability distributions, focusing on the uniform and exponential distributions. It defines the uniform distribution, provides examples of probability calculations, and explains key concepts such as expectation, variance, median, and cumulative distribution functions. Additionally, it covers the exponential distribution, including its properties and examples related to real-world scenarios.

Uploaded by

Bageya Alexis
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 25

Common Continuous

Probability Distributions

6.1 The Uniform/Rectangular distribution


Definition 1 A continuous random variable X is said to be uniformly distributed (or
to have a uniform distribution) over an interval (a, b) if its probability density function
is given by:
a<x<b
( 1
f (x) = b−a
0 otherwise.

It can be shown that f (x) is indeed a pdf as follows:


Z ∞ Z b
1 1 b−a
f (x)dx = dx = [x]b = = 1.
−∞ b−a a b−a a b−a

Example 1 If X is uniformly distributed over (0,10), calculate:

(a) P(X < 3,

(b) P(X > 6),

(c) P(3 < X < 8).

Solution

The pdf of X is given as:

0 < x < 10
( 1
f (x) = 10
0 otherwise.
R3
(a) P(X < 3) = 1
0 10
dx = 10 .
3

R 10
(b) P(X > 6) = 6
1
10 dx = 25 .
R8
(c) P(3 < X < 8) = 1
3 10
dx = 12 .

1
2 Elements of Probability & Statistics– MTH1202

6.1.1 Expectation and Variance of a Uniform Distribution


The expectation or mean of a uniform distribution is given by:
Z ∞ Z b
x 1 x2 b b2 − a 2 b+a
E(X) = x f (x)dx = dx = [ ]a = = .
−∞ a b − a b − a 2 2(b − a) 2
The variance is given by:
(b − a)2
Var(X) = .
12
Proof: Start by computing E(X2 ), therefore,
Z ∞ Z b 2
x 1 1 3 1
 
E(X ) =
2
x f (x)dx =
2
dx = (b − a3 ) = (b2 + ab + a2 ).
−∞ a b − a b − a 3 3
Hence,
1 2 1
Var(X) = E(X2 ) − [E(X)]2 = (b + ab + a2 ) − (b + a)2
3 4
1
= (4b + 4ab + 4a2 − 3b2 − 6ab − 3a2 )
2
12
1 2 (b − a)2
= (a − 2ab + b2 ) =
12 12

6.1.2 The Median


Let m be the median, then,
Z m Z m
1 1 1 1 1 1 1 b+a
f (x)dx = ⇒ dx = ⇒ a =
[x]m ⇒ [m−a] = ⇒ m = .
−∞ 2 b − a a 2 b − a 2 b − a 2 2
Thus, the median equals the mean. When the mean and median of a distribution
coincide, we say that the distribution is symmetric, otherwise it is asymmetric.

6.1.3 The Cumulative Distribution Function


The cumulative distribution function (cd f ), F(x) of a uniform distribution is
given by:
x<a

 0

 b−a a ≤ x < b
F(x) = 
 x−a

 1 x ≥ b.

You can show this by using the definition of F(x). Thus


Z x Z x
1 x−a
F(x) = f (u)du = du =
−∞ a b−a b−a
x<a


 0
Therefore, F(x) = 
 x−a

 b−a a ≤ x≤b
 1 x ≥ b.

B.K. Nannyonga, H.W. Kayondo, N. Muyinda, & B.N. Kirenga c 3

Example 2 Let X be a uniform continuous random variable that denotes the current
measured in a thin copper wire in milliamperes. Assume that the range of X is [0, 20]
mA.

(a) Write down the probability density function of X.

(b) What is the probability that a measurement of current is between 5 and 10


milliamperes?

(c) Find the mean, variance and standard deviation of X.

Solution

(a) f (x) = 1
20 = 0.05 for 0 ≤ x ≤ 20.

(b) The requested probability is shown in Figure 6.1.


R 10 R 10
P(5 < X < 10) = 5 f (x)dx = 5 0.05dx = [0.05x]105
= 5(0.05) = 0.25.

R 10
Figure 6.1: The shaded area is the needed probability 5
0.05dx.

(c) Substituting into the formulae for mean and variance seen above with
a = 0 and b = 20, we obtain E(X) = 10 mA and√Var(X) = 202 /12 = 33.33
mA2 . Consequently, the standard deviation is 33.33 = 5.77 mA.

Note: We could have used definitions and pdf to obtain the mean, variance and
standard deviation.

Exercise
1. Suppose X has a continuous uniform distribution over the interval
[2.5, 6.5].

(a) Determine the mean, variance and standard deviation of X.


(b) What is P(X < 2.5)?

2. Suppose Y has a continuous uniform distribution over the interval [−2, 2].

(a) Determine the mean, variance and standard deviation of Y.


(b) Determine the value for k such that P(−k < X < k) = 0.9.
4 Elements of Probability & Statistics– MTH1202

3. The net weight in Kgs of babies delivered at Kawempe referral hospital is


uniform for 2.2 < x < 4.6 Kgs.

(a) Determine the mean, variance and standard deviation of the weight
for the babies.
(b) Determine the cumulative distribution function of the weight of the
babies.
(c) Determine P(X < 3.2).

4. The age of a given an undergraduate student offering a Bachelor’s degree


at any University in Uganda is uniformly distributed between 18.5 and
24.4 years.

(a) Determine the cumulative distribution function for the age of an


undergraduate student.
(b) Determine the proportion of University students whose age exceeds
22 years.
(c) Which age is exceeded by 80% of the University students offering a
Bachelor’s degree.
(d) Determine the mean and variance of age for an undergraduate
student.

5. Suppose the time it takes for a bank teller to serve a customer is uniformly
distributed between 1.5 and 2.5 minutes.

(a) What is the mean and variance of the time it takes a bank teller to
serve a customer.
(b) What is the probability that it takes less than 2 minutes for a customer
to be served?
(c) Of the next 50 customers, how many will be served with in 2.2
minutes each?
(d) Determine the cumulative distribution function of the time it takes
for a bank teller to serve a customer.

6. The probability density function of the time it takes hematology cell counter
to complete a test on a blood sample is uniform on [50, 75] seconds.

(a) Write down the pdf for the distribution.


(b) What percentage of tests require more than 70 seconds to complete.
(c) What percentage of tests require less than one minute to complete.
(d) Determine the mean and variance of the time to complete a test on a
sample.
B.K. Nannyonga, H.W. Kayondo, N. Muyinda, & B.N. Kirenga c 5

6.2 The Exponential Distribution


A continuous random variable X whose probability density function is given by:
λe−λx x ≥ 0
(
f (x) =
0 x < 0.
for some λ > 0 is said to be an exponential random variable or is said to be
exponentially distributed with parameter λ.
f (x) is indeed a pdf since,
Z ∞ Z ∞
f (x)dx = λe−λx = [−e−λx ]∞
0 = 1.
0 0

If X is an exponentially distributed random variable, then:


Z b
(1) P(a < X < b) = f (x)dx.
a
Z a
(2) P(X < a) = λe−λx dx = 1 − e−λa .
0

(3) P(X > a) = 1 − P(X < a) = 1 − (1 − e−λa ) = e−λa .


P[(X > a + b) ∩ (X > a)] P(X > a + b)
(4) P[(X > (a + b)|X > a)] = =
P(X > a) P(X > a)
e−λ(a+b)
= = e−λb = P(X > b).
e−λa
Thus:
(a) P[(X > 75)|(X > 30)] = P(X > 45) = e−45λ , and
(b) P[(X > 1200)|(X > 850)] = P(X > 350) = e−350λ .
(c) P[(X > q)|(X > p)] = P(X > (q − p)), provided q > p.
Example 3 Suppose that the length of a phone call in minutes is an exponential random
variable with parameter λ = 10 1
. If one arrives immediately ahead of a public telephone
booth, find the probability that he will have to wait:
(a) more than 10 minutes,
(b) between 10 and 20 minutes.
Solution
Let X be the length of the call made by the person in booth. Then,
Z ∞
1 −1x
(a) P(X > 10) = e 10 dx = e−1 = 0.368.
10 10
Z 20
1 −1x
(b) P(10 < X < 20) = e 10 dx = e−1 − e−2 = 0.233.
10 10
6 Elements of Probability & Statistics– MTH1202

6.2.1 Expectation and Variance of Exponential Distribution


If a random variable X is exponentially distributed with parameter λ and pdf
f (x) = λe−λx ; for x ≥ 0, λ > 0, then the mean and the variance of of X are
given by:
1 1
E(X) = and Var(X) = 2 .
λ λ
Proof:
Z ∞ Z ∞ " −λx #∞
h i∞ Z ∞ e 1
E(X) = x f (x)dx = λxe dx = −xe
−λx −λx
+ e dx = −
−λx
= .
−∞ 0
0
0 λ 0 λ

To find Var(X), we first find E(X2 ),


Z ∞ Z ∞
E(X ) =
2
x f (x)dx =
2
λx2 e−λx dx.
−∞ 0

Using integration by parts twice gives:

2 ∞ −λx
∞ Z ∞
2 −2 2
 
E(X2 ) = −x2 e−λx − xe−λx + e dx = 2 e−λx = 2 .
λ 0 λ 0 λ 0 λ
Hence
2 1 1
Var(X) = E(X2 ) − [E(X)]2 = − 2 = 2.
λ 2 λ λ

6.2.2 The Cumulative distribution function


The cumulative distribution function, F(x) of an exponentially distributed
random variable X is given by F(x) = 1 − e−λx , x ≥ 0. This can be shown from
definition that,
Z x Z x
F(x) = f (u)du = λe−λu du = [−e−λu ]x0 = 1 − e−λx , x ≥ 0.
−∞ 0

Example 4 The lifetime of a particular type of bulb has an exponential distribution


with mean lifetime of 1000 hours.
(a) Find the probability that a bulb is still working after 1300 hours.
(b) Given that it was still working after 1300 hours, find the probability that it is still
working at after 1500 hours.
(c) Find the standard deviation of the lifetime of this type of light bulb.

Solution

Let X be a random variable representing lifetime of the light bulb in hours.


Then f (x) = λe−λx , x ≥ 0. Since E(X) = λ1 and we are given that E(X) = 1000,
then λ1 = 1000 ⇒ λ = 0.001 and as such, f (x) = 0.001e−0.001x .
B.K. Nannyonga, H.W. Kayondo, N. Muyinda, & B.N. Kirenga c 7

(a) P(X > x) = e−λx ⇒ P(X > 1300) = e−0.001∗1300 = e−1.3 = 0.273.

(b) P[(X > 1500)|(X > 1300)] = P(X > 200) = e−0.001∗200 = e−0.2 = 0.819.
q
(c) Standard deviation = λ12 = λ1 = 1000.

6.2.3 The Median


From theZdefinition of the median, if m denotes the median of a random variable
m
1
X, then f (x)dx = . This implies that for an exponential distribution,
0 2
Z m
1 1 1 1 ln 2
λe−λx dx = ⇒ −e−λx |m 0 = ⇒ −e−λm + 1 = ⇒ e−λm = ⇒ m = .
0 2 2 2 2 λ

Example 5 The length of time for a person to be served at a cafeteria is a random variable
having an exponential distribution with mean 4 minutes. What is the probability that a
person is served in less than 3 minutes on at least 4 of the next 6 days?

Solution

First obtain the probability of being served in less than 3 minutes on any day.
Let X represent the length of time it takes to serve a person. Therefore, with
R3 1
mean=4, λ = 14 this implies that, P(X < 3) = 14 0 e− 4 x dx = 1 − e−0.75 = 0.5276.
Next we treat a day as a Bernoulli trial with the event "served in less than 3
minutes" a success with a probability p = 0.5276. Hence,

6 !
X 6
P(at least 4 successes in 6 days) = P(X ≥ 4) = px q6−x = 0.3968.
x
x=4

Example 6 Let X be an exponential random variable with mean 6. Find:

(a) P(X < 4|X < 2),

(b) The median of X.

6.2.4 Link between Exponential and Poisson


The "waiting times" between successive events in a Poisson distribution can be
shown to follow an exponential distribution.

Example 7 On a busy road, accidents occur at random at the rate of 3 accidents per
day. Find the probability that after a particular accident has occurred, at least one day
goes without another accident.

Solution
8 Elements of Probability & Statistics– MTH1202

Let X represent the "number of accidents in a day", then X has a Poisson


distribution with parameter λ = 3. If T represents "the time in days between
successive accidents", then T is exponentially distributed with parameter λ = 3
and f (t) = 3e−3t . Thus,
P(at least one day without accident) = P(T > 1) = e−3 = 0.050.
So there is only 5% chance that a whole day will pass without another accident.

6.3 The Gamma Distribution


The Gamma function
The function Γ whose values are defined by:
Z ∞
Γ(α) = xα−1 e−x dx, α > 0
0

is called a gamma function.


(i) Integrating by parts with u = xα−1 and dv = e−x dx, we obtain:
Z ∞ Z ∞
−x α−1 ∞ α−2
Γ(α) = −e x |0 + e (α − 1)x dx = (α − 1)
−x
xα−2 e−x dx
0 0

for α > 1. This yields, Γ(α) = (α − 1)Γ(α − 1).


R∞ R∞
(ii) From Γ(α) = 0 xα−1 e−x dx for α > 0, it can be seen that Γ(1) = 0 e−x dx.
(iii) If n is any integer greater or equal to 2, that is n ≥ 2, then from
Γ(α) = (α − 1)Γ(α − 1),
repeated application of the recursion formula gives:
Γ(α) = (α − 1)(α − 2)Γ(α − 2) = (α − 1)(α − 2)(α − 3)Γ(α − 3)
and so forth. Note that when α = n, where n is a positive integer,
Γ(n) = (n − 1)(n − 2), ..., Γ(1).
R∞
But Γ(1) = 0
e−x dx = 1, hence, Γ(n) = (n − 1)!.

(iv) One important property of the gamma function is that Γ( 12 ) = π. The
proof is left to the reader as an exercise.
R∞
Example 8 Evaluate 0 x4 e−x dx.
Solution
This is a Gamma function with α − 1 = 4, ⇒ α = 5. Therefore,
Z ∞
x4 e−x dx = Γ(5) = 4!.
0
B.K. Nannyonga, H.W. Kayondo, N. Muyinda, & B.N. Kirenga c 9

The Gamma distribution

A continuous random variable X is said to have a gamma distribution with


parameters α > 0 and β > 0 if its pdf is given by:

( βα α−1 −βx
Γ(α) x e for x > 0,
f (x) =
0 otherwise.

and we write X ∼ Gamma(α, β).


The special gamma distribution for which α = 1 is called an exponential
distribution with parameter β.
We can show that the function f (x) is indeed a pdf by showing that:

Z ∞
f (x)dx = 1.
0

Now,

βα ∞
Z Z
f (x)dx = xα−1 e−βx dx,
0 Γ(α) 0

and letting u = βx ⇒ du = βdx, we obtain:

Z ∞ !α−1

βα
Z
u du
f (x)dx = e−u
0 Γ(α) 0 β β
α Z ∞ α−1
β u
= e−u du
Γ(α) 0 βα
βα 1
Z ∞
= uα−1 e−u du
Γ(α) βα 0
βα 1
= Γ(α) = 1.
Γ(α) βα

The Mean and Variance of Gamma distribution

Suppose that X ∼ Gamma(α, β), then,

α α
E(X) = and Var(X) = 2 .
β β
10 Elements of Probability & Statistics– MTH1202

Proof:
∞ ∞
βα α −βx
Z Z
E(X) = x f (x)dx = x e dx
0 0 Γ(α)
βα
Z ∞
= xα e−βx dx
Γ(α) 0
Z ∞ !α
βα u 1
= e−u du
Γ(α) 0 β β
!α+1 Z ∞
βα 1
= uα e−u du
Γ(α) β 0
!α+1 Z ∞
βα 1
= u(α+1)−1 e−u du
Γ(α) β 0
!α+1
βα 1
= Γ(α + 1)
Γ(α) β
βα α
= .αΓ(α) = .
Γ(α)βα+1 β
α(α+1)
Similarly E(X2 ) = β2
. Check it out. So,

α(α + 1) α2 α
Var(X) = − 2 = 2.
β 2 β β

Recall that:
R∞
(i) If α = 1 then f (x) = 0
βe−βx for β > 0, x > 0, this is equivalent to an
exp(β).
(ii) If β = 12 and α = n2 , then the gamma distribution is called the chi-square
(χ2 ) distribution with n degrees of freedom. The mean of a chi-square is n
and the variance is given by 2n. The pdf of a chi-square can be written as:
 n−2
−x
1
 2 n2 Γ( n ) x 2 e 2
 for x > 0,
f (x) = 

2
0 elsewhere.

The chi-square plays a very important role in sampling theory.

6.4 The Normal Distribution


The normal distribution is by far the most important continuous distribution in
the entire field of statistics. It is also referred to as the Gaussian distribution. By
the end of this topic, you should be able to:
• define a normal distribution.
B.K. Nannyonga, H.W. Kayondo, N. Muyinda, & B.N. Kirenga c 11

• compute the mean µ and the variance σ2 of the normal random variable.
• learn the standard normal Z.
• find probabilities from Normal tables.

Normal Random Variable


A random variable X is a normal random variable, or X is said to be normally
distributed with parameters µ and σ2 if the density of X is given by:

1 1 x−µ 2
f (x) = √ e− 2 ( σ ) , for − ∞ < x < ∞.
σ 2π
The distribution defined by a density function above is a normal distribution
and we say that X is normally distributed with the mean µ and variance σ2 ,
written as X ∼ N(µ, σ2 ).

The Normal Curve


The graph of f (x) is bell shaped. It is important to note that:
(a) f (x) > 0 for all x.
(b) f (x) decreases (to zero) as the distance between x and µ increases (to
infinity).
(c) f (x) is symmetric about the mean µ for all x.
(d) the mean, median and mode are all equal to µ.
x−µ 2
√1 e− 2 ( ) occurs
1
(e) the maximum value of the curve given by f (x) = σ
σ 2π
when x = µ.
(f) there are points of inflexion at x = µ − σ and x = µ + σ
(g) for any fixed value µ, the curve f (x) heightens and narrows as σ2 decreases,
in particular, given any x > 0, the area under the curve above x-axis and
between µ − x and µ + x increases as σ2 decreases, the graph of f (x) is
called a normal curve. Once µ and σ2 are specified, the normal curve takes
on different sizes for different values of µ and σ2 . A typical normal curve
is given in Figure 6.2 and from this, we can observe that:

(i) approximately 68% of the data falls within 1 standard deviation of


the mean.
(ii) approximately 95% of the data falls within 2 standard deviations of
the mean.
(iii) approximately 99.7% of the data falls within 3 standard deviations of
the mean.
12 Elements of Probability & Statistics– MTH1202

Figure 6.2: The normal curve with µ = 0 and σ2 = 1.

(h) The total area under the curve and above the horizontal axis equals 1.
The normal distribution was introduced by the French mathematician
Abraham de Moivre in 1733 and was used by him to approximate probabilities
associated with binomial random variables when the binomial parameter n
is large. This result was later extended by Laplace and others and is now
encompassed in a probability theorem known as the central limit theorem,
which gives a theoretical base to the often noted empirical observation that,
in practice, many random phenomena obey, at least approximately, a normal
probability distribution. Some examples of this behaviour are the height of a
person, the velocity in any direction of a molecule in gas, and the error made in
measuring a physical quantity.
Exercise: Show that f (x) is indeed a probability density function ( that is the
area under the normal curve equals 1).
Hint: you need to show that:
Z ∞ Z ∞
1 1 x−µ 2
f (x)dx = √ e− 2 ( σ ) dx = 1.
−∞ σ 2π −∞
x−µ
By making the substitution y = σ you observe that:
Z ∞ Z ∞
1 1 x−µ 2 1 1 2
√ e− 2 ( σ ) dx = √ e− 2 y dy.
σ 2π −∞ 2π −∞
Z ∞
1 2 √
It then suffices to show that, e− 2 y dy = 2π.
−∞
B.K. Nannyonga, H.W. Kayondo, N. Muyinda, & B.N. Kirenga c 13

R∞ 1 2
Let I = −∞
e− 2 y dy, then,
Z ∞ Z ∞ Z ∞ Z ∞
e− 2 (y +x ) dydx.
1 2 1 2 1 2 2
I2 = e− 2 y dy · e− 2 x dx =
−∞ −∞ −∞ −∞

By changing variables to polar coordinates, let x = rcosθ and y = rsinθ, so that


dydx = rdθdr.

Thus,
Z ∞ Z 2π Z ∞
1 2 1 2 1 2
I2 = re− 2 r dθdr = 2π 0 = 2π.
re− 2 r = | − 2πe− 2 r |∞
0 0 0
√ √
R∞ x−µ
− 21 ( σ )2
Hence I = 2π → √1 −∞ e dx = √1 · 2π = 1. Hence the area under
σ 2π 2π
the normal curve is equal to 1, which proves that f (x) is a probability density
function (pdf).
The mean of a normal distribution is µ and the variance is σ2 . The Proof
of this will be done in a second year course unit (Probability Theory) using
Moment Generating Function technique.
The normal distribution arises in many statistical problems and forms the
foundation for the theory of statistical inference.
X−µ
Let X ∼ Normal(µ, σ2 ). Consider a random variable Z = . Then,
σ
X µ µ µ
E(Z) = E( ) − E( ) = − = 0
σ σ σ σ
and
1 µ σ2
Var(X) − Var( ) = 2 − 0 = 1.
Var(Z) =
σ2 σ σ
In this case, Z is called a standard normal random variable or Z is said to have a
standard (unit) normal distribution. Thus if X ∼ N(µ, σ2 ), then Z ∼ N(0, 1). The
Z− score indicates how many standard deviations an element is from the mean.
If a Z− score is equal to zero, (0), it is on the mean. If it is +1, it is 1 standard
deviation from the mean. Negative values indicate that the raw scores were
below the mean.

Normal probabilities and normal tables


1 z2
The transformation φ(z) = √ e− 2 gives the probabilities such that:

Z z
1 z2
φ(z) = P(Z < z) = √ e− 2 dz.
2π −∞
The use of this cumulative distribution function gives all cumulative probabilities
as defined in normal tables. Whenever X assumes a value x, the corresponding
14 Elements of Probability & Statistics– MTH1202

x−µ
value of Z is given by z = . Therefore if X falls between two values, X = x1
σ
and X = x2 then the standard normal Z falls between corresponding values:
x1 − µ x2 − µ
z1 = and z2 = .
σ σ
Thus, we have the following:
Z x2
1 1 x−µ 2
P(x1 < X < x2 ) = √ e− 2 ( σ ) dx,
σ 2π x1
Z z2
1 1 2
= √ e− 2 z dz,
2π z1
= P(z1 < Z < z2 ).

and
a−µ X−µ b−µ
!
P(a < X < b) = P < < ,
σ σ σ
a−µ b−µ
!
= P <Z< .
σ σ

The probabilities of N(0, 1) are tabulated and published in statistical tables. The
following examples will guide us through on how to read normal tables as
well as applying normal distribution when solving mathematical and statistical
problems.

Example 9 If X is a normal random variable with mean µ = 3 and variance σ2 = 16,


find:

(a) P(X < 11),

(b) P(X > −1),

(c) P(2 < X < 7).

Solution
x−µ
(a) P(X < 11) = P( σ < 11−3
4 ) = P(z < 2). The shaded area under the graph
corresponds to P(z < 2). From the tables, it follows that P(z < 2) = 0.9772.
x−µ
(b) P(X > −1) = P( σ > −1−3
4 ) = P(z > −1). The shaded area under the graph
corresponds to P(z > −1). From the tables, it follows that,
P(z > −1) = 0.8413.
x−µ
(c) P(2 < X < 7) = P( 2−3
4 < σ < 4 ) = P(−0.25 < z < 1). The shaded area
4−3

under the graph corresponds to P(−0.25 < z < 1). From the tables, it
follows that P(−0.25 < z < 1) = φ(1) − φ(−0.25) = 0.8413 − 0.4013 = 0.44.
B.K. Nannyonga, H.W. Kayondo, N. Muyinda, & B.N. Kirenga c 15

Example 10 Suppose that the distribution of monthly earnings for all people who
possess a bachelor’s degree is known to be bell-shaped and symmetric with a mean
of $2116 and a standard deviation of $455. What percentage of individuals with a
bachelor’s degree earn:
(a) less than $1661 per month?
(b) more than $1206 per month?
(c) between $3026 and $3481 per month?

Solution

Let X be the random variable representing monthly earnings for all people
who possess a bachelor’s degree. Then X ∼ Normal($2116,$4552 ).
(a) P(X < $1661) = P(z < 1661−2116
455 ) = P(z < −1).
16 Elements of Probability & Statistics– MTH1202

The shaded area under the graph corresponds to P(z < −1). From the
tables, it follows that P(z < −1) = 0.1587. Therefore, approximately 16% of
the people who possess bachelor’s degree earn less than $1661 per month.

(b) P(X > $1206) = P(z > 1206−2116


455 ) = P(z > −2).

The shaded area under the graph corresponds to P(z > −2). From the tables,
it follows that P(z > −2) = 1 − 0.0228 = 0.9772. Therefore, approximately
98% of the people who possess bachelor’s degree earn more than $1206
per month.

(c) P($3026 < X < $3481) = P( 3026−2116


455 <z< 3481−2116
455 ) = P(2 < z < 3).

The shaded area under the graph corresponds to P(2 < z < 3). From the
tables, it follows that P(2 < z < 3) = φ(3) − φ(2) = 0.9987 − 0.9772 = 0.0215.
B.K. Nannyonga, H.W. Kayondo, N. Muyinda, & B.N. Kirenga c 17

Therefore, approximately 2% of the people who possess bachelor’s degree


earn between $3026 and $3481 per month.
Example 11 The distribution of heights for all American women aged 18 to 24 is
normally distributed with a mean height of 62.5 inches and a variance of 6.25 square
inches.What percentage of American women aged 18 to 24 is:
(a) between 60 inches and 70 inches tall?
(b) less than 65 inches tall?
(c) between 55 inches and 60 inches tall?
Solution
Let Y be a random variable that represents heights for all American women
aged 18 to 24. Then Y ∼Normal(62.5, 6.25).
(a) P(60 < Y < 70) = P( 60−62.5
2.5 <z< 70−62.5
2.5 ) = P(−1 < z < 3).

The shaded area under the graph corresponds to P(−1 < z < 3). From the
tables, it follows that P(−1 < z < 3) = φ(3) − φ(−1) = 0.9987 − 0.1587 = 0.84.
Therefore, approximately 84% of American women aged 18 to 24 are
between 60 and 70 inches tall.
(b) P(Y < 65) = P(z < 65−62.5
2.5 ) = P(z < 1). The shaded area under the graph

corresponds to P(z < 1). From the tables, it follows that P(z < 1) = 0.8413.
Therefore, approximately 84% of American women aged 18 to 24 are less
than 65 inches tall.
18 Elements of Probability & Statistics– MTH1202

(c) P(55 < Y < 60) = P( 55−62.5


2.5 <z< 60−62.5
2.5 ) = P(−3 < z < −1).

The shaded area under the graph corresponds to P(−3 < z < −1). From the
tables, it follows that P(−3 < z < −1) = φ(−1) − φ(−3) = 0.1587 − 0.0013 =
0.1574. Therefore, approximately 16% of American women aged 18 to 24
are between 55 and 60 inches tall.
Remark: It is good practise to always draw the normal curve and shade the
underlying area you are looking for, as done in the previous examples.
Example 12 The I.Q scores of all teenagers in a certain country are known to follow a
normal distribution with a mean of 100 and a standard deviation of 10.5.
(a) What percentage of teenagers in that country have I.Q scores between 89.5 and
121?
(b) What is the probability of finding a teenager in that country with an I.Q score of
less than 79?
(c) How many out of 1000 teenagers in that country have I.Q scores between 110.5
and 131.5?
Solution
Let X be a random variable representing I.Q scores for teenagers in that
country. Then, X ∼ Normal(100,10.5).
(a) P(89.5 < X < 121) = P( 89.5−100
10.5 < z < 121−100
10.5 ) = P(−1 < z < 2) = φ(2) −
φ(−1). From the tables, P(−1 < z < 2) = 0.9772−0.1587 = 0.8185. Therefore,
approximately 82% of teenagers in that country have I.Q scores between
89.5 and 121.
(b) P(X < 79) = P(z < 79−100
10.5 ) = P(z < −2). From the tables, P(z < −2) = 0.0228.
Therefore the probability that a teenager in that country has an I.Q score
of less than 79 is 0.0228.
(c) P(110.5 < X < 131.5) = P( 110.5−100
10.5 < z < 131.5−100
10.5 ) = P(1 < z < 3) =
φ(3) − φ(1). From the tables, P(1 < z < 3) = 0.9987 − 0.8413 = 0.1574.
Therefore, approximately 157 teenagers out of 1000 in that country have
I.Q scores between 110.5 and 131.5.
B.K. Nannyonga, H.W. Kayondo, N. Muyinda, & B.N. Kirenga c 19

Example 13 Suppose that the test scores for a particular college entrance exam are
distributed according to a bell-shaped, symmetric distribution with a mean of 450 and a
variance of 10000.
(a) What percentage of the students who take the entrance exam score between 350
and 650?
(b) Any student who scores higher than 550 is automatically admitted to the college.
What is the probability that a student who takes the entrance exam will be
automatically admitted to the college?
(c) What percentage of the students who take the entrance exam score between 250
and 350?

Solution

Let C be a random variable representing test scores for a particular college


entrance exam. Then, C ∼ Normal(450,1002 ).

(a) P(350 < C < 650) = P( 350−450


100 100 ) = P(−1 < z < 2) = φ(2) − φ(−1).
< z < 650−450
From the tables, P(−1 < z < 2) = 0.9772 − 0.1587 = 0.8185. Therefore,
approximately 82% of the students score between 350 and 650.
(b) P(C > 550) = P(z > 550−450
100 ) = P(z > 1). From the tables, P(z > 1) =
1 − 0.8413 = 0.1587. Therefore, the probability that a student who takes
the entrance exam will be automatically admitted to the college is 0.1587.
(c) P(250 < C < 350) = P( 250−450
100 < z < 350−450
100 ) = P(−2 < z < −1) = φ(−1) −
φ(−2). From the tables, P(−2 < z < −1) = 0.1587 − 0.0228 = 0.1359.
Therefore, approximately 13.6% of the students score between 250 and
350.

Example 14 For one of the examinations done by students offering a Bachelor’s degree
at the University, the average score was 72% and the standard deviation was 8%. If top
10% performers obtained A’s and the scores were normally distributed, what was the
lowest possible A and highest possible B?

Solution

From the figure, we need to first find a z-value that leaves an area of 0.1 to the
right and 0.9 to the left. From the tables, the z value is 1.28. It follows that
0.08 = 1.28. This gives x = 0.8224. Therefore the highest possible B is 82% and
x−0.72

lowest possible A is 83%.

Normal approximation to Binomial


The normal distribution provides a close approximation to the binomial when n
is large and p is close to 12 . The approximation is fairly accurate when np and nq
are at least 5.
20 Elements of Probability & Statistics– MTH1202

To investigate the normal approximation to the binomial distribution, we


drew a histogram for a binomial with parameters n = 15 and p = 0.45, together
with a normal curve with mean µ = np = 15 × 0.45 = 6.75 and a variance of
npq = 15 × 0.45 × 0.55 = 3.7125. Figure 6.3 depicts this.
It can be observed that the normal and binomial distributions are close under
the parameters used. We also compared the binomial probability,
P(X = 6) = 0.1914 with the normal approximation,
1.9268 < z 1.9268 ) = P(−0.65 < z < −0.13) = 0.1905.
P(5.5 < X < 6.5) = P( 5.5−6.75 6.5−6.75

The two answers agree to two decimal places.

Figure 6.3: The normal curve approximation to binomial(15,0.45).

Example 15 During measles outbreak in one village, the probability that a baby
recovered from it was 0.6. If 120 babies were known to have contracted measles, what is
the probability that less than one-half survived?
B.K. Nannyonga, H.W. Kayondo, N. Muyinda, & B.N. Kirenga c 21

Solution

Let X be the binomial random variable that represent the number of babies that
survived. Since n = 120, we can obtain fairly accurate results using the normal
approximation with:
µ = np = 120 × 0.6 = 72,
√ √
σ = npq = 120 × 0.6 × 0.4 = 5.3666.
The required probability is given by:
59
X  59.5 − 72 
P(X < 60) = bin(120, 0.6) ≈ P z < = P(z < −2.33) = 0.0107.
x=0
5.3666

Example 16 A multiple-choice quiz has 75 questions, each with 4 possible answers,


of which only one is the right answer. What is the probability that sheer guess work
results in between 20 and 30 right answers?

Solution

Let X represent the number of right answers due to guesswork, then X ∼ bin(75,
0.25). We need to work out:
30
X
P(20 ≤ X ≤ 30) = bin(75, 0.25).
20

Using normal approximation with:

µ = np = 75 × 0.25 = 18.75,
√ √
σ = npq = 75 × 0.25 × 0.75 = 3.75.
The required probability is given by:
30  19.5 − 18.75
X 30.5 − 18.75 
P(20 ≤ X ≤ 30) = bin(75, 0.25) ≈ P ≤z≤ = P(0.2 ≤ z ≤ 3.13) = 0.4198.
20
3.75 3.75

x=4
P
Example 17 Find the percentage error is approximating bin(20, 0.1) by the normal
x=1
approximation.

Solution
x=4
P
The exact solution for bin(20, 0.1) is given by:
x=1

x=4 k=4 !
X X 20
bin(20, 0.1) = (0.1)k (0.9)20−k = 0.8353.
k
x=1 k=1
22 Elements of Probability & Statistics– MTH1202

x=4
bin(20, 0.1), we use µ = np = 20 × 0.1 = 2
P
For normal approximation of
√ x=1

and σ = npq = 20 × 0.1 × 0.9 = 1.3416. The normal approximation is thus
given as:
x=4  0.5 − 2
X 4.5 − 2 
bin(20, 0.1) ≈ P ≤z≤ = P(−1.12 ≤ z ≤ 1.86) = 0.8372.
1.4316 1.4316
x=1

This results in an error of |0.8372 − 0.8353| = 0.0019. The percentage error is then
given as:
0.0019
%error = × 100% = 0.2%.
0.8353
Remark: We subtract or add 0.5 to the original binomial values since we are
approximating a discrete distribution by a continuous distribution. The 0.5 in
this case is called continuity correction factor.
B.K. Nannyonga, H.W. Kayondo, N. Muyinda, & B.N. Kirenga c 23

Exercise

1. Given that a random variable X is normally distributed with mean µ = 30


and standard deviation σ = 4, find:

(a) the area below 22,


(b) the area above 24,
(c) the area between 32 and 40,
(d) the area between 20 and 28,
(e) the area between 26 and 38,
(f) the x value that has 20% of the area to the left tail,
(g) the x value that has 12% of the area to the right tail.

2. Suppose Y is normally distributed with mean 16 and a standard deviation


of 2.2, find:

(a) P(Y < 14),


(b) P(15 < Y < 18),
(c) the value of k such that P(Y < k) = 0.2,
(d) the value of k such that P(Y > k) = 0.88.

3. A soft-drink machine is regulated so that it discharges an average of 298


millilitres of soda per 300-millilitre bottle. If the amount of soda in the
packed bottles is normally distributed with a standard deviation of 10
millilitres,

(a) what fraction of the bottles will contain more than 285 millilitres of
soda?
(b) what is the probability that a packed bottle contains between 275 and
307 millilitres of soda?
(c) How many bottles will likely overflow if 290-millilitre bottles are
used for a special order made for 500 bottles?
(d) below what value is the smallest 18% of the packed soda bottles?

4. A certain student commutes daily from her hostel to the University. On


average, the journey takes 18 minutes, with a standard deviation of 2.5
minutes. Assume that her journey times are normally distributed.

(a) What is the probability that a journey will take at least quarter an
hour?
(b) If lectures begin at 8:00 A.M and she leaves her hostel at 7:40 A.M
every day, how many days is she late for lectures given that she
attends lectures for 5 days in a week?
24 Elements of Probability & Statistics– MTH1202

(c) Find the probability that 3 of the next 5 journeys will take at least 20
minutes?
5. For a Psychology exam done last year at a certain University, grades were
approximately normally distributed with a mean of 72% and a standard
deviation of 8.5%, find:

(a) the pass mark if the lowest 14% of the students failed the exam,
(b) the highest mark for a B if the top 8% of the students obtained A’s.
6. In a certain continuous assessment test (CAT) done and marked out of 30
at the department of mathematics, the average was 24.6, with a standard
deviation of 1.5. Given that 8 students scored between 26.4 and 28.2, and
that the CAT scores were approximately normally distributed, how many
students sat for the CAT?
7. The average life of a certain type of car batteries is 2 years , with a standard
deviation of 3 months. The company replaces free all car batteries that
fail while under guarantee. If the company is willing to replace only 5%
of car batteries that fail, how long a guarantee should the company offer
assuming that the lives for car batteries follow a normal distribution?
8. Given that Y represents a binomial distribution with parameters n = 15
and p = 0.35. Find the error in approximating P(2 ≤ Y < 5) by the normal
approximation.

9. A pair of dice is rolled 180 times. Using normal approximation to a


binomial, find the probability that a total of 7 occurs:
(a) at least 25 times, Answer is 0.8643.
(b) between 33 and 41 times inclusive, Answer is 0.2978.
(c) exactly 30 times. Answer is 0.0796.
10. According to the last census done in Uganda, it was revealed that ap-
proximately 46% of all households own television sets in their respective
homes, what is the probability that between 480 and 510 inclusive of the
next 1000 randomly selected households in Uganda own television sets in
their homes? Hint: Use the normal approximation to binomial.
B.K. Nannyonga, H.W. Kayondo, N. Muyinda, & B.N. Kirenga c 25

You might also like