Normal Distribution With Solved Examples
Normal Distribution With Solved Examples
Jonathan Marchini
1 Introduction
In previous lectures we have considered discrete datasets and discrete probability
distributions. In practice many datasets that we collect from experiments consist
of continuous measurements. For example, Figures 1, 2, 3 and 4 show histograms
of real datasets consisting of continuous measurements. From such samples of
continuous data we might want to test whether the data is consistent with a spe-
cific population mean value or whether there is a significant difference between
2 groups of data. To answer these question we need a probability model for the
data. The Normal distribution is one such model and is used extensivley through-
out statistics.
10
8
Frequency
6
4
2
0
1
10
8
Frequency
6
4
2
0
2
60
Frequency
40
20
0
3
0.00 0.02 0.04 0.06 0.08 0.10 0.12
P(X)
0 5 10 15 20
X
4
3 The Normal Distribution
There will be many, many possible probability density functions over a contin-
uous range of values. The Normal distribution describes a special class of such
distributions that are symmetric and can be described by the distribution mean µ
and the standard deviation σ (or variance σ 2 ). 4 different Normal distributions are
shown in Figure 7 together with the values of µ and σ. These plots illustrate how
changing the values of µ and σ alter the positions and shapes of the distributions.
X∼N(µ, σ 2 )
µ and σ are the parameters of the distribution.
µ = 100 σ = 10 µ = 100 σ = 5
0.08
0.08
density
density
0.04
0.04
0.00
0.00
µ = 130 σ = 10 µ = 100 σ = 15
0.08
0.08
density
density
0.04
0.04
0.00
0.00
5
3.1 Calculating probabilities from the Normal distribution
For a discrete probability distribution we calculate the probability of being less
than some value x, i.e. P (X < x), by simply summing up the probabilities of the
values less than x.
P(Z < 0)
0
For this example we can calculate the required area as we know the distribution is
symmetric and the total area under the curve is equal to 1, i.e. P (X < 0) = 0.5.
6
What about P (X < 1.0)?
P(Z < 1)
0 1
Calculating this area is not easy1 and so we use probability tables. Probability
tables are tables of probabilities that have been calculated on a computer. All we
have to do is identify the right probability in the table and copy it down! Obvi-
ously it is impossible to tabulate all possible probabilities for all possible Normal
distributions so only one special Normal distribution, N(0, 1), has been tabulated.
The tables allow us to read off probabilities of the form P (Z < z). Most of
the table in the formula book has been reproduced in Table 3.1. From this table
we can identify that P (X < 1.0) = 0.8413 (this probability has been highlighted
with a box)
1
For those Mathematicians who recognize this area as a definite integral and try to do the
integral by hand please note that the integral cannot be evaluated analytically
7
0 z
z 0.0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09
0.0 0.5000 5040 5080 5120 5160 5199 5239 5279 5319 5359
0.1 0.5398 5438 5478 5517 5557 5596 5636 5675 5714 5753
0.2 0.5793 5832 5871 5910 5948 5987 6026 6064 6103 6141
0.3 0.6179 6217 6255 6293 6331 6368 6406 6443 6480 6517
0.4 0.6554 6591 6628 6664 6700 6736 6772 6808 6844 6879
0.5 0.6915 6950 6985 7019 7054 7088 7123 7157 7190 7224
0.6 0.7257 7291 7324 7357 7389 7422 7454 7486 7517 7549
0.7 0.7580 7611 7642 7673 7704 7734 7764 7794 7823 7852
0.8 0.7881 7910 7939 7967 7995 8023 8051 8078 8106 8133
0.9 0.8159 8186 8212 8238 8264 8289 8315 8340 8365 8389
1.0 0.8413 8438 8461 8485 8508 8531 8554 8577 8599 8621
1.1 0.8643 8665 8686 8708 8729 8749 8770 8790 8810 8830
Table 1: N(0, 1) probability table
8
Once we can know how to read tables we can calculate lots of other probabil-
ities
0 0.92 0 0.92
We know that P (X > 0.92) = 1 − P (X < 0.92) and we can calculate P (X <
0.92) from the tables.
−0.5 0 0 0.5
The Normal distribution is symmetric so we know that P (X > −0.5) = P (X <
0.5) = 0.6915
−0.76 0 0 0.76
9
Example 4 P (−0.64 < X < 0.43)
−0.64 0 0.43
P(X < −0.64) P(X < 0.43)
−0.64 0 0 0.43
We can calculate this using
From tables we know that P (X < 0.56) = 0.7123 and P (X < 0.56) = 0.7157
To calculate P (X < 0.567) we interpolate between these two values
10
3.2 Standardization
All of the probabilities above were calculated for the standard Normal distribution
N(0, 1). If we want to calculate probabilities from different Normal distributions
we convert the probability to one involving the standard Normal distribution. This
process is called standardization.
Subtracting the mean re-centers the distribution on zero. Dividing by the standard
deviation re-scales the distribution so it has standard deviation 1. If we also trans-
form the boundary point of the area we wish to calculate we obtain the equivalent
boundary point for the N(0, 1) distribution. This process is illustrated in the figure
below. In this example, P (X < 6.2) = P (Z < 1.6) = 0.9452 where Z ∼ N(0,1)
N(3, 4)
3 6.2
−3 => N(0, 4)
0 3.2
/ 2 => N(0, 1)
0 1.6
This process can be described by the following rule
X−µ
If X ∼ N(µ, σ 2 ) and Z = σ
then Z ∼ N(0, 1)
11
Example 6
Suppose we know that the birth weight of babies is Normally distributed with
mean 3500g and standard deviation 500g. What is the probability that a baby is
born that weighs less than 3100g?
Drawing a rough diagram of the process can help you to avoid any confusion
about which probability (area) you are trying to calculate.
!
X − 3500 3100 − 3500
P (X < 3100) = P < = P (Z < −0.8) where Z ∼ N(0, 1)
500 500
= 1 − P (Z < 0.8)
= 1 − 0.7881
= 0.2119
12
3.3 Linear combinations of Normal random variables
Suppose two rats A and B have been trained to navigate a large maze. The time it
takes rat A is normally distributed with mean 80 seconds and standard deviation
10 seconds. The time it takes rat B is normally distributed with mean 78 seconds
and standard deviation 13 seconds. On any given day what is the probability that
rat A runs the maze faster than rat B?
Z=D−2
16.40
0 2 0−2 0
16.40
= −0.122
!
D−2 0−2
P (D < 0) = P √ <√ = P (Z < −0.122) where Z ∼ N(0, 1)
269 269
= 1 − (0.8 × 0.5478 + 0.2 × 0.5517)
= 0.45142
13
Other rules that are often used are
If X and Y are two independent normal variable such that
then
Example 7 Suppose two rats A and B have been trained to navigate a large maze.
The time it takes rat A is normally distributed with mean 80 seconds and standard
deviation 10 seconds. The time it takes rat B is normally distributed with mean 78
seconds and standard deviation 13 seconds. On any given day what is the proba-
bility that the average time the rats take to run the maze is greater than 82 seconds?
X+Y
Let A = 2
= 21 X + 12 Y be the average time of rats A and B
1 1 1 2 2 1 2 2
Then A ∼ N 2 80 + 2 78, ( 2 ) 10 + ( 2 ) 13 = N(79, 67.25)
Z = A − 79
8.20
79 82 0 82 − 79
8.20
= 0.366
!
A − 79 82 − 79
P (A > 82) = P √ < √ = P (Z > 0.366) where Z ∼ N(0, 1)
67.25 67.25
= 1 − (0.4 × 0.6406 + 0.6 × 0.6443)
= 0.35718
14
3.4 Using the Normal tables backwards
Example 8
The marks of 500 candidates in an examination are normally distributed wit a
mean of 45 marks and a standard deviation of 20 marks.
⇒ P (X < x) = 0.8
Z = X − 45
20
45 x 0 x − 45
20
= 0.84
15
4 The Normal approximation to the Binomial
Under certain conditions we can use the Normal distribution to approximate the
Binomial distribution. This can be very useful when we need to sum up a large
number of Binomial probabilities to calculate the probability that we want.
For example, Figure 8 compares a Bin(300, 0.5) and a N(150, 75) which both
have the same mean and variance. The figure shows that the distributions are very
similar.
0.04
0.03
0.03
P(X = x)
density
0.02
0.02
0.01
0.01
0.00
0.00
100 120 140 160 180 200 100 120 140 160 180 200
X X
In general
If X ∼ Bin(n, p) then
µ = np
σ 2 = npq where q =1−p
X ∼ N(np, npq)
1 1
n > 10 and p ≈ 2
OR n > 30 and p moving away from 2
16
Example 8
Suppose X ∼Bin(12, 0.5) what is P (4 ≤ X ≤ 7)?
µ = np = 6
σ 2 = npq = 3
Unfortunately, it’s not quite so simple. We have to take into account the fact
that we are using a continuous distribution to approximate a discrete distribution.
This is done using a continuity correction. The continuity correction appropriate
for this example is illustrated in the figure below
0 1 2 3 4 5 6 7 8 9 10 11 12
3.5 7.5
!
3.5 − 6 X −6 7.5 − 6
P (3.5 < X < 7.5) = P √ < √ < √
3 3 3
= P (−1.443 < Z < 0.866) where Z ∼ N(0, 1)
= 0.732
The exact answer is 0.733 so in this case the approximation is very good.
17
5 The Normal approximation to the Poisson
We can also use the Normal distribution to approximate a Poisson distribution un-
der certain conditions.
In general,
If X ∼ Po(λ) then
µ = λ
σ2 = λ
X ∼ N(λ, λ)
Example 9 A radioactive source emits particles at an average rate of 25 particles
per second. What is the probability that in 1 second the count is less than 28 par-
ticles?
25 26.5 0 26.5 − 25
5
= 0.3
!
X − 25 26.5 − 25
P (X < 26.5) = P <
5 5
= P (Z < 0.3) where Z ∼ N(0, 1)
= 0.6179
18
Lecture 6 : The Normal Distribution
Jonathan Marchini
Continuous data
10
8
Frequency
6
4
2
0
10
8
Frequency
6
4
2
0
12
10
8
Frequency
6
4
2
0
60
Frequency
40
20
0
0 5 10 15 20
X
For continuous data we don’t have equally spaced
discrete values so instead we use a curve or function
that describes the probability density over the range of
the distribution.
The curve is chosen so that the area under the curve is
equal to 1.
If we observe a sample of data from such a distribution
we should see that the values occur in regions where
the density is highest.
A continuous probability distribution
0.04
0.03
density
0.02
0.01
0.00
0.08
0.08
density
density
0.04
0.04
0.00
0.00
50 100 150 50 100 150
X X
µ = 130 σ = 10 µ = 100 σ = 15
0.08
0.08
density
density
0.04
0.04
0.00
0.00
0
Symmetry ⇒ P (Z < 0) = 0.5
What about P (Z < 1.0)?
P(Z < 1)
0 1
Calculating this area is not easy and so we use proba-
bility tables. Probability tables are tables of probabil-
ities that have been calculated on a computer. All we
have to do is identify the right probability in the table
and copy it down!
Only one special Normal distribution, N(0, 1), has
been tabulated.
The N(0, 1) distribution is called
the standard Normal distribution.
The tables allow us to read off probabilities of the form
P (Z < z).
0 z
z 0.0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09
0.0 0.5000 5040 5080 5120 5160 5199 5239 5279 5319 5359
0.1 0.5398 5438 5478 5517 5557 5596 5636 5675 5714 5753
0.2 0.5793 5832 5871 5910 5948 5987 6026 6064 6103 6141
0.3 0.6179 6217 6255 6293 6331 6368 6406 6443 6480 6517
0.4 0.6554 6591 6628 6664 6700 6736 6772 6808 6844 6879
0.5 0.6915 6950 6985 7019 7054 7088 7123 7157 7190 7224
0.6 0.7257 7291 7324 7357 7389 7422 7454 7486 7517 7549
0.7 0.7580 7611 7642 7673 7704 7734 7764 7794 7823 7852
0.8 0.7881 7910 7939 7967 7995 8023 8051 8078 8106 8133
0.9 0.8159 8186 8212 8238 8264 8289 8315 8340 8365 8389
1.0 0.8413 8438 8461 8485 8508 8531 8554 8577 8599 8621
1.1 0.8643 8665 8686 8708 8729 8749 8770 8790 8810 8830
0 0.92 0 0.92
−0.5 0 0 0.5
The Normal distribution is symmetric so we know that
P (Z > −0.5) = P (Z < 0.5) = 0.6915
Example 3
−0.76 0 0 0.76
By symmetry
P (Z < −0.76) = P (Z > 0.76) = 1 − P (Z < 0.76)
= 1 − 0.7764
= 0.2236
Example 4
−0.64 0 0.43
P(Z < −0.64) P(Z < 0.43)
−0.64 0 0 0.43
We can calculate this probability as
P (−0.64 < Z < 0.43) = P (Z < 0.43) − P (Z < −0.64)
= 0.6664 − (1 − 0.7389)
= 0.4053
Example 5
3 6.2
−3 => N(0, 4)
0 3.2
/ 2 => N(0, 1)
0 1.6
We convert this probability to one involving the
N(0, 1) distribution by
(i) Subtracting the mean µ
(ii) Dividing by the standard deviation σ
Subtracting the mean re-centers the distribution on
zero. Dividing by the standard deviation re-scales the
distribution so it has standard deviation 1. If we also
transform the boundary point of the area we wish to
calculate we obtain the equivalent boundary point for
the N(0, 1) distribution.
⇒ P (X < 6.2) = P (Z < 1.6) = 0.9452 where Z ∼ N(0, 1)
This process can be described by the following rule
X−µ
If X ∼ N(µ, σ 2) and Z = σ
then
Z ∼ N(0, 1)
Example 6
Z=D−2
16.40
0 2 0−2 0
16.40
= −0.122
!
D−2 0−2
P (D < 0) = P √ <√
269 269
= P (Z < −0.122) Z ∼ N (0, 1)
= 0.45142
Other rules that are often used are
If X and Y are two independent normal
variables such that
then
X + Y ∼ N(µ1 + µ2, σ12 + σ22)
aX ∼ N(aµ1, a2σ12)
aX + bY ∼ N(aµ1 + bµ2, a2σ12 + b2σ22)
Using the Normal tables backwards
⇒ P (X < x) = 0.8
X ~ N(45, 400) Z ~ N(0, 1)
Z = X − 45
20
45 x 0 x − 45
20
= 0.84
Standardizing this probability we get
!
X − 45 x − 45
P < = 0.8
20 20
!
x − 45
⇒P Z< = 0.8
20
0.04
0.03
0.03
P(X = x)
density
0.02
0.02
0.01
0.01
0.00
0.00
100 120 140 160 180 200 100 120 140 160 180 200
X X
In general
If X ∼ Bin(n, p) then
µ = np
σ 2 = npq where q = 1 − p
For large n and p not too small or too large
X ∼ N(np, npq)
Z = X − 25
5
25 26.5 0 26.5 − 25
5
= 0.3
!
X − 25 26.5 − 25
P (X < 26.5) = P <
5 5
= P (Z < 0.3) where Z ∼ N(0, 1)
= 0.6179