Normal Distribution
Normal Distribution
Normal Distribution
Normal distribution
Distribution shape: Frequency distributions can assume many shapes
The three most important shapes are
1. positively skewed,
2. symmetric
3. negatively skewed
In a positively skewed or right-skewed distribution, the majority of the data values fall to the
left of the mean and cluster at the lower end of the distribution; the “tail” is to the right. Also, the
mean is to the right of the median, and the mode is to the left of the median. For example, if an
instructor gave an examination and most of the students did poorly, their scores would tend to
cluster on the left side of the distribution. A few high scores would constitute the tail of the
distribution, which would be on the right side. Another example of a positively skewed distribution
is the incomes of the population of the United States. Most of the incomes cluster about the low
end of the distribution; those with high incomes are in the minority and are in the tail at the right
of the distribution.
In a symmetric distribution, the data values are evenly distributed on both sides of the mean. In
addition, when the distribution is unimodal, the mean, median, and mode are the same and are at
the center of the distribution. Examples of symmetric distributions are IQ scores and heights of
adult males.
Qaisar Sohail
M PHIL statistics
Email: [email protected]
2
When the majority of the data values fall to the right of the mean and cluster at the upper end of
the distribution, with the tail to the left, the distribution is said to be negatively skewed or left-
skewed. Also, the mean is to the left of the median, and the mode is to the right of the median. As
an example, a negatively skewed distribution results if the majority of students score very high on
an instructor’s examination. These scores will tend to cluster to the right of the distribution.
What is normal?
Medical researchers have determined so-called normal intervals for a person’s blood pressure,
cholesterol, triglycerides, and the like. For example, the normal range of systolic blood pressure is
110 to 140. The normal interval for a person’s triglycerides is from 30 to 200 milligrams per
deciliter (mg/dl). By measuring these variables, a physician can determine if he study the people
person’s blood pressure, cholesterol, triglycerides he will found that maximum people will be fall
within the normal interval or if some type of treatment is needed to correct a condition and avoid
future illnesses. For example, if we study the age at menarche in Indian women, the age of the
menarche of the most of the women will fall between 13-15 years, very few women will fall
between 9-11 years (i.e. lower end of scale) and between 16-18 (higher end of scale) the frequency
distribution is also known as normal distribution or symmetric distribution i.e. normal curve (when
we make histogram of data).
Normal distribution: Normal distribution was first discovered by De-moivre in 1733. Carl
Freidrich Gauss and French mathematician Pierre Simon Laplace derived normal distribution in
1809. Normal distribution, also known as the Gaussian distribution, is a probability
distribution that is symmetric about the mean, showing that data near the mean are more frequent
in occurrence than data far from the mean. In graph form, normal distribution will appear as a bell
curve.
A random variable is normally distributed with mean µ and variance 𝜎 2 if its probability
distribution function is
1 𝑥−𝜇 2
1
P(X)= 𝑒 −2( 𝜎
)
for -∞ < 𝑥 < ∞
√2𝜋𝜎
where 𝜋 = 3.14 𝑎𝑛𝑑 e= 2.71
Qaisar Sohail
M PHIL statistics
Email: [email protected]
3
The Standard Normal Distribution: Since each normally distributed variable has its own
mean and standard deviation, as stated earlier, the shape and location of these curves will vary.
In practical applications, then, you would have to have a table of areas under the curve for each
variable. To simplify this situation, statisticians use what is called the standard normal
distribution.
Qaisar Sohail
M PHIL statistics
Email: [email protected]
4
The standard normal distribution is a special case of the normal distribution . It is the
distribution that occurs when a normal random variable has a mean of zero and a
standard deviation of one.
The formula for the standard normal distribution is
𝑧2
1
P(z)= 𝑒− 2
√2𝜋
All normally distributed variables can be transformed into the standard normally distributed
variable by using the formula for the standard score.
𝑣𝑎𝑙𝑢𝑒−𝑚𝑒𝑎𝑛 𝑥−𝜇
Here z= 𝑠𝑡𝑎𝑛𝑑𝑎𝑟𝑑 𝑑𝑒𝑣𝑎𝑡𝑖𝑜𝑛 or z= 𝜎
Z score: The normal random variable of a standard normal distribution is called a standard
score or a z score. The location of any element in a normal distribution can be expressed in
term of how many standard deviations it lies above or below the mean of the distribution. The
is z score of element. If the element lies above the mean it will be have positive z score, if
element lies below the mean it will be have negative z score.
for example, the heart rate of 85 brats/min in a distribution shown in figure below lies 1.5
standard deviation above the mean so it has zero score of +1.5. A heart rate 65 beats/min lies
0.5 standard deviation below the mean so its z score is -0.5.
Example: The National Center for Health Statistics at the CDC gives the following estimate
weight
of the body mass index ( height2) for 15 year old boys.
𝑥̅ =19.83
Relative to the variability in BMI for 15-year-old boys in general, Fred BMI may be close to
the mean or far away.
Qaisar Sohail
M PHIL statistics
Email: [email protected]
6
Thus, the extremeness of Fred’s BMI is quantified by its distance from the mean BMI relative
to the SD of BMI.
The z score gives us this kind of information.
𝑣𝑎𝑙𝑢𝑒−𝑚𝑒𝑎𝑛 𝑥−𝜇
Here z= 𝑠𝑡𝑎𝑛𝑑𝑎𝑟𝑑 𝑑𝑒𝑣𝑎𝑡𝑖𝑜𝑛 or z= 𝜎
25−19.83
Case 1: z= =0.517 Fred’s BMI is 0.517 SD above the mean
10
25−19.83
Case 1: z= =2.585 Fred’s BMI is 2.585 SD above the mean
2
An exact answer to this question depends upon the distribution of the variable you are
interested in.
There are three basic types of problems, and all three are summarized in the Procedure Table.
Note that this table is presented as an aid in understanding how to use the standard normal
distribution table and in visualizing the problems. After learning the procedures, you should
not find it necessary to refer to the Procedure Table for every problem.
Qaisar Sohail
M PHIL statistics
Email: [email protected]
7
Qaisar Sohail
M PHIL statistics
Email: [email protected]
8
Solution:
Qaisar Sohail
M PHIL statistics
Email: [email protected]
9
table in gives the area under the normal distribution curve to the left of any z value given in
two decimal places. For example, the area to the left of a z value of 1.39 is found by looking
up 1.3 in the left column and 0.09 in the top row. Where the two lines meet gives an area of
0.9177.
Example 1:
Qaisar Sohail
M PHIL statistics
Email: [email protected]
10
Example 2:
Example 3:
Qaisar Sohail
M PHIL statistics
Email: [email protected]
11
Note: In a continuous distribution, the probability of any exact z value is 0 since the area would
be represented by a vertical line above the value. But vertical lines in theory have no area. So
P (a≤ z ≤b) = P (a< z< b)
Qaisar Sohail
M PHIL statistics
Email: [email protected]
12
Example:
Qaisar Sohail
M PHIL statistics
Email: [email protected]
13
Exercise:
find the probabilities for each, using the standard normal distribution
Qaisar Sohail
M PHIL statistics
Email: [email protected]
14
Example:
Qaisar Sohail
M PHIL statistics
Email: [email protected]
15
Example:
Example:
The mean height of 500 medical students is 165cm and the standard deviation is 5cm assuming
that height is normally distributed find how many students will have height between 153 cm
and 180 cm?
Solution:
Given that
153−165
Z= =-2.4
5
180−165
Z= =3
5
So
P(153<X<180) = P(153<X<180) =P(0<Z<-2.4) +P(0<Z<3)
By using table D
=0.4918+0.4987
=0.9905
Hence the required number of students whose heights are in between 153cm and
180cm= 0.9905×500= 450 (approximately).
Example:
A hospital records the weight of every new born child at the hospital. The distribution of weight
Of weight is normally shaped, has mean, µ=2.9kg, and has a standard deviation, 𝜎 = 2.5kg
Find the
Qaisar Sohail
M PHIL statistics
Email: [email protected]
17
Solution:
Given that
µ=2.9kg
𝜎 = 2.5kg
1. To find the percentage of new born who weighted under 2.1 kg first we covert 2.1 kg to
z score as
𝑥−𝜇 2.1−2.9
z= = = -1.78
𝜎 0.45
From table F
P(z<-1.78) = 0.0375
Hence, the percentage of new born who weighed under 2.1 kg= 0.0375×100=3.75%.
2. To the percentage of new born who weighted between 1.8kg and 4kg firstly we transfer
the values into z score.
𝑥−𝜇 1.8−2.9
z= = = -2.4
𝜎 0.45
𝑥−𝜇 4−2.9
z= = = 2.4
𝜎 0.45
Qaisar Sohail
M PHIL statistics
Email: [email protected]
18
so
P(-2.4<z<2.4) =P(0<z<-2.4) + P(0<z<2.4)
From area table E
=0.4918+0.4918=0.9836
Hence the required percentage of new born between weight 1.8 kg and 4 kg is
0.9836×100= 98.4%.
3. To find the, how many weighted less than 2.5kg
𝑥−𝜇 2.5−2.9
z= = = -2.9
𝜎 0.45
Qaisar Sohail
M PHIL statistics
Email: [email protected]
19
Thus the required no of babies who weighed less than 2.5 kg = 0.1867×1500=280.
Example: Assume that the age at onset of disease X is distributed normally with the mean of 50
years and standard deviation of 12 years what is the probability that individual afflicted with X
had developed it before age 35 years?
Solution: Given that
µ=50 years
𝜎 = 12 years
Qaisar Sohail
M PHIL statistics
Email: [email protected]
20
Exercise:
Q1. Pulse rate of healthy male adults follow normal distribution with a mean of 75 per minute
and standard deviation of 4 per minute. Find out the percentage of individuals having pulse rate
beyond 85 per minute?
Q2. Systolic B.P reading of a large male population is normally distributed with mean 100 and
standard deviation 15. What is the 90th percentile of systolic B.P reading?
Q3. The average annual salary for all U.S. teachers is $47,750. Assume that the distribution is
normal and the standard deviation is $5680. Find the probability that a randomly selected teacher
earns.
a. Between $35,000 and $45,000 a year 0.3031
b. More than $40,000 a year 0.9131
c. If you were applying for a teaching position and were offered $31,000 a year, how would you
feel (based on this information)?
Q4. The average daily jail population in the United States is 706,242. If the distribution is
normal and the standard deviation is 52,145, find the probability that on a randomly selected day,
the jail population is
a. Greater than 750,000
b. Between 600,000 and 700,000
Q5. The average number of calories in a 1.5-ounce chocolate bar is 225. Suppose that the
distribution of calories is approximately normal with s 10. Find the probability that a randomly
selected chocolate bar will have
a. Between 200 and 220 calories
b. Less than 200 calories
Q6. The average monthly mortgage payment including principal and interest is $982 in the
United States. If the standard deviation is approximately $180 and the mortgage payments are
approximately normally distributed, find the probability that a randomly selected monthly
payment is
a. More than $1000
b. More than $1475
c. Between $800 and $1150
note: all question solved with diagram of normal distribution
Qaisar Sohail
M PHIL statistics
Email: [email protected]
21
Table E
Qaisar Sohail
M PHIL statistics
Email: [email protected]
22
Table D
Qaisar Sohail
M PHIL statistics
Email: [email protected]
23
Table F
Qaisar Sohail
M PHIL statistics
Email: [email protected]