Ch-6 Normal Distribution Lecture Notes
Ch-6 Normal Distribution Lecture Notes
Normal Distribution
Introduction
1. Symmetry: The curve is symmetrical, with the mean 𝜇, median, and mode all equal
and located at the center.
2. 68-95-99.7 Rule: About 68% of the data falls within one standard deviation, 95%
within two standard deviations, and 99.7% within three standard deviations.
3. Standard Normal Distribution: Standardizing using 𝑍 = 𝑋−𝜇 𝜎 leads to the standard
normal distribution with a mean of 0 and standard deviation of 1.
1
Figure 1: normal distribution formula
Here, x is value of the variable; f(x) represents the probability density function; � (mu) is the
mean; and � (sigma) is the standard deviation.
It is the process of transforming its values into standard units, making it conform to a standard
normal distribution. The standard normal distribution, also known as the Z distribution, has a
mean 𝜇 of 0 and a standard deviation 𝜎 of 1. Standardizing allows us to compare and analyze
different normal distributions more easily.
The formula for standardizing a normal distribution is:
𝑋−𝜇
𝑧=
𝜎
Here: - 𝑧 is the standard score or z-score, - 𝑋 is the raw score from the original distribution,
- 𝜇 is the mean of the original distribution, - 𝜎 is the standard deviation of the original
distribution.
When you standardize a variable using this formula, the resulting 𝑧 score tells you how many
standard deviations an observation or data point is from the mean of the distribution. If 𝑧 is
positive, the data point is above the mean, and if 𝑧 is negative, it’s below the mean.
Empirical Rule
The empirical rule for normal distributions describes where most of the data in a normal
distribution will appear, and it states the following:
• 68.2% of the observations will appear within +/-1 standard deviation of the mean;
• 95.4% of the observations will fall within +/-2 standard deviations; and
• 99.7% of the observations will fall within +/-3 standard deviations.
2
Example: Exam Scores Consider a dataset of final exam scores in a statistics class.
Assume the scores follow a normal distribution with a mean 𝜇 of 75 and a standard
deviation 𝜎 of 10.
1. Within One Standard Deviation: About 68% of the scores will fall within the range of 𝜇 ± 𝜎
, which is 65 to 85.
2. Within Two Standard Deviations: Approximately 95% of the scores will fall within the
range of 𝜇 ± 2𝜎, which is 55 to 95.
3. Within Three Standard Deviations: Around 99.7% of the scores will fall within the range
of 𝜇 ± 3𝜎, which is 45 to 105.
This empirical rule provides a quick way to assess the distribution of data in a normal dis-
tribution, making it a valuable tool for understanding the spread of scores in a statistical
context.
3
Question:
Assume that the test scores of a college entrance exam fits a normal distribution. Furthermore,
the mean test score is 72, and the standard deviation is 15.2. What is the percentage of students
scoring 84 or more in the exam?
pnorm(84,72,15.2,lower.tail = FALSE)
[1] 0.2149176
Question:
Suppose that SAT scores are normally distributed, and that the mean SAT score is 1000, and
the standard deviation of all SAT scores is 100. How high must you score so that only 10% of
the population scores higher than you?
qnorm(0.90,1000,100,lower.tail = FALSE)
[1] 871.8448
Question
A musical act expects the number of tickets sold for an upcoming concert to be normally
distributed with a mean of 450 tickets and a standard deviation of 60 tickets (1) Compute the
probability that between 400 and 450 tickets are sold. (2) Compute the probability that less
than 300 tickets are sold. (3) Compute the probability that at least 550 tickets are sold (4)
Compute the probability between 400 and 500 tickets are sold.
Solution: (Student activity)
Definition
The Central Limit Theorem (CLT) asserts that as the sample size increases, the distribu-
tion of sample means from any population tends to become normal, irrespective of the original
shape of the population distribution.
4
Example
Imagine you have a population of people’s heights, and this population distribution is not
normal. It might be skewed or have a different shape.
The CLT allows us to make statistical inferences about the average height of the population,
assuming we have a sufficiently large sample size, even if the individual heights are not normally
distributed.
Examples(Student activity)
Consider the following dataset of final exam scores for a statistics class:
72, 85, 64, 78, 93, 56, 80, 88, 70, 75, 82, 68, 90, 77, 84
By Hand Calculations
𝑋−𝜇
1. Percentage below 85: Calculate ( Z ) using 𝑍 = 𝜎 and find the cumulative proba-
bility.
2. Percentage between 60 and 80: Convert 60 and 80 to 𝑍 and use the 68-95-99.7 rule.
3. Minimum score for top 10%: Find the 𝑍 for the 90th percentile and convert it back
to the raw score.
5
Normal Distribution Functions in R
In statistical programming languages like R, three essential functions related to the normal
distribution are qnorm, pnorm, and dnorm.
These functions are particularly useful for working with normal distributions in statistical
analysis and hypothesis testing.