0% found this document useful (0 votes)
32 views

Normal Distribution Normal Probability Distribution: Mean Continuous Random Variable (X)

The document introduces the normal distribution, its key properties, and related concepts. It defines the normal distribution and explains that it is the most important theoretical distribution in statistics. It presents the normal probability density function and describes the bell curve shape of the distribution. It discusses how the mean, median, and mode are equal for a normal distribution and compares properties of two normal distributions.

Uploaded by

Marvin Arce
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
32 views

Normal Distribution Normal Probability Distribution: Mean Continuous Random Variable (X)

The document introduces the normal distribution, its key properties, and related concepts. It defines the normal distribution and explains that it is the most important theoretical distribution in statistics. It presents the normal probability density function and describes the bell curve shape of the distribution. It discusses how the mean, median, and mode are equal for a normal distribution and compares properties of two normal distributions.

Uploaded by

Marvin Arce
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 5

INTRODUCTION TO NORMAL DISTRIBUTION

Lesson Objectives
At the end of this lesson you shall be able to:
 define what a normal distribution is;
 enumerate the properties of a normal distribution;
 determine the family of normal distribution; and
 discuss the relationship between the mean, median and the mode in a normal distribution;

NORMAL DISTRIBUTION

The normal distribution (or normal probability distribution) is the most important theoretical
continuous distribution used in statistics because:
1. it has played a central role in the development of inferential statistics;
2. many real-world random variables exhibit frequency (or relative frequency) distribution that
closely resemble normal distribution; and
3. it can conveniently be used to approximate many other probability distributions, such as binomial
and the Poisson.

The normal probability distribution is defined by the function called the normal probability density
function (or normal probability function), given below, which for the continuous random variable X can
take a specific values X = x.
2
− ( x−μ )
f ( x )=¿ 1 e , for −∞< x <∞
2

σ √2 π

where:
e (Euler’s constant) = 2.71828…
π = 3.14159…
μ= E ( X ) = mean of the normal distribution
σ =E [ ( X−μ ) ] = variance of the normal distribution
2 2

σ =√ σ 2 = standard deviation of the normal distribution


−∞< x <∞ means that the function is defined for all real numbers.

The graph of a typical normal distribution is shown below.

Density

Continuous random variable (X )

Mean
68.3%

95.4%

99.7%

Properties of the Normal Distribution


The horizontal axis represents specific x values of the continuous variable X and the vertical axis
represents specific values of the normal probability density function, f ( x). The smooth curve, called the
normal curve, was constructed by calculating f ( x) values for a sufficient number of X =x values. You can
see that the resulting. You can see that the resulting curve has a bell-like shape that is completely
symmetrical about the vertical line above the mean, μ. Thus, 50% of the area under the curve is to the left of
this vertical line, and 50% is to its right.
It cannot be shown in the graph, but the curve extends continuously outward to both plus and
minus infinity, getting closer and closer to the horizontal axis in both directions but not reaching it.
As with any continuous probability distribution, the total area under the curve is 1.0, and the graph
also shows the percentage of the area lying above the intervals μ ± σ (68.3%), μ ±2 σ (95.4%) and μ ±3 σ
(99.7%). These percentages are the empirical rule for a normal distribution.
The equation for the normal distribution was first published in 1733 by French mathematician
Abraham de Moivre (1667 – 1754), who used it to approximate the binomial distribution. While the French
mathematician-astronomer Pierre Simon de Laplace (1749 – 1827) extended Moivre’s work, it is the
German mathematician-astronomer-physicist Karl Friedrich Gauss (1777 – 1855) who is credited with
being the first to really explore its properties and uses. Because of this, the normal distribution is also called
the Gaussian distribution.
If the frequency (or relative frequency) distribution of a set of data can be reasonably fit by the
normal curve, then the data is said to be normally distributed. This statement is often made even when the
empirical distribution only approximates a normal distribution by being unimodal, roughly mound-shaped,
and essentially symmetrical.
Many real-world continuous random variables generate such distribution: the heights in a
population of men, the aptitude test scores of job applicants, the weights in a population of melons, the
diastolic blood pressures in a population of women, and so on. While the normal curve is common for real-
world variables, it is not called “normal” because anything other than this curve is “abnormal”.

The Family of Normal Probability Distribution


Continuous probability distributions are a family of distributions with the specific distribution
being considered determined by its parameters (or parameter). For the normal distribution you can see in
the equation of the normal probability density function that there are two parameters: the mean μ and the
variance σ 2 (some statistical books say μ and σ ).

Example:
Use the normal probability density function to calculate f ( x) for X =1 for a normal distribution with the
parameters μ=0 and σ 2=1.

Solution:
2
− ( x−μ )
The equation is f ( x )=¿ 1 2σ
2

e
σ √2 π
2
− ( x−0 )
1
Inserting μ=0 and σ =1. Note that
2
f ( x )=¿ e 2 (1 )

√1 ( √ 2 π )
equation requires σ , hence σ =√ 1
2
−( 1−0)
1 2(1)
Insert the value of π (3.1416) and X =x=1 f ( x )=¿ e
√1 ( √ 2(3.1416))
−1
1 2
Simplify f ( x )=¿ e
√6.2832
INTRODUCTION TO NORMAL DISTRIBUTION
−1
1
f ( x )=¿ e 2
2.5066
−1
f ( x )=¿ 0.3989 e 2

−1
Insert the value of e (2.7183) f ( x )=¿ 0.3989 ( 2.7183 ) 2

Simplify f ( x )=¿ 0.3989 ( 0.6065 )


f ( x )=¿ 0.2419

Relationship Between the mean ( μ ¿, the median ( μ) and the mode.

Continuous probability distributions serve as mathematical models for population relative


frequency distributions of continuous random variables. For this reason, the probability distribution and
the population distribution must be describable by comparable statistical measures: both have means,
medians and modes. In the case of normal distribution, both distributions are unimodal and symmetrical
and for normal distribution μ=μ=¿ mode.

Example:
For the two normal distributions (A and B) shown below, which has larger (a) μ, (b) σ and (c) σ 2?

Solution:

(a) Because in any random distribution the mean μ indicates the location on the horizontal axis
(continuous random variable) of the median of the distribution, we know that a vertical line
erected above μ ot f ( x) will divide the distribution into two mirror-image halves. From this you
can see in the figure above that the mean of B ( μ B) is to the right of the mean of A ( μ A ).
Therefore, as it is true for any rectangular Cartesian coordinate system that numbers along the
horizontal axis are positive and increasing to the right of the origin, we know that μ A < μ B .

(b) Standard deviation σ is a measure of the dispersion (or spread) of the values around the mean.
We also know from the empirical rule that 68.3% of the area in a normal distribution will
always lie within one standard deviation from the mean, and consequently the distribution will
have a more distinct peak or greater “ peakedness”. From this we can see on the figure that
σ A >σ B.

2 2
(c) Since σ A >σ B, hence σ A >σ B.

Kurtosis
Not all unimodal, symmetrical distributions are normal distributions. They may differ from a
normal in terms of kurtosis which is the degree of peakedness. Three distributions that differ in kurtosis
are shown below.
The middle distribution is a normal distribution, which is called mesokurtic (meso, middle). The
distribution on the left, which is flatter and less peaked than the normal with a relatively even distribution
of values and with shorter tails, is called platykurtic (platy, flat). The distribution to the right which is more
peaked than the normal with values concentrated in the middle and with long tails, is called leptokurtic
(lepto, slender).
A mesokurtic distribution has a kurtosis ( a 4) of exactly 3 or very near to 3 with an excess kurtosis (
g2 for population, G2 for sample) of 0 or very near to 0. A distribution with a kurtosis of less than 3 and an
excess kurtosis of less than 0 is platykurtic, while a distribution with a kurtosis larger than 3 and an excess
kurtosis larger than 0 is leptokurtic.
We compute the moment coefficient of kurtosis is computed using the following equation:

m4
kurtosis: a 4 = 2
(σ2)
where:
4
m4 =¿ ∑ f ( x−μ )
n
2
2 ∑f ( x−μ )
σ =variance=¿
n

Excess kurtosis (population): g2=a4 −3

n−1
Excess kurtosis (sample): G 2=¿
(n−2)(n−3)
[ ( n+1 ) g2 +6 ]

Example:
The following data describes college men’s height (sample size, n = 100) in terms of inches. From the data
set, μ=67.45. Determine the kurtosis of the said distribution.

Class Mark, x Frequency, f x−μ ( x−μ)2 f (x−μ)2 (x−μ)4 f (x−μ)4


61 5 -6.45 41.6025 208.0125 1730.768 8653.84
64 18 -3.45 11.9025 214.245 141.6695 2550.051
67 42 -0.45 0.2025 8.505 0.041006 1.722263
70 27 2.55 6.5025 175.5675 42.28251 1141.628
73 8 5.55 30.8025 246.42 948.794 7590.352
2 4
∑ f ( x−μ ) =852.75 ∑ f ( x−μ ) =19937.59

m4
Solving for the kurtosis, a 4 a4 = 2
(σ2)
4
Solving m4 m4 =¿ ∑ f ( x−μ )
n
Insert ∑ f ( x−μ )4=19937.59 and
INTRODUCTION TO NORMAL DISTRIBUTION
19937.59
n = 100 m4 =¿
100

m4 =¿ 199.3759

∑ f ( x−μ )2
Solving σ 2 2
σ =¿
n
Insert∑ f ( x−μ )2=852.75 and
852.75
n = 100 σ 2=¿
100

σ 2=¿ 8.5275

199.3759
a4 =
( 8.5275 )2

199.3759
a4 =
72.7183
a 4=2.7418

Solving excess kurtosis, g4 g2=a4 −3


g2=2.7418−3
g2=−0.2582

But since our data is from a sample and not a population we will compute excess kurtosis for a sample, G 4 .

n– 1
G4 =
( n−2 ) ( n−3 )
[ ( n+1 ) g2 +6 ]

100 – 1
G4 = [( 100+1 ) (−0.2582)+6 ]
(100−2 ) (100−3 )

99
G4 = [ ( 101 ) (−0.2582)+6 ]
( 98 )( 97 )

99
G4 = [ −26.0782+6 ]
9506

G4 =0.0104(−20.0782)

G4 =−0.2088

Since the kurtosis (2.7418) is less than 3 and the excess kurtosis ( −0.2088 ) is less than zero, then the
distribution is not a normal distribution but a slightly platykurtic one.

ACTIVITY

1. Calculate f (x) for X =1 for the normal distribution with the following parameters:
a. μ=0 and σ 2=4
b. μ=0 and σ 2=9
c. μ=0 and σ 2=15

You might also like