NE 2207 Part 5
NE 2207 Part 5
• Range is not very useful. It only gives a rough idea about the variation.
Note
For any data set
𝑛𝑛
�(𝑥𝑥𝑖𝑖 − 𝑥𝑥̅ ) = 0
𝑖𝑖=1
18
Example
Data: 2, 3, 5, 6
𝑥𝑥̅ = 4
1
MD = (|2 − 4| + ⋯ + |6 − 4|) = 1.5
4
Variance
1
𝑠𝑠 2 = �(𝑥𝑥 − 𝑥𝑥̅ )2
𝑛𝑛 − 1
Example
Data: 2, 3, 5, 6
𝑥𝑥̅ = 4
1
𝑠𝑠 2 = �(𝑥𝑥 − 𝑥𝑥̅ )2
𝑛𝑛 − 1
1
= ((2 − 4)2 + (3 − 4)2 + (5 − 4)2 + (6 − 4)2 )
4−1
= 3.33
Note
The division is by 𝑛𝑛 − 1 because the number of free values (degrees of freedom) is
𝑛𝑛 − 1. If 𝑛𝑛 = 4, and we know 3 values of (𝑥𝑥𝑖𝑖 − 𝑥𝑥̅ ), the 4th one can be calculated.
Empirical Rule
If a distribution (histogram) appears to be symmetric and bell-shaped, we expect
that approximately
• 68% of the data values will fall in the interval (𝑥𝑥̅ − 𝑠𝑠, 𝑥𝑥̅ + 𝑠𝑠)
(within one standard deviation of the sample mean)
• 95% of the data values will fall in the interval (𝑥𝑥̅ − 2𝑠𝑠, 𝑥𝑥̅ + 2𝑠𝑠)
(within two standard deviations of the sample mean)
• 99.7% of the data values will fall in the interval (𝑥𝑥̅ − 3𝑠𝑠, 𝑥𝑥̅ + 3𝑠𝑠)
(within three standard deviations of the sample mean)
Example
Let the mean and SD of commuting time (minutes) of workers be 60 and 10,
respectively. Let the histogram be more or less symmetric and bell-shaped. We
then have:
𝑥𝑥̅ − 𝑠𝑠 = 60 − 10 = 50
𝑥𝑥̅ + 𝑠𝑠 = 60 + 10 = 70
Approximately 68% workers have commuting time between 50 and 70 minutes.
𝑥𝑥̅ − 2𝑠𝑠 = 60 − 2 × 10 = 40
𝑥𝑥̅ + 2𝑠𝑠 = 60 + 2 × 10 = 80
Approximately 95% workers have commuting time between 40 and 80 minutes.
𝑥𝑥̅ − 3𝑠𝑠 = 60 − 3 × 10 = 30
𝑥𝑥̅ + 3𝑠𝑠 = 60 + 3 × 10 = 90
Approximately 97.7% workers have commuting time between 30 and 90 minutes.
20
30 40 50 60 70 80 90
21