Chapter 4 Measires of Variability
Chapter 4 Measires of Variability
INSTRUCTIONAL MATERIAL #4
We learned in the preceding chapter that the measures of central tendency describe location along an
ordered scale. This characteristic of data distributions requires for additional types of statistical analysis. Consider
the following scores made by group of students.
GROUP A GROUP B
STUDENT SCORE GRADE STUDENT SCORE GRADE
Alex 100 A Joy 84 C+
Ben 90 B Helen 83 C+
Candy 80 C+ Lyn 80 C+
Doris 75 C Noel 78 C
Ellen 55 F May 75 C
" 𝒙 = 𝟒𝟎𝟎 " 𝒙 = 𝟒𝟎𝟎
𝒏=𝟓 𝒏=𝟓
𝟒𝟎𝟎 𝟒𝟎𝟎
)
𝒙= 𝒙
)=
𝟓 𝟓
𝑴𝒅 = 𝟖𝟎 𝑴𝒅 = 𝟖𝟎
As indicated in the table 1, the mean and the median are equal for both groups. It seems that averages do not
adequately describe the differences in achievement between the two group of students. To differentiate their
performance, it is necessary to use another measure known as variability. The measures of central tendency and
variability taken together provide a better picture of a data set than the measures of central tendency alone.
The range is the simplest measure of dispersion. It is equal to the difference between the highest score and
the lowest score of the set of scores. The range involves only the two most extreme scores in a distribution; hence,
it is not reliable. Its advantage is that it readily gives rough estimate if variability.
Unlike the range, the quartile deviation does not depend on 2 extreme measures of a distribution. Its
measurement is taken by getting one-half of the difference between 𝑄! and 𝑄" . Interquartile range is the difference
between 𝑃#$ and 𝑃%$ .
Example 1. Using the data found in the table, find the interquartile range and the quartile deviation
33-39 7 3.50
TOTAL 200 100.00
SOLUTION:
𝑃%$ = 𝑄" = 53.72 𝑃#$ = 𝑄! = 73.53
1
𝑄𝑢𝑎𝑟𝑡𝑖𝑙𝑒 𝐷𝑒𝑣𝑖𝑎𝑡𝑖𝑜𝑛 = ( 𝑄! − 𝑄" )
2
1
𝑄𝑢𝑎𝑟𝑡𝑖𝑙𝑒 𝐷𝑒𝑣𝑖𝑎𝑡𝑖𝑜𝑛 = ( 18.81 ) = 9.405
2
The mean absolute deviation considers the variation of the individual scores in a distribution. This
measurement is equal to the summation of the absolute value of the difference between each score and the mean
divided by the number of scores.
Example 2. The table shows the mean absolute deviation of the gross sales made by the four medical
representatives during the first six months of 2021.
Table 3. Gross Sales (in 100-thousands) Made by Four Medical Representatives During the First Six Months of
2002.
Mean
Medical
Jan Feb March April May June Mean Absolute
Representative
Deviation
Alba 6 5 9 2 8 3 5.5 2.160
)|
∑|x − X
MAD =
N
13.0
MAD = = 2.167
6
The variance is a measure of variability that considers the position of each observation relative to the mean
of the set scores. It is derived by getting the sum of the squared deviations from the mean and divided by N. The
formulas for ungrouped data of this test are presented below.
( ()*)!
a. Population Variance(𝜎 % ) = ∑ ,
where:
𝑥 = 𝑠𝑐𝑜𝑟𝑒
𝜇 = 𝑝𝑜𝑝𝑢𝑙𝑎𝑡𝑖𝑜𝑛 𝑚𝑒𝑎𝑛
𝑁 = 𝑝𝑜𝑝𝑢𝑙𝑎𝑡𝑖𝑜𝑛 𝑠𝑖𝑧𝑒
( ()-. )!
a. Sample Variance(𝑆 % ) = ∑ /)"
where:
𝑥 = 𝑠𝑐𝑜𝑟𝑒
𝑋o = 𝑠𝑎𝑚𝑝𝑙𝑒 𝑚𝑒𝑎𝑛
𝑛 = 𝑠𝑎𝑚𝑝𝑙𝑒 𝑠𝑖𝑧𝑒
A measurement that will give you a better idea of how the data entries differ from the mean is the standard
deviation. It is computed by extracting the square root of the variance. The formula for the standard deviation, as in
variance, differs slightly depending on whether one is using an entire population or just a sample. The formula for
the sample standard deviation is
( 𝑥 − 𝑋) )%
𝑠= p
𝑛−1
( 𝑥 − 𝜇)%
𝜎= p
𝑁
Example 3. A student was investigating the effect of synthetic fertilizer on the growth of peanut seedlings. A random
sample of those seedlings yielded the following heights in inches. Find the mean, variance, and standard deviation.
x 𝒙− 𝑿 ) ))𝟐
(𝒙 − 𝑿
2 2 – 6 = -4 16
3 3 – 6 = -3 9
4 4 – 6 = -2 4
5 5 – 6 = -1 1
6 6–6=0 0
8 8–6=2 4
10 10 – 6 = 4 16
10 10 – 6 = 4 16
" 𝒙 = 𝟒𝟖 "(𝑥 − 𝑋o ) % = 𝟔𝟔
48
𝑀𝑒𝑎𝑛 (𝑋o) = =6
8
66
𝑉𝑎𝑟𝑖𝑎𝑛𝑐𝑒(𝑠 % ) = = 9.43
7
𝑆𝑡𝑎𝑛𝑑𝑎𝑟𝑑 𝐷𝑒𝑣𝑖𝑎𝑡𝑖𝑜𝑛 (𝑠) = √9.43 = 3.07
For grouped data, the variance and standard deviation are calculated using the following formulas.
SAMPLE VARIANCE
∑ 𝑓𝑚% ∑(𝑓𝑚)%
𝑆% = −
𝑛−1 𝑛 (𝑛 − 1)
where:
𝑓 = 𝑐𝑜𝑟𝑟𝑒𝑠𝑝𝑜𝑛𝑑𝑖𝑛𝑔 𝑓𝑟𝑒𝑞𝑢𝑒𝑛𝑐𝑦
𝑚 = 𝑐𝑙𝑎𝑠𝑠 − 𝑎𝑚𝑟𝑘 𝑜𝑟 𝑚𝑖𝑑𝑝𝑜𝑖𝑛𝑡 𝑜𝑓 𝑒𝑎𝑐ℎ 𝑐𝑙𝑎𝑠𝑠 𝑖𝑛𝑡𝑒𝑟𝑣𝑎𝑙
𝑛 = 𝑠𝑎𝑚𝑝𝑙𝑒 𝑠𝑖𝑧𝑒
POPULATION VARIANCE
∑ 𝑓𝑚% ∑(𝑓𝑚)%
𝜎% = −
𝑁 𝑁
where:
𝑓 = 𝑐𝑜𝑟𝑟𝑒𝑠𝑝𝑜𝑛𝑑𝑖𝑛𝑔 𝑓𝑟𝑒𝑞𝑢𝑒𝑛𝑐𝑦
𝑚 = 𝑐𝑙𝑎𝑠𝑠 − 𝑎𝑚𝑟𝑘 𝑜𝑟 𝑚𝑖𝑑𝑝𝑜𝑖𝑛𝑡 𝑜𝑓 𝑒𝑎𝑐ℎ 𝑐𝑙𝑎𝑠𝑠 𝑖𝑛𝑡𝑒𝑟𝑣𝑎𝑙
𝑁 = 𝑝𝑜𝑝𝑢𝑙𝑎𝑡𝑖𝑜𝑛 𝑠𝑖𝑧𝑒
Example 4. Table represents the manager’s ages in a popular fastfood store. Assume that this comprise the entire
population.
Table 5. Computation of the Mean and Standard Deviation of the Ages of the Manager Respondents
Number of
Age (years) Midpoint (m) 𝒇𝒎 𝒇𝒎𝟐
Managers (f)
53 – 57 9 55 495 27,225
48 – 52 27 50 1,350 67,500
43 – 47 30 45 1,350 60,750
38 – 42 35 40 1,400 56,000
33 – 37 29 35 1,015 35,525
28 – 32 15 30 450 13,500
23 – 27 5 25 125 3,125
N = 150 " 𝑓𝑚 = 6,185 " 𝑓𝑚% = 263,625
6185
𝑀𝑒𝑎𝑛 (𝑋o) = = 41.23
150
263,625
𝑆𝑡𝑎𝑛𝑑𝑎𝑟𝑑 𝐷𝑒𝑣𝑖𝑎𝑡𝑖𝑜𝑛 (𝜎) = p − (41.23)% = √57.587 = 7.589
150
Frequency distribution can assume almost any shape. This shape of the frequency distribution influences
the relationship, among the measures of central tendency. If the distribution is symmetric and unimodal, then the
mean, the median and the mode will all coincide. But some frequency distributions are asymmetrical. Distributions
of this kind, which have a pronounced "tail" on one side or the other, are skewed.
SKEWNESS
Skewness refers to the symmetry or asymmetry of the frequency distribution. A frequency distribution is positively
skewed if its tail extends farther to the right of the mode than it does to the left. It is negatively skewed if its tail
extends to the left of the mode than it does to the right.
As shown in this distribution, only few individuals received the higher scores. The frequency polygon in Figure 5.1
is positively skewed because the tail of the distribution extends to the right towards the direction of the higher (more
positive) score values. It follows that the mean is higher than the median.
This polygon is negatively skewed, since the tail of the distribution goes off to the left. This implies that there are
more high scores, so, values cluster to the left. It follows that the mean is lower that the median.
𝟑 (𝑴𝒆𝒂𝒏 − 𝑴𝒆𝒅𝒊𝒂𝒏 )
𝑺𝒌 =
𝑺𝒕𝒂𝒏𝒅𝒂𝒓𝒅 𝑫𝒆𝒗𝒊𝒂𝒕𝒊𝒐𝒏
wherein a perfectly symmetrical distribution the value of Sk is 0, and in general, its value must fall between -3 and
3.
An Sk value that is greater than 0 indicates that the frequency polygon is skewed to the right. While an Sk value
that is less than 0 indicates that the frequency polygon is skewed to the left.
KURTOSIS
Kurtosis refers to the flatness or peakedness of one distribution in relation to another. Figure 5.3 shows the
three types of Kurtosis.
Curve B: Mesokurtic; K = 3
Curve A is leptokurtic because its curve is more peaked than the others. Curve C is platykurtic because it is less
peaked than Curve B. Curve B is a normal curve and it is mesokurtic.
where:
𝑥 = 𝑠𝑐𝑜𝑟𝑒
𝑋o = 𝑠𝑎𝑚𝑝𝑙𝑒 𝑚𝑒𝑎𝑛
𝑛 = 𝑠𝑎𝑚𝑝𝑙𝑒 𝑠𝑖𝑧𝑒
𝑠 = 𝑠𝑡𝑎𝑛𝑑𝑎𝑟𝑑 𝑑𝑒𝑣𝑖𝑎𝑡𝑖𝑜𝑛
where:
𝑓 = 𝑐𝑜𝑟𝑟𝑒𝑠𝑝𝑜𝑛𝑑𝑖𝑛𝑔 𝑓𝑟𝑒𝑞𝑢𝑒𝑛𝑐𝑦
𝑥 = 𝑐𝑙𝑎𝑠𝑠 𝑚𝑎𝑟𝑘
𝑋o = 𝑠𝑎𝑚𝑝𝑙𝑒 𝑚𝑒𝑎𝑛
𝑛 = 𝑠𝑎𝑚𝑝𝑙𝑒 𝑠𝑖𝑧𝑒
𝑠 = 𝑠𝑎𝑚𝑝𝑙𝑒 𝑠𝑡𝑎𝑛𝑑𝑎𝑟𝑑 𝑑𝑒𝑣𝑖𝑎𝑡𝑖𝑜𝑛
Example 5. Using the data below, solve for the skewness distributions.
Group A
3 (𝑀𝑒𝑎𝑛 − 𝑀𝑒𝑑𝑖𝑎𝑛 )
𝑆𝑘 =
𝑆𝑡𝑎𝑛𝑑𝑎𝑟𝑑 𝐷𝑒𝑣𝑖𝑎𝑡𝑖𝑜𝑛
3 (72.12 − 70.10 )
𝑆𝑘 = = 0.397
15.25
Group B
3 (𝑀𝑒𝑎𝑛 − 𝑀𝑒𝑑𝑖𝑎𝑛 )
𝑆𝑘 =
𝑆𝑡𝑎𝑛𝑑𝑎𝑟𝑑 𝐷𝑒𝑣𝑖𝑎𝑡𝑖𝑜𝑛
3 (67.10 − 65.25 )
𝑆𝑘 = = 0.548
10.12
The skewness value of group A is 0.397 while for group B is 0.548. Both data show positive skewness, which means
that both groups have low scores. However, group A has lower skewness value than what the control group
received. This implies that the scores of group A are more dispersed than that of group B.
FREQUENCY MIDPOINT
SCORE (𝒙 − 𝒙
)) ) )𝟒
(𝒙 − 𝒙 ) )𝟒 )
𝒇( 𝑿 − (𝒙 − 𝒙
(f) (m)
46 – 50 8 48 18.26 111,173.96 889,391.68
32 – 45 10 38.5 8.76 5,888.66 58,886.60
25 – 31 16 28 -1.74 9.17 146.72
11 – 24 12 17.5 -12.24 22,445.31 269,343.72
0 – 10 4 5 -24.74 374,626.75 1,498,507.00
N = 50 "(𝑥 − 𝑥̅ )1
= 𝟐, 𝟕𝟏𝟔, 𝟐𝟕𝟓. 𝟕𝟐