0% found this document useful (0 votes)
36 views9 pages

Chapter 4 Measires of Variability

The document discusses different statistical measures used to analyze data, including measures of central tendency, variability, and dispersion. It provides examples and formulas for calculating measures like range, quartile deviation, mean absolute deviation, variance, and standard deviation.

Uploaded by

Angelyn Alias
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
36 views9 pages

Chapter 4 Measires of Variability

The document discusses different statistical measures used to analyze data, including measures of central tendency, variability, and dispersion. It provides examples and formulas for calculating measures like range, quartile deviation, mean absolute deviation, variance, and standard deviation.

Uploaded by

Angelyn Alias
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

MATH 122 : ENGINEERING DATA ANALYSIS

INSTRUCTIONAL MATERIAL #4

INSTRUCTOR: ENGR. NOVEL KEITH T. SOLIS


1
MATH 122 : ENGINEERING DATA ANALYSIS
INSTRUCTIONAL MATERIAL #4

We learned in the preceding chapter that the measures of central tendency describe location along an
ordered scale. This characteristic of data distributions requires for additional types of statistical analysis. Consider
the following scores made by group of students.

Table 1. Scores of Two Groups of Students

GROUP A GROUP B
STUDENT SCORE GRADE STUDENT SCORE GRADE
Alex 100 A Joy 84 C+
Ben 90 B Helen 83 C+
Candy 80 C+ Lyn 80 C+
Doris 75 C Noel 78 C
Ellen 55 F May 75 C
" 𝒙 = 𝟒𝟎𝟎 " 𝒙 = 𝟒𝟎𝟎
𝒏=𝟓 𝒏=𝟓
𝟒𝟎𝟎 𝟒𝟎𝟎
)
𝒙= 𝒙
)=
𝟓 𝟓
𝑴𝒅 = 𝟖𝟎 𝑴𝒅 = 𝟖𝟎

As indicated in the table 1, the mean and the median are equal for both groups. It seems that averages do not
adequately describe the differences in achievement between the two group of students. To differentiate their
performance, it is necessary to use another measure known as variability. The measures of central tendency and
variability taken together provide a better picture of a data set than the measures of central tendency alone.

THE RANGE (R)

The range is the simplest measure of dispersion. It is equal to the difference between the highest score and
the lowest score of the set of scores. The range involves only the two most extreme scores in a distribution; hence,
it is not reliable. Its advantage is that it readily gives rough estimate if variability.

THE QUARTILE DEVIATION

Unlike the range, the quartile deviation does not depend on 2 extreme measures of a distribution. Its
measurement is taken by getting one-half of the difference between 𝑄! and 𝑄" . Interquartile range is the difference
between 𝑃#$ and 𝑃%$ .

Example 1. Using the data found in the table, find the interquartile range and the quartile deviation

Table 2. Score Distribution of the 2nd Year BS Civil Engineering Students

SCORE NUMBER OF STUDENTS PERCENTAGE


89-95 8 4.00
82-88 16 8.00
75-81 22 11.00
68-74 29 14.50
61-67 44 22.00
54-60 32 16.00
47-53 23 11.50
40-46 19 9.50

INSTRUCTOR: ENGR. NOVEL KEITH T. SOLIS


2
MATH 122 : ENGINEERING DATA ANALYSIS
INSTRUCTIONAL MATERIAL #4

33-39 7 3.50
TOTAL 200 100.00

SOLUTION:
𝑃%$ = 𝑄" = 53.72 𝑃#$ = 𝑄! = 73.53

𝐼𝑛𝑡𝑒𝑟𝑞𝑢𝑎𝑟𝑡𝑖𝑙𝑒 𝑅𝑎𝑛𝑔𝑒 = 𝑃#$ − 𝑃%$


𝐼𝑛𝑡𝑒𝑟𝑞𝑢𝑎𝑟𝑡𝑖𝑙𝑒 𝑅𝑎𝑛𝑔 = 73.53 − 53.72 = 18.81

1
𝑄𝑢𝑎𝑟𝑡𝑖𝑙𝑒 𝐷𝑒𝑣𝑖𝑎𝑡𝑖𝑜𝑛 = ( 𝑄! − 𝑄" )
2
1
𝑄𝑢𝑎𝑟𝑡𝑖𝑙𝑒 𝐷𝑒𝑣𝑖𝑎𝑡𝑖𝑜𝑛 = ( 18.81 ) = 9.405
2

THE MEAN ABSOLUTE DEVIATION

The mean absolute deviation considers the variation of the individual scores in a distribution. This
measurement is equal to the summation of the absolute value of the difference between each score and the mean
divided by the number of scores.

Example 2. The table shows the mean absolute deviation of the gross sales made by the four medical
representatives during the first six months of 2021.

Table 3. Gross Sales (in 100-thousands) Made by Four Medical Representatives During the First Six Months of
2002.
Mean
Medical
Jan Feb March April May June Mean Absolute
Representative
Deviation
Alba 6 5 9 2 8 3 5.5 2.160

Benson 4 5 4 4 4 5 4.33 0.443

Garcia 3 8 6 2 9 2 5.00 2.670

Mercado 7 3 6 9 2 5 5.33 2.000

Sample Calculation of Mean Absolute Deviation

Sales (Med Rep A) 𝒙− 𝑿 ) |𝒙 − 𝑿)|


6 6 – 5.5 0.5
5 5 – 5.5 0.5
9 9 – 5.5 3.5
2 2 – 5.5 3.5
8 8 – 5.5 2.5
3 3 – 5.5 2.5
TOTAL 13.0

Mean Absolute Deviation

)|
∑|x − X
MAD =
N

INSTRUCTOR: ENGR. NOVEL KEITH T. SOLIS


3
MATH 122 : ENGINEERING DATA ANALYSIS
INSTRUCTIONAL MATERIAL #4

13.0
MAD = = 2.167
6

THE VARIANCE (UNGROUPED DATA)

The variance is a measure of variability that considers the position of each observation relative to the mean
of the set scores. It is derived by getting the sum of the squared deviations from the mean and divided by N. The
formulas for ungrouped data of this test are presented below.

( ()*)!
a. Population Variance(𝜎 % ) = ∑ ,

where:
𝑥 = 𝑠𝑐𝑜𝑟𝑒
𝜇 = 𝑝𝑜𝑝𝑢𝑙𝑎𝑡𝑖𝑜𝑛 𝑚𝑒𝑎𝑛
𝑁 = 𝑝𝑜𝑝𝑢𝑙𝑎𝑡𝑖𝑜𝑛 𝑠𝑖𝑧𝑒

( ()-. )!
a. Sample Variance(𝑆 % ) = ∑ /)"

where:
𝑥 = 𝑠𝑐𝑜𝑟𝑒
𝑋o = 𝑠𝑎𝑚𝑝𝑙𝑒 𝑚𝑒𝑎𝑛
𝑛 = 𝑠𝑎𝑚𝑝𝑙𝑒 𝑠𝑖𝑧𝑒

THE STANDARD DEVIATION (UNGROUPED DATA)

A measurement that will give you a better idea of how the data entries differ from the mean is the standard
deviation. It is computed by extracting the square root of the variance. The formula for the standard deviation, as in
variance, differs slightly depending on whether one is using an entire population or just a sample. The formula for
the sample standard deviation is

( 𝑥 − 𝑋) )%
𝑠= p
𝑛−1

For the population standard deviation

( 𝑥 − 𝜇)%
𝜎= p
𝑁

Example 3. A student was investigating the effect of synthetic fertilizer on the growth of peanut seedlings. A random
sample of those seedlings yielded the following heights in inches. Find the mean, variance, and standard deviation.

Table 4. Heights of Peanut Seedlings (in inches)

x 𝒙− 𝑿 ) ))𝟐
(𝒙 − 𝑿
2 2 – 6 = -4 16
3 3 – 6 = -3 9
4 4 – 6 = -2 4
5 5 – 6 = -1 1

INSTRUCTOR: ENGR. NOVEL KEITH T. SOLIS


4
MATH 122 : ENGINEERING DATA ANALYSIS
INSTRUCTIONAL MATERIAL #4

6 6–6=0 0
8 8–6=2 4
10 10 – 6 = 4 16
10 10 – 6 = 4 16
" 𝒙 = 𝟒𝟖 "(𝑥 − 𝑋o ) % = 𝟔𝟔

48
𝑀𝑒𝑎𝑛 (𝑋o) = =6
8
66
𝑉𝑎𝑟𝑖𝑎𝑛𝑐𝑒(𝑠 % ) = = 9.43
7
𝑆𝑡𝑎𝑛𝑑𝑎𝑟𝑑 𝐷𝑒𝑣𝑖𝑎𝑡𝑖𝑜𝑛 (𝑠) = √9.43 = 3.07

THE VARIANCE & STANDARD DEVIATION (GROUPED DATA)

For grouped data, the variance and standard deviation are calculated using the following formulas.

SAMPLE VARIANCE
∑ 𝑓𝑚% ∑(𝑓𝑚)%
𝑆% = −
𝑛−1 𝑛 (𝑛 − 1)
where:
𝑓 = 𝑐𝑜𝑟𝑟𝑒𝑠𝑝𝑜𝑛𝑑𝑖𝑛𝑔 𝑓𝑟𝑒𝑞𝑢𝑒𝑛𝑐𝑦
𝑚 = 𝑐𝑙𝑎𝑠𝑠 − 𝑎𝑚𝑟𝑘 𝑜𝑟 𝑚𝑖𝑑𝑝𝑜𝑖𝑛𝑡 𝑜𝑓 𝑒𝑎𝑐ℎ 𝑐𝑙𝑎𝑠𝑠 𝑖𝑛𝑡𝑒𝑟𝑣𝑎𝑙
𝑛 = 𝑠𝑎𝑚𝑝𝑙𝑒 𝑠𝑖𝑧𝑒

POPULATION VARIANCE
∑ 𝑓𝑚% ∑(𝑓𝑚)%
𝜎% = −
𝑁 𝑁
where:
𝑓 = 𝑐𝑜𝑟𝑟𝑒𝑠𝑝𝑜𝑛𝑑𝑖𝑛𝑔 𝑓𝑟𝑒𝑞𝑢𝑒𝑛𝑐𝑦
𝑚 = 𝑐𝑙𝑎𝑠𝑠 − 𝑎𝑚𝑟𝑘 𝑜𝑟 𝑚𝑖𝑑𝑝𝑜𝑖𝑛𝑡 𝑜𝑓 𝑒𝑎𝑐ℎ 𝑐𝑙𝑎𝑠𝑠 𝑖𝑛𝑡𝑒𝑟𝑣𝑎𝑙
𝑁 = 𝑝𝑜𝑝𝑢𝑙𝑎𝑡𝑖𝑜𝑛 𝑠𝑖𝑧𝑒

SAMPLE STANDARD DEVIATION


∑ 𝑓𝑚% ∑(𝑓𝑚)%
𝑆= p −
𝑛−1 𝑛 (𝑛 − 1)

POPULATION STANDARD DEVIATION


∑ 𝑓𝑚% ∑(𝑓𝑚)%
𝜎= p −
𝑁 𝑁

Example 4. Table represents the manager’s ages in a popular fastfood store. Assume that this comprise the entire
population.

Table 5. Computation of the Mean and Standard Deviation of the Ages of the Manager Respondents

INSTRUCTOR: ENGR. NOVEL KEITH T. SOLIS


5
MATH 122 : ENGINEERING DATA ANALYSIS
INSTRUCTIONAL MATERIAL #4

Number of
Age (years) Midpoint (m) 𝒇𝒎 𝒇𝒎𝟐
Managers (f)
53 – 57 9 55 495 27,225
48 – 52 27 50 1,350 67,500
43 – 47 30 45 1,350 60,750
38 – 42 35 40 1,400 56,000
33 – 37 29 35 1,015 35,525
28 – 32 15 30 450 13,500
23 – 27 5 25 125 3,125
N = 150 " 𝑓𝑚 = 6,185 " 𝑓𝑚% = 263,625

6185
𝑀𝑒𝑎𝑛 (𝑋o) = = 41.23
150

263,625
𝑆𝑡𝑎𝑛𝑑𝑎𝑟𝑑 𝐷𝑒𝑣𝑖𝑎𝑡𝑖𝑜𝑛 (𝜎) = p − (41.23)% = √57.587 = 7.589
150

SKEWNESS AND KURTOSIS

Frequency distribution can assume almost any shape. This shape of the frequency distribution influences
the relationship, among the measures of central tendency. If the distribution is symmetric and unimodal, then the
mean, the median and the mode will all coincide. But some frequency distributions are asymmetrical. Distributions
of this kind, which have a pronounced "tail" on one side or the other, are skewed.

SKEWNESS

Skewness refers to the symmetry or asymmetry of the frequency distribution. A frequency distribution is positively
skewed if its tail extends farther to the right of the mode than it does to the left. It is negatively skewed if its tail
extends to the left of the mode than it does to the right.

Figure 1. Positively Skewed

INSTRUCTOR: ENGR. NOVEL KEITH T. SOLIS


6
MATH 122 : ENGINEERING DATA ANALYSIS
INSTRUCTIONAL MATERIAL #4

As shown in this distribution, only few individuals received the higher scores. The frequency polygon in Figure 5.1
is positively skewed because the tail of the distribution extends to the right towards the direction of the higher (more
positive) score values. It follows that the mean is higher than the median.

Figure 2. Negatively Skewed

This polygon is negatively skewed, since the tail of the distribution goes off to the left. This implies that there are
more high scores, so, values cluster to the left. It follows that the mean is lower that the median.

Pearsonian Coefficient of Skewness

Operationally, this statistical tool has the following formula:

𝟑 (𝑴𝒆𝒂𝒏 − 𝑴𝒆𝒅𝒊𝒂𝒏 )
𝑺𝒌 =
𝑺𝒕𝒂𝒏𝒅𝒂𝒓𝒅 𝑫𝒆𝒗𝒊𝒂𝒕𝒊𝒐𝒏

wherein a perfectly symmetrical distribution the value of Sk is 0, and in general, its value must fall between -3 and
3.

An Sk value that is greater than 0 indicates that the frequency polygon is skewed to the right. While an Sk value
that is less than 0 indicates that the frequency polygon is skewed to the left.

INSTRUCTOR: ENGR. NOVEL KEITH T. SOLIS


7
MATH 122 : ENGINEERING DATA ANALYSIS
INSTRUCTIONAL MATERIAL #4

KURTOSIS

Kurtosis refers to the flatness or peakedness of one distribution in relation to another. Figure 5.3 shows the
three types of Kurtosis.

Figure 3. Types of Kurtosis

Curve A: Leptokurtic ; K > 3

Curve B: Mesokurtic; K = 3

Curve C : Platykurtic; K < 3

Curve A is leptokurtic because its curve is more peaked than the others. Curve C is platykurtic because it is less
peaked than Curve B. Curve B is a normal curve and it is mesokurtic.

Kurtosis Formula for Ungrouped Data


1
∑( 𝑥 − 𝑋o )
𝐾=
𝑛 𝑠1

where:
𝑥 = 𝑠𝑐𝑜𝑟𝑒
𝑋o = 𝑠𝑎𝑚𝑝𝑙𝑒 𝑚𝑒𝑎𝑛
𝑛 = 𝑠𝑎𝑚𝑝𝑙𝑒 𝑠𝑖𝑧𝑒
𝑠 = 𝑠𝑡𝑎𝑛𝑑𝑎𝑟𝑑 𝑑𝑒𝑣𝑖𝑎𝑡𝑖𝑜𝑛

Kurtosis Formula for Grouped Data


1
∑ 𝑓( 𝑥 − 𝑋o )
𝐾=
𝑛 𝑠1

where:
𝑓 = 𝑐𝑜𝑟𝑟𝑒𝑠𝑝𝑜𝑛𝑑𝑖𝑛𝑔 𝑓𝑟𝑒𝑞𝑢𝑒𝑛𝑐𝑦
𝑥 = 𝑐𝑙𝑎𝑠𝑠 𝑚𝑎𝑟𝑘

INSTRUCTOR: ENGR. NOVEL KEITH T. SOLIS


8
MATH 122 : ENGINEERING DATA ANALYSIS
INSTRUCTIONAL MATERIAL #4

𝑋o = 𝑠𝑎𝑚𝑝𝑙𝑒 𝑚𝑒𝑎𝑛
𝑛 = 𝑠𝑎𝑚𝑝𝑙𝑒 𝑠𝑖𝑧𝑒
𝑠 = 𝑠𝑎𝑚𝑝𝑙𝑒 𝑠𝑡𝑎𝑛𝑑𝑎𝑟𝑑 𝑑𝑒𝑣𝑖𝑎𝑡𝑖𝑜𝑛

Example 5. Using the data below, solve for the skewness distributions.

Data Group A Group B


Mean 72.12 67.10
Median 70.10 65.25
Standard Deviation 15.25 10.12

Solving for skewness of the data:

Group A
3 (𝑀𝑒𝑎𝑛 − 𝑀𝑒𝑑𝑖𝑎𝑛 )
𝑆𝑘 =
𝑆𝑡𝑎𝑛𝑑𝑎𝑟𝑑 𝐷𝑒𝑣𝑖𝑎𝑡𝑖𝑜𝑛
3 (72.12 − 70.10 )
𝑆𝑘 = = 0.397
15.25

Group B
3 (𝑀𝑒𝑎𝑛 − 𝑀𝑒𝑑𝑖𝑎𝑛 )
𝑆𝑘 =
𝑆𝑡𝑎𝑛𝑑𝑎𝑟𝑑 𝐷𝑒𝑣𝑖𝑎𝑡𝑖𝑜𝑛
3 (67.10 − 65.25 )
𝑆𝑘 = = 0.548
10.12

The skewness value of group A is 0.397 while for group B is 0.548. Both data show positive skewness, which means
that both groups have low scores. However, group A has lower skewness value than what the control group
received. This implies that the scores of group A are more dispersed than that of group B.

Example 6. Below is the illustrative example of the computation of kurtosis.

FREQUENCY MIDPOINT
SCORE (𝒙 − 𝒙
)) ) )𝟒
(𝒙 − 𝒙 ) )𝟒 )
𝒇( 𝑿 − (𝒙 − 𝒙
(f) (m)
46 – 50 8 48 18.26 111,173.96 889,391.68
32 – 45 10 38.5 8.76 5,888.66 58,886.60
25 – 31 16 28 -1.74 9.17 146.72
11 – 24 12 17.5 -12.24 22,445.31 269,343.72
0 – 10 4 5 -24.74 374,626.75 1,498,507.00
N = 50 "(𝑥 − 𝑥̅ )1
= 𝟐, 𝟕𝟏𝟔, 𝟐𝟕𝟓. 𝟕𝟐

Mean = 29.74 SD = 13.00


1
∑ 𝑓( 𝑥 − 𝑋o )
𝐾=
𝑛 𝑠1
50(2,716,275.72)
𝐾=
(50) (13.00)1
𝐾 = 2.24 (platykurtic)

INSTRUCTOR: ENGR. NOVEL KEITH T. SOLIS


9

You might also like