0% found this document useful (0 votes)
138 views

Module 5 PDF

This document discusses measures of variability in data distributions, including skewness and kurtosis. It provides examples and formulas for calculating several measures of absolute variability from data sets, including: 1. Range, which is the difference between the highest and lowest values. 2. Interquartile range and quartile deviation, which describe the spread of the middle 50% of the data. 3. Mean deviation, which takes into account differences between all values and the mean. It also covers calculating variance and standard deviation, both for raw data sets and grouped data, as measures of variability around the mean. Formulas are provided for computing these statistics.

Uploaded by

yoonginism
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
138 views

Module 5 PDF

This document discusses measures of variability in data distributions, including skewness and kurtosis. It provides examples and formulas for calculating several measures of absolute variability from data sets, including: 1. Range, which is the difference between the highest and lowest values. 2. Interquartile range and quartile deviation, which describe the spread of the middle 50% of the data. 3. Mean deviation, which takes into account differences between all values and the mean. It also covers calculating variance and standard deviation, both for raw data sets and grouped data, as measures of variability around the mean. Formulas are provided for computing these statistics.

Uploaded by

yoonginism
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 17

Lesson

Measures of Variability,
5 Skewness and Kurtosis

Measures of Absolute Variability


Used to compare two or more data sets with the same means and the
same units of measurement.
Examples:
• 71,77,79,82,86,89,90 x = 82
• 78,79,80,81,84,86,86 x = 82
These two sets of distributions have the same value of the mean, 82, but the
spread of the scores from the mean are too different. Hence , the measures of
central tendency alone do not completely describe the distribution. Thus, it is
necessary to study other statistical measures in order to describe further the
characteristics of the distribution.

5.1 Absolute Measures of Variability

Example 1:
CI
5-9 lower scores
10-14
15-19
20-24 higher scores

R= h ighest UCB - lowest LCB


= 24.5 - 4.5

R=20 inclusive range


Quartile Deviation

Interquartile Range IR=Q3-Q1


Q3−Q1
Semi-interquartile Range 𝑄𝐷 = 2

Example 2: Using the given set of observation, determine the


1. Interquartile range (IR)
2. Quartile deviation (QD)
Set of observation:
10, 8, 6, 4, 14, 11, 16, 7

Solution: solving for Q 3 and Q1


p(n+1)=75(8+1) p(n+1)=25(8+1)
16 P75= : P25 =
100 100 100 100
14 =6.75 14-11 =3(.75) 2.25 =2.25 (7-6) 1(.25)
Q3=13.25 Q 1=6.25
11 Hence,
10 interquartile range
8 Q 3-Q1= 13.25-6.25=7
7 quartile deviation
𝑄3−𝑄1 13.5−6.25 7
6 = = = 3.5
2 2 2
4

The quartile deviation is used when the median is the preferred measure of the
central tendency, that is, if there are scattered or extreme score in the distribution.
It gives the spread of the scores around the median or the middle 50% of the cases
in a distribution.
The Average or Mean Deviation

The average or mean deviation takes into account all the values in a given
distribution. The formula for mean deviation is
∑ ∣ 𝑥 − x̅ ∣
𝑀𝐷 =
𝑛
where
(𝑋- x̅ ) is read as the absolute value of x and x̅ . By absolute value, the sign
of the difference between X and x̅ is disregarded.
X = score of value
= mean
n = number of cases

Example 3. Given the two sets of distribution, let us solve for the mean deviations:

Set A: 28, 29, 32, 37 and 39


Set B: 25, 32, 33, 40 and 45

Solution: Set A

X ∣ 𝑋 − x̅ ∣
28 ∣ 28 − 33 ∣= 5
29 ∣ 29 − 33 ∣= 4
32 ∣ 32 − 33 ∣= 1
37 ∣ 37 − 33 ∣= 4
39 ∣ 39 − 33 ∣= 6
∑165 mean = 33 ∑ = 20

∑∣𝑥−x̅∣ 20
MD= = or 4
𝑛 5
Solution: Set B

X ∣ 𝑋 − x̅ ∣
25 ∣ 25 − 35 ∣ = 10
32 ∣ 32 − 35 ∣ = 3
33 ∣ 33 − 35 ∣ = 2
40 ∣ 40 − 35 ∣ = 5
45 ∣ 45 − 35 ∣ = 10
∑175 ∑30

∑∣𝑥−x̅∣ 30
MD or AD= = or 6
𝑛 5

Grouped Data

For the grouped data the mean deviation can be calculated using the following
formula:
∑𝑓1 ∣ 𝑥𝑖 − x̅ ∣
𝑀𝐷 =
𝑛
where 𝑥𝑖 = class marks
x = grouped data mean
n = total frequency
𝑓𝑖 = frequency per class intervals
Example 4. Compute the range, quartile deviation and mean deviation

CI f
61 – 65 5
66 – 70 8
71 – 75 12
76 – 80 6
81 – 85 4

Solution:
a. range = ULCB – LLCB = 85.5 – 60.5 = 25
b. Quartile deviation

CI f <cf

61-65 5 5

66-70 8 13

71-75 12 25

76-80 6 31

81-85 4 35

C= 5 35

b. Quartile deviation

𝟐𝟓𝒏 𝟕𝟓𝒏
− <𝒄𝒇𝒃𝟐𝟓 − <𝒄𝒇𝒃𝟕𝟓
𝟏𝟎𝟎 𝟏𝟎𝟎
𝑷𝟐𝟓 = 𝒍𝟐𝟓 +( )𝒄 𝑷𝟕𝟓 = 𝒍𝟕𝟓 +( )𝒄
𝒇𝟐𝟓 𝒇𝟕𝟓
25(35) 75(35)
= 8.75 = 26.25
100 100
8.75−5 26.25−25
P 25 = 65.5 + ( )5 P 75 = 75.5 + ( )5
8 6
= 65.5 + 2.34 = 75.5 + 1.04
= 67.84 = 76.54

𝑄3− 𝑄1 76.54−67.84 8.7


Hence Quartile deviation = = = = 4.35
2 2 2
c. mean deviation

CI f xi fixi |𝑥𝑖 − 𝑥̅ | 𝑓𝑖 |𝑥𝑖


− 𝑥̅ |

61-65 5 63 315 /63- 5(9.43)


72.43/=9.43 =47.15
66-70 8 68 544 /68-72.43/ 8(4.43)
=4.43 = 35.44
71-75 12 73 876 0.57 6.84

76-80 6 78 468 5.57 33.42

81-85 4 83 332 10.57 42.28

C= 5 35 2535 165.13

∑ 𝑓𝑥𝑖 2535
𝑥̅ = = = 72.43
𝑛 35

∑𝑓1 ∣𝑥𝑖 −x̅∣ 165.13


𝑀𝐷 𝑜𝑟 𝐴𝐷 = = = 4.718 or 4.72
𝑛 35

Variance and standard Deviation

Variance and standard deviation of ungrouped data


The population variance and standard deviation

∑(𝑥𝑖 −𝜇 )2 ∑(𝑥𝑖 − 𝜇)2


𝜎 2= 𝜎=√
𝑁 𝑁

µ = population mean N = population size


The formula for variance and standard deviation for ungrouped sample data is
∑(𝑥𝑖−𝑥̅ )2 ∑(𝑥𝑖−𝑥̅ )2
S2= 𝑠=√
𝑛 −1 𝑛 −1

Example 5. Compute the variance and standard deviation

1. 10, 12, 14, 15, 17, 18, 18, 24

xi (𝑥𝑖 − 𝑥̅ ) (𝑥𝑖 − 𝑥̅ )2

10 (10-16) = -6 36

12 (12-16) = -4 16

14 -2 4

15 -1 1

17 1 1

18 2 4

18 2 4

24 8 64

128 130

Solution:
∑ 𝑥𝑖 128
𝑥̅ = = = 16
𝑛 8

∑(𝑥𝑖−𝑥̅ )2 130 130


Variance S2 = = 8−1 = = 18.57
𝑛 −1 7

∑(𝑥𝑖−𝑥̅ )2 130
Standard deviation 𝑠=√ =√ = √18.57 = 4.31
𝑛 −1 7
For grouped data:

The population variance and standard deviation

2
2 ∑ 𝑓𝑖 (𝑥𝑖 −𝜇 ) ∑ 𝑓𝑖 (𝑥𝑖 − 𝜇)2
𝜎 = 𝜎=√
𝑁 𝑁

µ = population mean N = population size

The formula for variance and standard deviation for ungrouped sample data is

∑𝑓𝑖 (𝑥𝑖−𝑥̅ )2 ∑𝑓𝑖 (𝑥𝑖−𝑥̅ )2


S2= 𝑠=√
𝑛 −1 𝑛 −1

Example 6. Compute the variance and standard deviation

(1) (2) 3=(1)(2) 4= (2)- 72.43 5= (4)2 6 = (5) (1)

CI f xi fixi (xi - 𝑥̅ ) (xi - 𝑥̅ )2 fi(xi - 𝑥̅ )2

61- 5 63 315 (63- 88.9249 444.6245


65 72.43)=-
9.43
66- 8 68 544 -4.43 19.6249 156.9992
70
71- 12 73 876 0.57 0.3249 3.8988
75
76- 6 78 468 5.57 31.0249 186.1494
80
81- 4 83 332 10.57 111.7249 446.8996
85
C= 5 35 2535 1238.5715

∑ 𝑥𝑖 2535
𝑥̅ = = = 72.43
𝑛 35

∑𝑓𝑖 (𝑥𝑖−𝑥̅ )2 1238.5715 1238.5715


S2= = = = 36.4286 or 36.43
𝑛 −1 35−1 34

∑𝑓𝑖(𝑥𝑖−𝑥̅ )2 1238.5715 1238.5715


𝑠=√ =√ =√ = √36.4286 = 6.0356 or
𝑛 −1 35−1 34

6.04

5.2 Relative Measures of Variability


The range , the quartile deviation, the average or mean deviation, the
standard deviation and the variance gives us the distances of the scores from
the measures of the central tendency. They are categorized as the measures of
absolute variability since the units are the same as those of the original units.

If two or more distributions of different units are to be compared, it is more


appropriate to use the measures of relative variability. These measures are the
coefficient of variation and the standard scores.

Coefficient of Variation
𝑺
𝑪𝑽 = (𝟏𝟎𝟎%)
𝑿
Where:
CV stands for the coefficient of variation;
S, the standard deviation;
X is the mean

Example 7. Department store A has a mean weekly sales of 340 bags with a
standard deviation of 12. Department store B has a mean weekly sales of 550 bags
with a standard deviation of 15. In relative terms, which store has the greater
variability in their weekly sales?

Department store A:
𝑆 12
CV = 𝑋 (100) CV = 340 (100) CV = 3.53%

Department store B
𝑆 15
CV = 𝑋 (100) CV = 550 (100) CV = 2.73 %

Thus, Department store A has the larger variation from the mean

Example 8. The mean score of a statistics test of class A is 75 with a standard


deviation of 13 while class B has a mean score of 86 with a standard deviation of
16. Which class has a larger variation from the mean?
13
Class A: (100) = 17. 33%
75
16
Class B: (100) = 18.60%
86

Coefficient of Quartile Deviation

𝑸𝟑 − 𝑸𝟏
𝑪𝑸𝑫 = 𝟏𝟎𝟎%
𝑸𝟑 + 𝑸𝟏
Example 9. Consider the data from example 2. 10, 8, 6, 4, 14, 11, 16,7

Solution: solving for Q 3 and Q1


p(n+1)=75(8+1) p(n+1)=25(8+1)
16 :
100 100 100 100
14 =6.75 =2.25
15 Q 3=13.25 Q 1=6.25
11 Hence,
10 interquartile range
8 Q 3-Q1= 13.25-6.25=7
7 quartile deviation
𝑄3−𝑄1 13.25−6.25 7
6 = = = 3.5
2 2 2
4

Hence, the Coefficient of Quartile deviation

𝑄3 −𝑄1 13.25−6.25 7
CQD =
𝑄3 +𝑄1
100% = 13.25+6.25 100% = 19.5 100%

7
CQD = 100% = 0.35897(100%) = 35.897% or 35.90%
19.5

Example 10. Consider data of Example 4. Compute the Coefficient of Quartile


deviation

CI F <cf

61-65 5 5

66-70 8 13

71-75 12 25
76-80 6 31

81-85 4 35

C= 5 35

Solution:

𝟐𝟓𝒏 𝟕𝟓𝒏
𝟏𝟎𝟎
− <𝒄𝒇𝒃𝟐𝟓 𝟏𝟎𝟎
− <𝒄𝒇𝒃𝟕𝟓
Q1 𝑷𝟐𝟓 = 𝒍𝟐𝟓 +( )𝒄 Q3 = 𝑷𝟕𝟓 = 𝒍𝟕𝟓 +( )𝒄
𝒇𝟐𝟓 𝒇𝟕𝟓

25(35) 75(35)
= 8.75 = 26.25
100 100

8.75−5 26.25−25
P 25 = 65.5 + ( )5 P 75 = 75.5 + ( ) 5
8 6
= 65.5 + 2.34 = 75.5 + 1.04
= 67.84 = 76.54

𝑄3− 𝑄1 76.54−67.84 8.7


Hence, Quartile deviation =𝑄 = = =
3+ 𝑄1 76.54+67.84 144.38

0.0603 (100%) = 6.03%

5.3 Measures of Skewness and Kurtosis

Formula for skewness is

𝟑(𝐱̅̅ − 𝒎𝒅)
𝑺𝑲 =
𝑺

Where x̅ = mean
md= median
S= standard deviation
Skewness can be classified according to the skewness coefficient.
• If the SK < 0, it is called POSITIVELY SKEWED distribution.
• If the SK > 0, it is NEGATIVELY SKEWED distribution.
• If the SK = 0, the scores are NORMALLY distributed.

Example 11. Consider the data below compute the Pearsonian coefficient of
skewness

10, 12, 14, 15, 17, 18, 18, 24

∑𝑥 128 15+17
Solution : 𝑥̅ = 𝑛 𝑖 = 8
= 16 and md = 2
= 16

∑(𝑥𝑖−𝑥̅ )2 130
Standard deviation 𝑠=√ =√ = √18.57 = 4.31
𝑛 −1 7

3(x̅−𝑀𝑑) 3(16−16)
Hence, Skewness (SK) = = =0
𝑆 4.31

Therefore, the distribution is normal or symmetrical.

Measures of Kurtosis

Kurtosis is the measure of the degree of peakedness or flatness of a


distribution. There are 3 types of kurtosis. These are leptokurtosis, platykurtosis,
and mesokurtosis.
Leptokurtic or tall distributions have unusually large number of scores or
values at the center of the distribution. It is more peaked than the normal curve
since the scores are concentrated within a very narrow interval at the center. It tails
are high and long.
Platykurtic distributions are flat distribution. The values or scores are
distributed over a wider range about the center making the hump of the curve flat.
It is flatter than that normal distribution. Its tail is short.
Mesokurtic distribution are the normal or symmetrical distribution. The
values or score are moderately distributed about the center of the distribution. But
neither peaked nor too flat.

For a normal distribution, kurtosis is equal to 3, its curve is


mesokurtic. If the kurtosis is higher than 3, the curve is leptokurtic, and if
than 3, the curve is platykurtic.

For raw data


∑ (𝑥 − 𝑥 )4
𝐾=
𝑛𝑠 4
Where xi- individual scores
𝑥̅ - mean for the raw data
s- standard deviation of the raw data

Example 12. Compute the Kurtosis

10, 12, 14, 15, 17, 18, 18, 24

xi (𝑥𝑖 − 𝑥̅ ) (𝑥𝑖 − 𝑥̅ )2 (𝑥𝑖 − 𝑥̅ )4

10 -6 36 1296

12 -4 16 256
14 -2 4 16

15 -1 1 1

17 1 1 1

18 2 4 16

18 2 4 16

24 8 64 4096

128 130 5698

Solution:

∑ 𝑥𝑖 128
𝑥̅ = = = 16
𝑛 8

∑(𝑥𝑖−𝑥̅ )2 130 130


Variance S2 = = 8−1 = = 18.57 S4 = (S2)2 = (18.57)2 = 344.8449
𝑛 −1 7

∑(𝑥−𝑥)4 5698 5698


K= = = = 2.065 OR 2.07
𝑛𝑠 4 8(344.8449) 2758.7592

Since k is less than 3, hence the distribution is platykurtic.

For grouped data

∑𝑓(𝑥 − x̅ )4
𝐾=
𝑛𝑠 4

Where x- class midpoints


f- frequency for every class interval
x̅ - mean for grouped data
s- standard deviation of the grouped data

Example 13. Compute the kurtosis


Solution:

1 2 3= (1)(2) 4= 2- 5 = (4)2 6= (1)(5) 7 = (6)(5)


72.43
CI f Xi F i xi (xi - 𝑥̅ ) (xi - 𝑥̅ )2 fi(xi - 𝑥̅ )2 fi(xi - 𝑥̅ )4

61-65 5 63 315 -9.43 88.9249 444.6245 39538.1892

66-70 8 68 544 -4.43 19.6249 156.9992 3081.0936

71-75 12 73 876 0.57 0.3249 3.8988 1.26672012

76-80 6 78 468 5.57 31.0249 186.1494 5775.26652

81-85 4 83 332 10.57 111.7249 446.8996 49929.81312

C= 5 35 2535 1238.5715 98325.62916

∑ 𝒙𝒊 𝟐𝟓𝟑𝟓
̅=
𝒙 = = 𝟕𝟐. 𝟒𝟑
𝒏 𝟑𝟓

̅ )𝟐
∑𝒇𝒊(𝒙𝒊−𝒙 𝟏𝟐𝟑𝟖.𝟓𝟕𝟏𝟓 𝟏𝟐𝟑𝟖.𝟓𝟕𝟏𝟓
S2= = = = 36.4286
𝒏 −𝟏 𝟑𝟓−𝟏 𝟑𝟒
S4= (S2) = (36.4286)2= 1327.042898

∑𝒇(𝒙−𝐱̅̅)𝟒 𝟗𝟖𝟑𝟐𝟓.𝟔𝟐𝟗𝟏𝟔 𝟗𝟖𝟑𝟐𝟓.𝟔𝟐𝟗𝟏𝟔


K= = = = 𝟐. 𝟏𝟏𝟔𝟗 OR 2.17
𝒏𝒔𝟒 𝟑𝟓(𝟏𝟑𝟐𝟕.𝟎𝟒𝟐𝟖𝟗𝟖) 𝟒𝟔𝟒𝟒𝟔.𝟓𝟎𝟏𝟒𝟑

Since k is less than 3, the distribution is platykurtic.

You might also like