Measure of Variability - Data Management PDF
Measure of Variability - Data Management PDF
and Location
Solution:
3 2 1 0 1 2 3
Mean
Average Deviation for Ungrouped Data
Individual
value Sample
mean
AD =
å /X-µ/
N
Average Population
Deviation
Sample AD: AD =
å /X-X/
n
Average Deviation for Grouped Data
Frequency
Individual
value Sample
mean
AD =
å f /X-X/
N
Average Population
Deviation
Sample AD: AD =
å f /X-X/
n
Example 1 for Average Deviation
Solution:
Compute for the mean
X=
å X 550 + 420 + 560 + 500 + 700 + 670 + 860 + 480
=
n 8
4,740
= = 592.50
8
Solution: AD Ungrouped Data
550 – 592.5
X X-X /X-X/
550 –42.5 42.5
420 –172.5 172.5
560 –32.5 32.5
500 –92.5 92.5
700 107.5 107.5
670 77.5 77.5
860 267.5 267.5
480 –112.5 112.5
å X = 4,740 å (X - X ) = 0 å / X - X / = 905
AD =
å / X - X / 905
= = 113.125 » 113.13
n 8
Example: AD Grouped Data
Amount of Number of
Electric Bill Families
700 – 849 2
850 – 999 9
1,000 – 1,149 15
1,150 – 1,299 9
1,300 – 1,499 5
Solution: AD Grouped Data
Midpoint
Class Limits f X fX
700 – 849 2 774.5 1,549.00
850 – 999 9 924.5 8,320.50
Add
1,000 – 1,149 15 1,074.5 16,117.50
1,150 – 1,299 9 1,224.5 11,020.50
1,300 – 1,499 5 1,374.5 6,872.50
40 å fX = 43,880
X=
å fX 43,880
= = 1,097
n 40
Solution (continuation)
AD =
å f / X - X / 5,070
= = 126.75
n 40
Standard Deviation
Standard range
Deviation s»
4
Solution:
range HV - LV 860 - 420 440
s» = = = = 110
4 4 4 4
Variance
Sample Variance:
For ungrouped data For grouped data
s 2
=
å (X - X ) 2
s2 =
å f ( X - X ) 2
n-1 n-1
or or
å X 2
-
( å X ) 2
( å fX) 2
s =
2 n å fX 2
-
n
n-1 s =
2
n-1
Variance
s=
å (X - X ) 2
s=
å f ( X - X ) 2
n-1 n-1
or or
( å X) 2
å
( å fX) 2
åX - n -
2
2 fX
s= n
s=
n-1 n-1
Population Variance and SD
Individual
Sample Variance value
n-1
Sample
Population
Sample Standard Deviation
Sample SD
s=
å ( X - X ) 2
n-1
Example: SD & Variance Ungrouped Data
Solution:
Compute for the mean
X=
å X 550 + 420 + 560 + 500 + 700 + 670 + 860 + 480
=
n 8
4,740
= = 592.50
8
Solution 1: SD & Variance Ungrouped Data
X X-X ( X - X )2
550 –42.5 1,806.25
550 – 592.5
420 –172.5 29,756.25
560 –32.5 1,056.25
500 –92.5 8,556.25
700 107.5 11,556.25
670 77.5 6,006.25
860 267.5 71,556.25
480 –112.5 12,656.25
å X = 4,740 å (X - X ) = 0 å / X - X / = 142,950
s2 =
å ( X - X ) 2
=
142,950
= 20,421.43
n-1 8-1
s=
å ( X - X ) 2
=
142,950
= 20,421.43 = 142.90
n-1 8-1
Solution 2: SD & Variance Ungrouped Data
X X2
å X 2
-
( å X ) 2
550 302,500 s =
2 n
420 176,400 n-1
560 313,600 ( 4,740)2
2,951,400 -
500 250,000 = 8
700 490,000 8-1
670 448,900 2,951,400 - 2,808,450
860 739,600 =
480 230,400 7
= 20,421.43
å X = 4,740 å = 2,951,400
X 2
å X 2
-
( å X ) 2
s= n = 20,421.43 = 142.90
n-1
Example: Variance & SD Grouped Data
Class Limits f X fX
18 – 26 3 22 66
27 – 35 5 31 155
36 – 44 9 40 360
45 – 53 14 49 686
54 – 62 11 58 638
63 – 71 6 67 402
72 – 80 2 76 152
Total 50 å fX =2,459
X=
å fX 2,459
= = 49.18
n 50
Solution 1 (continuation)
Variance: s =
2 å f ( X - X ) 2
=
8,827.38
= 180.15
n-1 50 - 1
Standard Deviation: s = å f (X - X ) 2
= 180.15 = 13.42
n-1
Solution 2:
(å fX)2 (2,459)2
å fX -
2
n
129,761 -
50 129,761 - 120,933.92
s =
2
= = = 180.15
n-1 50 - 1 49
(
å fX2 - nå fX) 2
s= = 180.15 = 13.42
n-1
Population Variance and SD
Individual
Population Variance value
N
Population
N
Example for Variance & SD
µ=
å X 55,000 + 59,500 + 62,500 + 57,000 + 61,000
=
N 5
295,000
= = 59,000
5
Solution for Variance & SD
X X-µ ( X - µ) 2
55,000 –4,000 16,000,000
59,500 500 250,000
62,500 3,500 12,250,000
57,000 –2,000 4,000,000
61,000 2,000 4,000,000
å X = 295,000 å (X - µ) = 0 å (X - µ) = 36,500,000
2
s =
2 å ( X - µ ) 2
=
36,500,000
= 730,000
N 5
s=
å ( X - µ ) 2
= 730,000 = 2,701.85
N
Quartiles, Deciles & Percentiles
Ungrouped Grouped
æ kN ö
k(N + 1) ç - cf ÷
Quartiles Qk = Q k = LB + ç 4 ÷(i)
4 ç f ÷
ç ÷
è ø
æ kN ö
k(N + 1) ç - cf ÷
Deciles Dk = Dk = LB + ç 10 ÷(i)
10 ç f ÷
ç ÷
è ø
æ kN ö
k(N + 1) ç - cf ÷
Percentiles Pk = Pk = LB + ç 100 ÷(i)
100 ç f ÷
ç ÷
è ø
Example for Quartiles
Find the 1st, 2nd, and 3rd quartiles of the ages of 9
middle-management employees of a certain
company. The ages are 53, 45, 59, 48, 54, 46, 51, 58,
and 55.
Solution:
1(N + 1) 1(9 + 1) 10
Q1 = = = = 2.5
4 4 4
46 + 48 94 55 + 58 113
Q1 = = = 47 Q3 = = = 56.5
2 2 2 2
Total 50
æ 3N ö æ 3(50) ö
ç - cf ÷ ç - 31 ÷
Q 3 = LB + ç 4 ÷(i)= 53.5 + ç 4 ÷(9) = 58.82
ç f ÷ ç 11 ÷
ç ÷ ç ÷
è ø è ø
Solution for D7
7N 7(50)
D7 (Ranked Value) = = = 35
10 10
Class Limits f cf
18 – 26 3 3 cf
27 – 35 5 8
LB = 54 – 0.5 36 – 44 9 17
= 53.5
45 – 53 14 31
54 – 62 11 42 D7 Class
63 – 71 6 48
72 – 80 2 50 f
Total 50
æ 7N ö æ 7(50) ö
ç - cf ÷ ç - 31 ÷
D7 = LB + ç 10 ÷(i) = 53.5 + ç 10 ÷(9) = 56.77
ç f ÷ ç 11 ÷
ç ÷ ç ÷
è ø è ø
Solution for P22
22N 22(50)
P22 (Ranked Value) = = = 11
100 100
Class Limits f cf cf
LB = 36 – 0.5 18 – 26 3 3
= 35.5 27 – 35 5 8
36 – 44 9 17 P22 Class
45 – 53 14 31
54 – 62 11 42
f
63 – 71 6 48
72 – 80 2 50
Total 50
æ 22N ö æ 22(50) ö
ç - cf ÷ ç - 8 ÷
P22 = LB + ç 100 ÷(i) = 35.5 + ç 100 ÷(9) = 38.5
ç f ÷ ç 9 ÷
ç ÷ ç ÷
è ø è ø
Midhinge
Q1 + Q3
Midhinge =
2
Example for Midhinge
Solution:
Q 1 + Q 3 47 + 56.5 103.5
Midhinge = = = = 51.75
2 2 2
Solution:
Solution:
Q3 - Q1
Quartile Deviation (QD) =
2
56.5 - 47 9.5
= = = 4.75
2 2
The QD is 4.75.
Coefficient of Variation (CV)
s
Sample Mean: CV = (100% )
X
s
Population Mean: CV = (100% )
µ
Example 1 for Coefficient of Variations
The average age of the engineers at VSAS Pipeline
Corporation is 33 years, with a standard deviation of
3; the average monthly salary of the engineers is
₧45,000, with standard deviation of ₧3,150.
Determine the coefficient of variations of age and
salary.
Solution:
s 3
CV = (100% ) = (100%) = 10% Age
X 33
s
CV = (100% ) = 3,150 (100%) = 7% Salary
X 45,000
Solution:
s
CV = (100% ) = 1,350 (100%) = 10.8% Commission
X 12,500
s 56.25
CV = (100% ) = (100%) = 12.5% Sales
X 60
3,000
k= = 2.0
1,500
1 1 1
1- 2 = 1- 2
= 1- = 1- 0.25 = 0.75
k 2.0 4
Therefore, at least 75% of the data value will fall
between ₧22,000 and ₧28,000.
Empirical Rule
68%
95%
99.7%
Kurtosis
ìé n(n + 1) ù é æ X i - X ö ù üï
4
3(n - 1)2
ï n
kurt = íê ú êå çç ÷÷ ú ý -
ïîë (n - 1)(n - 2)(n - 3) û êë i=1 è s ø úû ïþ (n - 2)(n - 3)
Three Types of Kurtosis
Leptokurtic
Mesokurtic
Platykurtic
Example for Kurtosis
Solution:
Compute for the mean
X=
å X 550 + 420 + 560 + 500 + 700 + 670 + 860 + 480
=
n 8
4,740
= = 592.50
8
Solution…
X X-X ( X - X )2
550 –42.5 1,806.25
420 –172.5 29,756.25
560 –32.5 1,056.25
500 –92.5 8,556.25
700 107.5 11,556.25
670 77.5 6,006.25
860 267.5 71,556.25
480 –112.5 12,656.25
å X = 4,740 å (X - X ) = 0 å / X - X / = 142,950
s=
å ( X - X ) 2
=
142,950
= 20,421.43 = 142.90
n-1 8-1
Solution…
4
æX-Xö
X X-X ( X - X )2 çç ÷÷
è s ø
550 –42.5 1,806.25 0.0078
420 –172.5 29,756.25 2.1232
560 –32.5 1,056.25 0.0027
500 –92.5 8,556.25 0.1755
700 107.5 11,556.25 0.3202
670 77.5 6,006.25 0.0865
860 267.5 71,556.25 12.2779
480 –112.5 12,656.25 0.3841
4,740 0 142,950 15.3779
Solution…
ìé n(n + 1) ù é æ X i - X ö ù üï
4
3(n - 1)2
ï n
kurt = íê ú êå çç ÷÷ ú ý -
ïîë (n - 1)(n - 2)(n - 3) û êë i=1 è s ø úû ïþ (n - 2)(n - 3)
4 5
4 0
3 5
3 0
2 5
2 0
1 5
1 0
1 2 3 4 5 6 7 8 9
4 0
3 5
3 0
2 5
2 0
1 5
1 0
1 2 3 4 5 6 7 8 9
4 0
3 5
3 0
2 5
2 0
1 5
1 0
1 2 3 4 5 6 7 8 9
3( X - median)
sk =
s
n é æX-Xö ù
3
sk = êå çç ÷÷ ú
(n - 1)(n - 2) ê è s ø ú
ë û
Example for Pearson’s Coefficient of Skewness
Solution:
Compute for the mean
X=
å X 550 + 420 + 560 + 500 + 700 + 670 + 860 + 480
=
n 8
4,740
= = 592.50
8
Solution…
X X-X ( X - X )2
550 –42.5 1,806.25
420 –172.5 29,756.25
560 –32.5 1,056.25
500 –92.5 8,556.25
700 107.5 11,556.25
670 77.5 6,006.25
860 267.5 71,556.25
480 –112.5 12,656.25
å X = 4,740 å (X - X ) = 0 å / X - X / = 142,950
s=
å ( X - X ) 2
=
142,950
= 20,421.43 = 142.90
n-1 8-1
Solution….
3
æX-Xö
X X-X ( X - X )2 çç ÷÷
è s ø
n é æX-Xö ù
3
sk = êå çç ÷÷ ú
(n - 1)(n - 2) ê è s ø ú
ë û
8
= ( 4.59) = 0.19047619( 4.59) = 0.87
(8 - 1)(8 - 2)
Xlowest Xhighest
Q1 Q2 = Median Q3
0 10 20 30 40 50 60
Example for Boxplot
Solution:
Recall that Q1 = 47, Median = 53, and Q3 = 56.5
Lowest Value = 45
Highest Value = 59
Solution for Boxplot
Added Constant
Constant Multiplier
Summary Yi = Xi + k Yi = h Xi
Measure
Range Unaffected Range (Y) = h Range (X)
Standard Unaffected s (Y) = /h/ s (X)
Deviation
Quartiles Qi (Y) = Qi (X) + k Qi (Y) = h Qi (X)
IQR Unaffected IQR (Y) = h IQR (X)
Example 1
Summary Statistics
Range 22
Standard deviation 7.33
First quartile 64
Third quartile 72
Interquartile range 14
Summary Statistics
Range 22 (unaffected)
Standard deviation 7.33 (unaffected)
First quartile 64 + 2 = 66
Third quartile 72 + 2 = 74
Interquartile range 14 (unaffected)
Example 2
Summary Statistics
Range 700
Standard deviation 222.28
First quartile 1,100
Third quartile 1,400
Interquartile range 300
Summary Statistics
Range 1.05(700) = 7.35
Standard deviation 1.05(222.28) = 233.39
First quartile 1.05(1,100) = 1.155
Third quartile 1.05(1,400) = 1,470
Interquartile range 1.05(300) = 315
Success doesn’t come to You, You go to it.”
– Ben Francia