Lecture Week 3
Lecture Week 3
Fractiles are measures of location or position which include not only central location but also any position
based on the number of equal divisions in a given distribution. If we divide the distribution into four equal
divisions, then we have quartiles denoted by Q1, Q2, Q3, and Q4. The most commonly used fractiles are the
quartiles, deciles, and percentiles.
3N th item
Q3 is the 3rd quartile. Q3 =
4
This means that 75% of the observations lie below this value.
23
Q2 = 2N th = 2(23) th = th = 11.5th item, which is 100.
4 4 2
This means that the score of 100 is higher that 50% of the items in the distribution.
69
Q3 = 3N th = 3(23) th = th = 17.25th item, which is 102.
4 4 4
N th = 23
D1 = th = 2.3th item, which is 95.
10 10
4N th = 4(23)th = 92
D4 = th = 9.2th item, which is 100.
10 10 10
5N th = 5(23)th = 115
D5 = th = 11.5th item, which is 100.
10 10 10
7N th = 7(23)th = 161
D7 = th = 16.1th item, which is 102.
10 10 10
10Nth = 10(23)th = 23
P10 = th = 2.3th item, which is 95.
100 100 10
P25 = 25Nth = 25(23)th = 23
th = 5.75th item, which is 98.
100 100 4
50Nth = 50(23)th = 23
P50 = th = 11.5th item, which is 100.
100 100 2
P70 = 70Nth = 70(23)th = 1,610th = 16.1th item, which is 102.
100 100 100
Note that the median is equal to Q2, D5, and P50.
Quartiles
kN – - cf
Qk = LL + i 4
m
where Qk = kth quartile
LL = lower class boundary of the kth quartile class
cf = less than cumulative frequency below the kth quartile class
fm = frequency of the kth quartile class
i = class size
N = total number of observations
Deciles
kN – - cf
Dk = LL + i 10
m
where Dk = kth decile
LL = lower class boundary of the kth decile class
cf = less than cumulative frequency below the kth decile class
m = frequency of the kth decile class
i = class size
N = total number of observations
Percentiles
kN – - cf
Pk = LL+ i 100
m
where Pk = kth percentile
LL = lower class boundary of the kth percentile class
cf = less than cumulative frequency below the kth percentile class
m = frequency of the kth percentile class
i = class size
N = total number of observations
Example 2:
Find the Q1, D6, and P95 of the data in table 1.
Table 1
Weights of 50 Pieces of Jackfruits Sold in Supermarket Y
Weights (in pounds) No. of Pieces <CF
50 – 54 5 5
55 – 59 19 24
60 – 64 22 46
65 – 69 3 49
70 – 74 1 50
Solution:
12.5 – 5
= 54.5 + 5
= 54.5 + 1.97 19
= 56.47
D6 = 6N
2. 6(50)
th = th = 6(5)th = 30th item.
10 10
Thus, the 6 decile class is 60–64 since it is where the 30 th item is found. LL=59.5, cf=24, fm=22, i=5, and N=50.
th
6N– - cf
D6 = LL + i 10
m
30 – 24
= 59.5 + 5
= 59.5 + 1.36 22
= 60.86
P95 = 95Nth =
3. 95(50)th = 95
th = 47.5th item.
100 100 2
Thus, the 95 percentile class is within 65 – 69. LL = 64.5, cf = 46, fm = 3, i = 5
th
95N– - cf
P95 = LL + i 100
m
47.5 – 46
= 64.5 + 5
22
= 64.5 + 2.5
= 67
Measures of Variation
Events of nature vary from time to time. People keep on changing their location, motion, physical
appearance, skin reaction to different chemicals, height, weight, hair color, eye color, ideas, and even values in
life. Usually, the heights of a group of people with the same race tend to converge to a certain common value. For
example, if the mean height of Filipino males is approximately 5 feet and 6 inches, then this means that most
Filipino male adults have heights that are clustering about this value. The extent of the clustering of the heights of
the Filipino males about a central value is known as variation. The measures of variation will enable you to know
how varied the observations are, whether there are extreme values in the distribution, or whether the values are
very close to each other. If the measure of variation is zero, it means that there is no variation at all and that the
observations are all alike, or homogeneous. Otherwise, they are heterogeneous. The common measures of
variation are the range, mean absolute deviation, variance, standard deviation, coefficient of variation, quartile
deviation, and the percentile range.
Range
The range is the simplest form of measuring the variation of a distribution. To get the range, subtract the
lowest score or observation from the highest score.
x- x
MAD = N (for ungrouped data)
where MAD = mean absolute deviation
x = raw score
= mean score
N = number of observations
x
x- x
MAD = N (for grouped data)
where MAD = mean absolute deviation
= frequency
x = class mark
= mean score
N = number of observations
Example 2: x
Take the MAD of the ages of the scientists in example 1.
Solution:
The ages are 34, 35, 45, 56, 32, 25, and 40.
34 + 35 + 45 + 56 + 32 + 25 + 40
Mean Age: x = = 38.14
7
x x-x x - x
34 -4.14 4.14
35 -3.14 3.14
45 6.86 6.86
56 17.86 17.86
32 -6.14 6.14
25 -13.14 13.14
40 1.86 1.86
Total 53.14
53.14 = 7.59
MAD =
7
Therefore, the mean absolute deviation is 7.59.
Variance
Variance is another measure of variation which can be used instead of the range. The variance considers
the deviation of each observation from the mean. To obtain the variance of a distribution, first, square the
deviation from the mean of each raw score and add them together. Then, divide the resulting sum by N or the
total number of cases.
1. Population Variance for Ungrouped Data
(x – )2
= N
where V = population variance
x = raw score
= population mean
N = number of observations
f(x – )2
=
N
where V = population variance
= frequency
x = class mark
= population mean
N = number of observations
Example 3:
Find the population and sample variances of the following distribution: 34, 35, 45, 56, 32, 25, and 40
Solution:
x = 267 = 38.14
7
x x - x (x – x)2
34 4.14 17.1396
35 3.14 9.8596
45 6.86 47.0596
56 17.86 318.9796
32 6.14 37.6996
25 13.14 172.6596
40 1.86 3.4596
Total 267 53.14 606.86
1. Population Variance
(x – )2
= N
606.8
=
6
7
= 86.7
2.
3. Sample Variance
(x – Mn )2
V = N
606.8
=
6
= 101.14
6
Example 4:
Compute for the population and sample variances for the data in table 1.
Table 1
IQ Scores
IQ Scores x x x2 x2 (x – x)2
75 – 79 10 77 770 5,929 59,290 1,876.9
80 – 84 12 82 984 6,724 80,688 908.28
85 – 89 25 87 2,175 7,569 189,225 342.25
90 – 94 34 92 3,128 8,464 287,776 57.46
95 – 99 19 97 1,843 9,409 178,771 754.11
100 - 104 15 102 1,530 10,404 156,060 1,915.35
N = 115 10,430 951,810 5,854.35
Solution: 10,430
x= = 38.14
Sample Variance 115
Nx2 – ( x)2
V = N(N – 1)
N
115(951,810) – (10,430)2
=
115(115 – 1)
109,458,150 – 108,784,900
=
13,110
= 51.35
Population Variance
(x – )2
= N
5,854.
=
35
= 50.91
115
Standard Deviation
The standard deviation, for a population or s for a sample, is the square root of the value of the variance.
In symbols,
Population Standard Deviation (s)
___
s=√
Sample Standard Deviation (SD)
___
SD = √V
Unless specified, the sample standard deviation will be used in all the examples and exercises throughout the
book.
Example 5:
Compute for the population and sample standard deviations for the data in table 1.
Solution:
Population Variance
= 50.91
Therefore, the value of the population standard deviation is
s = √50.91 =
Sample Variance
7.14
V = 51.35
The sample standard deviation is
SD = √51.35 = 7.17
Example 6:
Find the standard deviation for the distribution in table 2.
Table 2
Scores in the Statistics Final Exam
Class Interval x x x2
27 – 29 12 28 336 9,408
30 – 32 23 31 713 22,103
33 – 35 60 34 2,040 69,360
36 – 38 45 37 1,665 61,605
39 – 41 51 40 2,040 81,600
42 – 44 75 43 3,225 138,675
45 – 47 28 46 1,288 59,248
48 – 50 33 49 1,617 79,233
51 – 53 18 52 936 48,672
54 – 56 10 55 550 30,250
355 14,410 600,154
14,410
x=
355
= 40.59
V 355(600,154) – (14,410)2
=
355(355 – 1)
5,406,57
=
0
= 43.02125,670
SD = √43.02
= 6.56
Therefore, the standard deviation of the score is 6.56.
Coefficient of Variation
When it is necessary to compare the variability of two or more groups, the task is easy if the means are
the same. For example, you can easily compare which group is more varied in height between the following
groups:
Group 1: 156 cm, standard deviation = 6
Group 2: 156 cm, standard deviation = 10
Clearly, one can say that Group 2 is more varied because it has a higher standard deviation. The task
becomes more difficult if the means are not equal and the units are different, such as when comparing the weights
of two groups belonging to different age brackets or different genders. To compare the variability of the weights of
9 girls, having a mean weight of 100 pounds and a standard deviation of 5 with that of the weight of 12 boys
having a mean of 160 pounds and a standard deviation of 8, a statistic called the coefficient of variation could help
you. The formula is given by:
SD
CV = 100%
where SD = standard deviation
= mean
Since s and have the same units, their units will cancel out and so, CV has no unit.
Example 7:
Suppose two groups of students are to be compared in terms of height.
Group Mean Height Standard Deviation CV
Male 162 cm 10 cm 6.17%
Female 148 cm 4 cm 2.70%
Solution:
10 100% = 6.17%
Male CV =
162
4 100% = 2.70%
Female CV =
148
Comparing the relative variations in height of the male and female students, it can be seen that the heights of the
male students have a higher coefficient of variation than those of the female students. Thus, the male students’
heights are more varied.
Example 8:
Compare the variability of the heights and weights of the students given in the following data:
s CV
Height (in cm) 168 cm 12 cm 7.14%
Weight (in pounds) 200 lb 20 lb 10.00%
From the results, it can be seen that the weights of the students are more varied than the heights.
Quartile Deviation
The quartile deviation is another way of determining the spread of a distribution in terms of quartiles. The
quartile deviation formula is shown below:
Q3 – Q1
QD =
2
where QD = quartile deviation
Q3 = 3rd quartile
Q1 = 1st quartile
Example 9:
Find the QD of the following scores:
23 25 25 30 35 39 40 44 47 51 60
Solution:
3N
For Q3: = = 3(355)= 266.25, hence, LL
= 32.5, cf = 35, i = 3, and fm = 60.
4 4
266.25 – 266
Q3 = 44.5 + 3 = 44.5 + 0.027 = 44.53
28
44.53 – 35.19 9.34
QD = = = = 4.67
2 2
Hence, the quartile deviation is 4.67.
Percentile Range
The percentile range, PR, is the difference between the 90 th percentile (P90) and the 10th percentile (P10). In
symbols,
PR = P90 – P10
Example 12:
The following data represent the scores of students in a Physics final examination:
100 100 111 111 112 120 121 122 123
130 132 133 135 140 145 145 146 150
150 155 160 164 165 165 170 171 175 180
Calculate the percentile range of the scores.
Solution: