Lesson 5
Lesson 5
These are important measures which divide the distribution into parts of subgroups.
Percentiles are used to divide the distribution into one hundred parts.
Deciles divide the distribution in to 10 subgroups.
Quartiles divide the distribution into four subgroups.
These measures are also important in looking at the position of an individual in a group.
This may be also used in categorizing data.
When presenting or analysing data set it is sometimes important to group subjects into several equal
groups.
For example, to create four equal groups we need the values that split the data such that 25% of the
observations are in each group.
The cut off points are called quartiles and there are three (3) of them (the middle one also being called
the median).
20, 22, 25, 27, 28, 29, 30, 31, 33, 35, 39, 40
20, 22, 25, 27, 28, 29, 30, 31, 33, 35, 39, 40
20, 22, 25, 27, 28, 29, 30, 31, 33, 35, 39, 40
Other values likely to be encountered are deciles, which split data into 10 parts and percentiles which
split the data into 100 parts (also called centiles).
Quartiles
20, 22, 25, 27, 28, 29, 30, 31, 33, 35, 39, 40
1st quartile, Q1, or 25th percentile—the number that separates the lowest 25% of the group
from the highest 75% of the group.
2nd quartile, Median, or 50th percentile—the number in the middle of the group, when
arranged from smallest to largest.
3rd quartile, Q3, or 75th percentile—the number that separates the lowest 75% of the group
from the highest 25% of the group.
Maximum, or (rarely) “100th percentile”—the largest number in the group.
Quartiles are used to summarize a group of numbers. Instead of looking a big list of numbers,
you are looking at just a few numbers that give you a picture of what’s going on in the big list.
Quartiles are great for reporting on a set of data and for making box and whisker plots. Quartiles
are especially useful when you’re working with data that isn’t symmetrically distributed, or a data
set that has outliers.
All numerical summaries—like mean, median, and mode—give you a few numbers to summarize
a large group of data, but what’s special about quartiles is that they split the data up into four
equal-size groups.
Q1 = n/4
Q2 = 2n/4
Q3 = 3n/4
Example: Find the first, second and third quartiles of the ages of 10 middle management employees of a
certain company. The ages are 53, 45, 59, 48, 54, 46, 51, 57, 58, and 55.
Step 2:
Q1 = n/4
Q2 = 2n/4
Q3 = 3n/4
Q1 = 10/4
=2.5th observation
Since the 2.5th falls between 46 and 48, we need to get the average of the two values.
Q1 = 46+48
=94/2
Q1=47
Q2 = 2(10)/4
=20/5
=5th observation
Q2 = 53
Q3 = 3(10)/4
=30/4
=7.5th observation
Since the 7.5th falls between 55 and 57, we need to get the average of the two values.
Q3 = 55+57
2
Q3 = 56
In descriptive statistics, a box plot or boxplot is a method for graphically depicting groups of
numerical data through their quartiles. Box plots may also have lines extending from the boxes
indicating variability outside the upper and lower quartiles, hence the terms box-and-whisker plot and
box-and-whisker diagram.
Example:
The following is a list of scores resulting from an English examination administered to 40 students:
Scores (x)
91 61 46 62 54
62 93 90 99 76
48 83 59 96 66
94 52 51 59 62
89 100 92 70 59
91 73 68 49 54
85 43 78 50 45
98 69 77 42 46
Solution:
Scores (x)
42
43
45
46
46
48
49
50
51
52
54
54
59
59
59
61
62
62
62
66
68
69
70
73
76
77
78
83
85
89
90
91
91
92
93
94
96
98
99
100
For Q1 = n/4
=40/4
=10
For Q2 = 2n/4
=2(40)/4
=80/4
=20
For Q3 = 3n/4
=3(40)/4
=120/4
=30
59 60 61 62 62 63 63 64 64 64 65 65 65 65 65 65 65 65 65 66 66
67 67 68 68 69 70 70 70 70 70 71 71 72 72 73 74 74 75 77
Construct a box plot with the following properties; the calculator
instructions for the minimum and maximum values as well as the quartiles
follow the example.
Minimum value =
Maximum value =
Range =
Deciles
Solved in exactly the same way for the quartiles, except for the denominator
k= (i/10)n
Dk = Kn/10
Where:
D = decile
K = from 1, 2, … 9
n = sample size
The following is a list of scores resulting from an English examination administered to 40 students
(arranged in an array form from lowest to highest). Solve for D3, D5 and D8.
42 54 68 90
43 54 69 91
45 59 70 91
46 59 73 93
46 59 76 93
48 61 77 94
49 62 78 96
50 62 83 98
51 62 85 99
52 66 89 100
Solution:
D3 = 3n/10
=3(40)/10
=120/10
=12
D5 = 5n/10
=5(40)/10
=200/10
=20
D8 = 8n/10
=8(40)/10
=320/10
=32
Percentiles
Pk = kn/100
Where:
P = percentile
k = from 1, 2, 3,…99
n = sample size
Below are the scores of 40 students in an English examination. Solve for P50, P66, P98.
42 54 68 90
43 54 69 91
45 59 70 91
46 59 73 93
46 59 76 93
48 61 77 94
49 62 78 96
50 62 83 98
51 62 85 99
52 66 89 100
It can be noted that the position of the score of 66 in the above distribution are the same when the
following measures are computed such as the Md = Q2 = D5 = P5
So the Md = Q2 = D5 = P50
Md = 20th = 66
Q2 = 20th = 66
D5 = 20th = 66
P50 = 20th = 66
𝑘𝑁
4
−𝑐𝑓
Qk = LB + ( )i
𝑓
Where Qk = quartile
N = population
k = quartile location
i= class interval
Example: Determine the Q1, Q2 and Q3 of the frequency distribution on the ages of 50 people taking
travel tours.
Class Frequency
18-26 3
27-35 5
36-44 9
45-53 14
54-62 11
63-71 6
72-80 2
Solution:
Class F cf
18-26 3 3
27-35 5 8
36-44 9 17
45-53 14 31
54-62 11 42
63-71 6 48
72-80 2 50
=50/4
=12.5
Class
Class F cf
18-26 3 3
27-35 5 8
36-44 9 17 Q1 class
45-53 14 31
54-62 11 42
63-71 6 48
72-80 2 50
Step 4: Determine the values of LB, cf, f, i and N.
LB = 36-0.5
=35.5
cf = 8
f=9
i = 45-36 = 9
𝑘𝑁
4
−𝑐𝑓
Qk = LB + ( )i
𝑓
Class F cf
18-26 3 3
27-35 5 8
36-44 9 17
45-53 14 31 Q2 Class
54-62 11 42
63-71 6 48
72-80 2 50
LB = 45-0.5 = 44.5
𝑘𝑁
4
−𝑐𝑓
Qk = LB + ( )i
𝑓
2(50)
Q2= 44.5 4 17 9
14
Q2 = 49.64
Class F cf
18-26 3 3
27-35 5 8
36-44 9 17
45-53 14 31
54-62 11 42 Q3 Class
63-71 6 48
72-80 2 50
LB = 54-0.5
LB = 53.5
𝑘𝑁
4
−𝑐𝑓
Qk = LB + ( )i
𝑓
3N
cf
Q3 LB 4 i
f
3(50)
31
Q3 LB 4 9
11
Q3 = 58.82
k(N )
cf
Dk LB 10 (i)
f
Where:
Dk = decile
N= population
k = decile location
i = class interval
Example:
Determine the D7 of the frequency distribution on the ages of 50 people taking travel tours.
Class f
18-26 3
27-35 5
36-44 9
45-53 14
54-62 11
63-71 6
72-80 2
Solution:
Class f cf
18-26 3 4
27-35 5 8
36-44 9 17
45-53 14 31
54-62 11 42
63-71 6 48
72-80 2 50
7( N ) 7(50)
D7 (Rank value) = 35
10 10
Step 3: Identify the D7 class by locating the 35th rank in the table.
Class f cf
18-26 3 4
27-35 5 8
36-44 9 17
45-53 14 31
54-62 11 42 D7 class
63-71 6 48
72-80 2 50
LB = 54-0.5
LB = 53.5
cf = 31
f = 11
i = 54-45 = 9
Step 5: Apply the formula to compute for the value of the seventh decile.
k(N )
cf
Dk LB 10 (i)
f
7(50)
31
D7 53.5 10 (9)
11
D7 = 56.77
k(N )
cf
Pk LB 100 (i )
f
Where:
Pk = percentile
N= population
k = decile location
i = class interval
22 N 22(50)
P22 rank value = 11
100 100
Step 2: Identify the P22 class by locating the 11th rank in the table.
Class f cf
18-26 3 4
27-35 5 8
36-44 9 17 P22 Class
45-53 14 31
54-62 11 42
63-71 6 48
72-80 2 50
LB = 36-0.5 = 35.5
i = 45-36 = 9
Cf = 8
f=9
k(N )
cf
Pk LB 100 (i )
f
22(50)
8
P 22 35.5 100 (9)
9
P22 = 38.5