032 Measures of Position
032 Measures of Position
032 Measures of Position
Measures of Position
Measures of position are used to describe the relative location of an observation Quartiles and percentiles are two of the most popular measures of position An additional measure of central tendency, the midquartile, is defined using quartiles Quartiles are part of the 5-number summary
Quartiles
Quartiles: Values of the variable that divide the ranked data into quarters; each set of data has three quartiles 1. The first quartile, Q1, is a number such that at most 25% of the data are smaller in value than Q1 and at most 75% are larger 2. The second quartile, Q2, is the median 3. The third quartile, Q3, is a number such that at most 75% of the data are smaller in value than Q3 and at most 25% are larger
Ranked data, increasing order
25%
L
25%
Q1 Q2
25%
Q3
25%
H
c = class size = (74.5 69.5 = 5) Median = 69.5 + (20 9) x (74.5 69.5) = 70.5 2 5
Class Interval 50 54 55 59 60 64 65 69
Class Limit
cf 1 2 4 9 14 16 18 20
49.5 54.5 54.5 59.5 59.5 64.5 64.5 69.5 69.5 74.5 74.5 79.5 79.5 84.5 84.5 89.5
Median 70 74 Class
75 79 80 85 85 89
Median = L + (N s ) x c 2 f
L = LCL of median class Q2 (= 69.5) N = f = total frequency (= 20) s = total frequency before median classQ2 (= 9) f = frequency of median class Q2 (= 5)
First Quartile Q1
Q1 = L + ( N s ) x c 4 f L = LCL of Q1 class (= 64.5) N = f = total frequency (= 20) s = total frequency before Q1 class (= 4) f = frequency of Q1 class (= 5)
Class Interval 50 54 55 59 60 64
Class Limit
cf 1 2 4 9 14 16 18 20
49.5 54.5 54.5 59.5 59.5 64.5 64.5 69.5 69.5 74.5 74.5 79.5 79.5 84.5 84.5 89.5
Q1 Class
65 69 70 74 75 79 80 85 85 89
Third Quartile Q3
Q3 = L + (3N s ) x c 4 f L = LCL of Q3 class (=74.5) N = f = total frequency (=20) s = total frequency before Q3 class (=14) f = frequency of Q3 class (= 2)
c = class size (=79.5 74.5 = 5) Q3 = 74.5 + (3x20 14) x (79.5 74.5) = 76.0 4 2
Class Interval 50 54 55 59 60 64 65 69 70 74
Class Limit
cf 1 2 4 9 14 16 18 20
49.5 54.5 54.5 59.5 59.5 64.5 64.5 69.5 69.5 74.5 74.5 79.5 79.5 84.5 84.5 89.5
Q3 Class
75 79 80 85 85 89
Percentiles
Percentiles: Values of the variable that divide a set of ranked data into 100 equal subsets; each set of data has 99 percentiles. The kth percentile, Pk, is a value such that at most k% of the data is smaller in value than Pk and at most (100 k)% of the data is larger.
at most k %
L
at most (100 - k )% Pk
H
Notes:
The 1st quartile and the 25th percentile are the same: Q1 = P25 The median, the 2nd quartile, and the 50th percentile are x = Q2 = P50 all the same: ~
Percentiles
1%
P1
1%
P2
1%
P3 P97
1%
P98
1%
P99
1%
Pk = the
kN th value 100
d(Pk) = A.5 (depth) Pk is halfway between the value of the data in the Ath position and the value of the next data
Example
Example: The following data represents the pH levels of a random sample of swimming pools in a town. Find: 1) the first quartile, 2) the third quartile, and 3) the 37th percentile:
5.6 6.0 6.7 7.0 5.6 6.1 6.8 7.3 5.8 6.2 6.8 7.4 5.9 6.3 6.8 7.4 6.0 6.4 6.9 7.5
Solutions:
depth = 5.5,
Q1 = 6
2) k = 75: (20) (75) / 100 = 15, depth = 15.5, Q3 = 6.95 3) k = 37: (20) (37) / 100 = 7.4, depth = 8, P37 = 6.2
kN position 100
80 85 88 91 99 at position 7.44
P62 = 62 th percentile =
79 + 80 2
= 79.5
Percentile Pk
X 38 37 36 35 34 33 32 31 30 29 28 27 26 25 24 f 1 1 3 5 9 8 17 23 24 18 10 3 1 0 2 N = 125 cf 125 124 123 120 115 106 98 81 58 34 16 6 3 2 2 cf % 100 99 98 96 92 85 78 65 46 27 13 5 3 2 2 P25 = L + kN - cf f
k = 25/100
10
Midquartile
Midquartile: The numerical value midway between the first and third quartile: Q +Q midquartile = 1 2 3 Example: Find the midquartile for the 20 pH values in the previous example: Q + Q3 6 + 6.95 12.95 = = = 6.475 midquartile = 1 2 2 2 Note:The mean, median, midrange, and midquartile are all measures of central tendency. They are not necessarily equal. Can you think of an example when they would be the same value?
5-Number Summary
5-Number Summary: The 5-number summary is composed of: 1. L, the smallest value in the data set 2. Q1, the first quartile (also P25) x , the median (also P50 and 2nd quartile) 3. ~ 4. Q3, the third quartile (also P75) 5. H, the largest value in the data set Notes: The 5-number summary indicates how much the data is spread out in each quarter The interquartile range is the difference between the first and third quartiles. It is the range of the middle 50% of the data
11
Box-and-Whisker Display
Box-and-Whisker Display: A graphic representation of the 5-number summary:
The five numerical values (smallest, first quartile, median, third quartile, and largest) are located on a scale, either vertical or horizontal The box is used to depict the middle half of the data that lies between the two quartiles The whiskers are line segments used to depict the other half of the data One line segment represents the quarter of the data that is smaller in value than the first quartile The second line segment represents the quarter of the data that is larger in value that the third quartile
Example
Example: A random sample of students in a sixth grade class was selected. Their weights are given in the table below. Find the 5-number summary for this data and construct a boxplot: 63 89 99 Solution:
63 L 85 Q1 92 ~ x 99 Q3 112 H
64 90 99
88 97
12
60
70
80
90
Weight
100
110
Q1
~ x
Q3
13