TOPIC 2 (Part 3)
TOPIC 2 (Part 3)
REFERENCES
2
1
3
TOPIC 5.1
MEASURES OF DISPERSION
METHODS
4
2
OBJECTIVES
Calculate measures of dispersion (mean
deviation, Variance and Standard
Deviation) for :
5.1.1 MEASURES OF
DISPERSION FOR
UNGROUPED DATA
3
UNGROUPED DATA
(Individual value)
Example: Age of student
20 25 19 22 16 27 22 23
| x − x |
Mean deviation = n
4
EXAMPLE 1
9
12, 6, 3, 7, 8, 10, 11
SOLUTION 1
10
Mean, X =
x
n
12 + 6 + 3 + 7 + 8 + 10 + 11
=
7
57
= = 8.1429
7
5
11
x | x−x|
12 3.8571
6 2.1429
3 5.1429
7 1.1429
8 0.1429
10 1.8571
11 2.8571
| x − x | = 17.1429
12
| x − x | = 17.1429
Mean deviation =
|x−x|
n
17.1429
=
7
= 2.45
6
SAMPLE VARIANCE (S2)
FOR UNGROUPED DATA
s =
2 1
x −
2
( x) 2
n − 1 n
13
s = s2
s=
1
x −
2
( x) 2
n − 1 n
14
7
EXAMPLE 2
15
1, 7, 2, 5
EXERCISE 1
16
REFER TEXTBOOK
PAGE 127
QUESTION 8
8
17
5.1.2 MEASURES OF
DISPERSION FOR
UNGROUPED DATA
(Presented in a Frequency
Distribution Table)
UNGROUPED DATA
(Frequency
distribution table)
Example: Age of student
Age (years) Number of
Students
21 5
22 17
23 10
24 18 3
9
MEAN DEVIATION FOR UNGROUPED DATA (BUT
PRESENTED IN FREQUENCY DISTRIBUTION TABLE)
19
Mean deviation =
1
(f | x − x |)
f
where x =The observation
x =Sample mean
f =Total of frequency / sample size /
Number of observation
EXAMPLE 3
20
X f
3 4
8 4
13 20
18 10
23 2
10
SOLUTION 3
21
X Frequency fx f|x-mean|
(f)
3 4 12 41
8 4 32 21
13 20 260 5
18 10 180 47.5
23 2 46 19.5
Mean =
fx = 530 = 13.25 Mean Deviation =
1
(f | x − x |)
f
f 40 1
= [134]
40
= 3.35
A B C D
Variable Frequency fx fx2
(X) (f)
11
23
s2 =
1
f x 2 −
( f x )2
n −1 n
n = f
s=
1
f x −
2
( f x)
2
n −1 n
n = f
EXAMPLE 4
24
122 7
127 1
132 1
12
SOLUTION 4
25
x Frequency(f) fx fx2
102 2 204 20808
107 8 856 91592
112 18 2016 225792
117 13 1521 177957
122 7 854 104188
127 1 127 16129
132 1 132 17424
f = 50 fx = fx2
5710 =653890
26
s =
2 1
f x −
2
( f x)
2
=
1
653890 −
(5710 )2 = 36.9
n −1
n 50 − 1 50
s=
1 ( f x )2 = 1 653890− (5710)2 = 36.9 = 6.1
f x 2 −
n −1 n 50 − 1 50
13
EXERCISE 2
27
28
5.1.3 MEASURES OF
DISPERSION FOR GROUPED
DATA
14
GROUPED DATA
(Frequency
distribution table)
Example: Age of Resident
Age (years) Number of
Resident
0 to 9 25
10 to 19 19
20 to 29 11
30 to 39 29 5
Mean deviation =
1
(f | x − x |)
f
where x =Midpoint
x =Sample mean
f =Total of frequency / sample size /
Number of observation
15
EXAMPLE 5
31
SOLUTION 5
32
Class interval Frequency Midpoint (x) fx f|x-mean|
(f)
1-5 4 3 12 41
6 – 10 4 8 32 21
11 – 15 20 13 260 5
16 – 20 10 18 180 47.5
21 - 25 2 23 46 19.5
(f|x-mean| )=
f = 40 fx = 530 134
Mean =
fx = 530 = 13.25 Mean Deviation =
1
(f | x − x |)
f
f 40 1
= [134]
40
= 3.35
16
33
17
35
s2 =
1
f x 2 −
( f x )2
n −1 n
Find the sample standard deviation
s=
1
f x −
2
( f x)
2
n −1 n
EXAMPLE 6
36
18
SOLUTION 6
37
Class Class boundary Frequency(f Midpoint (x) fx fx2
)
100-104 99.5 – 104.5 2 102 204 20808
105-109 104.5 – 109.5 8 107 856 91592
110-114 109.5 – 114.5 18 112 2016 225792
115-119 114.5 - 119.5 13 117 1521 177957
120-124 119.5 – 124.5 7 122 854 104188
125-129 124.5 – 129.5 1 127 127 16129
130-134 129.5 – 134.5 1 132 132 17424
f = 50 fx = 5710 fx2
=653890
38
s =
1
2
f x −
2
( f x)
2
=
1
653890 −
(5710 )2 = 36.9
n −1
n 50 − 1 50
s=
1 ( f x )2 = 1 653890− (5710)2 = 36.9 = 6.1
f x 2 −
n −1 n 50 − 1 50
19
EXERCISE 3
39
EXERCISE 4
40
A 2Z Consultant is trying for a new scale to calculate
productivity level of the estate workers through a series
of training. The company wants to knows the
productivity level of their workers after the training
session.
Productivity level No. of workers
Scale
Very weak 0 and less than 2.0 15
Weak 2.0 and less than 4.0 20
Average 4.0 and less than 6.0 40
Good 6.0 and less than 8.0 52
Very Good 8.0 and less than 10.0 43
20
EXERCISE 5
41
MEASURES OF VARIABILITY
42
21
VARIANCE
• The variance is the average of the squared deviations
from the mean
• The variance will be larger when the observations
within a frequency distribution have a wide range
• Where:
➢ Range = max – min (for ungrouped data)
➢ Range = Upper boundary of the last class – Lower
boundary of the first class. (for grouped data)
43
SUMMARY
• Measures of spread tell us how widely
spread the distribution is. The standard
deviation is the most commonly used
measure of spread in descriptive statistics.
44
22
45
TOPIC 5.2
MEASURES OF POSITION
METHODS
46
1 QUARTILE
2 INTERQUARTILE RANGE (IQR)
3 QUARTILE DEVIATION
4 PERCENTILE
23
OBJECTIVES
Calculate measures of position
(Quartile, Interquartile Range, Quartile
Deviation, Percentile) for :
48
5.2.1 MEASURES OF
POSITION FOR
UNGROUPED DATA
24
UNGROUPED DATA
(Individual value)
Example: Age of student
20 25 19 22 16 27 22 23
49
QUARTILE FOR
UNGROUPED
50
DATA
Quartiles are values that divide an array into 4 equal
quarters.
25
51
Steps in computing the Quartiles of raw data
Step 1: Arrange the data in ascending order
Step 2: Find the position of Quartile 1, Quartile 2
and Quartile 3.
n +1
Position of Q1 = 1 th
4
n +1
Position of Q2 = 2 th
4
n +1
Position of Q3 = 3 th
4
Step 3: Find the values of Q1, Q2 and Q3 based on
its positions.
INTERQUARTILE RANGE
FOR UNGROUPED
52
DATA
The Interquartile range (IQR) is defined as
the difference between the third quartile and the
first quartile for a data set.
IQR = Q3 − Q1
Interquartile range is used to identify outliers,
and it is also used as a measure of variability .
26
QUARTILE DEVIATION FOR
UNGROUPED
53
DATA
= (Q3 − Q1 )
1
Quartile deviation
2
EXAMPLE 7
Referring to the data on the number of
patients at the outpatient clinic
27
SOLUTION 7 (a)
Step 1: Obtain the array
60, 95, 98, 100, 100, 110, 112, 115
n=8
Step 2:
Determine the location of Q1, Q2 and Q3
n +1 8 +1
Position of Q1 = 1 th = 1 th = 2.25th
4 4
n +1 8 +1
Position of Q2 = 2 th = 2 th = 5th
4 4
n +1 8 +1
Position of Q3 = 3 th = 3 th = 6.75th
4 4
28
To obtain the values of Q1 and Q3, some linear interpolation is
required.
Q1 = 0.25(98-95) + 95
= 95.75
Q2 = 0.5(100-100) + 100
= 100
Q3 = 0.75(112-110) + 110
= 111.5
Interpretation:
At most 96 patients visited the clinic 25% of the time.
At most 100 patients visited the clinic 50% of the time.
Therefore, at most 112 patients visited the clinic 75% of the time.
SOLUTION 7 (b)
Q1 = 95.75
Q2 = 100
Q3 = 111.5
IQR = Q 3 − Q1
= 111.5 - 95.75
= 15.75
29
SOLUTION 7 (c)
IQR = Q3 − Q1
= 111.5 - 95.75
= 15.75
Quartile Deviation =
1
(Q3 − Q1 )
2
= (15.75)
1
2
= 7.88
PERCENTILE FOR
UNGROUPED
60
DATA
Steps in computing the Percentiles of raw data
Step 1: Arrange the data in ascending order
Step 2: Find the position of Percentile
n +1
Position of Percentile = p th
100
30
EXAMPLE 8
61
SOLUTION 8
62
Step 2:
Determine the location of 15th percentile:
n +1 8 +1
= p = 15 = 1.35
th
100 100
31
63
1.35th position
(P15 -60)/(1.35-1)=95-60
P15 = 0.35(95-60) + 60
= 72.25
EXERCISE 6
64
The following data give the speeds (in miles per hour) of 12 cars
traveling on a highway.
67 71 57 54 57 84
77 62 61 59 58 93
32
65
3.3.2 MEASURES OF
POSITION FOR
UNGROUPED DATA
(Presented in a Frequency
Distribution Table)
UNGROUPED DATA
(Frequency
distribution table)
Example: Age of student
Age (years) Number of
Students
21 5
22 17
23 10
24 66 3
33
QUARTILE FOR UNGROUPED DATA (BUT
PRESENTED IN FREQUENCY DISTRIBUTION TABLE)
67
IQR = Q3 − Q1
Interquartile range is used to identify outliers,
and it is also used as a measure of variability .
34
QUARTILE DEVIATION FOR UNGROUPED DATA (BUT
PRESENTED IN FREQUENCY DISTRIBUTION TABLE)
69
= (Q3 − Q1 )
1
Quartile deviation
2
EXAMPLE 9
70
35
SOLUTION 9(a)
71
72
( f ) + 1 (50 ) + 1
Position of Q1 : 1 = 1 = 12.75th
4 4
Q1 = 112
( f ) + 1 (50 ) + 1
Position of Q 2 : 2 = 2 = 25.5th
4 4
Q 2 = 112
( f ) + 1 (50 ) + 1
Position of Q3 : 3 = 3 = 38.25th
4 4
Q 3 = 117
36
SOLUTION 9 (b)
IQR = Q3 − Q1
= 117 - 112
=5
Quartile Deviation =
1
(Q3 − Q1 )
2
= (5)
1
2
= 2.5
74
37
EXAMPLE 10
75
Temperature Number of
States
102 2
107 8
112 18
117 13
122 7
127 1
132 1
SOLUTION 10
76
Temperature Frequency(f) Cumulative Position of
frequency data
102 2 2 1–2
107 8 10 3 – 10
112 18 28 11 – 28
117 13 41 29 – 41
122 7 48 42 – 48
127 1 49 49 – 49
132 1 50 50 - 50
f = 50
( f ) + 1
Position of P30 : p = 30 50 + 1 = 15.3th
100
100
P30 = 112
38
EXERCISE 7
77
A sample of 200 students is randomly selected from the
library. The time spent (in minutes) in the library in a
particular day was recorded.
Time Number of Cumulative Position of
spent students Frequency data
15 44 44 1-44
25 56 100 45-100
35 62 162 101-162
45 28 190 163-190
55 10 200 191-200
a) Find the first, second and third Quartiles.
b) Calculate the Interquartile Range
c) Compute the Quartile deviation.
d) Find the 65th percentile
78
5.2.3 MEASURES OF
POSITION FOR GROUPED
DATA
39
GROUPED DATA
(Frequency
distribution table)
Example: Age of Resident
Age (years) Number of
Resident
0 to 9 25
10 to 19 19
20 to 29 11
30 to 39 79 5
40
METHOD 1: BASED ON FREQUENCY
DISTRIBUTION TABLE
Step 1: Construct the cumulative frequency
distribution table.
Step 2: Create a column for the position of data.
f ,
2 f
and
3 f
respective ly.
4 4 4
81
82
f Q1
f
− f m −1
x ( or Second quartile, Q 2 ) = L m + 2
Median, ~ C
fm
3 f
− f Q −1
Quartile3, Q 3 = L Q3 + 4 3
C
f Q3
41
83
Where
LQ = Lower boundary of class containing Quartile
(Lower boundary of the class Quartile)
f = Sample size / the total frequency
f Q −1 = Cumulative frequency of classes before
class containing the Quartile.
f Q = Number of observations (or frequency)
in class Quartile
C = Width of the class Quartile
42
INTERQUARTILE RANGE
FOR GROUPED
85
DATA
The Interquartile range (IQR) is defined as
the difference between the third quartile and the
first quartile for a data set.
IQR = Q3 − Q1
Interquartile range is used to identify outliers,
and it is also used as a measure of variability .
43
EXAMPLE 11
87
Temperature Number of
States
100-104 2
105-109 8
110-114 18
115-119 13
120-124 7
125-129 1
130-134 1
SOLUTION 11 (a)
88
44
89
f
Position of the first
− f Q −1
Quartile1 = L Q1 + 4 C
1
quartile class f Q1
=
f = 50 = 12.5th
L Q1 = 110 − 0.5 = 109.5
4 4
in the array.
f Q1 −1
= 10
f Q1 = 18
C = 114.5 − 109.5 = 5
Class of the first quartile: 50
− 10
110 – 114 Quartile1 = 109.5 + 4 5
18
Quartile 1 = 110. 2 F
o
90
f
Position of the second quartile − f Q −1
(Median class) Median, ~x = Lm + 2 C
2
n
= 2 =
f =
50
= 25th
fm
4 2 2
L m = 110 − 0.5 = 109.5
In the array data.
f Q 2 −1
= 10
Class median: 110 – 114 f m = 18
C = 114.5 − 109.5 = 5
The median of the high 50
temperature in degrees − 10
Fahrenheit of the 50 x = 109.5 + 2
Median, ~ 5
states is 113.7 oF. 18
~
Median, x = 113.7 F
o
45
91
3 f
Position of the third − f Q −1
quartile class Quartile3 = L Q3 + 4 C
3
3 f
f Q3
3(50)
= = = 37.5th
4 4 L Q3 = 115 − 0.5 = 114.5
in the array.
f Q 3 −1
= 28
f Q3 = 13
Class of the third quartile:
115 – 119 C = 119.5 − 114.5 = 5
3(50 )
− 28
Quartile 3 = 114.5 + 4 5
13
Quartile 3 = 118.2 o F
92
Interpretation:
46
SOLUTION 11 (b)
93
Q1 = 110.2
Q3 = 118.2
Interquartile Range (IQR) = Q 3 − Q1
= 118.2 − 110.2
=8
Quartile Deviation =
1
(Q3 − Q1 )
2
= (118.2 − 110.2)
1
2
=4
47
METHOD 1: BASED ON FREQUENCY
DISTRIBUTION TABLE
95
96
Where
Lp = Lower boundary of class containing Percentile
48
METHOD 2: BASED ON ‘LESS THAN’
OGIVE
97
EXAMPLE 12
98
Temperature Number of
States
100-104 2
105-109 8
110-114 18
115-119 13
120-124 7
125-129 1
130-134 1
49
SOLUTION 12
99
100
50
101
p f
The 90th percentile − f p −1
location 90 th percentile = L p + 100 C
fp
p f 90(50 )
= = = 45th L p = 119.5
100 100
f p −1
= 41
fp = 7
Class of the 90th percentile: C = 124.5 − 119.5 = 5
120 – 124 90(50 )
− 41
90 percentile = 119.5 +
th 100 5
7
90 percentile = 122.4 F
th o
EXERCISE 8
102
A sample of 200 students is randomly selected from the
library. The time spent (in minutes) in the library in a
particular day was recorded.
Time Number of Cumulative Position of data
spent students frequency
10 – 19 45 45 1-45
20 – 29 19.5-29.5 55 100 46-100
30 – 39 42 142 101-142
40 - 49 28 170 143-170
50 - 59 30 200 171-200
51
103
f
Position of the first
− f Q −1
Quartile1 = L Q1 + 4 C
1
quartile class f Q1
=
f =
200
= 50th
L Q1 = 19.5
4 4
in the array.
f Q1 −1
= 45
f Q1 = 55
C = 29.5 - 19.5 = 10
Class of the first quartile: 200
− 45
20-29 Quartile1 = 19.5 + 4 10
55
Quartile 1 = 20.41
104
f
Position of the second
− f Q −1
Quartile1 = L Q1 + 4 C
1
quartile class f Q1
=
f =
200
= 100th
L Q2 = 19.5
2 2
in the array.
f Q 2 −1
= 45
f Q2 = 55
C = 29.5 - 19.5 = 10
Class of the second quartile: 200
− 45
20-29 Quartile2 = 19.5 + 2 10
55
Quartile 2 = 29.5
52
105
3 f
Position of the third − f Q −1
quartile class Quartile3 = L Q3 + 4 C
3
3 f
f Q3
3(200)
= = = 150th
4 4 L Q3 = 115 − 0.5 = 114.5
in the array.
f Q 3 −1
= 28
f Q3 = 13
Class of the third quartile:
115 – 119 C = 119.5 − 114.5 = 5
3(50 )
− 28
Quartile 3 = 114.5 + 4 5
13
Quartile 3 = 118.2 o F
106
TOPIC 5.3
MEASURES OF SHAPE
53
METHODS
107
BASED ON
5.3.1 THE SHAPE OF THE HISTOGRAM
5.3.2 THE SHAPE OF THE STEM-AND-LEAF PLOT
5.3.3 THE SHAPE OF THE BOX-AND-WHISKER PLOT
OR BOX-PLOT
5.3.4 THE CENTRAL TENDENCY MEASUREMENT
(MEAN, MODE, MEDIAN)
5.3.5 VALUE OF THE COEFFICIENT OF SKEWNESS,
Sk
MEASURES OF SHAPE
108
54
DISTRIBUTION SHAPES
109
DISTRIBUTION SHAPES
110
55
EXAMPLE 13
111
A sample of 200 students is randomly selected from
the library. The time spent (in minutes) in the library
in a particular day was recorded.
EXERCISE 9
112
REFER TEXTBOOK
PAGE 153
QUESTION 11 (c)
56
MEASURES OF SHAPE
113
DISTRIBUTION SHAPES
114
57
DISTRIBUTION SHAPES
115
EXAMPLE 14
116
0 9 5 7 6
1 2 0 2 3 1 2 4 3 8 9
2 5 3 8 2
3 6 1 7 8
4 1 4
58
EXERCISE 10
117
25 31 20 32 13
14 43 02 57 23
36 32 33 32 44
32 52 44 51 45
MEASURES OF SHAPE
118
59
BOX AND WHISKERS PLOT 119
60
121
EXERCISE 11
122
REFER TEXTBOOK
PAGE 154
QUESTIONS 13(b), 14 & 15
61
MEASURES OF SHAPE
123
62
RELATIONSHIP AMONG MEAN, MEDIAN
& MODE
125
63
RELATIONSHIP AMONG MEAN, MEDIAN
& MODE
127
EXERCISE 12
128
REFER TEXTBOOK
PAGE 152
QUESTIONS 6 & 7
64
MEASURES OF SHAPE
129
5.3.5 BASED ON
COEFFICIENT OF
SKEWNESS, Sk
130
65
131
Mean − Mode
Sk =
Standard deviation
or
3(Mean − Median)
Sk =
Standard deviation
Sk Interpretation/Explanation/Comment
Sk = 0 Symmetry/Bell shaped/Normal
-0.9999 ≤ Sk ≤ -0.0001 Approximately symmetry (or slightly
skewed to the left)
0.0001≤ Sk ≤ 0.9999 Approximately symmetry (or slightly
skewed to the right)
Sk ≥ 1 Right skewed
Sk ≤ -1 Left skewed
66
EXAMPLE 15
133
SOLUTION 15
134
Mean - Mode 28 − 23
Sk = = = 11905
.
s 4.2
OR
3 (Mean - Median ) 3 ( 28 − 25 )
Sk = = = 21429
.
s 4.2
67
EXERCISE 13
135
REFER TEXTBOOK
PAGE 128
QUESTION 14
136
TOPIC 5.4
COMPARING THE DISPERSION
(OR CONSISTENCY)
BETWEEN SAMPLES
68
METHOD
137
COEFFICIENT OF VARIATION
(or RELATIVE DISPERSION)
138
69
INTERPRETATION OF COEFFICIENT OF
VARIATION (CV)
CV Interpretation/Explanation/Comment
Higher CV The data are less consistent.
The data are more dispersed.
Lower CV The data are more consistent.
The data are less dispersed.
139
EXAMPLE 16
140
70
EXERCISE 14
141
REFER TEXTBOOK
PAGE 128 (QUESTION 13)
PAGE 154 (QUESTION 13(a))
71