0% found this document useful (0 votes)
30 views71 pages

TOPIC 2 (Part 3)

The document discusses measures of dispersion for grouped and ungrouped data including mean deviation, variance, and standard deviation. Examples are provided to demonstrate calculating these measures from raw data and frequency distribution tables. Objectives and steps for calculating the measures are outlined.

Uploaded by

Muhammad Zammeer
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
30 views71 pages

TOPIC 2 (Part 3)

The document discusses measures of dispersion for grouped and ungrouped data including mean deviation, variance, and standard deviation. Examples are provided to demonstrate calculating these measures from raw data and frequency distribution tables. Objectives and steps for calculating the measures are outlined.

Uploaded by

Muhammad Zammeer
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 71

TOPIC 2

5.1 MEASURES OF DISPERSIO N


5.2 MEASURES OF POSITION
5.3 MEASURES OF SHAPE
5.4 COMPARING THE DISPERSIO N
(OR CONSISTEN C Y) BETWEEN
SAMPLES.

REFERENCES
2

1. Allan G. Bluman, Elementary Statistics: A Step by Step,


5th edition, The McGraw-Hill Companies, Inc., 2004.

2. Lau, Phang and Zainudin, 2009. Statistics for


UiTM, Second edition. Oxford Fajar Sdn. Bhd.

3. Prem S. Mann and Ken Black, 2007. Introductory


Statistics, 6th edition. John Wiley & Sons.

4. Muhammad Rozi, Faridah, 2011. Business Statistics,


First edition. Oxford Fajar Sdn. Bhd.

1
3

TOPIC 5.1
MEASURES OF DISPERSION

METHODS
4

1 THE MEAN DEVIATION


(or AVERAGE ABSOLUTE DEVIATION)
2 VARIANCE
3 STANDARD DEVIATION

2
OBJECTIVES
Calculate measures of dispersion (mean
deviation, Variance and Standard
Deviation) for :

(1) ungrouped data


(2) ungrouped data but
presented in the frequency
distribution table
(3) grouped data
5

5.1.1 MEASURES OF
DISPERSION FOR
UNGROUPED DATA

3
UNGROUPED DATA
(Individual value)
Example: Age of student

20 25 19 22 16 27 22 23

MEAN DEVIATION FOR


UNGROUPED
8
DATA
❑ The mean deviation (or average absolute deviation) is
calculated by summing up the difference between each
observation and the mean.
❑ This value is then divided by the number of
observations.

| x − x |
Mean deviation = n

where x =The observation


n =Number of observation (or sample size)

4
EXAMPLE 1
9

Find the mean deviation for the following


data.

12, 6, 3, 7, 8, 10, 11

SOLUTION 1
10

Mean, X =
x
n
12 + 6 + 3 + 7 + 8 + 10 + 11
=
7
57
= = 8.1429
7

5
11

x | x−x|
12 3.8571
6 2.1429
3 5.1429
7 1.1429
8 0.1429
10 1.8571
11 2.8571

 | x − x | = 17.1429

12

 | x − x | = 17.1429
Mean deviation =
 |x−x|
n
17.1429
=
7
= 2.45

6
SAMPLE VARIANCE (S2)
FOR UNGROUPED DATA

The sample variance is calculated as

s =
2 1 
 x −
2
(  x) 2


n − 1  n 

13

SAMPLE STANDARD DEVIATION


(S) FOR UNGROUPED DATA

The Sample Standard Deviation is


calculated as

s = s2

s=
1 
 x −
2
(  x) 2


n − 1  n 
14

7
EXAMPLE 2
15

Find the variance and the standard


deviation of the sample data below.

1, 7, 2, 5

EXERCISE 1
16

REFER TEXTBOOK
PAGE 127
QUESTION 8

8
17

5.1.2 MEASURES OF
DISPERSION FOR
UNGROUPED DATA
(Presented in a Frequency
Distribution Table)

UNGROUPED DATA
(Frequency
distribution table)
Example: Age of student
Age (years) Number of
Students
21 5
22 17
23 10
24 18 3

9
MEAN DEVIATION FOR UNGROUPED DATA (BUT
PRESENTED IN FREQUENCY DISTRIBUTION TABLE)
19

Mean deviation =
1
 (f | x − x |)
f
where x =The observation
x =Sample mean
 f =Total of frequency / sample size /
Number of observation

EXAMPLE 3
20

X f
3 4
8 4
13 20
18 10
23 2

Calculate the mean deviation value for the above data

10
SOLUTION 3
21
X Frequency fx f|x-mean|
(f)
3 4 12 41
8 4 32 21
13 20 260 5
18 10 180 47.5
23 2 46 19.5

f = 40  fx = 530 (f|x-mean| ) = 134

Mean =
 fx = 530 = 13.25 Mean Deviation =
1
 (f | x − x |)
f
 f 40 1
= [134]
40
= 3.35

SAMPLE VARIANCE (S2) AND SAMPLE STANDARD


DEVIATION (S) FOR UNGROUPED DATA (BUT
PRESENTED IN FREQUENCY DISTRIBUTION TABLE)
22

Step 1: Make a table as shown

A B C D
Variable Frequency fx fx2
(X) (f)

Step 2: Multiply the frequency by the value of x for


each row, and place the product in column C.
Step 3: Find the sum of column C
Step 4: Multiply the product of fx by the value of x
for each row, and place the product in column D.
Step 5: Find the sum of column D

11
23

Step 6: Find the sample variance

s2 =
1 
 f x 2 −
( f x )2


n −1  n 

n = f

Find the sample standard deviation

s=
1 
 f x −
2
( f x) 
2


n −1  n 

n = f

EXAMPLE 4
24

The following table shows the high temperatures in


degrees Fahrenheit (F) for 50 states. Calculate the
Variance and Standard deviation.
Temperature Number of
States
102 2
107 8
112 18
117 13

122 7

127 1

132 1

12
SOLUTION 4
25

x Frequency(f) fx fx2
102 2 204 20808
107 8 856 91592
112 18 2016 225792
117 13 1521 177957
122 7 854 104188
127 1 127 16129
132 1 132 17424
f = 50 fx = fx2
5710 =653890

26

Find the sample variance

s =
2 1 
 f x −
2
( f x) 
2

=
1 
 653890 −
(5710 )2  = 36.9
n −1 

n  50 − 1  50 

Find the sample standard deviation

s=
1  ( f x )2  = 1  653890− (5710)2  = 36.9 = 6.1
 f x 2 − 
n −1  n  50 − 1  50 

13
EXERCISE 2
27

A sample of 200 students is randomly selected from the


library. The time spent (in minutes) in the library in a
particular day was recorded.
Time spent Number of students
15 44
25 56
35 62
45 28
55 10

Calculate the mean deviation, Variance and Standard


deviation.

28

5.1.3 MEASURES OF
DISPERSION FOR GROUPED
DATA

14
GROUPED DATA
(Frequency
distribution table)
Example: Age of Resident
Age (years) Number of
Resident
0 to 9 25
10 to 19 19
20 to 29 11
30 to 39 29 5

MEAN DEVIATION FOR


GROUPED
30
DATA

Mean deviation =
1
 (f | x − x |)
f
where x =Midpoint
x =Sample mean
 f =Total of frequency / sample size /
Number of observation

15
EXAMPLE 5
31

Class interval Frequency


1-5 4
6 – 10 4
11 – 15 20
16 – 20 10
21 - 25 2

Calculate the mean deviation value for the above data

SOLUTION 5
32
Class interval Frequency Midpoint (x) fx f|x-mean|
(f)
1-5 4 3 12 41
6 – 10 4 8 32 21
11 – 15 20 13 260 5
16 – 20 10 18 180 47.5
21 - 25 2 23 46 19.5
(f|x-mean| )=
f = 40  fx = 530 134

Mean =
 fx = 530 = 13.25 Mean Deviation =
1
 (f | x − x |)
f
 f 40 1
= [134]
40
= 3.35

16
33

Sample Variance (S2)


&
Sample Standard deviation (S)
(Grouped Data)

SAMPLE VARIANCE (S2) AND SAMPLE


STANDARD DEVIATION (S)
FOR GROUPED
34 DATA
Step 1: Make a table as shown. Find the class
boundary and class midpoint.
A B C D E F
Class Class Frequency (f) Midpoint (X) fx fx2
Boundary

Step 2: Multiply the frequency by the value of


midpoint(x) for each row, and place the
product in column E.
Step 3: Find the sum of column E.
Step 4: Multiply the product of fx by the value of x
for each row, and place the product in column F.
Step 5: Find the sum of column F

17
35

Step 6: Find the sample variance

s2 =
1 
 f x 2 −
( f x )2


n −1  n 

Find the sample standard deviation

s=
1 
 f x −
2
( f x) 
2


n −1  n 

EXAMPLE 6
36

The following table shows the high temperatures in


degrees Fahrenheit (F) for 50 states. Calculate the
Variance and Standard deviation.
Temperature Number of
States
100-104 2
105-109 8
110-114 18
115-119 13
120-124 7
125-129 1
130-134 1

18
SOLUTION 6
37
Class Class boundary Frequency(f Midpoint (x) fx fx2
)
100-104 99.5 – 104.5 2 102 204 20808
105-109 104.5 – 109.5 8 107 856 91592
110-114 109.5 – 114.5 18 112 2016 225792
115-119 114.5 - 119.5 13 117 1521 177957
120-124 119.5 – 124.5 7 122 854 104188
125-129 124.5 – 129.5 1 127 127 16129
130-134 129.5 – 134.5 1 132 132 17424
f = 50 fx = 5710 fx2
=653890

38

Find the sample variance

s =
1 
2
 f x −
2
( f x) 
2

=
1 
 653890 −
(5710 )2  = 36.9
n −1 

n  50 − 1  50 

Find the sample standard deviation

s=
1  ( f x )2  = 1  653890− (5710)2  = 36.9 = 6.1
 f x 2 − 
n −1  n  50 − 1  50 

19
EXERCISE 3
39

The following frequency distribution shows the hourly wage


of palm oil workers in an estate in Bintulu, Sarawak.
Hourly Wage (RM) Number of
Workers
3.50 – 3.59 1
3.60 – 3.69 2
3.70 – 3.79 2
3.80 – 3.89 4
3.90 – 3.99 5
4.00 – 4.09 6
4.10 – 4.19 3
4.20 – 4.29 2
a) Find the mean deviation, Variance and Standard
deviation.

EXERCISE 4
40
A 2Z Consultant is trying for a new scale to calculate
productivity level of the estate workers through a series
of training. The company wants to knows the
productivity level of their workers after the training
session.
Productivity level No. of workers
Scale
Very weak 0 and less than 2.0 15
Weak 2.0 and less than 4.0 20
Average 4.0 and less than 6.0 40
Good 6.0 and less than 8.0 52
Very Good 8.0 and less than 10.0 43

Compute the mean deviation, Variance and Standard


deviation productivity level of the workers.

20
EXERCISE 5
41

From a survey on daily expenses of housewives in


Town A, the following data are obtained.

Daily Expenses Number of housewives


(RM)
15 - 25 64
25 - 35 90
35 - 45 26
45 – 55 20

Calculate the mean deviation, Variance and Standard


deviation.

MEASURES OF VARIABILITY

• Measures of variability tell how widely the


observations are spread out around the
measure of central tendency
• Measures of spread increase with greater
variation on the variable of interest
• Measures of spread equal zero when there
is no variation

42

21
VARIANCE
• The variance is the average of the squared deviations
from the mean
• The variance will be larger when the observations
within a frequency distribution have a wide range

• Where:
➢ Range = max – min (for ungrouped data)
➢ Range = Upper boundary of the last class – Lower
boundary of the first class. (for grouped data)

43

SUMMARY
• Measures of spread tell us how widely
spread the distribution is. The standard
deviation is the most commonly used
measure of spread in descriptive statistics.

44

22
45

TOPIC 5.2
MEASURES OF POSITION

METHODS
46

1 QUARTILE
2 INTERQUARTILE RANGE (IQR)
3 QUARTILE DEVIATION
4 PERCENTILE

23
OBJECTIVES
Calculate measures of position
(Quartile, Interquartile Range, Quartile
Deviation, Percentile) for :

(1) ungrouped data


(2) ungrouped data but
presented in the frequency
distribution table
(3) grouped data
47

48

5.2.1 MEASURES OF
POSITION FOR
UNGROUPED DATA

24
UNGROUPED DATA
(Individual value)
Example: Age of student

20 25 19 22 16 27 22 23

49

QUARTILE FOR
UNGROUPED
50
DATA
 Quartiles are values that divide an array into 4 equal
quarters.

 The first Quartile (Q1) is the value such that at


most 25% of the measurements are less than Q1 or
at most 75% are greater than Q1.
 The second Quartile (Q2) is the median.
 The third Quartile (Q3) is the value such that at
most 75%of the measurements are less than Q3 or
at most 25% are greater than Q3.

25
51
 Steps in computing the Quartiles of raw data
Step 1: Arrange the data in ascending order
Step 2: Find the position of Quartile 1, Quartile 2
and Quartile 3.
 n +1
Position of Q1 = 1 th
 4 
 n +1
Position of Q2 = 2 th
 4 
 n +1
Position of Q3 = 3 th
 4 
Step 3: Find the values of Q1, Q2 and Q3 based on
its positions.

INTERQUARTILE RANGE
FOR UNGROUPED
52
DATA
 The Interquartile range (IQR) is defined as
the difference between the third quartile and the
first quartile for a data set.

IQR = Q3 − Q1
 Interquartile range is used to identify outliers,
and it is also used as a measure of variability .

26
QUARTILE DEVIATION FOR
UNGROUPED
53
DATA

= (Q3 − Q1 )
1
Quartile deviation
2

EXAMPLE 7
Referring to the data on the number of
patients at the outpatient clinic

110, 112, 98, 100, 115, 95, 100, 60

(a) Find Q1, Q2 and Q3.


(b) Find the Interquartile Range.
(c) Calculate the Quartile Deviation.

27
SOLUTION 7 (a)
 Step 1: Obtain the array
60, 95, 98, 100, 100, 110, 112, 115
n=8

 Step 2:
Determine the location of Q1, Q2 and Q3

 n +1  8 +1
Position of Q1 = 1 th = 1 th = 2.25th
 4   4 
 n +1  8 +1
Position of Q2 = 2 th = 2 th = 5th
 4   4 
 n +1  8 +1
Position of Q3 = 3 th = 3 th = 6.75th
 4   4 

 Step 3: Determine the values of Q1, Q2 and Q3


within the array based on the location in step 2.
X1, X2, X3, X4, X5, X6, X7, X8
60, 95, 98, 100, 100, 110, 112, 115

2.25th position 4.5th position 6.75th position

28
To obtain the values of Q1 and Q3, some linear interpolation is
required.
Q1 = 0.25(98-95) + 95
= 95.75
Q2 = 0.5(100-100) + 100
= 100
Q3 = 0.75(112-110) + 110
= 111.5

Interpretation:
At most 96 patients visited the clinic 25% of the time.
At most 100 patients visited the clinic 50% of the time.
Therefore, at most 112 patients visited the clinic 75% of the time.

SOLUTION 7 (b)

Q1 = 95.75
Q2 = 100
Q3 = 111.5

IQR = Q 3 − Q1
= 111.5 - 95.75
= 15.75

29
SOLUTION 7 (c)

IQR = Q3 − Q1
= 111.5 - 95.75
= 15.75

Quartile Deviation =
1
(Q3 − Q1 )
2
= (15.75)
1
2
= 7.88

PERCENTILE FOR
UNGROUPED
60
DATA
 Steps in computing the Percentiles of raw data
Step 1: Arrange the data in ascending order
Step 2: Find the position of Percentile
 n +1
Position of Percentile = p th
 100 

Step 3: Find the value of the Percentile based on its


positions.

30
EXAMPLE 8
61

Referring to the data on the number of patients at


the outpatient clinic. Determine the 15th percentile of
the following eight numbers:

110, 112, 98, 100, 115, 95, 100, 60

SOLUTION 8
62

 Step 1: Obtain the array


60, 95, 98, 100, 100, 110, 112, 115
n=8

 Step 2:
Determine the location of 15th percentile:

 n +1  8 +1
= p =  15 = 1.35
th

 100   100 

31
63

 Step 3: Find the value of the 15th percentile within


the array based on the location in step 2.

X1, X2, X3, X4, X5, X6, X7, X8


60, 95, 98, 100, 100, 110, 112, 115

1.35th position

To obtain the value of the 15th percentile , some linear


interpolation is required.

(P15 -60)/(1.35-1)=95-60
P15 = 0.35(95-60) + 60
= 72.25

EXERCISE 6
64

The following data give the speeds (in miles per hour) of 12 cars
traveling on a highway.

67 71 57 54 57 84
77 62 61 59 58 93

(a) Find Q1, Q2 and Q3.


(b) Find the Interquartile Range.
(c) Calculate the Quartile Deviation.
(d) Determine the 28th percentile
(e) Determine the 87th percentile

32
65

3.3.2 MEASURES OF
POSITION FOR
UNGROUPED DATA
(Presented in a Frequency
Distribution Table)

UNGROUPED DATA
(Frequency
distribution table)
Example: Age of student
Age (years) Number of
Students
21 5
22 17
23 10
24 66 3

33
QUARTILE FOR UNGROUPED DATA (BUT
PRESENTED IN FREQUENCY DISTRIBUTION TABLE)

Step 1: Construct the cumulative frequency


distribution table.
Step 2: Create a column for the position of data.

Step 3: Define the class containing the first, second


and third quartiles, which is the first class
with the value of the cumulative frequency
that equal at least
 ( f ) + 1  ( f ) + 1  ( f ) + 1
1  , 2  and 3  respective ly.
 4   4   4 

67

INTERQUARTILE RANGE FOR UNGROUPED DATA


(BUT PRESENTED IN FREQUENCY
DISTRIBUTION
68 TABLE)

 The Interquartile range (IQR) is defined as


the difference between the third quartile and the
first quartile for a data set.

IQR = Q3 − Q1
 Interquartile range is used to identify outliers,
and it is also used as a measure of variability .

34
QUARTILE DEVIATION FOR UNGROUPED DATA (BUT
PRESENTED IN FREQUENCY DISTRIBUTION TABLE)
69

= (Q3 − Q1 )
1
Quartile deviation
2

EXAMPLE 9
70

The following table shows the high temperatures in degrees


Fahrenheit (F) for 50 states.
a) Find Q1, Q2 and Q3.
b) Find the Interquartile Range and Quartile deviation
Temperature Number of
States
102 2
107 8
112 18
117 13
122 7
127 1
132 1

35
SOLUTION 9(a)
71

Temperature Frequency(f) Cumulative Position of


frequency data
102 2 2 1–2
107 8 10 3 – 10
112 18 28 11 – 28
117 13 41 29 – 41
122 7 48 42 – 48
127 1 49 49 – 49
132 1 50 50 - 50
f = 50

72

 ( f ) + 1  (50 ) + 1
Position of Q1 : 1  = 1  = 12.75th
 4   4 
Q1 = 112
 ( f ) + 1  (50 ) + 1
Position of Q 2 : 2  = 2  = 25.5th
 4   4
Q 2 = 112
 ( f ) + 1  (50 ) + 1
Position of Q3 : 3  = 3  = 38.25th
 4   4
Q 3 = 117

36
SOLUTION 9 (b)

IQR = Q3 − Q1
= 117 - 112
=5

Quartile Deviation =
1
(Q3 − Q1 )
2
= (5)
1
2
= 2.5

PERCENTILE FOR UNGROUPED DATA (BUT


PRESENTED IN FREQUENCY DISTRIBUTION TABLE)

Step 1: Construct the cumulative frequency


distribution table.
Step 2: Create a column for the position
of data.

Step 3: Determine the location of pth


percentile: 

( f ) + 1 p
 100 
 

74

37
EXAMPLE 10
75

The following table shows the high temperatures in degrees


Fahrenheit (F) for 50 states. Find the 30th percentile.

Temperature Number of
States
102 2
107 8
112 18
117 13

122 7

127 1

132 1

SOLUTION 10
76
Temperature Frequency(f) Cumulative Position of
frequency data
102 2 2 1–2
107 8 10 3 – 10
112 18 28 11 – 28
117 13 41 29 – 41
122 7 48 42 – 48
127 1 49 49 – 49
132 1 50 50 - 50
f = 50
 ( f ) + 1 
Position of P30 : p  = 30 50 + 1  = 15.3th
  100 
 100 
P30 = 112

38
EXERCISE 7
77
A sample of 200 students is randomly selected from the
library. The time spent (in minutes) in the library in a
particular day was recorded.
Time Number of Cumulative Position of
spent students Frequency data
15 44 44 1-44
25 56 100 45-100
35 62 162 101-162
45 28 190 163-190
55 10 200 191-200
a) Find the first, second and third Quartiles.
b) Calculate the Interquartile Range
c) Compute the Quartile deviation.
d) Find the 65th percentile

78

5.2.3 MEASURES OF
POSITION FOR GROUPED
DATA

39
GROUPED DATA
(Frequency
distribution table)
Example: Age of Resident
Age (years) Number of
Resident
0 to 9 25
10 to 19 19
20 to 29 11
30 to 39 79 5

QUARTILE FOR GROUPED DATA


80

The first, second (or median) and third quartiles can


be:

(i) calculated based on the distribution table.

(ii) Obtained using “less than” Ogive ( or “less


than "cumulative frequency curve)

40
METHOD 1: BASED ON FREQUENCY
DISTRIBUTION TABLE
Step 1: Construct the cumulative frequency
distribution table.
Step 2: Create a column for the position of data.

Step 3: Define the class containing the first, second


and third quartiles, which is the first class
with the value of the cumulative frequency
that equal at least

f ,
2 f
and
3 f
respective ly.
4 4 4
81

82

Step 4: Find the first, second and third quartiles


 f 
 − f 
Quartile1, Q1 = L Q1 +  4 C
Q1-1

 f Q1 
 
  f 
 −  f m −1 
x ( or Second quartile, Q 2 ) = L m +  2
Median, ~ C
 fm 
 
 
 3 f 
 −  f Q −1 
Quartile3, Q 3 = L Q3 +  4 3
C
 f Q3 
 
 

41
83

Where
LQ = Lower boundary of class containing Quartile
(Lower boundary of the class Quartile)
 f = Sample size / the total frequency
 f Q −1 = Cumulative frequency of classes before
class containing the Quartile.
f Q = Number of observations (or frequency)
in class Quartile
C = Width of the class Quartile

METHOD 2: BASED ON ‘LESS THAN’


OGIVE
84

Step 1: Construct a ‘less than’ Ogive.

Step 2: Find the position of the first, second and third


quartiles
Position of Q1 =
f
4
2 f
Position of Q 2 =
4
3 f
Position of Q 3 =
4
Step 3: The position of the quartiles at y-axis, while the
value of quartiles at x-axis

42
INTERQUARTILE RANGE
FOR GROUPED
85
DATA
 The Interquartile range (IQR) is defined as
the difference between the third quartile and the
first quartile for a data set.

IQR = Q3 − Q1
 Interquartile range is used to identify outliers,
and it is also used as a measure of variability .

QUARTILE DEVIATION FOR


GROUPED86
DATA

Quartile deviation = 2 (Q3 − Q1 )


1

43
EXAMPLE 11
87

The following table shows the high temperatures in degrees


Fahrenheit (F) for 50 states.
a) Find Q1, Q2 and Q3.
b) Find the Interquartile Range and Quartile deviation

Temperature Number of
States
100-104 2
105-109 8
110-114 18
115-119 13
120-124 7
125-129 1
130-134 1

SOLUTION 11 (a)
88

Temperature Class Frequency Cumulative Position of


boundary (f) frequency data
100-104 99.5 – 104.5 2 2 1–2
105-109 104.5 – 109.5 8 10 3 – 10
110-114 109.5 – 114.5 18 28 11 – 28
115-119 114.5 - 119.5 13 41 29 – 41
120-124 119.5 – 124.5 7 48 42 – 48
125-129 124.5 – 129.5 1 49 49 – 49
130-134 129.5 – 134.5 1 50 50 - 50
f = 50

44
89
 f 
Position of the first
 −  f Q −1 
Quartile1 = L Q1 +  4 C
1

quartile class  f Q1 
 
=
 f = 50 = 12.5th  
L Q1 = 110 − 0.5 = 109.5
4 4
in the array.
f Q1 −1
= 10
f Q1 = 18
C = 114.5 − 109.5 = 5
Class of the first quartile:  50 
 − 10 
110 – 114 Quartile1 = 109.5 +  4 5
 18 
 
 
Quartile 1 = 110. 2 F
o

90
 f 
Position of the second quartile  −  f Q −1 
(Median class) Median, ~x = Lm +  2 C
2

 
n
= 2  =
f =
50
= 25th 
fm

 
4 2 2
L m = 110 − 0.5 = 109.5
In the array data.
f Q 2 −1
= 10
Class median: 110 – 114 f m = 18
C = 114.5 − 109.5 = 5
The median of the high  50 
temperature in degrees  − 10 
Fahrenheit of the 50 x = 109.5 +  2
Median, ~ 5
states is 113.7 oF.  18 
 
 
~
Median, x = 113.7 F
o

45
91
 3 f 
Position of the third  −  f Q −1 
quartile class Quartile3 = L Q3 +  4 C
3

 
3 f
f Q3
3(50)  
= = = 37.5th  
4 4 L Q3 = 115 − 0.5 = 114.5
in the array.
f Q 3 −1
= 28
f Q3 = 13
Class of the third quartile:
115 – 119 C = 119.5 − 114.5 = 5
 3(50 ) 
 − 28 
Quartile 3 = 114.5 +  4 5
 13 
 
 
Quartile 3 = 118.2 o F

92

Interpretation:

25% of the states have high temperature below 110.2 oF


and another 75% of the states have high temperature
above 110.2 oF.
50% of the states have high temperature below 113.7 oF
and another 50% of the states have high temperature
above 113.7 oF.
75% of the states have high temperature below 118.2 oF
and another 25% of the states have high temperature
above 118.2 oF.

46
SOLUTION 11 (b)
93
Q1 = 110.2
Q3 = 118.2
Interquartile Range (IQR) = Q 3 − Q1
= 118.2 − 110.2
=8

Quartile Deviation =
1
(Q3 − Q1 )
2
= (118.2 − 110.2)
1
2
=4

PERCENTILE FOR GROUPED


DATA
94

The Percentile can be:

(i) calculated based on the distribution table.

(ii) Obtained using “less than” Ogive ( or “less


than "cumulative frequency curve)

47
METHOD 1: BASED ON FREQUENCY
DISTRIBUTION TABLE
95

Step 1: Construct the cumulative frequency


distribution table.
Step 2: Create a column for the position of data.

Step 3: Determine the location of pth percentile:


p f
100
Step 4: Find the value of the pth percentiles
 p f 
 −  f p-1 
p Percentile = L p + 
th 100 C
 fp 
 
 

96

Where
Lp = Lower boundary of class containing Percentile

 f = Sample size (or the total frequency)


 f = Cumulative frequency of classes before
p −1
class containing the Quartile.
fp = Number of observations (or frequency)
in class Quartile
C = Width of the class Quartile

48
METHOD 2: BASED ON ‘LESS THAN’
OGIVE
97

Step 1: Construct a ‘less than’ Ogive.

Step 2: Find the position of the percentile


p f
100

Step 3: The position of the quartiles at y-axis, while


the value of quartiles at x-axis

EXAMPLE 12
98

The following table shows the high temperatures in degrees


Fahrenheit (F) for 50 states. Calculate the 10th percentile and
the 90th percentile.

Temperature Number of
States
100-104 2
105-109 8
110-114 18
115-119 13
120-124 7
125-129 1
130-134 1

49
SOLUTION 12
99

Temperature Class Frequency Cumulative Position of


boundary (f) frequency data
100-104 99.5 – 104.5 2 2 1–2
105-109 104.5 – 109.5 8 10 3 – 10
110-114 109.5 – 114.5 18 28 11 – 28
115-119 114.5 - 119.5 13 41 29 – 41
120-124 119.5 – 124.5 7 48 42 – 48
125-129 124.5 – 129.5 1 49 49 – 49
130-134 129.5 – 134.5 1 50 50 - 50
f = 50

100

The 10th percentile  p f 


 −  f p −1 
location 10 th percentile = L p +  100 C
 
p f
fp
10(50 ) 



= = = 5th
100 100 L p = 104.5
f p −1
=2
fp = 8
Class of the 10th percentile : C = 109.5 − 104.5 = 5
105 – 109  10(50 ) 
 −2
10 percentile = 104.5 + 
th 100 5
 8 
 
 
10 percentile = 106.4 F
th o

50
101
 p f 
The 90th percentile  −  f p −1 
location 90 th percentile = L p +  100 C
 fp 
 
p f 90(50 )  
= = = 45th L p = 119.5
100 100
f p −1
= 41
fp = 7
Class of the 90th percentile: C = 124.5 − 119.5 = 5
120 – 124  90(50 ) 
 − 41 
90 percentile = 119.5 + 
th 100 5
 7 
 
 
90 percentile = 122.4 F
th o

EXERCISE 8
102
A sample of 200 students is randomly selected from the
library. The time spent (in minutes) in the library in a
particular day was recorded.
Time Number of Cumulative Position of data
spent students frequency
10 – 19 45 45 1-45
20 – 29 19.5-29.5 55 100 46-100
30 – 39 42 142 101-142
40 - 49 28 170 143-170
50 - 59 30 200 171-200

a) Find the first, second and third Quartiles.


b) Calculate the Interquartile Range
c) Compute the Quartile deviation.
d) Find the 85th percentile

51
103
 f 
Position of the first
 −  f Q −1 
Quartile1 = L Q1 +  4 C
1

quartile class  f Q1 
 
=
f =
200
= 50th
 
L Q1 = 19.5
4 4
in the array.
f Q1 −1
= 45
f Q1 = 55
C = 29.5 - 19.5 = 10
Class of the first quartile:  200 
 − 45 
20-29 Quartile1 = 19.5 +  4 10
 55 
 
 
Quartile 1 = 20.41

104
 f 
Position of the second
 −  f Q −1 
Quartile1 = L Q1 +  4 C
1

quartile class  f Q1 
 
=
f =
200
= 100th
 
L Q2 = 19.5
2 2
in the array.
f Q 2 −1
= 45
f Q2 = 55
C = 29.5 - 19.5 = 10
Class of the second quartile:  200 
 − 45 
20-29 Quartile2 = 19.5 +  2 10
 55 
 
 
Quartile 2 = 29.5

52
105
 3 f 
Position of the third  −  f Q −1 
quartile class Quartile3 = L Q3 +  4 C
3

 
3 f
f Q3
3(200)  
= = = 150th  
4 4 L Q3 = 115 − 0.5 = 114.5
in the array.
f Q 3 −1
= 28
f Q3 = 13
Class of the third quartile:
115 – 119 C = 119.5 − 114.5 = 5
 3(50 ) 
 − 28 
Quartile 3 = 114.5 +  4 5
 13 
 
 
Quartile 3 = 118.2 o F

106

TOPIC 5.3
MEASURES OF SHAPE

53
METHODS
107

BASED ON
5.3.1 THE SHAPE OF THE HISTOGRAM
5.3.2 THE SHAPE OF THE STEM-AND-LEAF PLOT
5.3.3 THE SHAPE OF THE BOX-AND-WHISKER PLOT
OR BOX-PLOT
5.3.4 THE CENTRAL TENDENCY MEASUREMENT
(MEAN, MODE, MEDIAN)
5.3.5 VALUE OF THE COEFFICIENT OF SKEWNESS,
Sk

MEASURES OF SHAPE
108

5.3.1 BASED ON SHAPE OF


HISTOGRAM

54
DISTRIBUTION SHAPES
109

 Bell-shaped -has a single peak and tapers off


at either end. Symmetric.

 Uniform -basically flat or rectangular

 J-shaped -it has a few data values on the


left side and increases as one
moves to the right

 Right-skewed -the peak of a distribution to the


left and the data values taper off
to the right

DISTRIBUTION SHAPES
110

 Left-skewed - data values are clustered to


the right and taper off to the
left

 Unimodal - distribution with one peak

 Bimodal - has two peaks of the same


height

 U-shaped - has two peaks on the left


and right side of distribution

55
EXAMPLE 13
111
A sample of 200 students is randomly selected from
the library. The time spent (in minutes) in the library
in a particular day was recorded.

Time spent Number of students


10 – 19 45
20 – 29 55
30 – 39 42
40 - 49 30
50 - 59 28

Construct a Histogram and comment on the


distribution.

EXERCISE 9
112

REFER TEXTBOOK
PAGE 153
QUESTION 11 (c)

56
MEASURES OF SHAPE
113

5.3.2 BASED ON SHAPE OF


STEM-AND-LEAF PLOT

DISTRIBUTION SHAPES
114

 Bell-shaped -has a single peak and tapers off


at either end. Symmetric.

 Uniform -basically flat or rectangular

 J-shaped -it has a few data values on the


left side and increases as one
moves to the right

 Right-skewed -the peak of a distribution to the


left and the data values taper off
to the right

57
DISTRIBUTION SHAPES
115

 Left-skewed - data values are clustered to


the right and taper off to the
left

 Unimodal - distribution with one peak

 Bimodal - has two peaks of the same


height

 U-shaped - has two peaks on the left


and right side of distribution

EXAMPLE 14
116

State the shape of the distribution.

0 9 5 7 6
1 2 0 2 3 1 2 4 3 8 9
2 5 3 8 2
3 6 1 7 8
4 1 4

58
EXERCISE 10
117

 At an outpatient testing center, the number of


cardiograms performed each day for 20 days is
shown. Construct a stem and leaf plot for the data
and comment on the distribution.

25 31 20 32 13
14 43 02 57 23
36 32 33 32 44
32 52 44 51 45

MEASURES OF SHAPE
118

5.3.3 BASED ON BOX-


ANDD-WHISKERS PLOT

59
BOX AND WHISKERS PLOT 119

❑ Purpose : To find out what information can be


discovered about the data such as the center and
spread
❑ Box and Whisker plot involve five specific values.
1) The lowest value (minimum)
2) Q1
3) The median
4) Q3
5) The highest value (maximum)

❑ If the median is near the center of box, the


distribution is approximately symmetric
❑ If the median falls to the left of the center
of the box, the distribution is positively
skewed.
❑ If the median falls to the right of the center,
the distribution is negatively skewed.

60
121

For symmetry data


Smallest K1 Median K3 Largest


value value

⚫ For left skewed data

Smallest K1 Median K3 Largest


value value

⚫ For right skewed data


Smallest K1 Median K3 Largest
value value

EXERCISE 11
122

REFER TEXTBOOK
PAGE 154
QUESTIONS 13(b), 14 & 15

61
MEASURES OF SHAPE
123

5.3.4 BASED ON CENTRAL


TENDENCY MEASUREMENT
(MEAN, MEDIAN
& MODE)

RELATIONSHIP AMONG MEAN, MEDIAN


& MODE
124

 As discussed in previous topic, histogram or a


frequency distribution curve can assume either
skewed shape or symmetrical shape.

 Knowing the value of mean, median and mode can


give us some idea about the shape of frequency
curve.

62
RELATIONSHIP AMONG MEAN, MEDIAN
& MODE
125

Mean, median, and mode for a symmetric


histogram and frequency distribution curve

RELATIONSHIP AMONG MEAN, MEDIAN


& MODE
126

Mean, median, and mode for a histogram and


frequency distribution curve skewed to
the right

63
RELATIONSHIP AMONG MEAN, MEDIAN
& MODE
127

Mean, median, and mode for a histogram and


frequency distribution curve skewed to the left

EXERCISE 12
128

REFER TEXTBOOK
PAGE 152
QUESTIONS 6 & 7

64
MEASURES OF SHAPE
129

5.3.5 BASED ON
COEFFICIENT OF
SKEWNESS, Sk

130

 To determine the skewness of data


(symmetry, approximately symmetry,
skewed to the left, slightly skewed to the
left, skewed to the right and slightly
skewed to the right)

 Alsocalled Skewness Coefficient or


Pearson’s Measure of Skewness.

65
131

Mean − Mode
Sk =
Standard deviation
or
3(Mean − Median)
Sk =
Standard deviation

INTERPRETATION OF PEARSON’S MEASURE OF


SKEWNESS
132

Sk Interpretation/Explanation/Comment
Sk = 0 Symmetry/Bell shaped/Normal
-0.9999 ≤ Sk ≤ -0.0001 Approximately symmetry (or slightly
skewed to the left)
0.0001≤ Sk ≤ 0.9999 Approximately symmetry (or slightly
skewed to the right)
Sk ≥ 1 Right skewed
Sk ≤ -1 Left skewed

66
EXAMPLE 15
133

The duration of cancer patient warded in Hospital


Sultanah Bahiyah recorded in a frequency
distribution. From the record, the mean is 28 days,
median is 25 days and mode is 23 days. Given the
standard deviation is 4.2 days. Find the skewness
coefficient. What is the type of distribution?

SOLUTION 15
134

Mean - Mode 28 − 23
Sk = = = 11905
.
s 4.2
OR
3 (Mean - Median ) 3 ( 28 − 25 )
Sk = = = 21429
.
s 4.2

 Since the value of Sk is more than 1, the shape of


the distribution is skewed to the right.

67
EXERCISE 13
135

REFER TEXTBOOK
PAGE 128
QUESTION 14

136

TOPIC 5.4
COMPARING THE DISPERSION
(OR CONSISTENCY)
BETWEEN SAMPLES

68
METHOD
137

COEFFICIENT OF VARIATION
(or RELATIVE DISPERSION)

138

▪ Comparing distribution of different means and


variances.
▪ Gives us the ratio of the standard deviation to the
arithmetic mean expressed as a percent.

Sample Standard Deviation


CV = X100%
Sample Mean
s
CV = 100%
X

69
INTERPRETATION OF COEFFICIENT OF
VARIATION (CV)

CV Interpretation/Explanation/Comment
Higher CV The data are less consistent.
The data are more dispersed.
Lower CV The data are more consistent.
The data are less dispersed.

139

EXAMPLE 16
140

During the first six month of 2010,


the mean share price of Company A
was RM1.90 with standard deviation
of RM0.50, while the mean share
price of Company B was RM8.00 with
standard deviation of RM0.85. Which
company’s share price is more
consistent?

70
EXERCISE 14
141

REFER TEXTBOOK
PAGE 128 (QUESTION 13)
PAGE 154 (QUESTION 13(a))

71

You might also like