EPSC 224 Topic 9
EPSC 224 Topic 9
Topic 9 Handout
Copyright
Copyright© Egerton University
Published 2015
All rights reserved. No part of this publication may be reproduced, stored in a retrieval
system or transmitted in any form or by any means, electronic, mechanical, photocopying,
recording, or otherwise, without the prior written permission of the copyright owner.
Introduction
Imagine you have a distribution of numeric values such as the age of First Year students
that include 17, 18, 18, 19, 20, 20, 21, 17, 22, 20, 19, 18, 18, 20, 20, 40 and 40. If you are
asked to state one value that will best capture and communicate the distribution, which
value will you choose? This is the focus of discussion in this topic, the measures of
central tendency. Enjoy the discussion.
Learning Outcomes
By the end of this topic you should be able to:
X = ∑X/N X - mean
∑X – summation of x
Example
The mean of the scores 7, 7, 6, 5, 4, 4, 3 is calculated as:-
X = 7+7+6+5+4+4+3
= 40/8
=5
̅ = ∑ fX
= ∑fxX
fX
Or = ∑ ∑ = ∑fx f - frequency
N f
x – midpoints of classes
E.g. determine the mean of the data represented in the table below.
60-64 3 62 186 49
55-59 4 57 228 46
50-54 7 52 364 42
45-49 9 47 423 35
40-44 11 42 462 26
35-39 8 37 296 15
30-34 4 32 128 7
25-29 2 27 54 3
20-24 1 22 22 1
∑f = 50 ∑fx =2230
Σfx 2230
Mean= = = 44.60 =
Σf 50
Each separate sample mean receives weight proportional to the respective sample size.
Example
The mean yearly income for 200 civil servants in Nakuru is 21,600/=, whereas the mean
yearly income for 400 municipal workers is 19,500/=. What is the overall average salary
for these public servants?
X = 200*21600+19500*400
600
= 20,200/=
The median is the 50th percentile in a group of scores. It is the point below and above
which 50% of the scores fall. It divides ranked scores into two equal parts such that it
exceeds and is exceeded by the same number of observations. The determination of the
median is dependent on the type of data. For ungrouped data, the median is obtained as
follows: -
Example
The data below represents scores in a CAT
9, 9, 6, 4, 7, 6, 8, 2, 9, 10, 3
On order of magnitude
2, 3, 4, 6, 6, 7, 8, 9, 9, 9, 10.
The median is 7.
N.B. If the data set is even, the median id half way between each number of the mid pair.
In this case, determine the mid pair and average them
If similar scores surround the median e.g. 7, 7, 7. Determine the exact lower and exact
upper limits of 7 i.e. 6.5 and 7.5. Divide the interval by 3, add values of 2 portions to the
lower limit e.g. 6.5 + 0.66 = 7.16 this then becomes the median.
For grouped data, determine the median class, then obtain the median as follows: -
N+1
{ −F}
2
Median = L + i
f
i – Class interval
Example
60-64 3 62 186 49
55-59 4 57 228 46
50-54 7 52 364 42
45-49 9 47 423 35
40-44 11 42 462 26
35-39 8 37 296 15
30-34 4 32 128 7
25-29 2 27 54 3
20-24 1 22 22 1
∑f = 50 ∑fx =2230
In the distribution repeated above, the scores are 50, hence the median is between the 25th
and 26th score. This falls in the interval class 40-44. Hence;
11
11
= 39.5 + 4.772727272727273
= 44.272727272727
= 44.27
(iii) It is not affected at all by the extreme observations since it’s a positional average.
(iv) It can be calculated when dealing with a distribution having open ended classes.
(iii) It is affected by fluctuations of sampling and thus is a less stable average than the
mean.
The mean, median and mode being measures of central tendency, which one should one
choose to use? The choice depends on why such a measure is required. The mode is the
most useful whenever a research requires a quick and relatively informal indication of a
typical data value. The median is the most useful when one requires the midpoint of the
Topic Summary
The mean, median and mode being measures of central tendency, which one should one
choose to use? The choice depends on why such a measure is required. The mode is the
most useful whenever a research requires a quick and relatively informal indication of a
typical data value. The median is the most useful when one requires the midpoint of the
data distribution. The mean is the most useful if the knowledge of exact values of the
observation is important.
Further Reading
Book on Statistics.