21MDS13 7. Measures of Central Tendency

Download as pdf or txt
Download as pdf or txt
You are on page 1of 30

Measures of Central Tendency

A measure of central tendency is a single value that attempts to


describe a set of data by identifying the central position within that
set of data. As such, measures of central tendency are sometimes
called measures of averages. They are also called as summary
statistics or descriptive statistics. We can think of it as the
tendency of data to cluster around a middle value. The mean (often
called the average) is most likely the measure of central tendency
that you are most familiar with, but there are others, such as the
median and the mode etc. Each of these measures calculates the
location of the central point using a different method.

The different measures of central tendency are


arithmetic mean (or mean)
median
mode
geometric mean
harmonic mean and
weighted mean
Under different conditions, some measures of central tendency
become more appropriate to use than others. In the following
sections, we will look at these measures and learn how to calculate
them and under what conditions they are most appropriate to be
used.
Arithmetic mean
The arithmetic mean (or mean or average) is the most popular and
well known measure of central tendency. It can be used with both
discrete and continuous data, although its use is most often with
continuous data. The mean is defined as the sum of all the values in
the data set divided by the number of values (observations) in the
data set. So, if we have n values in a data set and they have
values x1,x2, …,xn, the sample mean, usually denoted by x¯ ( "x
bar"), is :
x1  x2  x3  ...  xn
x
n
This formula is usually written in a slightly different manner using
the Greek capital letter, ∑, pronounced "sigma", which means "sum
of...":
1 n
x¯   xi
n i 1
This formula is for the ungrouped or raw data.
Example

Calculate the mean for 6.8, 6.6, 5.2, 5.6, 5.8

Solution

6.8  6.6  5.2  5.6  5.8 30


x  6
5 5

For the followng data which represents the marks obtained by 20

students in a subject calculate mean.

75, 89, 92, 100, 100, 84, 89, 88, 45, 67, 77, 80, 100, 33, 60, 79, 85,

99, 100, 69

80.55

Grouped Data

The mean for grouped data is obtained from the following formula:

x
 fx
n

where x = the mid-point of individual class

f = the frequency of individual class

n = the sum of the frequencies or total frequencies in a sample.

This method of calculation is called direct method. There is another

method called as shortcut method and it is easier to calculate.

Short-cut method

x  A
 fd x c
n
x A
where d
c

A = assumed mean (normally the mid value corresponding to the


maximum frequency)
x = mid values
n = total frequency
c = width of the class interval

Example

Given the following frequency distribution, calculate the arithmetic


mean
Marks : 64 63 62 61 60 59

Number of
Students : 8 18 12 9 7 6
Solution

x f fx d=(x- fd

A)/c

64 8 512 2 16

63 18 1134 1 18

62 12 744 0 0

61 9 549 -1 -9

60 7 420 -2 -14

59 6 354 -3 -18

60 3713 -7

Direct method

x
 fx
n
3713
x  61.88
60

Short-cut method

x  A
 fd x c
n

Here A = 62

7
x  62  x 1  61.88
60

For the frequency distribution of seed yield of seasamum given in


table, calculate the mean yield per plot.

Yield ( in g) No of Mid x fx
Plots (f)

64.5-84.5 3 74.5 3x74.5=223.5

84.5-104.5 5 94.5 472.5

104.5-124.5 7 114.5 801.5

124.5-144.5 20 134.5 2690

144.5-164.5 17 154.5 2626.5

164.5-184.5 10 174.5 1745.0

185.6-204.5 6 194.5 1167

Total 68 9726

mean x 
 fx
n

9726
x  143.03
68

Yield ( in g) No of Mid x x A fd
d
Plots (f) c
64.5-84.5 3 74.5 -3 -9

84.5-104.5 5 94.5 -2 -10

104.5-124.5 7 114.5 -1 -7

124.5-144.5 20 134.5 0 0

144.5-164.5 17 154.5 1 17

164.5-184.5 10 174.5 2 20

185.6-204.5 6 194.5 3 18

Total 68 29

A=134.5

The mean yield per plot is 134.5 + (29/68 X 20) = 143.03

Given the following frequency distribution, calculate the arithmetic


mean
Marks : 64 63 62 61 60 59

Number of
Students : 8 18 12 9 7 6

x f fx d=(x- fd

A)/c

59 6 354 -2 -12

60 7 420 -1 -7

61 9 549 0 0

62 12 744 1 12

63 18 1134 2 36

64 8 512 3 24
3713 53

Direct method

x
 fx
n

3713
x  61.88
60

Short-cut method

x  A
 fd x c
n

Here A = 61

53
x  61  ( x 1)  61.88
60
Merits and demerits of Arithmetic mean

Merits (advantages)
1. It is rigidly defined.
2. It is easy to understand and easy to calculate.
3. If the number of observations is sufficiently large, it is more
accurate and more reliable.
4. It is a calculated value and is not based on its position in the
series.
5. It is possible to calculate even if some of the details of the data
are lacking.
6. Of all averages, it is affected least by fluctuations of sampling.
7. It provides a good basis for comparison.

Demerits (disadvantages)
1. It cannot be obtained by inspection nor located through a
frequency graph.
2. It cannot be in the study of qualitative phenomena not capable of
numerical measurement i.e. Intelligence, beauty, honesty etc.,
3. It can ignore any single observation only at the risk of losing its
accuracy.
4. It is affected very much by extreme values.
5. It cannot be calculated for open-end classes.

Median

The median is the middle most item that divides the group into two equal
parts, one part comprising all values greater, and the other, all values less
than that item. It is that value which divides the group into two equal
parts.

Ungrouped or Raw data

Arrange the given values in the ascending order. If the number of values
are odd, median is the middle value. If the number of values are even,
median is the mean of middle two values. By formula
 n 1
th

When n is odd or even , Median = Md =   value


 2 

th
n n 
When n is even, Average of   and   1 value
2 2 
Example

For the data calculate the median 45, 60,48,100,65.

Solution

Here n = 5
First arrange it in ascending order
45, 48, 60, 65, 100
 n 1
th

Median =   value
 2 

 5 1
=   3 value =60
rd

 2 

Example 5

If the values are 45,48, 60, 65, 65, 100 gms. Calculate the median.

Here n = 6
th
n n 
Median = Average of   and   1 value
2 2 

n 6 n  6
    3 value  60 and   1   1  4 value  65
rd th

 
2 2  2  2

60  65
Median =  62.5 g
2

For example, in a data set of {3, 13, 2, 34, 11, 26, 47}, the sorted
order becomes {2, 3, 11, 13, 26, 34, 47}. The median is the number in
the middle {2, 3, 11, 13, 26, 34, 47}, which in this instance is 13
since there are three numbers on either side.

To find the median value in a list with an even amount of numbers,


one must determine the middle pair, add them, and divide by two.
Again, arrange the numbers in order from lowest to highest.

For example, in a data set of {3, 13, 2, 34, 11, 17, 27, 47}, the sorted
order becomes {2, 3, 11, 13, 17, 27, 34, 47}. The median is the
average of the two numbers in the middle {2, 3, 11, 13, 17, 26 34,
47}, which in this case is fifteen {(13 + 17) ÷ 2 = 15}.
Grouped data

In a grouped distribution, values are associated with frequencies.

Grouping can be in the form of a discrete frequency distribution or a

continuous frequency distribution. Whatever may be the type of


distribution, cumulative frequencies have to be calculated to know the

total number of items.

Discrete Series

Step1: Find cumulative frequencies.

n
Step2: Find  
2

n
Step3: See in the cumulative frequencies the value just greater than  
2

Step4: Then the corresponding value of x is median.

Example 6

The following data pertaining to the number of insects per plant. Find

median size of the family.

Number 1 2 3 4 5 6 7 8 9 10 11 12

of

insects

per plant

(x)

No. of 1 3 5 6 10 13 9 5 3 2 2 1

plants(f)

Solution

Form the cumulative frequency table

x f cf

1 1 1

2 3 4
3 5 9

4 6 15

5 10 25

6 13 38

7 9 47

8 5 52

9 3 55

10 2 57

11 2 59

12 1 60

60

th
n
Median = size of   item
2

th
 60 
= size of   item
 2

= 30th item

The cumulative frequencies just greater than 30 is 38 and the value of x

corresponding to 38 is 6.Hence the median size is 6 members per family.

Continuous Series

The steps given below are followed for the calculation of median in

continuous series.

Step1: Find cumulative frequencies.


n
Step2: Find  
2

Step3: See where n/2 lies in the cumulative frequency. It will lie between

two cumulative frequencies. Corresponding to these two values there will

be two < class values. That is the median class. l is the lower limit of

these two. cf is the cumulative frequency upto l, f is the frequency of the

median class and c is the class interval. Substitute these values in the

formula and calculate median.

Example
Calculate median for the following data.

35,33,42,50,47,53,36,37,43,41,41.5,50.5,55.57.39.5,52.5,60.25,61.5,53.5,

44.5

Calculae median for the following data

class 50-60 60-70 70-80 80-90 90-100 100-110 110-120 120-130

f 11 14 22 32 24 16 10 9

class 15-19 20-24 25-29 30-34 35-39 40-44 45-49 50-54 55-59

f 7 8 17 29 32 26 14 7 5

Merits of Median

1. Median is not influenced by extreme values because it is a positional


average.
2. Median can be calculated in case of distribution with open-end
intervals.
3. Median can be located even if the data are incomplete.

Demerits of Median

1. A slight change in the series may bring drastic change in median value.
2. In case of even number of items or continuous series, median is an
estimated value other than any value in the series.
3. It is not suitable for further mathematical treatment except its use in
calculating mean deviation.
4. It does not take into account all the observations.

Mode

The mode refers to that value which occur most frequently. It is an actual

value, which has the highest concentration of items in and around it. It

shows the centre of concentration of the frequency in around a given

value. Therefore, where the purpose is to know the point of the highest

concentration it is preferred. It is, thus, a positional measure. Mode can be

located even for qualitative factors such as ability, honesty etc.


Its importance is very great in business especially the size of readymade

shirt that sells more, the size of shoes or chappal that sells more etc.

Computation of the mode

Ungrouped or Raw Data

For ungrouped data or a series of individual observations, mode is often

found by mere inspection. It can also be found out after arranging the

values in an array and see which value occur more number of times.

Example

Find the mode for the following data

2 , 7, 10, 15, 10, 17, 8, 10, 2

arranging the values

2,2,7,8,10,10,10,15, 17

Mode = 10 (as 10 repeats three times)

In some cases the mode may be absent while in some cases there may be

more than one mode.

Example

(1) 12, 10, 15, 24, 30

(2) 10, 13, 14, 18, 20, 17, 18

(3) 7, 10, 15, 12, 7, 14, 24, 10, 7, 20, 10

In the first dataset there is no mode

In the second dataset 18 is mode (uni modal)

In the third dataset the modes are 7 and 10 both occur 3 times each. (bi

modal)

Grouped Data
For Discrete distribution, see the highest frequency and corresponding
value of x is mode.
Example:

Find the mode for the following data.

Weight of sorghum No. of ear


in gms (x) head(f)
60 4
65 6
70 16
75 8
80 7
85 4

The maximum frequency is 16. The corresponding x value is 70.

 mode = 70 gms.

Grouped Data

f 1 f 0
Mode = l  c
2 f 1 f 0  f 2

Where l = lower limit of the model class


f0 = the frequency of the class preceding the modal class
f2 = the frequency of the class succeeding the modal class
f1 = the frequency of the modal class
and c = class interval

Example
Merits
1. It is easy to calculate and in some cases it can be found out by
inspection.
2. This is not affected by extreme values
3. This can be calculated for open end classes also.

Demerits

1. It is not based on all values


2. It is not capable of further statistical analysis

Geometric mean

The geometric mean of a dataset containing n observations is the nth root


of their product of the values. If x1,x2…, xn are the n observations then

G.M= n x1 , x2 ...xn

1
= ( x1 , x2 ...xn ) n

1
Log GM = log ( x1 , x 2 ...x n )
n

1
= (log x1  log x 2  ...  log x n )
n

=
 log x i

GM = Antilog
 log x i

For grouped data

  f log xi 
GM = Antilog  
 n 

GM is used in studies like bacterial growth, cell division, etc.

Example

If the weights of sorghum ear heads are 45,60,48,100 , 65 gms . Find the
Geometric mean.

Weight of Log x
ear head x
(g)
45 1.653
60 1.778
48 1.681
100 2.000
65 1.813
Total 8.925

Here n = 5

GM = Antilog
 log x i

n
8.925
= Antilog
5
= Antilog 1.785
= 60.95 grams
Grouped Data

Find the Geometric mean for the following

Weight of No. of ear Log x f log x


sorghum head(f)
50
gms(x) 5 1.699 8.495
63 10 1.799 17.99
65 5 1.813 9.065
130 15 2.114 31.71
135 15 2.130 31.95

Total 50 9.555 99.21

Here n= 50

  f log xi 
GM = Antilog  
 n 

 99.21
= Antilog 
 50 

= Antilog 1.9842 = 96.43 grams

Continuous distribution

Example

For the frequency distribution of weights of fruits given in the table below
calculate the Geometric mean
Weights of No of fruits
fruits ( in g) (f)
60-80 22
80-100 38
100-120 45
120-140 35
140-160 20
Total 160

Weights of No of fruits Mid x log x f log x


fruits ( in g) (f)
60-80 22 70 1.845 40.59
80-100 38 90 1.954 74.25
100-120 45 110 2.041 91.85
120-140 35 130 2.114 73.99
140-160 20 150 2.176 43.52
Total 160 324.2

Here n = 160

  f log xi 
GM = Antilog  
 n 

 324.2 
= Antilog 
 160 

= Antilog 2.02625

= 106.23 grams

Merits

1. I is based on all observations


2. It is rigidly defined
3. It is more suitable for data pertaining to ratios, rates and percentages
4. Not affected much in the presence of extreme values

Demerits

1. It can not be used when negative values are present in the data set
2. Difficult to calculate
Harmonic mean (H.M.) :

Harmonic mean of a set of observations is defined as the reciprocal


of the mean of the reciprocal of the given values. If x1,x2…..xn are n
observations,

n
H.M =
n
 1 
 
i  n  xi


For a frequency distribution

n
H.M. =
n
1 
 f  
i n  xi 

H.M is used when we are dealing with speed, rates, etc.

Example

From the given data 5,10,17,24,30 calculate H.M.

X 1
x
5 0.2000
10 0.1000
17 0.0588
24 0.0417
30 0.3333
Total 0.4338

5
H.M = = = 11.526
0.4338

Example

Data on number of tomatoes per plant are given below. Calculate the

harmonic mean.

Number of 20 21 22 23 24 25
tomatoes/plant
Number of 4 2 7 1 3 1
plants
Solution

Number of No of 1 1
f 
tomatoes plants(f) x x
per plant
(x)
20 4 0.0500 0.2000
21 2 0.0476 0.0952
22 7 0.0454 0.3178
23 1 0.0435 0.0435
24 3 0.0417 0.1251
25 1 0.0400 0.0400
18 0.8216
n 18
H.M = =  21.91
1  0.1968
 f  x 
 i 

Merits of H.M
1. It is rigidly defined
2. It is based on all observations
3. It is the most suitable average when it is desired to give greater weight
to smaller observations and less weight to the larger ones

Demerits of H.M
1. It is not easily understood
2. It is difficult to compute
3. It gives greater importance to small items and is therefore, useful only
when small items have to be given greater importance
4. It is rarely used in grouped data

For the data 3,13,11, 15, 5, 4,2 calculate GM and HM.

(GM = 5.93 and HM = 4.61)

Weighted mean or combined mean

If there are n observations x1,x2,.......xn with corresponding weights


w1,w2,......,wn then the weighted mean is given by ∑wx / ∑w. This is also
called as combined mean as the formula is same. If the average and the
number of observations in two or more related groups are known, then the
mean of the entire group can be calculated.
in a class there are 10 girls and 12 boys. The average marks of girls in a
test is 65 and boys is 60. Calculate the class average.

The class average is given by (10 X 65 + 12 X 60) / (10+12) = 1370/22 =


63.2

The following are the important properties which a good


average should satisfy:
It should be easy to understand.
It should be simple to compute.
It should be rigidly defined.
It should be based on all the observations.
It should not be affected by extreme values / observations.
It should be capable of being used for further statistical analysis..
It should be stable from sample to sample.

When to use different averages


The proper average to be used depends upon the nature of the data,
nature of frequency distribution and importantly the purpose.
If the data is qualitative one, only mode can be used. For example,
when we are interested in knowing the typical size of the shirts that
are sold most he can use mode. On the other hand, if the data is
quantitative we can use any one of the averages mean, median or
mode. When the frequency distribution is skewed (not symmetrical)
median or mode will be proper average. In case of raw data in which
extreme values are present median or mode is better. In case of
symmetrical distribution mean or median or mode can be used.
However, mean is preferred over the other two. When we are
dealing with rates, speed (travelling speed) and prices (share price)
use harmonic mean. If we are interested in relative change as in the
case of bacterial growth, geometric mean is more appropriate.

An important property of the mean is that it includes every value in


your data set as part of the calculation. In addition, the mean is the
only measure of central tendency where the sum of the deviations
of each value from the mean is always zero. The only disadvantage
is it is affected by extreme observations. But median and mode are
not based on all observations.
When you have ordinal data, the median or mode is usually the best
choice. For categorical data, you have to use the mode.
There is a relationship between mean, median and mode. This
relationship in equation form is:

Mean – Mode = 3(Mean – Median).

Relation between arithmetic mean(AM), GM and HM

For any set of data AM ≥ GM ≥ HM

For the following data verify the above relationship

85,96,76, 108, 85, 80, 100, 85, 70 and 95

GM = 87.31 HM = 86.63

Calculate the geometric mean of the annual percentage


growth rate of profits in business corporate from the year
2000 to 2005 is given below

50, 72, 54, 82, 93

GM 68.26

Find the G.M for the following data, which gives the defective
screws obtained in a factory.

diameter in cms 5 15 25 35

number 5 8 3 4

An investor buys Rs.1200 worth of shares in a company each month.


During the first 5 months he bought the shares at Rs.10, Rs.12, Rs.15,
Rs.20 and Rs. 24 per share. After 5 months what is the average price paid
for the shares by him.

A cyclist pedals from his house to his college at a speed of 10 kms per
hour and back from the college at 15kms per hour. Find the average
speed.

You might also like