Chapter 3
Chapter 3
Thus, in the computation of an ideal average the entire set of data at our disposal should be used and there
should not be any loss of information resulting from not using the available data. Obviously, if the whole
data is not used in computing the average, it will be unrepresentative of the distribution.
In other words, the average should possess some important and interesting mathematical properties so that its
use in further statistical theory is enhanced. For example, if we are given the averages and sizes (frequencies)
of a number of different groups then for an ideal average we should be in a position to compute the average
of the combined group. If an average is not amenable to further algebraic manipulation, then obviously its
use will be very much limited for further applications in statistical theory.
By this we mean that if we take independent random samples of the same size from a given population and
compute the average for each of these samples then, for an ideal average, the values so obtained from
different samples should not vary much from one another. The difference in the values of the average for
different samples is attributed to the so called fluctuations of sampling. This property is also explained by
saying that an ideal average should possess sampling stability.
By extreme observations we mean very small or very large observations should not unduly affect the value of
a good average.
General Rounding Rule In statistics the basic rounding rule is that when computations are done in the
calculation, rounding should not be done until the final answer is calculated. When rounding is done in the
intermediate steps, it tends to increase the difference between that answer and the exact one. But in the
textbook and solutions manual, it is not practical to show long decimals in the intermediate calculations;
hence, the values in the examples are carried out to enough places (usually three or four) to obtain the same
answer that a calculator would give after rounding on the last step.
Definition If X1 , X 2 ,...., X N are the values of a variable X, then arithmetic mean of X denoted by X is defined
as
N
X 1 X 2 ... X N
Xi
X i 1
N N
For the case of discrete grouped data, if X assumes k distinct values X1 , X 2 ,...., X k with respective frequencies
f1, f2, … , fk, then X ( the Arithmetic mean of X) is
fX i i k
X i 1
, where N fi
N i 1
For the case of a frequency distribution of a continuous variable grouped in to class-intervals, if there are k
classes with respect to mid points X1 , X 2 ,...., X k and respective frequencies f1, f 2 ,...., f k then
fX i i k
Arithemetic mean X of X is X i 1
, where N f i
N i 1
Example 3.1 Compute the Arithmetic mean for the following frequency distributions.
a) Values (xi) 20 21 22 23 24 25
Frequencies(fi) 5 5 7 6 6 7
fx i i
5 x 20 5 x 21 7 x 22 6 x 23 6 x 24 7 x 25 816
Solution a) x i 1
6
22.66666667 22.7
f
36 36
i
i 1
Rounding Rule for the Mean The mean should be rounded to one more decimal place than occurs in the raw
data. For example, if the raw data are given in whole numbers, the mean should be rounded to the nearest tenth.
If the data are given in tenths, the mean should be rounded to the nearest hundredth, and so on.
2 4 6 8 20
Example3.2 The mean of the observations 2, 4, 6, and 8 is 5 . If we subtract from each of
4 4
these observations 2, we get the new observations 0, 2, 4 and 6. The mean of the new observations is
0246
3 which is mean of the original data minus the constant 2. i.e., 5-2 =3.
4
2. If we add an arbitrary constant to each of the observations the mean is also increased by the same constant
value.
3. If we divide each observation of a set by an arbitrary constant the mean is reduced as many times as the
constant devisor.
Example 3.3 If we have a set of data 2, 4, 6 and 8 then their mean is 5. If we divide the given set of data by 2,
1 2 3 4
we get a new set of data 1, 2, 3 and 4. The mean of the new set of data is 2.5 which is equal to
4
5
2.5
2
4. If a wrong figure has been used when computing the mean, then the correct mean can be obtained without
Example 3.4 The arithmetic mean of 20 observations 20. But while calculating this, an observation 13 was
misread as 30. Compute the correct mean.
Solution
Given n 20 , x 20.
correct value wrong value 13 30
Then x (correct) x ( wrong) 20 19.15 19.2
n 20
5. The sum of the deviations of the observations from their arithmetic mean is always equal to zero i.e., Let
X
N
X1 , X 2 ,...., X N denote the values of a variable X and Let X denote their mean, then i X 0
i 1
N
Pr oof : X i X X 1 X X 2 X ... X N X
i 1
N
Xi N X
i 1
N
X i
N X NX , sin ce X i 1
N
0
6. If Y is a linear function of X then Y is the same linear function X . i.e. If Yi aX i b, i 1, 2,..., N , where a
and b are any given constants, then Y aX b .
N
1
N
aX
i 1
i b
1 N
a X i Nb
N i 1
N
a Xi
Nb
i 1
N N
aX b
7. The sum of the squares of deviations of the given set of observations is minimum when taken from the
arithmetic mean.
Proof Mathematically, for a given frequency distribution
The sum S f i X i A is minimum when A X . Here we use the principle of maxima and minima in
2
differential calculus.
dS d 2S
For S to be minimum if 0 and 0
dA dA2
dS
fi .2 X i A 1 2 fi X i A 0
dA i i
fi X i fi A 0
i i
fX i i
A i
X
f i
i
d 2S
Again 2 fi 1 2 fi 2 N 0,
dA2 i i
b) Weighted mean
In the computation of arithmetic mean we assumed that all items are of equal importance. It may not be so.
Importance of different items can be shown by attaching suitable weights to them relative to their importance. If
w1,w2,…wn are the weights assigned to the values x1,x2,…,xn respectively, then the weighted mean is given as:
w x w2 x2 ... wn xn w x i i
xW 1 1 i 1
w1 w2 ... wn n
wi 1
i
(When w1= w2 =w3 = … = w for all i=1, 2…, n then the mean becomes the arithmetic)
Example 3.5 A student was registered for five courses with 4,4,3,2 and 3 credit hours. She obtained B, A, C, D
and A grades respectively. The grading system is of the form A= 4, B=3, C=2, D=1 and F=0. Find the GPA of
the student.
Solution Let w1 4, w2 4, w3 3, w4 2 and w5 3, because the credit hours are the weights of the courses.
Then x1 3, x2 4, x3 2, x4 1 and x5 4.
w x i i
4 x3 4 x 4 3x 2 2 x1 3x 4 48
Therefore, GPA xw i 1
3.00
5
4 43 23
w
16
i
i 1
c) Combined mean
Let there be two sets of observations on the variable X . Let n1 and x1 denote the number of observations and
the mean of X in the 1st, and n2 and x2 denote the number of observations and the mean of X in the 2nd set.
Then the mean of the combined set n1 n2 observations on X , denoted by x12 , is given by
n1 x1 n2 x2
x12
n1 n2
Example 3.6 In a test given to two sections of a statistics course the average grade is 60.98. Section 1 has a
mean of 57.30, section 2 a mean of 65.30. If there are 27 students in section 1, how many students are there in
section2?
Solution Let n1 and n2 be the number of students in section 1 and section 2 respectively x1 and x2 be the mean
mark of students in section 1 and section 2 respectively for a statistics course. Again let x12 be the combined
mean grade of the two sections in statistics course.
Hence, we are given that n1 27, x1 57.30, x2 65.30 and x12 60.98 .We are to find n2.
n1 x1 n2 x2 27 x57.3 n2 x65.3
x12 60.98 n2 23
n1 n2 27 n2
Generally, if we have k- different sets of data with n1 , n2 ,..., nk numbers of observations and x1 , x2 ,..., xk
arithmetic means respectively, then the arithmetic mean for the combined set of observations is given by the
relation
n1 x1 n2 x2 ... nk xk
x12...k
n1 n2 ... n k
Example 3.7 The mean marks obtained by 300 candidates in statistics are 46. The mean of the top 100 of
them was found to be 70 and the mean of the last 100 was known to be 20. What is the mean of the remaining
100 candidates?
Solution Here we are given 3-different sets of data with n1 n2 n3 100 and x1 70, x3 20 and x123 46
We require x2 ?
Using the formula for combined mean of three sets of data, we have
Merits
1. It is rigidly defined ( the definition should be clear and un-ambiguous so that it leads to one and only
one interpretation by different persons)
Demerits
3. It cannot be used for qualitative characteristics such as intelligence, honesty, beauty, etc.
Definition Let X be a variable with values X1, X2, …, Xn. Then the geometric mean of X denoted by G.M or
Mg is defined as:
1
n n
G.M x1.x2 .x 3 ... xn
1
n xi ( for xi 0 only)
i1
n x1.x2 .x3 ... xn ( for xi 0 only)
Example 3.8 Suppose the profits earned by the Sur Construction Company on five .projects were 3,4,4,6 and 5
percent, respectively. What is the geometric mean profit?
Solution
The geometric mean, profit is 4.28225 percent. The arithmetic mean profit is 4.4 percent, found by
(3+4+4+6+5)/5. It is always true that the arithmetic mean is greater than the geometric mean for any series of
positive values, unless the items being averaged are the same value, in which case the two averages are the
same.
The above form of the formula is used when dealing with ungrouped data. For discrete grouped data, the
formula of the geometric mean becomes:
xi i th value
m - number of values
and n= fi
Example 3.9 Find the geometric mean for the data given in the table below.
xi 1 2 4 6
fi 2 1 2 3
Solution
n fi
i
For continuous grouped data, we use the same formula by letting the class marks represent their respective
classes.
Example 3.10 Find the geometric mean for the following continuous grouped data on the percentage increase in
salary of 16 employees of accompany.
% increase in salary 0 __ 4 5 __ 9 10 __ 14 15 __ 19
Number of employees 5 6 3 2
Solution The class marks are 2,7,12 and 17 for the 1st, 2nd, 3rd and 4th class respectively.
Therefore,
Geometric mean is especially useful in averaging ratios, percentages, and rates of increase between two periods.
G.M. is the appropriate average to be used for computing the average rate of growth of population or average
increase in the rate of profits, sales, production etc., or the rate of money.
Let Po be the initial value of the variable (i.e. the value of the variable in the beginning and P n be its value at the
end of the period n and let r be the rate of growth per unit period.
Growth for period 1 is Por and thus the value of the variable at the end of period 1 is
Po +Por = Po (1+r) r
= Po (1+r) r and consequently the value of the variate at the end of 2 nd period is
Po 1 r Po 1 r r Po 1 r 1 r
Po 1 r .
2
Similarly proceeding we shall get the value of the variable at the end of period 3 is
Po 1 r Po 1 r r Po 1 r
2 2 3
Po Po
Pn
r n 1
Po
If instead of the values of the variable increasing at a constant rate in each period, the rate per unit per period is
different, say, r1,r2,…rn for the 1st ,2nd , …, and nth period respectively. Then as discussed above we shall get
= Po (1+r1)
Po 1 r1 1 r2
.
.
.
Pn = the value at the end of period n = Po 1 r1 1 r2 ... 1 rn
Pn Po (1 r1 )(1 r2 )...(1 rn )
If r is assumed to be the constant rate of growth per unit per period, then we get
Pn Po 1 r
n
Hence equating the values of Pn in and the average rate of growth over the period n is given by:
1 r 1 r1 1 r2 ... 1 rn
n
If r1 , r2 , r3 ,..., rn denote the percentage growth per unit per period for the n periods respectively, then we have
1
r r r r n
1 1 1 1 2 ... 1 n
100 100 100 100
Thus we see that if rates are given as percentages then the average percentage growth rate can be obtained on
subtracting 100 from the G.M. of (100+r1),(100+r2), …,(100+rn).
Example 3.11 Find the average rate of increase in population which in the first decade had increased by 20% in
the next by 30% and in the third by 40%.
Solution Here r1 20, r2 30, r3 40 and n 3 Hence, the average percentage rate of
Example 3.12 The population of a country was 300million in 1951. It became 520 million1969. Calculate the
percentage compound rate of growth per annum.
If r is the percentage compound rate of growth per annum, then by the formula:
520
Pn Po 1 r 1 r
n 19
300
1
1 r 26
19
15
1
r 26 1 0.02972926 2.97%
19
15
Example 3.13 A certain store made profits of Birr 5,000, Birr 10, 000, Birr80, 000 in1965, 1966, and1967
respectively. Determine the average rate of growth of this store‟s profits.
10, 000
Solution Rate of growth of profits from 1965 to 1966 is x100 200%
5, 000
80, 000
Rate of growth of profits from 1966 to 1967 is x100 800%
10, 000
The average rate of growth of store‟s profits from 1965 to 1967 is the geometric
mean of 200 and 800.
Example 3.14 The price of a certain commodity increases from Birr 60 to Birr 140 in a period of 4 years. Find
the average percentage rate of growth of the price per year.
Solution Here Po Birr 60, Pn Birr 140, n 4
1
Then r n 1
P 4
Po
1
140 4
1
60
0.235930917
Merits
1. Geometric Mean is rigidly defined
2. It is based on all observations
3. It is suitable for further mathematical treatment
Demerits
1. Because of its abstract mathematical character, geometric mean is not easy to understand and to
calculate for a non-mathematical person.
2. If any one of the observations is zero, geometric mean becomes zero and if anyone the
observations are negative, geometric mean becomes imaginary regardless of the magnitude of the
other items.
Another measure of central tendency which is only occasionally used is the harmonic mean.
The harmonic mean is the reciprocal of the arithmetic mean of the reciprocals of series of observations.
Let the values of the variable X be x1 , x2 ,.., xn . Then the harmonic mean of X denoted by H.M is defined as:
1 n
H .M
1 1 ... 1 n
1
X1 X2 Xn
i 1 X i
n
1 X
n
1 1 ... 1
1 X1 X2 Xn i
i 1
H .M n n
a) 2, 4 and 8 b) 2, 4, 3, 5, 6, 8
1 1 1
1
Solution a) 2 4 8 7
H .M 3 24
H .M 24 3.43
7
1 1 1 1 1 1 189
1 2 4 3 5 6 8 120
b)
H .M 6 6
189
720
720
H .M 3.81
189
For discrete grouped data, the same formula is used with slight modification. For k values x1, x2,…, xk with
frequencies f1,f2,…, fk respectively, the harmonic mean is given as:
k
1 f i
H .M i 1
f 1 X
f 1 X
k
i
i
i
i 1 i
k
f
i 1
i
Example 3.16 Find the harmonic mean for the following discrete grouped data
Xi 3 6 5 4
fi 2 3 1 4
Solution
4
f
i 1
i 2 3 1 4 10
f i 1
Therefore, H .M Xi 10
4.22
fi 2 3 1
3 6 5
For continuous grouped data, we apply the same formula as with the discrete grouped data by taking the class
marks as class representatives.
Example 3.17 Find the harmonic mean of the following continuous grouped data on the percentage increase in
salary of 16 employees of a company.
% increase in salary 0 __ 4 5 __ 9 10 __ 14 15 __ 19
Number of employees 5 6 3 2
Solution
f i 16, X 1 2, X 2 , 7, X 3 12 X 4 17
and f1 5, f 2 , 6, f 3 3 and f 4 2
16
Therefore H .M 4.30
5 6 3 2
2 7 12 17
Merits
5. Since the reciprocals of the values of the variables are involved, it gives greater weightage to smaller
observations and as such is not very much affected by one or two big observations.
6. Sometimes the variable may be in the form „x per y‟, e.g. kms.per hour, birr per kg., kg. per cubic
cm., etc. In such cases, the harmonic mean would be the proper average if equal units of x were
considered, while the arithmetic mean would be appropriate if equal units of y were considered.
Demerits
1. It is not easy to understand and calculate.
2. Its value cannot be obtained if any one of the observations is zero.
3. It is not a representative figure of the distribution unless the phenomenon requires greater weightage
to be given to smaller items. As such, it is hardly used in business problems.
Relationship among Arithmetic Mean, Geometric Mean and Harmonic Mean
1. The arithmetic mean (A.M.), the geometric mean (G.M.) and the harmonic mean (H.M.) of a series of N-
4
H .M . 3.116883117
1 1 1 1
2 3 4 5
G.M . 4 2 x3 x 4 x5 3.30975092
23 45
A.M . 3.5
4
G.M 2 A.MxH .M .
30 2
11 60
x
2 11
30 30
138,143,141,139,152,148,130 and 267 kg. Here the mean is 161 kg. but this cannot be said to be a
representative value, because seven out of the eight given values are smaller than 161.
In cases of this sort, where the data contain a few extreme values widely different form the majority of the
values, the mean should not be used.
The center point for such problems can be described using a measure of central tendency called the median.
Definition If the given values of variable X are arranged in an increasing or decreasing order of magnitude then
the middle most value in this arrangement is called the median of X (denoted by M d or ~ x ).
The median may alternatively be defined as a value of X such that half of the given values of X are smaller than
or equal to it and half are greater than or equal to it.
n 1
th
i) When the number of values, n, is odd the middle most value i.e. the value in the arrangement will be
2
the unique median of X.
n 1
th
~
x M d the observation.
2
a) 0, 5, -100, -20, 80
b) 6, 7, 9, 12, 16, 20
Solution a) Arranging the data in ascending order we have: -100, -20, 0, 5, 80
Then Md =0
~ 9 12 21
X 10.5
2 2
For discrete grouped data, the median is obtained by using the same formula as with the ungrouped data after
arranging the values in an increasing order.
Values (xi)
6 3 0 2 5 1 4
Frequencies
1 20 1 15 6 6 15
(fi)
xi 0 1 2 3 4 5 6
fi 1 6 15 20 15 6 1
For continuous grouped data, the exact median cannot be obtained unless the original raw data was retained.
There are two popular ways of locating the median for grouped data; the graphic method and the algebraic
interpolation method.
n
Here the ogive of the distribution is first drawn. Then through the point on the vertical axis a line parallel to
2
the x-axis is taken, which intersects the ogive at a point. From this point a perpendicular is let fall on the x-axis.
The point at which it meets the x-axis is the median Md.
c n
Md ~
x l C
f 2
Example 3.23 Find the mean and median for the following data.
Number of absent < 5 < 10 < 15 < 20 < 25 < 30 < 35 < 40 < 45
days
Number of 29 224 465 582 634 644 650 653 655
students
Solution:
The data should be rearranged first. The frequencies given are less than cumulative frequencies .To calculate the
frequencies in different class intervals subtract each cumulative frequency from the one immediately following.
Number of 0 __ 5 5 __ 10 10 __ 15 15 __ 20 20 __ 25 25 __ 30 30 __ 35 35 __ 40 40 __ 45 Total
absent days
Number of 29 195 241 117 52 10 6 3 2 655
students
9
fx i i
8432.5
x i 1
9
12.8740458 12.87 days
f
655
i
i 1
Md ~
x l
c n
f 2
C , n 327.5.
2
M d 10
5
327.5 224
241
M d 12.1473029 12.15 days
arranged in ascending or descending order of magnitude e.g., to find the average intelligence, average beauty,
average honesty etc., among a group of people.
Demerits
1. In case of even number of observations for an ungrouped data, median cannot be determined exactly.
2. Median, being a positional average, is not based on each and every item of the distribution
3. Median is not suitable for further mathematical treatment i.e., given the sizes and the median values of
different groups we cannot compute the median of the combined groups.
4. Median is relatively less stable than mean, particularly for small samples since it is affected more by
fluctuations of sampling as compared with arithmetic mean.
The quartiles are measures which divide a given set of data in four equal parts. We can have three quartiles.
These quartiles usually denoted by Q1, Q2 and Q3 are obtained after arranging the data in to an increasing order
and are known as the first, second and third quartiles respectively.
The deciles divide a given set of data in to ten equal parts. There are nine deciles usually denoted by D1, D2,…,
D9 . These measures are obtained after arranging the data in an increasing order and are known as the first
decile, second decile, third decile, etc.
Similarly, the percentiles divide a given set of data in to hundred equal parts. We can have 99 percentiles
denoted as P1,P2,…,P99 for the first, second, third, etc. percentiles respectively. Generally P i is used to denote
the ith percentile.
Percentile Formula
The percentile corresponding to a given value X is computed by using the following formula:
Percentile
number of values below X 0.5
100%
Total number of values
Example 3.24: A teacher gives a 20-point test to 10 students. The scores are shown here. Find the percentile
rank of a score of 12.
18, 15, 12, 6, 8, 2, 3, 5, 20, 10
S o l uti o n
Arrange the data in order from lowest to highest.
2, 3, 5, 6, 8, 10, 12, 15, 18, 20
Then substitute into the formula.
Percentile
number of values below X 0.5 100%
Total number of values
Since there are six values below a score of 12, the solution is
6 0.5
Percentile 100% 65th Percentile
10
Thus, a student whose score was 12 did better than 65% of the class.
Example 3.25 Using the data in Example 3.22, find the percentile rank for a score of 6.
Solution
There are three values below 6. Thus
3 0.5
Percentile 100% 35th Percentile
10
A student who scored 6 did better than 35% of the class.
Procedure Table
Finding a Data Value Corresponding to a Given Percentile
Step 1 Arrange the data in order from lowest to highest.
Step 2 Substitute into the formula
n p
c
100
Where
n = total number of values
p = percentile
Step 3A If c is not a whole number, round up to the next whole number. Starting at the lowest value, count
over to the number that corresponds to the rounded-up value.
Step 3B If c is a whole number, use the value halfway between the cth and (c+1)st values when counting up
from the lowest value.
Example 3.26 Using the scores in Example 3.22, find the value corresponding to the 25th percentile.
Solution
Step 1 Arrange the data in order from lowest to highest. 2, 3, 5, 6, 8, 10, 12, 15, 18, 20
Step 2 Compute
n p 10 25
c 2.5
100 100
Step 3 If c is not a whole number, round it up to the next whole number; in this case, c = 3. (If c is a whole
number, see Example 3.25.) Start at the lowest value and count over to the third value, which is 5. Hence, the
value 5 corresponds to the 25th percentile.
Example 3.27 Using the data set in Example 3.22, find the value that corresponds to the 60th percentile.
Solution
The values of Q2, Md, D5 and P50 and those Q3 and P75 are equal. This is not a coincidence. The relationship
always holds because Md, Q2, D5 and P50 divide the set of numbers in to two equal parts, similarly, Q 3 and P75
are the values below which 75% of the numbers lie.
i) M d Q2 D5 P50
ii ) Q1 P25
iii) Q3 P75
iv) D1 P10 , D2 P20 , D3 P30 ,..., D9 P90
Procedure Table
Finding Data Values Corresponding to Q1, Q2, and Q3
Step 1 Arrange the data in order from lowest to highest.
Step 2 Find the median of the data values. This is the value for Q2.
Step 3 Find the median of the data values that fall below Q2. This is the value for Q1.
Step 4 Find the median of the data values that fall above Q2. This is the value for Q3.
Example 3.28 Find Q1, Q2, and Q3 for the data set 15, 13, 6, 5, 12, 50, 22, 18.
Solution
Step 1 Arrange the data in order.
5, 6, 12, 13, 15, 18, 22, 50
13 15
Md = 14
2
Step 3 Find the median of the data values less than 14.
5, 6, 12, 13
↑
Q1
6 12
Q1 = 9 . So Q1 is 9.
2
Step 4 Find the median of the data values greater than 14.
15, 18, 22, 50
↑
Q3
18 22
Q3= 20
2
For discrete grouped data, the quantiles are obtained by using the set of formulae as with the ungrouped data
after arranging the values in an increasing order.
Example 3.29 The data given below is the distribution of 99 students according to the total number of credits
they are taking in a semester.
Number of students 8 10 10 16 20 25 10 99
Find the 1st quartile, 7th decile and the 60th percentile.
Solution The data is already arranged according to numerical size.
Therefore,
n 1
th
2
Q1 The median of the values lessthan the 50th observation
49 1
th
2
n p 99 60
c 59.4
100 100
Start at the lowest value and count over to the 60 th value, which is 17 credit hours. Hence, the value 17
corresponds to the 60th percentile.
Note: The inclusion of a third column on cumulative distribution may be helpful in detecting the values.
For continuous grouped data the exact quantiles cannot be obtained unless we can restore the ungrouped data. In
such a case, we develop an approximating formula which is analogous to that of the median to each of the
quantiles. These can be given as:
Qi l
f
c in
4
C
Where l the lower class boundary of the Q i class
Di l
c in
f 10
C
Where l the lower class boundary of the Di class
n total number of observations
C Cumulative frequency of all classes lower than the Di class
Pi l
c in
f 100
C
Where l the lower class boundary of the Pi class
Example 3.30 The frequency distribution of the scores of 50 students in a final examination is given in the table
below.
46-50 4 4
51-55 8 12
56-60 15 27
61-65 5 32
66-70 9 41
71-75 5 46
76-80 3 49
81-100 1 50
Q3 is the score of the ¾ (50) th =37.5th student. This student is in the fifth class. The intermediate values needed
for the calculation of Q3 are:
l 65.5 , C 32 , f 9 , c 5
Therefore , Q3 65.5
5
37.5 32
9
65.5 3.06
68.56
ii) To find D5
5
50 25th student. This student is in the third Class. The values needs for its
th
D5 is the score of the
10
calculation are
l 55.5 , C 12 , f 15 , and c 5
40
50 20th student. The 20th student is in the third class. The values needed for its
th
P40 is the score of the
100
determination are
l 55.5 , C 12 , f 15 , c 5
A mode can also be obtained for a numerical set of data; but mode is especially useful in describing nominal
and ordinal data.
For ungrouped data (raw data), the mode is the value of the observation(s) with the highest frequency, if any.
Example 3.32 Find the mode(s) of each of the following sets of data
i) 3 5 5 4 6 5 4 5 5 4 7 8 5
ii) 3 8 8 7 4 7 2 9 7 8 1
iii) 1 3 5 8 3 8 5 1 1 3 8 5
Solution i) To make the detection of the mode(s) easier we group values with same magnitude together.
3 4 4 4 5 5 5 5 5 5 6 7 8
The most frequent value is 5 with frequency 6. Therefore, M o=5. Such distributions (sets of data) which have a
unique mode are known as uni-modal.
Sometimes, two or more values may have the same but highest frequency. In such cases the values with the
highest frequency are jointly the modes of the distribution.
1 2 3 4 7 7 7 8 8 8 9.
The most frequent values are 7 and 8 with frequency 3.Therefore, Mo=7 and 8 sets of data with two modes are
known as bi-modal.
Generally, sets of data with two or more modes are known as multi-modal.
1 1 1 3 3 3 5 5 5 8 8 8
All the values appear with the same frequency. In such cases, we say the set has no mode, i.e. the mode is non-
existent.
For discrete grouped data, the mode is the value(s) with the highest frequency and is obtained by inspecting the
grouped data.
Example 3.33 Find the mode for the following discrete grouped data.
xi 2 4 3 5 8 7
fi 3 7 1 4 7 6
Solution: By inspecting the data the highest frequency is 7 and the values with that frequency are 4 and 8
therefore, Mo=4 and 8.
The mode of a continuous frequency distribution very often can be approximated by the midpoint of the modal
class- the class with the greatest frequency density or class containing the largest number of class frequencies.
Thus for the example 3.28, Mo=58, the midpoint of the 3rd class with frequency 15. This method of locating the
mode is quite satisfactory when frequency densities in the class immediately before the modal class (the pre-
modal class) and immediately after the modal class (the post modal class) are approximately equal. When this
condition is not met, more satisfactory results can be obtained by algebraic interpolation with the following
formula by making the following assumptions.
iii. The modal class, the class in which the mode is expected to corresponds with the class
f p f p1
M d xˆ l xc
( f p f p1 ) ( f p f p1 )
Example 3.34 The following distribution was obtained from the age distribution of 228 housewives.
Age(in years) 15 -- 19 20 -- 24 25 -- 29 30 -- 34 35 -- 39 40 -- 44 45 -- 49
Number of
women 6 19 50 57 48 27 21
Merits
1. It is easy to calculate and understand in case of ungrouped data. In some cases it can be located merely by
2. It is not at all affected by extreme observations and as such is preferred to arithmetic mean while dealing with
extreme observations.
3. It can be conveniently obtained in the case of open end classes which do not pose any problem here.
Demerits
1. It is not rigidly defined i.e., it may not exist (when no two values in a set are alike or when all values are
equally frequent).
2. It may not be unique when the set of data is multi-modal.
3. It is not based on all observations.
4. It is not suitable for further mathematical treatment.
5. As compared with mean, mode is affected to a great extent by the fluctuations of sampling.