Chapter 2
Chapter 2
Chapter 2
SUMMARY STATISTICS
2.1 Introduction
In this chapter, we shall discuss the measure of averages (central tendencies) and
dispersion (spread). These measures are important for statistical reporting and analyses.
They do not involve any inference on them, but their information is important for
decision making.
G.M n x1 x2 xn .
It is not commonly used measure of average, but is still applicable in physical sciences.
1 n
x xi
n i1
2.2.1.2 Median
Another measure of central tendency is the median. This is defined as the middle value
of the data set when the data are arranged in order (ascending or descending). The
median is a suitable measure of average for data with extreme values. It is also used to
give the general overview for a huge mass of data whereby the computation of the
arithmetic mean might be tedious. It can be denoted by ~
x.
n 1
The median takes the position th for odd number of observations. However, if
2
n n 2
there is even number of observations, it is the average of the th and th
2 2
observations.
2.2.1.3 Mode
This is the value(s) with the highest frequency from the data set. It can be used to
determine the most favourable output of a certain experiment and help decide on what
measures may be taken from that output. It is commonly denoted by x̂ .
Example 2.1
Compute mean, median and mode for the following data.
3, 4, 6, 8, 3, 5, 9, 11, 7, 10
Solution
We can first arrange the data in ascending order as follows;
3, 3, 4, 5, 6, 7, 8, 9, 10, 11
Then,
The arithmetic mean is given by,
x
x 3 3 4 5 6 7 8 9 10 11 66 6.6 .
n 10 10
Mode is the value with highest frequency. In this case the highest frequency is 2,
Therefore, a mode is 3.
There is even number of observations n = 10. In this case, median is the average of the
fifth and sixth observations which are 6 and 7. Hence,
67
Median = = 6.5.
2
f
i 1
i xi
x ,
f i
Example 2.2
The height (in inches) of 100 male students at ABC College were recorded as follows
Height 60 – 62 63 – 65 66 – 68 69 – 71 72 - 74
Frequency 5 18 42 27 8
Solution
(a) For the arithmetic mean we summarize the summations in the following table.
(a) Mean =
fx i i
6745
67.45
f i 100
(b) Median is the value contained in class 66 – 68. From this class we have
L 65.5 , f m 42 , Cb 23 , h 3.
Then,
N
Cb h
Median = L
2 65.5 50 23(3) 67.43
fm 42
(c) The class with the highest frequency is 66 – 68 and thus it is the modal class.
From this class, we find that
L 65.5 , h 3 , f m 42 , f b 18 , f a 27 .
Implying that 1 f m f b 42 18 24 , 2 f m f a 42 27 15 .
1 24
Therefore, mode = L h 65.5 (3) 65.5 1.846 67.35
1 2 39
MAD
X i X
N
Where X is the population mean and N = population size.
Similarly for sample we have
MAD
x i x
, where n = sample size.
n
2
X i X2
N
And the sample variance denoted by s 2 , is given by the formula
s2
x .
i x 2
n 1
The standard deviation is defined as the square root of the variance. It is denoted by
for a population, and s for a sample. It is also denoted in general by SD X . From the
above formulas we have
X i X2
and s
x i x 2
N n 1
Example 2.3
Compute mean absolute deviation, sample standard deviation and quartile deviation of
the following data: 10, 12, 8, 16, 8, 20, 21, 15
Solution
We first compute the arithmetic mean as
10 12 8 16 8 20 21 17 112
x 14
8 8
Then, the data are arranged in order and various calculations are summarized in the
table below
xi xi x xi x xi x 2
8 - 6 6 36
8 - 6 6 36
10 - 4 4 16
12 - 2 2 4
16 2 2 4
17 3 3 9
20 6 6 36
21 7 7 49
36 190
Then,
Mean absolute deviation is given by
MAD
x i x
36
4.5
n 8
s
x i x 2
=
190
= 5.21
n 1 7
We can also obtain the sample variance without prior computation of the arithmetic
mean, x . This alternative formula is given by
1 xi 2
s
2
n 1
xi n
2
th
Q3
3
n 1th observation = 27 observation
4 4
= 6.75th observation
= 6 th 0.75 7 th 6 th
= 17 + 0.75 (20 - 17) = 17 + 2.25 = 19.25
x
fx
i
, where n f i
i
n
Then, the mean absolute deviation for the sample is given by
MAD
f i xi x
n
s2
f x i i x 2
n 1
Alternatively we use
1 f i xi 2
s2
n 1
i i
f x 2
n
3
N Cb h
Q3 L
4
f
Where,
L = lower boundary of the class contains upper quartile
C b = cumulative frequency before the class which contains upper quartile
h = the class size
f = the class frequency
Example 2.4
Compute the sample standard deviation and the quartile deviations for the data in the
following table
Class 15 – 19 20 – 24 25 – 29 30 – 34 35 – 39 40 – 44 45 – 49
Freq 4 5 8 5 8 6 4
Solution
Calculations are summarized in the table below
Class Freq ( f i ) c. mark ( x i ) f i xi f i xi2
15 – 19 4 17 68 1156
20 – 24 5 22 110 2420
25 – 29 8 27 216 5832
30 – 34 5 32 160 5120
35 – 39 8 37 296 10952
40 – 44 6 42 252 10584
45 – 49 4 47 188 8836
TOTAL 40 1290 44900