Stats
Stats
Measure of Central Tendency: Usually when two or more different data sets are to be compared it is
necessary to condense the data, but for comparison the condensation of data set into a frequency
distribution and visual presentation are not enough. It is then necessary to summarize the data set in a
single value. Such a value usually somewhere in the center and represent the entire data set and hence it
is called measure of central tendency or averages. Since a measure of central tendency (i.e. an average)
indicates the location or the general position of the distribution on the X-axis therefore it is also known
as a measure of location or position.
1. Arithmetic Mean
2. Geometric Mean
3. Harmonic Mean
4. Mode
5. Median
Arithmetic Mean or Simply Mean: “A value obtained by dividing the sum of all the observations
by the number of observations is called arithmetic mean”
x A h x A h ; Here n f
u fu
Step deviation n n
Method Xi - A
Where u = and h is the common width of the class intervals
h
9
Chapter 03 Measures of Central Tendency
Properties of Arithmetic Mean: The following are the properties of arithmetic mean:
The sum of squared deviations from the mean is smaller than the sum of squared deviations from
any arbitrary value or provisional mean. i.e. (xi x)2 (xi A)2
Proof: Taking (xi A)2 (xi A x x)2
[(xi x) (x A)]2
[(xi x)2 (x A)2 2(xi x)(x A)]
(xi x)2 (x A)2 2 (xi x)(x A)
(xi x)2 n(x A)2 2(x A) (xi x)
(xi x)2 n(x A)2 (xi x) 0
(xi A)2 (xi x)2 n(x A) 2
0
Note: If A x Then (xi A)2 (xi x)2
Variable Mean
The arithmetic mean is affected by the change of origin and Xi X
scale i.e. when a constant is added to or subtracted from each Xi a X a
value of a variable or if each value of a variable is multiplied or
divided by a constant, then arithmetic mean is affected by these a Xi aX
changes. Xi X
a a
10
Chapter 03 Measures of Central Tendency
If k-subgroups consists of n1 ,n2 ,…,nk observations having their respective means as x 1, x 2,…,
x k then the mean of all the data or combined mean is denoted by x or x and is defined by:
c
n1 x1 n2 x2 ... nk xk
xc
n1 n2 ... nk
Weighted Arithmetic Mean: Up till now we have discussed the simple A.M or in other words un-
weighted A.M. In calculating arithmetic mean we assume that the values of a variable have equal
importance. But it is not necessary that all the values have the same relative importance. Thus whenever
it is required to find the mean of certain variables, which are not of equal importance, then we assign
certain numerical quantities to these variables, which express their relative importance. Such numerical
quantities are technically called the weight. So it is obvious that we would modify the formula of the
simple A.M and apply the formula of the weighted A.M i.e.
Xw
wx
w
Geometric Mean: “The nth root of the product of “n” positive values is called geometric mean”
G Antilog G Antilog
logx f logx
;Here n f
n n
Harmonic Mean: “The reciprocal of the arithmetic mean of the reciprocals of the values is called
harmonic mean”
11
Chapter 03 Measures of Central Tendency
Mode in case of Ungrouped Data: “A value that occurs most frequently in a data is called mode”
OR
“If two or more values occur the same number of times but most frequently than the other values,
then there is more than one mode”
Mode in case of Discrete Grouped Data: “A value which has the largest frequency in a set of data
is called mode”
Mode in case of Continuous Grouped Data: In case of continuous grouped data, mode would lie in the
class that carries the highest frequency. This class is called the modal class. The formula used to
compute the value of mode, is given below:
fm f1
Mode l h
(fm f1) (fm f2)
Median: “When the observations are arranged in ascending or descending order, then a value,
that divides a distribution into two equal parts, is called median”
12
Chapter 03 Measures of Central Tendency
The number of values above the median balances (equals) the number of values below the
median i.e. 50% of the data falls above and below the median.
n 1
Median size of th observation
2
Here n f
In continuous grouped data, when we are finding median, we first construct the class
boundaries if the classes are discontinuous. Then we find cumulative frequencies and
then we use the following two steps:
h n
Median l C; Here n f
f 2
Quartiles: “When the observations are arranged in increasing order then the values, that divide
the whole data into four (4) equal parts, are called quartiles”
These values are denoted by Q1, Q2 and Q3. It is to be noted that 25% of the data falls below Q1, 50% of
the data falls below Q2 and 75% of the data falls below Q3.
13
Chapter 03 Measures of Central Tendency
Deciles: “When the observations are arranged in increasing order then the values, that divide the
whole data into ten (10) equal parts, are called deciles”
These values are denoted by D1, D2,…,D9. It is to be noted that 10% of the data falls below D1, 20% of
the data falls below D2,…, and 90% of the data falls below D9.
Percentiles: “When the observations are arranged in increasing order then the values, that divide
the whole data into hundred (100) equal parts, are called percentiles”
These values are denoted by P1, P2,…,P99. It is to be noted that 1% of the data falls below P1, 2% of the
data falls below P2,…, and 99% of the data falls below P99.
Ungrouped j(n 1)
Qj size of th observation
Quartiles Data 4
j = 1, 2, 3 Discrete j(n 1)
Qj size of th observation ; Here n f
4
Grouped
data
Ungrouped j(n 1)
Deciles Dj size of th observation
Data 10
j = 1, 2, . .,9 Discrete j(n 1)
Grouped Dj size of th observation ; Here n f
data 10
Ungrouped j(n 1)
Percentiles Pj size of th observation
Data 100
j = 1, 2, . Discrete j(n 1)
.,99 Grouped Pj size of th observation ; Here n f
data 100
14
Chapter 03 Measures of Central Tendency
h jn
Quartiles Q j l C ; Here n f
f4
h jn
Deciles D j l C ; Here n f
f 10
h jn
Percentiles P j l C ; Here n f
f 100
15
Chapter 03 Measures of Central Tendency
The main object (purpose) of the average is to give a bird’s eye view (summary) of the statistical
data. The average removes all the unnecessary details of the data and gives a concise (to the
point or short) picture of the huge data under investigation.
Average is also of great use for the purpose of comparison (i.e. the comparison of two or more
groups in which the units of the variables are same) and for the further analysis of the data.
Averages are very useful for computing various other statistical measures such as dispersion,
skewness, kurtosis etc.
Requisites (desirable qualities) of a Good Average: An average will be considered as good if:
It is mathematically defined.
It utilizes all the values given in the data.
It is not much affected by the extreme values.
It can be calculated in almost all cases.
It can be used in further statistical analysis of the data.
It should avoid to give misleading results.
A.M is an appropriate average for all the situations where there are no extreme values in the
data.
G.M is an appropriate average for calculating average percent increase in sales, population,
production, etc. It is one of the best averages for the construction of index numbers.
H.M is an appropriate average for calculating the average rate of increase of profits of a firm or
finding average speed of a journey or the average price at which articles are sold.
Mode is an appropriate average in case of qualitative data e.g. the opinion of an average
person; he is probably referring to the most frequently expressed opinion which is the modal
opinion.
Median is an appropriate average in a highly skewed distribution e.g. in the distribution of
wages, incomes etc.
16
Chapter 03 Measures of Central Tendency
Q: What is a measure of location? What is the purpose served by it? What are its desirable
qualities?
Measure of location: A central value that represents the whole data is called an average. Since
average is a value usually somewhere in the center and represents the entire data set therefore it
is called measure of central tendency. Measure of central tendency indicates the location or the
general position of the data on the X-axis therefore it is also known as a measure of location or
position
Purpose:
It removes all the unnecessary details of the data and gives a concise picture of the huge
data.
It is used for the purpose of comparison.
It is very useful in computing other statistical measures such as dispersion, skewness and
kurtosis etc.
It is mathematically defined.
It utilizes all the observations given in a data.
It is not much affected by the extreme values.
It is capable of further algebraic treatment.
It is not affected by fluctuations of sampling.
17