MEasures of Central Tendency
MEasures of Central Tendency
Statistics
Sir R.A. Fisher defined Statistics as,” the science of statistics is essentially a branch of applied
mathematics and may be regarded as mathematics applied to observational data.
A.L.Bowley: i) Statistics is a device for abbreviation and classifying the statement and making
clear the relations. ii) Statistics is a science of measurement of social phenomenon regarded as a
whole in all its manifestations.
Lovitt : Statistics is a science which deals with the collecting, classifying, presenting,
comparing, interpreting numerical data collected to throw light on any sphere of enquiry.
W.A. Waills and H.V. Robert: i) Statistics is not a body of substantive knowledge, but a body
of methods obtaining knowledge.
ii) Statistics is a body of methods for making wise decision in the face of uncertainty.
Abstract
In any research, enormous data is collected and, to describe it meaningfully, one needs to
summarize the same. The bulkiness of the data can be reduced by organizing it into a frequency
table or histogram. Frequency distribution organizes the heap of data into a few meaningful
categories. Collected data can also be summarized as a single index/value, which represents the
entire data. These measures may also help in the comparison of data.
Introduction
Central tendency
By means of classification and frequency curve we get an idea about the shape of frequency
distribution. In most of the frequency distributions we observe that all class-frequencies are not
the same. Initially frequency is small in magnitude, later on it increases, and it reaches to
maximum in the field in the middle part of the data and then falls down. In the other the
frequency curve is bell- shaped. Here we note properties of observations are not uniformly
spread. However, most of the observations get clustered in the central part of the data. This
property of observations is described as Central tendency.
Example
Q. Marks obtained by the students are given below find out the Arithmetic mean for grouped
frequency distribution
10 5 50
20 10 200
30 12 360
40 21 840
Demerits of Mean
1) It is too much affected by the extreme values
2) Mostly it does not correspond to any value of the set of observations.
3) It cannot be calculated for frequency distribution with open end classes.
4) It does not convey any information about the spread or trend of data.
5) It is not a suitable measure of central value in case of highly skewed distribution.
Certainly! Here are some additional points about the arithmetic mean:
1. Representation of Central Tendency: The arithmetic mean is one of the most common
measures of central tendency. It aims to capture the typical or average value of a dataset.
By summing up all the values and dividing by the total count, it provides a single
representative value.
2. Sensitivity to Outliers: One important characteristic of the arithmetic mean is that it is
sensitive to extreme values or outliers in the dataset. A single outlier can significantly
affect the value of the mean. Therefore, when dealing with datasets that may contain
outliers, it's important to consider alternative measures, such as the median or trimmed
mean.
3. Suitable for Numerical Data: The arithmetic mean is primarily used for datasets
consisting of numerical values. It is not applicable to categorical or ordinal data, as the
calculation relies on the numerical values of the dataset.
4. Limitations with Skewed Distributions: The arithmetic mean may not accurately
represent the typical value in datasets with skewed distributions. Skewness refers to the
asymmetry of the data. In such cases, the median, which represents the middle value
when the data is sorted, can be a more appropriate measure of central tendency.
5. Sample Mean vs. Population Mean: The arithmetic mean can be calculated for both a
sample and a population. When calculating the mean for a sample, it represents the
average value of the observed sample data. The population mean, on the other hand,
represents the average value of an entire population. Depending on the context and
purpose, the sample mean or the population mean may be used.
6. Continuous and Discrete Data: The arithmetic mean can be calculated for both
continuous and discrete data. Continuous data refers to measurements that can take on
any value within a range (e.g., height, weight), while discrete data refers to specific
values (e.g., number of siblings, number of goals scored). The arithmetic mean can
handle both types of data.
7. Additivity Property: The arithmetic mean has an additivity property, which means that if
you have two separate datasets and calculate the mean for each, then the mean of the
combined dataset is equal to the weighted average of the individual means. This property
is useful in various statistical calculations.
Overall, the arithmetic mean is a widely used and intuitive measure of central tendency. It
provides a useful summary statistic for understanding and analyzing datasets, but it is important
to consider its limitations and potential alternatives depending on the characteristics of the data.
Median
The median is another measure of central tendency, similar to the arithmetic mean. While the
mean represents the average value of a dataset, the median represents the middle value when the
dataset is arranged in ascending or descending order.
To calculate the median, follow these steps:
1. Arrange the dataset in ascending or descending order.
2. If the dataset has an odd number of values, the median is the middle value.
3. If the dataset has an even number of values, the median is the average of the two middle
values.
4. If the number of observations are odd, then (n + 1)/2th observation (in the ordered set) is
the median. When the total number of observations are even, it is given by the mean of
n/2th and (n/2 + 1)th observation
i.e. If the number of observations
are odd, then (n + 1)/2th observation (in the ordered set) is the
median. When the total number of observations are even, it
[
is given by the mean of n/2th and (n/2 + 1)th observation.
If the number of observations
are odd, then (n + 1)/2th observation (in the ordered set) is the
median. When the total number of observations are even, it
[
is given by the mean of n/2th and (n/2 + 1)th observation.
If the number of observations
are odd, then (n + 1)/2th observation (in the ordered set) is the
median. When the total number of observations are even, it
[
is given by the mean of n/2th and (n/2 + 1)th observation.
If the number of observations
are odd, then (n + 1)/2th observation (in the ordered set) is the
median. When the total number of observations are even, it
[
is given by the mean of n/2th and (n/2 + 1)th observation.
If the number of observations
are odd, then (n + 1)/2th observation (in the ordered set) is the
median. When the total number of observations are even, it
[
is given by the mean of n/2th and (n/2 + 1)th observation.
If the number of observations
are odd, then (n + 1)/2th observation (in the ordered set) is the
median. When the total number of observations are even, it
[
is given by the mean of n/2th and (n/2 + 1)th observation.
For example, let's calculate the median of the following dataset: 3, 5, 7, 12, 15.
Step 1: Arrange the dataset in ascending order: 3, 5, 7, 12, 15.
Step 2: Since the dataset has an odd number of values (5), the median is the middle value, which
is 7.
Therefore, the median of the dataset is 7.
The median is particularly useful when dealing with skewed distributions or when the dataset
contains outliers. Unlike the mean, the median is not affected by extreme values. It provides a
more robust estimate of the central tendency and is less influenced by the presence of outliers.
Demerits:
1) A slight change in series may bring drastic change in median value.
3) Mode
The mode is another measure of central tendency that represents the most frequently occurring
value(s) in a dataset or the observation with maximum frequency. Unlike the mean and median,
which focus on the average or middle values, the mode identifies the value(s) that appear most
frequently.
Example:
1. The mode of {4, 2, 4, 3, 2, 2} is 2 because it occurs three times, which is more than any other
number.
2. Find out the mode of following observations.
X 10 12 14 16 18
F 2 12 23 16 8
Mode is 14.
References
1. Sundaram KR, Dwivedi SN, Sreenivas V. Medical statistics principles and methods. 1sted. New Delhi: B.I
Publications Pvt Ltd; 2010.
2. Petrie A, Sabin C. Medical statistics at a glance. 3rd ed. Oxford:Wiley- Blackwell;2009.
3. Norman GR, Streiner DL. Biostatistics the bare essentials. 2nd ed. Hamilton:
B.C. Decker Inc; 2000.
4. Glaser AN. High Yield Biostatistics. 1st Indian Ed. New Delhi:Lippincott Williams and Wilkins;2000.
5. Dawson B, Trapp RG. Basic and Clinical Biostatistics. 4thed. New York: Mc- Graw Hill; 2004.
Sundaram KR, Dwivedi SN, Sreenivas V. Medical statistics principles and
st
methods. 1 ed. New Delhi, India: B.I Publications Pvt Ltd; 2010.
nd
6. Norman GR, Streiner DL. Biostatistics the bare essentials. 2 ed. Hamilton:
B.C. Decker Inc; 2000.
7. Plackett,R. L. 1958. Studies in the History of Probability and
Statistics: VII. The Principle of the Arithmetic Mean. Biometrica 45:
130-135.
8.