0% found this document useful (0 votes)
118 views12 pages

MEasures of Central Tendency

The document discusses various measures of central tendency including the mean, median, and mode. It provides definitions and formulas for calculating the mean, or arithmetic average. The mean is calculated by adding all values in a data set and dividing by the total number of values. It is affected by outliers but is widely used. The median represents the middle value of a sorted data set and is not impacted by outliers. The document also discusses situations where different measures may be more appropriate depending on the type and distribution of the data.

Uploaded by

Pranjal Kulkarni
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
118 views12 pages

MEasures of Central Tendency

The document discusses various measures of central tendency including the mean, median, and mode. It provides definitions and formulas for calculating the mean, or arithmetic average. The mean is calculated by adding all values in a data set and dividing by the total number of values. It is affected by outliers but is widely used. The median represents the middle value of a sorted data set and is not impacted by outliers. The document also discusses situations where different measures may be more appropriate depending on the type and distribution of the data.

Uploaded by

Pranjal Kulkarni
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 12

Literature Review

Measures of Central Tendency – Mean, Median and Mode

Mrs. Pranjal Tarte


P.V.G.’S College of Science and Commerce, Pune-09

Statistics
Sir R.A. Fisher defined Statistics as,” the science of statistics is essentially a branch of applied
mathematics and may be regarded as mathematics applied to observational data.
A.L.Bowley: i) Statistics is a device for abbreviation and classifying the statement and making
clear the relations. ii) Statistics is a science of measurement of social phenomenon regarded as a
whole in all its manifestations.
Lovitt : Statistics is a science which deals with the collecting, classifying, presenting,
comparing, interpreting numerical data collected to throw light on any sphere of enquiry.
W.A. Waills and H.V. Robert: i) Statistics is not a body of substantive knowledge, but a body
of methods obtaining knowledge.
ii) Statistics is a body of methods for making wise decision in the face of uncertainty.

Abstract
In any research, enormous data is collected and, to describe it meaningfully, one needs to
summarize the same. The bulkiness of the data can be reduced by organizing it into a frequency
table or histogram. Frequency distribution organizes the heap of data into a few meaningful
categories. Collected data can also be summarized as a single index/value, which represents the
entire data. These measures may also help in the comparison of data.

Keyword: Mean, median, mode.

Introduction
Central tendency
By means of classification and frequency curve we get an idea about the shape of frequency
distribution. In most of the frequency distributions we observe that all class-frequencies are not
the same. Initially frequency is small in magnitude, later on it increases, and it reaches to
maximum in the field in the middle part of the data and then falls down. In the other the
frequency curve is bell- shaped. Here we note properties of observations are not uniformly
spread. However, most of the observations get clustered in the central part of the data. This
property of observations is described as Central tendency.

Measures of Central Tendency


Central Tendencies in Statistics are the numerical values that are used to represent mid-value or
central value in a large collection of numerical data or the Statistical measure identifies a single
value as a representative of the collected data. These obtained numerical values are called
central or average values in Statistics. Such a value is of great significance because it depicts
the nature or characteristics of the entire data, which is otherwise very difficult to observe.
Some of the most commonly used measures of central tendency are:
● Mean or Arithmetic mean
● Median
● mode

Mean or Arithmetic mean (X̄ )


It is the most commonly used measure of central tendency and widely applicable average.
The arithmetic mean, also known as the average, is a statistical measure that represents the
central tendency of a set of numbers. It is calculated by summing up all the values in a dataset
and dividing the sum by the total number of values.
To calculate the arithmetic mean, follow these steps:
1. Add up all the numbers in the dataset.
2. Count the total number of values in the dataset.
3. Divide the sum by the total number of values.
Mathematically, the arithmetic mean can be represented as:
Arithmetic Mean = (Sum of all values) / (Total number of values)
For example, let's calculate the arithmetic mean of the following dataset: 5, 7, 9, 12, and 15.
Step 1: Add up all the numbers: 5 + 7 + 9 + 12 + 15 = 48
Step 2: Count the total number of values: There are 5 values in the dataset.
Step 3: Divide the sum by the total number of values: 48 / 5 = 9.6
Therefore, the arithmetic mean of the dataset is 9.6.

Example
Q. Marks obtained by the students are given below find out the Arithmetic mean for grouped
frequency distribution

Marks obtained (xi) Frequency (Fi) FiXi

10 5 50

20 10 200

30 12 360

40 21 840

X̄ = ∑FiXi/∑Fi i=1, 2,…., n


= 1450/48
= 30.20
Arithmetic mean is 30.20

Merits and Demerits of arithmetic mean


Merits of mean
1) Although Mean is the most general way to calculate the central tendency of a dataset
however it cannot always give the correct idea always, especially when there is a large
gap between the datasets.
2) It is applicable only for quantitative data
3) It is unduly affected by the extreme observations.
4) It cannot be computed for the frequency distribution with open end classes.
5) It cannot be determined graphically.
6) Sometimes Arithmetic mean cannot be an actual observation in a data.

Demerits of Mean
1) It is too much affected by the extreme values
2) Mostly it does not correspond to any value of the set of observations.
3) It cannot be calculated for frequency distribution with open end classes.
4) It does not convey any information about the spread or trend of data.
5) It is not a suitable measure of central value in case of highly skewed distribution.

Real life situations of arithmetic mean


1. Exam Grades: Suppose a teacher wants to determine the average score of a class on a
math exam. They would add up the scores of all the students and divide the sum by the
total number of students to find the arithmetic mean.
2. Temperature: Meteorologists use the arithmetic mean to calculate average temperatures.
They sum up the recorded temperatures for a specific period, such as a month, and divide
the sum by the number of days in that period to obtain the average temperature.
3. Financial Analysis: When assessing investment returns, the arithmetic mean is often
used. For example, an investor might calculate the average annual return of a portfolio
over several years to evaluate its performance.
4. Sports Statistics: In sports, the arithmetic mean is used to determine averages for various
statistics. For instance, the average points scored per game by a basketball player or the
average batting average of a baseball player can be calculated using the arithmetic mean.
5. Population Studies: In social sciences, researchers often use the arithmetic mean to
describe characteristics of a population. For instance, the average income of a particular
region or the average age of a group of people can be determined using the arithmetic
mean.
6. Quality Control: In manufacturing, the arithmetic mean is used to monitor and control
product quality. Measurements of product dimensions or weights are taken, and the
average is calculated to ensure it falls within acceptable limits.

Certainly! Here are some additional points about the arithmetic mean:
1. Representation of Central Tendency: The arithmetic mean is one of the most common
measures of central tendency. It aims to capture the typical or average value of a dataset.
By summing up all the values and dividing by the total count, it provides a single
representative value.
2. Sensitivity to Outliers: One important characteristic of the arithmetic mean is that it is
sensitive to extreme values or outliers in the dataset. A single outlier can significantly
affect the value of the mean. Therefore, when dealing with datasets that may contain
outliers, it's important to consider alternative measures, such as the median or trimmed
mean.
3. Suitable for Numerical Data: The arithmetic mean is primarily used for datasets
consisting of numerical values. It is not applicable to categorical or ordinal data, as the
calculation relies on the numerical values of the dataset.
4. Limitations with Skewed Distributions: The arithmetic mean may not accurately
represent the typical value in datasets with skewed distributions. Skewness refers to the
asymmetry of the data. In such cases, the median, which represents the middle value
when the data is sorted, can be a more appropriate measure of central tendency.
5. Sample Mean vs. Population Mean: The arithmetic mean can be calculated for both a
sample and a population. When calculating the mean for a sample, it represents the
average value of the observed sample data. The population mean, on the other hand,
represents the average value of an entire population. Depending on the context and
purpose, the sample mean or the population mean may be used.
6. Continuous and Discrete Data: The arithmetic mean can be calculated for both
continuous and discrete data. Continuous data refers to measurements that can take on
any value within a range (e.g., height, weight), while discrete data refers to specific
values (e.g., number of siblings, number of goals scored). The arithmetic mean can
handle both types of data.
7. Additivity Property: The arithmetic mean has an additivity property, which means that if
you have two separate datasets and calculate the mean for each, then the mean of the
combined dataset is equal to the weighted average of the individual means. This property
is useful in various statistical calculations.

Overall, the arithmetic mean is a widely used and intuitive measure of central tendency. It
provides a useful summary statistic for understanding and analyzing datasets, but it is important
to consider its limitations and potential alternatives depending on the characteristics of the data.
Median
The median is another measure of central tendency, similar to the arithmetic mean. While the
mean represents the average value of a dataset, the median represents the middle value when the
dataset is arranged in ascending or descending order.
To calculate the median, follow these steps:
1. Arrange the dataset in ascending or descending order.
2. If the dataset has an odd number of values, the median is the middle value.
3. If the dataset has an even number of values, the median is the average of the two middle
values.
4. If the number of observations are odd, then (n + 1)/2th observation (in the ordered set) is
the median. When the total number of observations are even, it is given by the mean of
n/2th and (n/2 + 1)th observation
i.e. If the number of observations
are odd, then (n + 1)/2th observation (in the ordered set) is the
median. When the total number of observations are even, it
[
is given by the mean of n/2th and (n/2 + 1)th observation.
If the number of observations
are odd, then (n + 1)/2th observation (in the ordered set) is the
median. When the total number of observations are even, it
[
is given by the mean of n/2th and (n/2 + 1)th observation.
If the number of observations
are odd, then (n + 1)/2th observation (in the ordered set) is the
median. When the total number of observations are even, it
[
is given by the mean of n/2th and (n/2 + 1)th observation.
If the number of observations
are odd, then (n + 1)/2th observation (in the ordered set) is the
median. When the total number of observations are even, it
[
is given by the mean of n/2th and (n/2 + 1)th observation.
If the number of observations
are odd, then (n + 1)/2th observation (in the ordered set) is the
median. When the total number of observations are even, it
[
is given by the mean of n/2th and (n/2 + 1)th observation.
If the number of observations
are odd, then (n + 1)/2th observation (in the ordered set) is the
median. When the total number of observations are even, it
[
is given by the mean of n/2th and (n/2 + 1)th observation.

For example, let's calculate the median of the following dataset: 3, 5, 7, 12, 15.
Step 1: Arrange the dataset in ascending order: 3, 5, 7, 12, 15.
Step 2: Since the dataset has an odd number of values (5), the median is the middle value, which
is 7.
Therefore, the median of the dataset is 7.
The median is particularly useful when dealing with skewed distributions or when the dataset
contains outliers. Unlike the mean, the median is not affected by extreme values. It provides a
more robust estimate of the central tendency and is less influenced by the presence of outliers.

Merits and demerits of median


Merits:
1) Median is not influenced by extreme values because it is a positional average.
2) Median can be calculated in case of distribution with the open end intervals
3) Medians can be located even if the data are incomplete.
4) Median can be located even for the qualitative factor such as ability, honesty etc.
5) Median can be located visually in case of discrete series.

Demerits:
1) A slight change in series may bring drastic change in median value.

Real life situations of median


1. Income Distribution: The median is frequently used to analyze income distribution within
a population. It provides a measure of the middle income value, indicating the income
level that separates the higher-earning and lower-earning individuals or households. The
median income is often used as an indicator of the overall economic well-being and
inequality within a society.
2. Housing Prices: When analyzing housing prices in a particular area, the median price is
commonly used to determine the typical or central price point. It helps understand the
price range that most buyers or sellers fall within, providing insights into the affordability
of housing options.
3. Test Scores: In educational settings, the median is often used to interpret test scores. By
calculating the median score, educators and researchers can identify the middle
performance level in a group of students. This information helps gauge the overall
academic performance, evaluate the effectiveness of teaching methods, or compare
different schools or districts.
4. Health Data: The median is frequently employed in healthcare and medical research to
analyze various health-related measurements. For instance, it can be used to determine
the median age of patients diagnosed with a specific disease, the median duration of a
certain medical condition, or the median response time for a particular treatment.
5. Travel Time: When estimating travel time or commute duration, the median is often used
to represent the typical or average time it takes to reach a destination. By considering the
median travel time, transportation planners and commuters can gain a better
understanding of the expected duration and plan accordingly.
6. Population Age: The median age is commonly used to describe the age distribution
within a population. It represents the age at which half of the population is younger and
the other half is older. The median age is useful for studying demographic trends, making
policy decisions related to healthcare, retirement, and social services, and understanding
the overall age structure of a society.

Here are a few key points about the median:


1. Robust to Outliers: The median is less sensitive to outliers compared to the arithmetic
mean. Even if the dataset contains extreme values, the median remains unaffected.
2. Suitable for Skewed Distributions: Unlike the mean, the median is a suitable measure of
central tendency for datasets with skewed distributions. It provides a better representation
of the typical value when the data is not symmetrically distributed.
3. Applicable to Ordinal Data: In addition to numerical data, the median can also be
calculated for ordinal data. Ordinal data represents categories with a specific order but
does not necessarily have fixed numerical values. The median provides a meaningful
measure of central tendency for such data.
4. Easy to Understand: The median is a relatively easy concept to understand and interpret.
It represents the middle value in the dataset, which can be useful for understanding the
distribution and characteristics of the data.
It's important to note that the choice between using the mean or the median depends on the
nature of the data and the research question at hand. Both measures have their own strengths and
limitations, and the appropriate measure should be selected based on the specific context and
goals of the analysis.

7. Waiting Times: In various service industries, such as restaurants, hospitals, or customer


support centers, the median waiting time is often used to evaluate the efficiency of
service delivery. By estimating the median wait time, businesses can assess customer
satisfaction, identify potential bottlenecks, and make improvements in their operations.

Certainly! Here are some additional points about the median:


1. Handling Skewed Distributions: The median is particularly useful when dealing with
datasets that have skewed distributions. Skewness refers to the asymmetry of the data. In
such cases, the mean may be influenced by extreme values, while the median provides a
more robust estimate of the central tendency. For example, in a dataset with a long tail on
one side, the median will be less affected by the extreme values in the tail.
2. Unequal Distribution of Values: The median is especially valuable when the dataset
contains unequal distribution of values. It is not influenced by the magnitude of the
values, but rather their relative position. Therefore, the median can give a representative
measure even when the dataset has significant variations in values.
3. Ordinal Data: The median can be applied to ordinal data, which represents categories
with a specific order but does not necessarily have fixed numerical values. For example,
if you have a survey with responses like "strongly disagree," "disagree," "neutral,"
"agree," and "strongly agree," you can calculate the median to determine the central
tendency of the responses.
4. Data with Outliers: The median is resistant to outliers, which are extreme values that
deviate significantly from the other values in the dataset. Since the median is based on the
middle value(s), it is not affected by outliers. This makes it a useful measure when you
want to estimate the central tendency without the influence of extreme values.
5. Calculating the Median: If the dataset has an odd number of values, the median is simply
the middle value. If the dataset has an even number of values, the median is the average
of the two middle values. For example, in the dataset 2, 4, 6, 8, the median is the average
of 4 and 6, which is 5.
6. Visualization: The median can be used to divide a dataset into two equal halves. In a box
plot, the median is represented by a line dividing the box into two parts. This
visualization provides insights into the distribution of the data and helps identify any
Skewness.
While the median is a useful measure of central tendency, it does have limitations. It may not
provide a complete picture of the dataset, as it only considers the middle value(s) and not the
range or variability of the data. Therefore, it is often used in conjunction with other descriptive
statistics and measures to gain a comprehensive understanding of the dataset.

3) Mode
The mode is another measure of central tendency that represents the most frequently occurring
value(s) in a dataset or the observation with maximum frequency. Unlike the mean and median,
which focus on the average or middle values, the mode identifies the value(s) that appear most
frequently.

Example:
1. The mode of {4, 2, 4, 3, 2, 2} is 2 because it occurs three times, which is more than any other
number.
2. Find out the mode of following observations.

X 10 12 14 16 18
F 2 12 23 16 8
Mode is 14.

Real life situations of mode,


1. Clothing Sizes: In the fashion industry, the mode is used to identify the most commonly
worn clothing sizes. By determining the mode size for a particular garment,
manufacturers can ensure they produce enough inventory to meet customer demand and
minimize stock outs.
2. Product Preferences: Businesses often use the mode to analyze customer preferences and
identify the most popular products or features. For example, an e-commerce company
may determine the mode color, size, or style of a product to inform their inventory
management and marketing strategies.
3. Voting and Elections: The mode is relevant in analyzing election results. It helps identify
the winning candidate or option by determining the mode number of votes. If multiple
candidates or options have the same highest frequency of votes, it indicates a tie or the
need for further resolution methods.
4. Traffic Patterns: The mode is useful in studying traffic patterns and identifying the most
common or peak times for traffic congestion. By analyzing traffic data and identifying
the mode time periods, urban planners and transportation authorities can make informed
decisions about infrastructure improvements and traffic management strategies.
5. Internet Usage: Internet service providers and website administrators often analyze
internet usage patterns to optimize their services. The mode can be used to identify peak
usage periods, such as the mode time of day or day of the week when internet traffic is
highest. This information helps in capacity planning and resource allocation.
6. Medical Diagnosis: In medical research and healthcare settings, the mode is employed to
identify the most prevalent disease or medical condition within a population. It helps
healthcare professionals and researchers understand the most commonly occurring
ailments, allocate resources accordingly, and develop appropriate preventive measures.
7. Survey Responses: When analyzing survey data, the mode can be used to identify the
most frequent response to a particular question. This information helps researchers and
businesses understand the prevailing opinions, preferences, or attitudes of the survey
respondents.
8. Stock Market Analysis: The mode is sometimes used in stock market analysis to identify
the most frequently occurring stock price within a given period. It can provide insights
into the price levels at which a particular stock is heavily traded, indicating potential
support or resistance levels.

Here are some key points about the mode:


1. Multiple Modes: A dataset can have one mode, known as unimodal, or multiple modes,
known as bimodal, trimodal, or multimodal. If there are several values that occur with the
same highest frequency, all of them are considered modes.
2. Categorical and Numerical Data: The mode can be calculated for both categorical and
numerical data. In categorical data, the mode represents the most common category or
class. For numerical data, it represents the value(s) with the highest frequency.
3. Useful with Nominal Data: The mode is particularly useful when dealing with nominal
data, which consists of categories that have no inherent order. For example, if you have a
dataset of eye colors (blue, brown, green), the mode would indicate the most frequently
occurring eye color.
4. Handling Skewed Distributions: The mode can be helpful in identifying the central
tendency of a dataset with skewed distributions. In skewed datasets, the mode can
provide information about the peak or cluster of values that occur most frequently, even
if the distribution is not symmetric.
5. Missing Mode: It is possible for a dataset to have no mode if all values occur with the
same frequency. In other words, if no value appears more frequently than others, the
dataset is considered to have no mode.
6. Visual Representation: The mode can be visually represented in a histogram or bar chart
by identifying the bar or category with the highest frequency. This helps visualize the
most common values in the dataset.
It's worth noting that the mode may not always be the most appropriate measure of central
tendency, especially when working with continuous or interval data. In such cases, the mean or
median may be more suitable. Additionally, the mode may not provide a comprehensive
summary of the dataset, as it focuses solely on the most frequent value(s) and does not consider
the overall distribution or variability of the data.
Overall, the mode is a valuable measure for identifying the most frequently occurring value(s) in
a dataset. It is particularly useful with nominal or categorical data and can provide insights into
the central tendency of skewed distributions.

Certainly! Here are some additional points about the mode:


1. Applicability to Different Data Types: The mode can be calculated for various types of
data, including categorical, nominal, ordinal, and discrete numerical data. It is not limited
to a specific type of data and can be used to identify the most common category or value
within a dataset.
2. Data with No Mode or Multiple Modes: There are cases where a dataset may have no
mode, meaning that no value appears more frequently than others. For example, in a
dataset with the values [2, 4, 6, 8], there is no mode because all values occur with the
same frequency. On the other hand, a dataset can also have multiple modes, indicating
that two or more values occur with the same highest frequency. For example, in a dataset
with the values [2, 4, 4, 6, 6, 8], both 4 and 6 are modes.
3. Usefulness in Descriptive Statistics: The mode is a fundamental measure used in
descriptive statistics. Alongside the mean and median, it provides a comprehensive
summary of a dataset's central tendency. By considering the mode, you gain insights into
the most prevalent values, which can be valuable for understanding the characteristics of
the data.
4. Handling Skewed Distributions: The mode is particularly useful when dealing with
skewed distributions. Skewness refers to the asymmetry of the data. In such cases, the
mode can help identify the peak or cluster of values that occur most frequently, providing
information about the dominant features of the distribution.
5. Frequency Distribution: The mode is closely related to the concept of frequency
distribution. A frequency distribution table or graph displays the values in a dataset along
with their corresponding frequencies (how often they occur). The mode, in this context,
represents the value(s) with the highest frequency in the distribution.
6. Limitations of the Mode: While the mode is a valuable measure, it does have some
limitations. For instance, in datasets with continuous or interval data, the mode may not
accurately represent the central tendency since it focuses on specific values rather than
considering the entire range of values. In such cases, the mean or median might be more
appropriate measures to use.
Remember that the choice of the mode, mean, or median as a measure of central tendency
depends on the type of data, the distribution of values, and the specific objectives of the analysis.
It's often beneficial to consider multiple measures together to obtain a more complete
understanding of the dataset.

References
1. Sundaram KR, Dwivedi SN, Sreenivas V. Medical statistics principles and methods. 1sted. New Delhi: B.I
Publications Pvt Ltd; 2010.
2. Petrie A, Sabin C. Medical statistics at a glance. 3rd ed. Oxford:Wiley- Blackwell;2009.
3. Norman GR, Streiner DL. Biostatistics the bare essentials. 2nd ed. Hamilton:
B.C. Decker Inc; 2000.
4. Glaser AN. High Yield Biostatistics. 1st Indian Ed. New Delhi:Lippincott Williams and Wilkins;2000.

5. Dawson B, Trapp RG. Basic and Clinical Biostatistics. 4thed. New York: Mc- Graw Hill; 2004.
Sundaram KR, Dwivedi SN, Sreenivas V. Medical statistics principles and
st
methods. 1 ed. New Delhi, India: B.I Publications Pvt Ltd; 2010.
nd
6. Norman GR, Streiner DL. Biostatistics the bare essentials. 2 ed. Hamilton:
B.C. Decker Inc; 2000.
7. Plackett,R. L. 1958. Studies in the History of Probability and
Statistics: VII. The Principle of the Arithmetic Mean. Biometrica 45:
130-135.
8.

You might also like