0% found this document useful (0 votes)
360 views15 pages

Chapter Three

This chapter discusses measures of central tendency, which provide a single value to represent a data set. It describes the three main measures: mean, median, and mode. The mean is the sum of all values divided by the number of values and can be used for both discrete and continuous data. The chapter provides formulas for calculating the mean of raw data, frequency distributions, and grouped continuous data. It also outlines properties a good measure of central tendency should have and how each type of measure is suited for certain data.

Uploaded by

Yohannis Reta
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
360 views15 pages

Chapter Three

This chapter discusses measures of central tendency, which provide a single value to represent a data set. It describes the three main measures: mean, median, and mode. The mean is the sum of all values divided by the number of values and can be used for both discrete and continuous data. The chapter provides formulas for calculating the mean of raw data, frequency distributions, and grouped continuous data. It also outlines properties a good measure of central tendency should have and how each type of measure is suited for certain data.

Uploaded by

Yohannis Reta
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 15

Chapter three: Measures of central Tendency

Contents

3.1 Introduction
3.2 The summation Notation
3.3 Properties of Measures of Central Tendency
3.4 Types of Measures of Central Tendency
3.4.1 The Arithmetic mean (simple and weight)
3.4.2 The Mode
3.4.3 The median
3.5 Measures of Non-central Locations

Measures of Central Tendency


“The way to make sense out of raw data is to compare and contrast, to understand difference.”

G. BATESO

Objectives
At the end of this chapter students will be able to:
• Identify measure of central tendency.
• Understand properties of arithmetic mean.
• Summarize an aggregate of statistical data by using single measure.
• Define and calculate the mean, mode and median.
• measure the position of data using quartiles, deciles and percentiles with their interpretation.

3.1 introductions
Researchers are often interested in defining a value that best describes some attribute of the
population. The best way to reduce a set of data and still retain part of the information is to
summarize the set with a single value. Therefore, measures of central tendency are one of
descriptive statistics.

[email protected]

1
Chapter three: Measures of central Tendency

When we want to make comparison between groups of numbers it is good to have a single value that is
considered to be a good representative of each group. This single value is called the average of the group.
-Averages are also called measures of central tendency.
-An average which is representative is called typical average and an average which is not
representative and has only a theoretical value is called a descriptive average.
3.2 The Summation Notation ()
Statistical Symbols: Let a data set consists of a number of observations, represents by 𝑥1 , 𝑥2
, … , 𝑥𝑛 where n (the last subscript) denotes the number of observations in the data and 𝑥𝑖 is the ith
observation. Then the sum of all numbers (𝑥𝑖 ′𝑠) where i goes from 1 up to n is symbolically
given by ∑𝑛𝑖=1 𝑥𝑖 𝑜𝑟 ∑ 𝑥𝑖 𝑜𝑟 ∑ 𝑥 that is,
∑ 𝑥𝑖 = 𝑥1 + 𝑥2 + … + 𝑥𝑛
x - whole set of numbers
𝑥𝑖 - specific score in a set of numbers
n - total number of observations
For instance a data set consisting of six measurements 2, 3, 9, 10, 8 and -2 is represented by 𝑥1 ,
𝑥2 , … , 𝑥6 where 𝑥1 = 2, 𝑥2 =3, 𝑥3 =9, 𝑥4 = 10, 𝑥5 = 8 and 𝑥6 =-2 Their sum becomes ∑6𝑖=1 𝑥𝑖
= 𝑥1 + 𝑥2 + … + 𝑥6 = 2+3+9+10+8+ (-2) = 30
Some Properties of the Summation Notation
1. ∑𝑛𝑖=1 𝑐 = n.c, where c is a constant number.

2. ∑𝑛𝑖=1 𝑏𝑥𝑖 = b∑𝑛𝑖=1 𝑥𝑖 where b is a constant number

3. ∑𝑛𝑖=1(𝑎 + 𝑏𝑥𝑖 ) = n.a + b∑𝑛𝑖=1 𝑥𝑖

4. ∑𝑛𝑖=1((𝑥𝑖 ± 𝑦𝑖 ) = ∑𝑛𝑖=1 𝑥𝑖 ± ∑𝑛𝑖=1 𝑦𝑖

5. ∑𝑛𝑖=1 𝑥𝑖 𝑦𝑖 ≠ ∑𝑛𝑖=1 𝑥𝑖 ∑𝑛𝑖=1 𝑦𝑖

[email protected]

2
Chapter three: Measures of central Tendency

Example 3.1: ∑7𝑖=1 𝑥𝑖 = 20 , ∑7𝑖=1 𝑦𝑖 = 30, ∑7𝑖=1 𝑥𝑖2 = 420, ∑7𝑖=1 𝑦𝑖2 =280

Find i/ ∑7𝑖=1(6𝑥𝑖 + 4𝑦𝑖 ) = 6 ∑7𝑖=1 𝑥𝑖 + 4∑7𝑖=1 𝑦𝑖 = 6.20 + 4.30 = 240


ii/ 3∑7𝑖=1 𝑥𝑖2 − 2 ∑7𝑖=1 𝑦𝑖2 = 3.420 – 2.280 = 700

3.3 Properties of measures of central tendency


A good average should be:
1. Rigidly defined (unique).
2. Based on all observation under investigation.
3. Easily understood.
4. Simple to compute.
5. Suitable for further mathematical treatment.
6. Little affected by fluctuations of sampling.
7. Not highly affected by extreme values.
3.4 Types of Measures of Central Tendency
Measures of Central Tendency:- give us information about the location of the center of the
distribution of data values. A single value that describes the characteristics of the entire mass of
data is called measures of central tendency. We will discuss briefly the three measures of central
tendency: mean, median and mode in this unit.
The following are types of Central Tendency which are suitable for a particular type of data.
These are
• Arithmetic Mean
- Weighted Arithmetic Mean
- Combined mean
• Median
• Mode or modal value
3.3.1 Arithmetic Mean:- Arithmetic mean is defined as the sum of the measurements of the
items divided by the total number of items. It is usually denoted by 𝑥̅ .
Arithmetic Mean for individual series

[email protected]

3
Chapter three: Measures of central Tendency

Suppose 𝑥1 , 𝑥2 , … , 𝑥𝑛 are observed values in a sample of size n from a population of size N,


n<N then the arithmetic mean of the sample, denoted by 𝑥̅ is given by
𝑥1 + 𝑥2+ … +𝑥𝑛 ∑𝑛
𝑖=1 𝑥𝑖
𝑥̅ = =
𝑛 𝑛

If we take an entire population the mean is denoted by μ and is given by:


𝑋1 + 𝑋2+ … +𝑋𝑁 ∑𝑁
𝑖=1 𝑋𝑖
𝜇= =
𝑁 𝑁

Where N stands for the total number of observations in the population.


Example 3.2: Consider the samples given below:
i. 46 54 21 35
ii. 10.5 2.4 3.6 5.9 8.7
Find the arithmetic mean
Solution:
i. The sample values are: 46 54 21 35
∑𝑛
𝑖=1 𝑥𝑖 46+ 54+21+35 156
𝑥̅ = = = = 39
𝑛 4 4

The arithmetic mean for sample value is 39.


ii. The sample values are: 10.5 2.4 3.6 5.9 8.7
∑𝑛
𝑖=1 𝑥𝑖 10.5+ 2.4+3.6+ 5.9+ 8.7 31.1
𝑥̅ = = = = 6.22
𝑛 5 5

The arithmetic mean for sample value is 6.22.


Arithmetic mean for discrete data arranged in frequency distribution
When the numbers 𝑥1 , 𝑥2 , … , 𝑥𝑘 occur with frequencies 𝑓1 , 𝑓2 , … , 𝑓𝑘 , respectively, then the
mean can be expressed in a more compact form as:
𝑥1 𝑓1 +𝑥2 𝑓2 + …+𝑥𝑘 𝑓𝑘 ∑𝑘
𝑖=1 𝑥𝑖 𝑓𝑖
𝑥̅ = = ∑𝑘
𝑓1 +𝑓2 + …+ 𝑓𝑘 𝑖=1 𝑓𝑖

Example 3.3: Calculate the arithmetic mean of the sample of numbers of students in 10 classes:
50 42 48 60 58 54 50 42 50 42
∑𝑛
𝑖=1 𝑥𝑖 50+42+48+60+58+54+50+42+50+42 496
𝑥̅ = = = = = 49.6 ≈ 50
𝑛 10 10

In this case there are three 42’s, one 48, three 50’s, one 54, one 58 and one 60. The number of
times each number occurs is called its frequency and the frequency is usually denoted by f. The
information in the sentence above can be written in a table, as follows.

[email protected]

4
Chapter three: Measures of central Tendency

Value, xi 42 48 50 54 58 60
Frequency, 3 1 3 1 1 1
fi
xifi 126 48 150 54 58 60

The formula for the arithmetic mean for data of this type is
𝑥1 𝑓1 +𝑥2 𝑓2 + …+𝑥𝑘 𝑓𝑘 ∑𝑘
𝑖=1 𝑥𝑖 𝑓𝑖
𝑥̅ = = ∑𝑘
𝑓1 +𝑓2 + …+ 𝑓𝑘 𝑖=1 𝑓𝑖

In this case we have:


42𝑥3 + 48𝑥1 + 50𝑥3 + 54𝑥1+58𝑥1+60𝑥1 126+48 + 150+54+58+60 496
𝑥̅ = = = = 49.6 ≈ 50
3+1+3+1+1+1 10 10

The mean numbers of students in ten classes is 50.


Arithmetic Mean for Grouped Continuous Frequency Distribution
If data are given in the form of continuous frequency distribution, the sample mean can be
computed as
∑𝑘
𝑖=1 𝑥𝑖 𝑓𝑖 𝑥1 𝑓1 +𝑥2 𝑓2 + …+𝑥𝑘 𝑓𝑘
𝑥̅ = ∑𝑘
= where 𝑥𝑖 is the class mark of the ith class; i=1, 2, . . . , k , 𝑓𝑖 is
𝑖=1 𝑓𝑖 𝑓1 +𝑓2 + …+ 𝑓𝑘

the frequency of the ith class and k is the number of classes


Note that ∑𝑘𝑖=1 𝑓𝑖 = n = the total number of observations.
Example 3.4: The following frequency table gives the height (in inches) of 100 students in a
college.
Class Interval (CI) 60-62 62-64 64-66 66-68 68-70 70-72 Total

Frequency (f) 5 18 42 20 8 7 100

Calculate the mean

Solution:
The formula to be used for the mean is as follows:

∑𝑘
𝑖=1 𝑥𝑖 𝑓𝑖
𝑥̅ =
∑𝑘
𝑖=1 𝑓𝑖
Let us calculate these values and make a table for these values for the sake of convenience.

Class Interval (CI) 60-62 62-64 64-66 66-68 68-70 70-72 Total

Frequency (f) 5 18 42 20 8 7 100

[email protected]

5
Chapter three: Measures of central Tendency

Mid-Point (𝑥𝑖 ) 61 63 65 67 69 71

𝑓𝑖 𝑥𝑖 305 1134 2730 1340 552 497 6558

Substituting these values with ∑6𝑖=1 𝑓𝑖 = 100, we get


∑𝑘
𝑖=1 𝑥𝑖 𝑓𝑖 6558
𝑥̅ = = 𝑥̅ = = 65.58
∑𝑘
𝑖=1 𝑓𝑖 100

The mean height of students is 65.58

Properties of the Arithmetic Mean


• The algebraic sum of the deviations of a set of numbers 𝑥1 , 𝑥2 , … , 𝑥𝑛 from their mean x is always zero.
i.e.
n

 (x
i =1
i − x) = 0
n
• The sum of squares of deviations from the mean is the least. That is,  ( x − A)
i =1
i
2
is minimum when

A = x.

• If the mean of 𝑥1 , 𝑥2 , … , 𝑥𝑛 is 𝑥̅ , then


a) The mean of 𝑥1 ± k, 𝑥2 ± k ,..., 𝑥𝑛 ± k will be 𝑥̅ ± k
b) The mean of 𝑘𝑥1 , 𝑘𝑥2 , … , 𝑘𝑥𝑛 will be k 𝑥̅ .
Merits of Arithmetic Mean
• Arithmetic mean has a rigidly defined mathematical formula so that its value is always definite
or unique. It can be calculated for any set of numerical data.
• It is calculated based on all observations.
• Arithmetic mean is simple to calculate and easy to understand.
• It doesn’t need arrangement of data in increasing or decreasing order.
• Arithmetic mean of many samples from the same population does not fluctuate considerably.
• It affords a good standard of comparison.
Demerits of Arithmetic Mean
• It can’t be calculated for data which are not quantifiable.
• It is highly affected by extreme (abnormal) values in the series.
• It can be a number which does not exist in the series.
• It can’t be calculated for grouped continuous open-ended classes.
Weighted Arithmetic Mean

[email protected]

6
Chapter three: Measures of central Tendency

While calculating simple arithmetic mean, all items were assumed to be of equally importance (each
value in the data set has equal weight). When the observations have different weight, we use weighted
average. Weights are assigned to each item in proportion to its relative importance.
If 𝑥1 , 𝑥2 , … , 𝑥𝑛 represent values of the items and 𝑤1 , 𝑤2 , … , 𝑤𝑛 are the corresponding weights, then the
weighted mean, (𝑥̅𝑤 ) is given by

xw =
w1 x1 + w2 x2 +  + wn xn
=
w x i i

w1 + w2 +  + wn w i

Example 3.5: A student’s final mark in Mathematics, Physics, Chemistry and Biology are respectively A,
B, D and C. If the respective credits received for these courses are 4, 4, 3 and 2, determine the
approximate average grade the student has got for the course.
Solution
We use a weighted arithmetic mean, weight associated with each course being taken as the number of
credits received for the corresponding course.
𝑥𝑖 4 3 1 2 Total

𝑤𝑖 4 4 3 2 13

𝑥𝑖 𝑤𝑖 16 12 3 4 35

xw =
w1 x1 + w2 x2 +  + wn xn
=
w x i i

w1 + w2 +  + wn w i

16+12+3+4 35
= 13
= 13
= 2.69

Average grade of the student is approximately 2.69.

Combined mean: When a set of observations is divided into k groups and 𝑥̅1 is the mean of n1
observations of group 1, 𝑥̅2 is the mean of n2 observations of group2, …, 𝑥̅𝑘 is the mean of nk
observations of group k, then the combined mean, denoted by 𝑥̅𝑐 , of all observations taken together is
given by

𝑥̅1 𝑛1 + 𝑥̅2 𝑛2 + ⋯ + 𝑥̅𝑘 𝑛𝑘


𝑥̅𝑐 =
𝑛1 + 𝑛2 + ⋯ + 𝑛𝑘

This is a special case of the weighted mean. In this case the sample sizes are the weights.

[email protected]

7
Chapter three: Measures of central Tendency

Example 3.6: In the Previous year there were two sections taking Statistics course. At the end of the
semester, the two sections got average marks of 70 & 78. There were 45 and 50 students in each section
respectively. Find the mean mark for the entire students.
Solution:

𝑥̅1 𝑛1+𝑥̅2 𝑛2 +⋯+𝑥̅𝑘 𝑛𝑘 𝑥̅1 𝑛1 +𝑥̅2 𝑛2 70𝑥45 +78𝑥50 7050


̅𝑥𝑐 = = = = = 74.21
𝑛1 +𝑛2 +⋯+𝑛𝑘 𝑛1 +𝑛2 45+50 95

The combined mean of the entire students will be 74.21.


3.4.2 Median
The median is as its name indicates the middle most value in the arrangement which divides the data into
two equal parts. It is obtained by arranging the data in an increasing or decreasing order of magnitude and
denoted by𝑥̃.
Median for individual series
We arrange the sample in ascending order of the variable of interest. Then the median is the
middle value (if the sample size n is odd) or the average of the two middle values (if the sample
size n is even).
For individual series the median is obtained by
𝑛+1 𝑡ℎ
a/ 𝑥̃ = ( ) value if n is odd, and
2
𝑛 𝑡ℎ 𝑛
( ) 𝑣𝑎𝑙𝑢𝑒 + ( +1)𝑡ℎ 𝑣𝑎𝑙𝑢𝑒
2 2
b/ 𝑥̃ = if n is even
2
Example 3.10: Find the median for the following data.
a/ -5 15 10 5 0 2 1 4 6 and 8
b/ 5 2 2 3 1 8 4

Solution;
i. The data in ascending order is given by:

-5 0 1 2 4 5 6 8 10 15
n=10 ➔n is even. The two middle values are 5th and 6th observations. So the median
is,

10 10
( )𝑡ℎ +( +1)𝑡ℎ 5𝑡ℎ +6𝑡ℎ 4+5
2 2
𝑥̃ = value = = = 4.5
2 2 2

ii. The data in ascending order is given by:


1 2 2 3 4 5 8
The middle value is the 4th observation. So the median is 3.

[email protected]

8
Chapter three: Measures of central Tendency

Note: The median is easy to calculate for small samples and is not affected by an "outlier".
Median for Discrete data arranged in a frequency distribution:- In this case also, the median
is obtained by the above formula. After arranging the values in an increasing order find the
smallest CF greater than or equal to that value obtained by a & b above formula and the
corresponding value is the median.
Median for grouped continuous data:-For continuous data, the median is obtained by the
following formula.
w n 
Median = L +  − CF  = ~
x
f med  2 
Where: L= the lower class boundary of the median class; w = the class width of the median
class;
f med = the frequency of the median class; and CF = the cum. freq. corresponding to the class
preceding the median class. That is, the sums of the frequencies of all classes lower than the
median class. Where the median class is the class which contains the (n/2)th observation whether n
is odd or even, since the items have already lost their originality once they are grouped in to
continuous classes.
Example 3.11: Calculate the median for the following frequency distribution.

C.I 1-5 6 - 10 11 – 15 16 – 20 21 - 25 26 - 30 31 - 35 Total

Freq. 4 8 12 6 3 4 3 40

Solution: Construct the less than cumulative frequency distribution, then:

C.I 1-5 6 - 10 11 – 15 16 – 20 21 - 25 26 - 30 31 - 35 Total

Freq. 4 8 12 6 3 4 3 40

Cuml. Freq. 4 12 24 30 33 37 40

Since n = 40, 40/2 = 20, and the smallest CF greater than or equal to 20 is 24; thus, the median class
is the third class. And for this class, L = 10.5, w = 5, f med =12, CF = 12. Then applying the formula,

we get:
~
x =10.5+(20-12)*5/12=13.8
Merits of median
• It is less affected by extreme values.

[email protected]

9
Chapter three: Measures of central Tendency

• Median can be calculated even in case of open-ended intervals.


• It can be computed for ratio, interval, and ordinal level of data.
Demerits of median
• Its value is not determined by each & every observation.
• It is not a good representative of the data if the number of items (data) is small.
• The arrangement of items in order of magnitude is sometimes very tedious process if the
number of items is very large.
3.4.3 The Mode or modal value
The mode or the modal value is the value with the highest frequency and denoted by 𝑥̂. A data
set may not have a mode or may have more than one mode. A distribution is called a bimodal
distribution if it has two data values that appear with the greatest frequency. If a distribution has
more than two modes, then the distribution is multimodal. If a distribution has no modes, then
the distribution is no modal.
Mode of individual series:- The mode or the modal value of individual series (raw data) is simply
obtained by locating the observation with the maximum frequency.
Example 3.12: Consider the following data:
a. 30 45 69 70 32 18 32. The mode (𝑥̂ ) = 32.
b. 10 20 30 10 40 30. The mode (𝑥̂ ) = 10 and 30.
c. 10 40 30 20 50 60. No mode.
Note that in some samples there may be more than one mode or there may not be a mode. The
mode is not a suitable measure of central tendency in these cases. We use the mode as a measure
of central tendency if we require a measure that takes on one of the sample values. The mode can
be used for variables that are measured on a category (nominal) scale, e.g. the most popular
computer type.
Mode for discrete data arranged in a frequency distribution:-In the case of discrete grouped
data, the mode is determined just by looking to that value (s) having the highest frequency.
Mode for Grouped Continuous Frequency Distribution
For grouped data, the mode is found by the following formula:
In such cases, one can only determine the modal class easily: the class with the highest frequency.
After locating this class, the mode is interpolated using:
1
Mode = L +  w , where L = the lower class boundary of the modal class;
1 +  2
1 = f mod − f 1 ,  2 = f mod − f 2 , w = the common class width, f 1 = frequency of the class

[email protected]

10
Chapter three: Measures of central Tendency

immediately preceding the modal class; f 2 = frequency of the class immediately succeeding the
modal class; and fmode = frequency of the modal class.
Example 3.13: Calculate the mode for the frequency distribution of data of example 3.11.
Solution: By inspection, the mode lies in the third class, where L =10.5, fmod = 12, f1=8, f2=6, w = 5
Using the formula, the mode is:
1
Mode = L +  w = 10.5 + (12-8)*5/(12-8)+(12-5) = 12.5
1 +  2
Merits of mode
• Mode is not affected by extreme values.
• We can change the size of the observations without changing the mode.
• It can be computed for all level of data i.e. ratio, interval, ordinal or nominal.
Demerits of mode
• It may not exist.
• It does not take every value into consideration.
• Mode may not exist in the series and if it exists it may not be unique.
3.4 The Relationship of the Mean, Median and Mode
Comparing the Mean, Median, and the Mode
• If the data is skewed –avoid the mean.
• If there is high gap around the middle- avoid the median.
• A measure is a resistant measure if its value is not affected by an outlier or an extreme
data value.
• The mean is not a resistant measure of central tendency because it is not resistant to the
influence of the extreme data values or outliers.
• The median is resistant to the influence of extreme data values or outliers and its value
does not respond strongly to the changes of a few extreme data values regardless of how
large the change may be.
• The mode has an advantage over both the mean and the median when the data is
categorical since it is not possible to calculate the mean or median for this type of data.
Also, the mode usually indicates the location within a large distribution where the data
values are concentrated. However, the mode can not always be calculated because if a
distribution has all different data values, then the distribution is non modal.
• In the case of symmetrical distribution; mean, median and mode coincide. That is
mean=median = mode. However, for a moderately asymmetrical (non symmetrical)
distribution, mean and mode lie on the two ends and median lies between them and they
have the following important empirical relationship, which is
Mean – Mode = 3(Mean - Median)

[email protected]

11
Chapter three: Measures of central Tendency

Example 3.14: In a moderately asymmetrical distribution, the mean and the mode are 30 and 42
respectively. What is the median of the distribution?
Solution:
Median = (2mean + Mode)/2 = (2*30 + 42)/3 = 34
Hence the median of the distribution is 34.
Which of the Three Measures is the Best?
At this stage, one may ask as to which of these three measure of central tendency is the best.
There is no simple answer to this question. It is because these three measures are based upon
different concepts. The arithmetic mean is the sum of the values divided by the total number of
observations in the series. The median is the value of the middle observations tend to
concentrate. As such; the use of a particular measure will largely depend on the purpose of the
study and the nature of the data. For example, when we are interested in knowing the consumers’
preferences for different brands of television sets or kinds of advertising, the choice should go in
favor of mode. The use of mean and median would not be proper. However, the median can
sometimes be used in the case of qualitative data when such data can be arranged in an ascending
or descending order. Let us take another example. Suppose we invite applications for a certain
vacancy in our company. A large number of candidates apply for that post. We are now
interested to know as to which age or age group has the largest concentration of applicants. Here,
obviously the mode will be the most appropriate choice. The arithmetic mean may not be
appropriate as it may be influenced by some extreme values.
3.5 Measures of Non-central Locations
Median is the value of the middle item which divides the data in to two equal parts and found by
arranging the data in an increasing or decreasing order of magnitude, where as quintiles are
measures which divides a given set of data in to approximately equal subdivision and are
obtained by the same procedure to that of median. They are averages of position (non-central
tendency). Some of these are quartiles, deciles and percentiles.
Quartiles: are values which divide the data set in to approximately four equal parts, denoted by
𝑄1 , 𝑄2 𝑎𝑛𝑑 𝑄3 . The first quartile (𝑄1) is also called the lower quartile and the third quartile
(𝑄3 ) is the upper quartile. The second quartile ( 𝑄2 ) is the median.
• Quartiles for Individual series:
Let x1 , x 2 ,  , x n be n ordered observations. The ith quartile (Qi ) is the value of the item
corresponding
with the [i(n+1)/4]th position, i = 1, 2, 3.
That is, after arranging the data in ascending order, Q1, Q2, & Q3 are, obtained by:
1(𝑛+1) 𝑡ℎ 2(𝑛+1) 𝑡ℎ 3(𝑛+1) 𝑡ℎ
𝑄1 = ( ) 𝑣𝑎𝑙𝑢𝑒, 𝑄2 = ( ) 𝑣𝑎𝑙𝑢𝑒 and 𝑄3 = ( ) 𝑣𝑎𝑙𝑢𝑒.
4 4 4

• Quartiles for discrete data arranged in a frequency distribution:-Arranged in a frequency


distribution this case also, we will follow the same procedure as the median. That is, we construct

[email protected]

12
Chapter three: Measures of central Tendency

the less than cumulative frequency distribution and apply the formula of quartile for individual
series.
• Quartiles in continuous data:- For continuous data, use the following formula:
w  in 
Qi = L +  − CF 
f Qi  4 

Where i = 1,2, 3, and L, w ,fQi and CF are defined in the same way as the median.
𝑤 𝑛 𝑤 2𝑛 𝑤 3𝑛
i.e. Q1 = L +𝑓 ( 4 − 𝐶𝐹) , Q2 = L + 𝑓 ( 4 − 𝐶𝐹) 𝑎𝑛𝑑 Q3 = L + 𝑓 ( 4 − 𝐶𝐹)
𝑄1 𝑄2 𝑄3

The class under question is the one including (ixn/4)th value. That is, the class with the minimum
frequency greater than or equal to (ixn/4) th is the class of the ith quartile.
Deciles: are values dividing the data approximately in to ten equal parts, denoted by 𝐷1 , 𝐷2,…, 𝐷9 .
• Deciles for Individual Series:
Let x1 , x 2 ,  , x n be n ordered observations. The ith decile (𝐷𝑖 ) is the value of the item
corresponding
with the [i(n+1)/10]th position, i = 1, 2, . . . ,9.
That is, after arranging the data in ascending order, D1, D2, . . . & D9 are, obtained by:
1(𝑛+1) 𝑡ℎ 2(𝑛+1) 𝑡ℎ 9(𝑛+1) 𝑡ℎ
𝐷1 = ( ) 𝑣𝑎𝑙𝑢𝑒, 𝐷2 = ( ) 𝑣𝑎𝑙𝑢𝑒 . . . and 𝐷9 = ( ) 𝑣𝑎𝑙𝑢𝑒.
10 10 10

• Deciles for Discrete data arranged in a frequency distribution:-Arranged in a frequency


distribution this case also, we will follow the same procedure as the median. That is, we construct
the less than cumulative frequency distribution and apply the formula of deciles for individual
series.
• Deciles for continuous data: Apply the following formula and follow the procedures of quartile
for continuous data.
𝑤 𝑖𝑛
𝐷𝑖 = 𝐿 + (10 − 𝐶𝐹) ,i = 1, 2,...,9 . Then
𝑓𝐷𝑖

Define the symbols in similar ways as we did in the case of quartiles for continuous data.
Percentiles: are values which divide the data approximately in to one hundred equal parts, and
denoted by 𝑃1 , 𝑃2,…, 𝑃99 .
• Percentiles for Individual Series:
Let x1 , x 2 ,  , x n be n ordered observations. The ith percentile (𝑃𝑖 ) is the value of the item
corresponding with the [i(n+1)/100]th position, i = 1, 2, . . . ,99.
That is, after arranging the data in ascending order, P1, P2, . . . & P99 are, obtained by:

[email protected]

13
Chapter three: Measures of central Tendency

1(𝑛+1) 𝑡ℎ 2(𝑛+1) 𝑡ℎ 99(𝑛+1) 𝑡ℎ


𝑃1 = ( ) 𝑣𝑎𝑙𝑢𝑒, 𝑃2 = ( ) 𝑣𝑎𝑙𝑢𝑒 . . . and𝑃99 = ( ) 𝑣𝑎𝑙𝑢𝑒.
100 100 100

• Percentiles for Discrete data arranged in a frequency distribution:-Arranged in a frequency


distribution this case also, we will follow the same procedure as the median. That is, we construct
the less than cumulative frequency distribution and apply the formula of percentile for individual
series.
• Percentiles for continuous data: Apply the following formula
𝑤 𝑖𝑛
𝑃𝑖 = 𝐿 + ( − 𝐶𝐹) ,i = 1, 2,...,99 . Then
𝑓𝑃𝑖 100

Define the symbols similar ways as we did in the case of quartiles or deciles for continuous data.
Interpretations
1. 𝑄𝑖 is the value below which ( i × 25) percent of the observations in the series are found
(where i = 1, 2,3). For instance 𝑄3 means the value below which 75 percent of observations in
the given series are found.
2. 𝐷𝑖 is the value below which ( i ×10) percent of the observations in the series are found (where
i = 1, 2,...,9 ). For instance 𝐷4 is the value below which 40 percent of the values are found in the
series.
3. 𝑃𝑖 is the value below which i percent of the total observations are found (where i = 1, 2,3,...,99
). For example 60 percent of the observations in a given series are below 𝑃60 .

Example 3.15: Calculate 𝑄1 , 𝑄2 , 𝑄3, 𝐷4, 𝐷9, 𝑃40 & 𝑃90 for the following data given on the table
below.
x 10 11 12 13 14 15 16 17 18
f 2 8 25 48 65 40 20 9 2

Solution: The data is arranged in an increasing order. So we need to construct only the
cumulative frequency table before calculating the required values.
x 10 11 12 13 14 15 16 17 18
f 2 8 25 48 65 40 20 9 2
Cum. 2 10 35 83 148 188 208 217 219
Freq.

The total number of observations is 219 which is odd. Clearly then the median is 14. i.e.
𝑛+1 𝑡ℎ 219+1 𝑡ℎ
𝑥̃ = ( ) =( ) value = 110th value = 14
2 2
1(𝑛+1) 𝑡ℎ 1(219+1) 𝑡ℎ
𝑄1 = ( ) 𝑣𝑎𝑙𝑢𝑒 = ( ) 𝑣𝑎𝑙𝑢𝑒 = 55th value = 13
4 4
2(𝑛+1) 𝑡ℎ 2(219+1) 𝑡ℎ
𝑄2 = ( ) 𝑣𝑎𝑙𝑢𝑒 = ( ) 𝑣𝑎𝑙𝑢𝑒 = 110th value = 14 = 𝑥̃
4 4

[email protected]

14
Chapter three: Measures of central Tendency

3(𝑛+1) 𝑡ℎ 3(219+1) 𝑡ℎ
𝑄3 = ( ) 𝑣𝑎𝑙𝑢𝑒 = ( ) 𝑣𝑎𝑙𝑢𝑒 = 165th value = 15
4 4
4(𝑛+1) 𝑡ℎ 4(219+1) 𝑡ℎ
𝐷4 = ( ) 𝑣𝑎𝑙𝑢𝑒 = ( ) 𝑣𝑎𝑙𝑢𝑒 = 88th value = 14
10 10
9(𝑛+1) 𝑡ℎ 9(219+1) 𝑡ℎ
𝐷9 = ( ) 𝑣𝑎𝑙𝑢𝑒 = ( ) 𝑣𝑎𝑙𝑢𝑒 = 198th value = 16
10 10
40(𝑛+1) 𝑡ℎ 40(219+1) 𝑡ℎ
𝑃40 = ( ) 𝑣𝑎𝑙𝑢𝑒 = ( ) 𝑣𝑎𝑙𝑢𝑒 = 88th value = 14
100 100
90(𝑛+1) 𝑡ℎ 90(219+1) 𝑡ℎ
𝑃90 = ( ) 𝑣𝑎𝑙𝑢𝑒 = ( ) 𝑣𝑎𝑙𝑢𝑒 = 198th value = 16
100 100

Example 3.16: Marks of 50 students out of 85 is given below. Based on the data find 𝑄1,
𝐷4 𝑎𝑛𝑑 𝑃7.
Marks 46-50 51-55 56-60 61-65 66-70 71-75 76-80
fi 4 8 15 5 9 5 4

Solution:- first find the class boundaries and cumulative frequency distributions.
Marks 46-50 51-55 56-60 61-65 66-70 71-75 76-80
class 45.5-50.5 50.5-55.5 55.5-60.5 60.5-65.5 65.5-70.5 70.5-75.5 75.5-80.5
boundary
fi 4 8 15 5 9 5 4
Cum. 4 12 27 32 41 46 50
frequency

Q1 Measure of (n/4)th value = 12.5th value which lies in group 55.5 – 60.5
𝑤 𝑛 5
Q1 = L +𝑓 ( 4 − 𝐶𝐹) = 55.5 +15 (12.5 − 12) = 55.7
𝑄1

D4 Measure of (4n/10)th value = 20th value which lies in group 55.5 – 60.5.
𝑤 4𝑛 5
D4 = L +𝑓 ( 10 − 𝐶𝐹) = 55.5 +15 (20 − 12) = 58.2
𝐷4
P7 Measure of (7n/100)th value = 3.5th value which lies in group 45.5 – 50.5
𝑤 7𝑛 5
P7 = L +𝑓 (100 − 𝐶𝐹) = 45.5 +4 (3.5 − 0) = 49.875.
𝑃7

[email protected]

15

You might also like