Lec 3
Lec 3
Associate Professor
Department of Statistics, DU
Describing data: Measures of Central Tendency
• Goals:
✓ Understanding and finding the arithmetic mean, weighted mean, median, mode, and
geometric mean.
• The numerical measure of this tendency of concentration is variously known as the measure
of central tendency or measure of location or the measure of average.
✓ Arithmetic mean
✓ Median
✓ Mode
Mean
• The most popular and best-understood measure of central tendency for a quantitative data
set is the arithmetic mean (mean) of a data set.
𝑥1 + 𝑥2 + 𝑥3 + ⋯ + ⋯ + 𝑥𝑛 ∑𝑥
𝑥ത = =
𝑛 𝑛
• Example: Let us consider a hypothetical data of a sample of 9 subjects that contains scores
on the Peabody Picture Vocabulary Test-Revised (PPVT-R). The data are as follows:
✓ The sum of the deviations of each value from the mean is zero. Symbolically, ∑𝑛𝑖=1 𝑥𝑖 − 𝑥ҧ = 0
✓ If a set consists of 𝑛1 observations of the form 𝑥11 ,𝑥12 , … , 𝑥1𝑛1 with mean 𝑥ҧ1 and a second set consists
of 𝑛2 observations of the form 𝑥21 ,𝑥22 , … , 𝑥2𝑛2 with mean 𝑥ҧ 2 , then the mean of all the 𝑛1 + 𝑛2
𝑛1 𝑥ҧ1 +𝑛2 𝑥ҧ 2
observations called combined mean or pooled mean, is given by 𝑥ҧ 𝑐 =
𝑛1 +𝑛2
Median
• Median is the middle most value when the observations or a set of values are arranged in
ascending (or descending) order of magnitude.
✓ the number of observations below the position corresponding to median should be equal to the
number of observations above the position.
• Median for raw data: Let us consider 𝑛 observations on a variable. At first we have to
arrange the observations in ascending/descending order of magnitude and then identify
whether 𝑛 is even or odd
𝑛+1 𝑡ℎ
✓ If 𝑛 is odd: Median = observation.
2
𝑛 𝑡ℎ 𝑛 𝑡ℎ
✓ If 𝑛 is even: Median = Mean of 2
observation and 2
+ 1 observation
Median
• Find median for the data: 12, 7, 2, 34, 17, 21 and 19
✓ arrange the values in ascending order 2, 7, 12, 17, 19, 21, 34
𝑛+1 𝑡ℎ
✓ Median = observation = 4𝑡ℎ observation = 17
2
• Find median for the data: 12, 7, 2, 34, 17, 40, 21 and 18
✓ arrange the values in ascending order 2, 7, 12, 17, 18, 21, 34, 40
𝑛 𝑡ℎ 𝑛 𝑡ℎ
𝑣𝑎𝑙𝑢𝑒+ +1 𝑣𝑎𝑙𝑢𝑒 4𝑡ℎ 𝑣𝑎𝑙𝑢𝑒+5𝑡ℎ 𝑣𝑎𝑙𝑢𝑒 17+18
✓ Median = 2
2
2
= 2
= 2
= 17.5
Median
• Median for group data: To calculated median for grouped frequency distribution, we regard
the frequencies are evenly spread over the class intervals and the class intervals are formed
so that there could be no gaps in the intervals. For example 1-5, 6-10, 11-15 should be
replaced by 0.5-5.5, 5.5-10.5, 10.5-15.5. Then median is calculated as follows
𝑛Τ2−𝐹
Median = 𝐿 + ℎ
𝑓𝑚
𝑛 𝑡ℎ
✓ Median Class is the class that contains 2
observation of the given data.
Median
• Calculate the mean age (in years) of workers from the following data
Age Frequency
11-20 5
21-30 15
31-40 50
41-50 45
51-60 35
• To compute the median age of workers, we have constructed the following table
• The scores obtained by 5 students in a statistics test are 10, 7, 7, 7, and 0. The value “7”
has the highest frequency, therefore the mode is “7”
• Find the measure of central tendency from the following frequency distribution showing the
opinion of DU students regarding their curriculum load.
Laboratory service is excellent Frequency
Strongly agree 16
Agree 22
Undecided 33
Disagree 178
Strongly disagree 118
The category “disagree” has the highest frequency, therefore the mode is “disagree”
Mode for grouped data
• For grouped data mode is obtained by using the following formula
(𝑓0 − 𝑓−1)
𝑀𝑜 = 𝐿 + ℎ
𝑓0 − 𝑓−1 + (𝑓0 − 𝑓1 )
• The class boundaries of modal class is 30.5-40.5, the highest frequency belongs to this class.
50−15
• Mode = 30.5 + × 10 = 30.5 + 8.75 = 39.25
50−15 +(50−45)
Choosing measures of central tendency
• The mean is only suitable for only ratio or interval data. For this type of data, the median is
used as a measure of central tendency if some unusual values arise.
• The mode may be the only measure available where it is not possible to do arithmetic
operation on the data, as in the case of qualitative (nominal/ordinal) variable.
✓ When there are very large and very small values of observations (median can be used)
✓ When the distribution is unevenly spread and the concentration being small or large at
irregular points (see Figure-2). (median can be used).
I: 7, 5, 4, 4
II: 6, 8, 10, 4, 9, 5, 8, 6
III: 3, 7, 9, 5, 6
IV: 9, 10, 8
V: 2, 6, 5, 4
Which group has the least memory performance? Which group has the best memory
performance?
Exercise (cont…)
• Problem#04: Let us consider a hypothetical data of a sample of 30 subjects that contains
scores on the Peabody Picture Vocabulary Test-Revised (PPVT-R). The following frequency
distribution was constructed based on PPVT-R scores of the sampled data.
Calculate mean and median, which one is the appropriate measure of central tendency for this
data set? Why?
Thank You