Lesson 7 - Measure of Central Tendency
Lesson 7 - Measure of Central Tendency
Semi-finals
Chapter 3 – Statistics
INTRODUCTION
TERMS TO REMEMBER
Classification of Variables
Qualitative Variables – words or codes that represent a class or category. (Ex. Gender and Religion)
Quantitative Variables – number that represent an amount or a count. (Ex. Height, Weight, and Size)
Discrete Variables – data that can be counted. (Ex. number of days)
Continuous Variables – it can assume all values between specific values. (Ex. Temperature)
Levels of Measurement
Nominal Level – is characterized by data that consists of names, labels or categories only (Ex. Status)
Ordinal Level – involves data that arranged in some order. (Ex. Highest educational attainment)
Interval Level – This is the same in ordinal level, with an additional property that can determine
meaningful amounts of differences between the data. (Ex. IQ)
Ratio Level – This is an interval level modified to include the inherent zero point. (Ex. No. of Siblings)
Finding the measure of central tendency of a numerical data is the most basic statistical concepts in
statistics. It is often helpful to find numerical values that locate, in some sense, the center of set of a data.
Suppose Pedro is looking for his new farm lot for his future plan of making a huge business. After
looking at the land registry he found 5 same sized lot with different prices. After evaluating the land he
wants to identify what is the average price of the land of his ideal farm to be built.
Arithmetic mean is most commonly used measure of central tendency and is often referred as
“mean”. To find the mean of a set, find the sum of all the data values and divide by the number of data. In
this case we add all the 5 prices of the land listed above, Pedro would divide the sum of the prices by 5.
In statistics it is often necessary to find the sum of a set of numbers. The traditional symbol used to
indicate a summation is the Greek latter sigma ∑ . Thus the notation ∑ 𝑥, is called summation notation,
denotes the sum of all the numbers in a given set. We can define the mean using summation notation.
Mean
The mean of n numbers is the sum of the numbers divide by n.
∑𝑥
𝑀𝑒𝑎𝑛 =
𝑛
Statisticians often collect data from small portions of a large group in order to determine
information about the group. In such situations the entire group under consideration is known as the
population, and any subset of the population is called a sample. It is traditional to denote the mean of a
sample by 𝑥̅ (which is read as “x bar”) and to denote the mean population by the Greek letter m (lower case
mu)
Solution:
𝑇ℎ𝑒 6 𝑓𝑟𝑖𝑒𝑛𝑑𝑠 𝑎𝑟𝑒 𝑠𝑎𝑚𝑝𝑙𝑒 𝑜𝑓 𝑡ℎ𝑒 𝑝𝑜𝑝𝑢𝑙𝑎𝑡𝑖𝑜𝑛 𝑜𝑓 20 𝑠𝑡𝑢𝑑𝑒𝑛𝑡𝑠. 𝑈𝑠𝑒 𝑥̅ 𝑡𝑜 𝑟𝑒𝑝𝑟𝑒𝑠𝑒𝑛𝑡 𝑡ℎ𝑒 𝑚𝑒𝑎𝑛.
∑ 𝑥 92 + 84 + 65 + 76 + 88 + 90
𝑥̅ = = = 82.5
𝑛 6
A doctor ordered 4 separate blood tests to measure a patient’s total blood cholesterol levels. The test results
were
The Median
Another type of average is the median. Essentially the median is the middle number or the mean of
the two middle numbers in a list of numbers that have been arranged in numerical order from smallest to
largest or largest to smallest. Any list of numbers that is arranged in numerical order from smallest to largest
or largest to smallest is called ranked list.
Solution:
a. The list of 4, 8, 1, 14, 9, 21, 12 contains 7 numbers. The median of a list of data with odd number
entries is found by ranking the numbers and finding the middle number.
1, 4, 8, 𝟗, 12, 14, 21
b. The list of 46, 23, 92, 89, 77, 108 contains 6 numbers. The median of a list of data with even number
entries is found by ranking the numbers and computing the mean of two middle numbers.
77 + 89
= 𝟖𝟑
2
The middle number is 83. Thus 83 is the median of the data
The mode of a list of numbers is the number that occurs most frequently.
Some of lists number doesn’t have a mode. For instance, in the lit 1, 6, 8, 10, 32, 15, 49, each
number occurs exactly once. Because no number appeared more often than the other numbers, there is no
mode.
A list of numerical data can have more than one mode. For instance 4, 2, 6, 2, 7, 9, 2, 4, 9, 8, 9, 7, the
number 2 occurs three times and the number 9 occurs three times. Each of the other numbers occurs less
than three times. Thus 2 and 9 are both modes for the data.
Solution:
a. In the list 18, 15, 21, 16, 15, 14, 15, 21, the number 15 occurs more often than the other numbers.
Thus 15 is the mode.
b. Each number of the list 2, 5, 8, 9, 11, 4, 7, 23 occurs only once. Thus there is no mode in the data.
ACTIVITY 3
The mean, the median, and the mode are all averages; however, they are not equal. The mean of a set of
data is the most sensitive of the averages. A change in any of the numbers changes the mean, and the mean
can be changed drastically by changing an extreme value.
In contrast, the median and the mode of a set of data are usually not changed by changing extreme
value.
When a data set has once or more extreme values that are very different from the majority of data
values, the mean will not necessarily be a good indicator of an average value. In the following example, we
compare the mean, median, and mode for the salaries of 5 employees of a small company.
506,000
= 101,200
5
The median is the middle number, $36,000. Because the $20,000 salary occur the most, the mode is
$20,000. The data contains one extreme value that is much larger than the other values.
A value called the weighted mean is often used when some data values are more important than others.
The weighted mean of the n numbers 𝑥 , 𝑥 , 𝑥 , … , 𝑥 with the respective assigned weights
𝑤 , 𝑤 , 𝑤 , … , 𝑤 is
∑(𝑥 ∙ 𝑤)
𝑊𝑒𝑖𝑔ℎ𝑡𝑒𝑑 𝑚𝑒𝑎𝑛 =
∑𝑤
Where ∑(𝑥 ∙ 𝑤) the sum of the products is is formed by multiplying each number by its assigned weight, and
∑ 𝑤 is the sum of all the weights.
𝑨 = 𝟒, 𝑩 = 𝟑, 𝑪 = 𝟐, 𝑫 = 𝟏, 𝑭 = 𝟎
A student’s grade point average (GPA) is calculated as a weighted mean, where the student’s grade in each is
given a weight equal to the number of units (or credits) that course is worth. Use this 4-point grading system
for Example 4 and Check Your Progress 4.
EXAMPLE 4
Table shows Dillon’s fall semester course grades. Use the weighted mean formula to find Dillon’s GPA for the
fall semester.
Solution:
The B is worth 3 points, with a weight of 4; the A is worth 4 points with a weight of 3; the D is worth 1 point,
with a weight of 3; and the C is worth 2 points, with a weight of 4. The sum of all the weight is 4 + 3 + 3 + 4,
or 14.
(3 × 4) + (4 × 3) + (1 × 3) + (2 × 4)
𝑊𝑒𝑖𝑔ℎ𝑡𝑒𝑑 𝑚𝑒𝑎𝑛 =
14
35
𝑊𝑒𝑖𝑔ℎ𝑡𝑒𝑑 𝑚𝑒𝑎𝑛 = = 2.5
14
Table below shows Janet’s spring semester course grades. Use the weighted mean formula to find Janet’s
GPA for the spring semester.
Raw data – a data that have not been organized or manipulated in any manner.
Frequency distribution – is a table that lists observed events and the frequency occurrence of each observed
event, is often used to organize raw data.
For instance, consider the following table, which lists the number of laptop computers owned by families in
each of 40 homes in a subdivision.
Table-1
Number of Laptop Computers per household
2 0 3 1 2 1 0 4
2 1 1 7 2 0 1 1
0 2 2 1 3 2 2 1
1 4 2 5 2 3 1 2
2 1 2 1 5 0 2 5
The frequency distribution in Table 2 was constructed using the data from Table 1. The first column of the
frequency distribution of the numbers 0, 1, 2, 3, 4, 5, 6, and 7. The corresponding frequency of occurrence, f,
of each of the numbers in the first column is listed in the second column.
Table-2
Frequency Distribution for Table-1
The formula for a weighted mean can be used to find the mean of the data in a frequency distribution. The
only change is that the weights 𝑤 , 𝑤 , 𝑤 , … , 𝑤 are replaced with the frequencies 𝑓 , 𝑓 , 𝑓 , … , 𝑓 . This
procedure is illustrated in the next example.
Solution:
The numbers in the right-hand column of Table-2 are the frequencies f for the numbers in the first column.
The sum of all the frequencies is 40.
∑(𝑥 ∙ 𝑓)
𝑀𝑒𝑎𝑛 =
∑𝑓
(0 ∙ 5) + (1 ∙ 12) + (2 ∙ 14) + (3 ∙ 3) + (4 ∙ 2) + (5 ∙ 3) + (6 ∙ 0) + (7 ∙ 1)
=
40
79
𝑀𝑒𝑎𝑛 = = 𝟏. 𝟗𝟕𝟓
40
The mean number of laptop computers per household for the homes in the subdivision is 1.975.
ACTIVITY 5
A housing division consists of 45 homes. The following frequency distribution shows the number of homes in
the subdivision that are two-bedroom homes, the number that are three-bedroom homes, the number that
are four-bedroom homes, and the number that are five-bedroom homes. Find the mean number of
bedrooms for the 45 homes.
Range – the range of a set of data values is the difference between the greatest data value and the least
data value
Solution:
Machine 1 Machine 2
Greatest number of ounces Machine 1 dispensed is 10.07 and the least
9.52 8.01 is 5.85.
6.41 7.99
10.07 7.95 𝑅𝑎𝑛𝑔𝑒 = 10.07 − 5.85 = 4.22𝑜𝑧
5.85 8.03
8.15 8.02
𝒙 = 𝟖. 𝟎 𝑥̅ = 8.0
ACTIVITY 6
The range of a set is easy to compute, but it can be deceiving. The range is a measure that depends only on
the two most extreme values, and as such it is very sensitive. A measure of dispersion that is less sensitive to
extreme values is the standard deviation.
If 𝑥 , 𝑥 , 𝑥 , … , 𝑥 is a sample of n numbers with mean 𝑥̅ , then the standard deviation of the sample is
∑( ̅)
𝑠= (2).
TAKE NOTE: You may question why a denominator of 𝑛 − 1 is used instead of n when we compute a sample
standard deviation. The reason is that a sample standard deviation is often used to estimate the population
standard deviation, and it can be shown mathematically that the use of 𝑛 − 1 tends to yield better
estimates.
Solution:
Step 1: The mean of the numbers is
2 + 4 + 7 + 12 + 15 40
𝑥̅ = = =8
5 5
Step 2: For each number, calculate the deviation between the number and the mean.
𝒙 𝒙−𝒙
2 2 − 8 = −6
4 4 − 8 = −4
7 7 − 8 = −1
12 12 − 8 = 4
15 15 − 8 = 7
Step 3: Calculate the square of each deviation in Step 2, and find the sum of these squared deviation.
𝒙 𝒙−𝒙 (𝒙 − 𝒙)𝟐
2 2 − 8 = −6 (−6) = 36
4 4 − 8 = −4 (−4) = 16
7 7 − 8 = −1 (−1) = 1
12 12 − 8 = 4 (4) = 16
15 15 − 8 = 7 (7) = 49
Sum of the squared
118
deviation
Step 4: Because we have a sampler of 𝑛 = 5 values, divide the sum 118 by 𝑛 − 1, which is 4.
= 29.5
Step 5: The standard deviation of the sample is 𝑠 = √29.5. To the nearest hundredths the standard
deviation is 𝑠 = 5.43.
ACTIVITY 7
A student has the following quiz scores: 5, 8, 16, 17, 18, 20. Find the deviation for this population of quiz
scores.
In the next example we use standard deviations to determine which company produces batteries
that are most consistent with regard to their life expectancy.
EXAMPLE 8 - Use Standard Deviations
A consumer group has tested a sample of 8 size-D batteries from each of 3 companies. The result of the tests
is shown in the following table. According to these tests, which company produces batteries for the values
representing hours of constant use have the smallest standard deviation?
Solution:
The mean for each sample of batteries is 7 h.
The batteries from EverSoBright have a standard deviation of
The batteries from Dependable have the smallest standard deviation. According to these results the
Dependable company produces the most consistent batteries regard to life expectancy under constant use.
ACTIVITY 8
A consumer testing agency has tested the strengths of 3 brands of 𝑖𝑛𝑐ℎ 𝑟𝑜𝑝𝑒. The results of the tests are
shown in the following table. According to the sample test results, which company produces 𝑖𝑛𝑐ℎ 𝑟𝑜𝑝𝑒 per
which the breaking point has the smallest standard deviation?
Company 𝟏
Breaking point of 𝟖 inch rope in pounds
Trustworthy 122, 141, 151, 114, 108, 149, 125
Brand X 128, 127, 148, 164, 97, 109, 137
NeverSnap 112, 121, 138, 131, 134, 139, 135
The Variance
A statistics known as the variance is also used as a measure of dispersion. The variance for given set of data
is the square of the standard deviation of the data.
Notation for Standard Deviation and Variance
Solution:
In Example 7, we found 𝑠 = √29.5. The variance is the square of the standard deviation. Thus the variance is
𝑠 = √29.5 = 29.5 .
ACTIVITY 9
References
Book – Mathematics in the Modern World by R. Aufmann, J. Lockwood, R. Nation, D. Clegg, and S. S. Epp.