Module 4 - Data Management
Module 4 - Data Management
Bislig Campus
Maharlika, Bislig City
Mathematics in
the Modern
World
MODULE 4
Adam C. Macapili
Instructor
Mathematics in the Modern World, Surigao del Sur State University
Page | 1
Module 4
Module Overview
Data Management
In this Module
Objectives:
• Identify the essential parts of a table and describe the
different kinds of graphs for data presentation; and
• Analyze and interpret the data presented in a graph/table.
Introduction
When conducting a statistical analysis, investigation or report, the
analysis must collect data for the specific variable under investigation. In order
to explain circumstances, draw conclusions and draw inferences about events,
the researcher must arrange the data collected in some meaningful way. The
simplest and most commonly used way to arrange data is to create a frequency
distribution. A frequency distribution is a grouping of the data into categories
showing the number of observations in each of the non-overlapping classes.
After organizing data, the next move of the researcher is to present the
data so they can be understood easily by those who will benefit from reading
the study. The most useful method of presenting data is by constructing graphs
and charts. There are number of ways to plot graphs and charts, and each one
has a specific purpose.
ABSTRACTION
A. Organization of Data
Before we get started in constructing frequency
distribution, we must define some terms that are essential to
understand deeper the nature of data that are displayed in a
frequency distribution.
A grouped frequency distribution is used when the range of the data set
is large; the data must be grouped into classes whether it is categorical data or
interval data. For interval data the class is more than one unit in width. The
procedure for constructing the frequency distribution is discussed in the
succeeding sections.
Example 1:
Twenty applicants were given a performance evaluation appraisal. The
data set is
High High High Low Average
Average Low Average Average Average
Low Average Average High High
Low Low Average High High
• When using a frequency distribution, we may not know what exactly the
smallest value is and the highest value is unless we refer back to the
original ungrouped data.
• A simple frequency distribution essentially has two columns, one for the
classes which are also referred to as class intervals, and another for the
class frequencies indicate the number of observations falling within the
different class intervals.
The following table shows the class limits, class boundaries, and class
marks of the frequency distribution of table 1.
1. Calculate the range of the data by subtracting the lowest value from the
highest value.
2. Decide on the number of class intervals. The use of 5 to 20 class
intervals is often justified depending on the nature of the data.
3. Divide the range by the desired number of class intervals. The result may
now be employed as the interval size 𝑖.
4. Choose an appropriate lower limit for the first-class interval. This number
should be less than, or equal to, the lowest value in the data. The general
practice, however, is to use, if possible, a lower limit that is divisible by
the interval size. The upper limit of the class interval is obtained by
adding 𝑖 – 1 to the lower limit.
5. Determine the rest of the class interval.
6. Count the number of observations or measurements falling within each
class interval and enter the results in the frequency column. This is
facilitated by providing a tally column to the right of the class intervals.
Example 2:
A sample of 40 companies belonging to a certain industry reported the
following numbers of employees.
43 58 21 24 31 49 40 51 55 28
50 33 62 30 25 39 59 29 36 42
38 46 42 61 50 41 37 35 40 52
47 35 57 55 36 45 32 45 42 36
Solution:
Step 1: Calculate the range of the data by subtracting the lowest value
from the highest value.
Mathematics in the Modern World, Surigao del Sur State University
Page | 7
𝑅𝑎𝑛𝑔𝑒 = 𝐻𝑉 − 𝐿𝑉 = 62 − 21 = 41
Step 2: Decide on the number of class intervals. The use of 5 to 20 class
intervals is often justified depending on the nature of the data. In this
example, we are suggested to use 5.
Step 3: Divide the range by the desired number of class intervals. The
result may now be employed as the interval size 𝑖.
𝑅𝑎𝑛𝑔𝑒 41
= 5 = 8.2 ≈ 9 = 𝑖
5
Step 4: Choose an appropriate lower limit for the first-class interval. This
number should be less than, or equal to, the lowest value in the data.
The general practice, however, is to use, if possible, a lower limit that is
divisible by the interval size. The upper limit of the class interval is
obtained by adding 𝑖 – 1 to the lower limit.
Class Interval
(𝑥)
18 − 26
Class Interval
(𝑥)
18 − 26
27 − 35
36 − 44
45 − 53
54 − 62
Histogram
A histogram is a graph in which the classes are marked on the horizontal
axis (𝑥 − 𝑎𝑥𝑖𝑠) and the class frequencies on the vertical axis (𝑦 − 𝑎𝑥𝑖𝑠). The
height of the bars represents the class frequencies, and the bars are drawn
adjacent to each other. Nevertheless, the histogram focusses on the frequency
of each class and sacrifices whatever information is contained in the actual
observation.
Frequency Polygon
A frequency polygon is a graph that displays the data using points which
are connected by lines. The frequencies are represented by the heights of the
points at the midpoints of the classes. The vertical axis represents the
frequency of the distribution while the horizontal axis represents the midpoints
of the frequency distribution.
Example 3:
Shown below is the frequency distribution in Example 2.
Histogram
14
12
10
Frequency
8
6
0
22 31 40 49 58
Class Marks or Midpoints
F req u en cy Po l yg o n
14
12
10
Frequency
0
22 31 40 49 58
Class Marks or Midpoints
Cumulative Frequency
40
35
30
25
20
15
10
5
0
26.5 35.5 44.5 53.5 62.5
Upper Class Boundaries
2. Bar Chart (Bar Graph). A bar chart is similar to bar histogram. The
bases of the rectangles are arbitrary intervals whose centers are the
codes. The height of each rectangle represents the frequency of that
category. It is also applicable for categorical data (or nominal-level).
3. Pie Chart (Circle Graph). A pie chart is a circle divided into portions that
represent the relative frequencies (or percentages) of the data belonging
to different categories. The data in a pie chart should be categorical or
nominal-level.
4. Time Series Graph. A time series graph represents data that occur over
specific period of time under observation. In addition, it shows a trend or
pattern on the increase or decrease over the period of time.
Solution:
a. Constructing a Pareto Chart
Step 1: Arrange the data from highest to lowest according to frequency.
Products Sales
Candy 150
Chocolate 130
Ice Cream 105
Junk Foods 75
Others 40
Favorite Snacks
160
140
120
100
Sales
80
60
40
20
0
Candy Chocolate Ice Cream Junk Foods Others
Products
It can easily be seen in the pareto chart that candy is the most preferred
snacks followed by chocolate while other kinds of snacks are least preferred by
the students from the given population.
Mathematics in the Modern World, Surigao del Sur State University
Page | 13
b. Constructing a Bar Chart
Step 1: Draw and label x-axis (Products) and y-axis (Sales).
Step 2: Make a bar with the same width and draw the height
corresponding to the frequencies.
Favorite Snacks
160
140
120
100
Sales
80
60
40
20
0
Junk Foods Candy Ice Cream Chocolate Others
Products
The same observation can also be seen in the bar chart that candy is
the most preferred snacks followed by chocolate while other kinds of snacks
are least preferred by the students from the given population.
Step 3: Using a protractor, graph each section and write its name and
appropriate percentage
Favorite Snacks
Junk Foods,
Others, 8%
15%
Chocolate,
26%
Candy, 30%
Ice Cream,
21%
Since the candy has the biggest slice in the pie chart, it is the most
preferred snacks followed by chocolate while other kinds of snacks are least
preferred by the students from the given population.
Example 5:
Using the information in the table below about the US dollar and
Philippine peso exchange rate from January to December of 2017, construct a
time series graph.
Solution:
Step 1: Draw and label the x-axis and y-axis.
Step2: Label the x-axis for months and y-axis for Peso per US Dollar.
Step 3: Plot each point according to the table.
Step 4: Draw the segments connecting adjacent points.
45
44
43
42
41
40
39
38
Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
Months
It can be seen in the table that April has the highest exchange rate of US
dollar to Philippine peso and it is in the lowest in the months of January,
February, and August.
Example 6:
The information in the table below show the number of male students of
a certain College in Bislig City from 2016 to 2020. Construct a pictograph.
Year Male
2016 300
2017 375
2018 525
2019 600
2020 300
Solution:
Step 1: Draw and label the x-axis and y-axis.
Step 2: Label the x-axis for Students and y-axis for Years.
Pictograph
2020
2019
Years
2018
2017
2016
It can be noted in the pictograph that more male students in 2018 and
2019, while less male students in 2016 and 2020 in a certain College in Bislig
City.
Example 7:
The owner of a chain of halo-halo stores would like to study the effect of
atmospheric temperature on sales during the summer season. A random
sample of 10 days is selected with the results given as follows:
Day 1 2 3 4 5 6 7 8 9 10
Temperature (℉) 79 76 78 84 90 83 93 94 97 85
Total Sales 147 143 147 168 206 155 192 211 209 187
200
150
50
0
0 20 40 60 80 100 120
Temperature (℉)
APPLICATION
43 58 21 24 31 49 40 51 55 28
50 33 62 30 25 39 59 29 36 42
38 46 42 61 50 41 37 35 40 52
47 35 57 55 36 45 32 45 42 36
Sketch the pareto chart, bar chart, and pie chart and interpret the data.
Student 1 2 3 4 5 6 7 8 9 10 11 12 13 14
General Math 90 80 75 78 79 84 86 93 95 76 84 81 84 87
Stat & Prob 88 84 76 77 76 83 88 95 85 78 89 84 87 89
4. The number of postpaid cellular phone subscribers for each of the last
12 years is listed below. Use the time series graph to represent these
figures. Interpret the result.
Year 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015
No. of 3.12 4.10 4.23 3.96 3.87 3.50 4.67 4.99 4.86 4.96 5.01 5.18
Subscribers
Well done! You have just finished Lesson 1 in this module. Now if you
are ready, please proceed to Lesson 2 of this module which will discuss about
Measures of Central Tendency.
Objectives:
• Analyze the data using mean, median, and mode; and
• Solve mathematical problems involving measures of
central tendency.
Introduction
One of the most basic statistical concepts involves finding measures of
central tendency of a set of numerical data. But what are the measures of
central tendency? You will find the answer to this question in this lesson.
ABSTRACTION
A. Mean
The arithmetic mean, often called as the mean, is the most frequently
used measure of central tendency. The mean is the only common measure in
which all values play an equal role, meaning, to determine its values you would
need to consider all the values of any given date set. The mean is appropriate
to determine the central tendency of an interval or ratio data.
The symbol 𝑥̅ , called "𝑥 𝑏𝑎𝑟", is used to represent the mean of a sample
and the symbol 𝜇, called "𝑚𝑢", is used to denote the mean of a population.
Properties of Mean
1. A set of data has only one mean.
2. Mean can be applied for interval and ratio data.
3. All values in the data set are included in computing the mean.
4. The mean is very useful in comparing two or more data set.
Mathematics in the Modern World, Surigao del Sur State University
Page | 20
5. Mean is affected by the extreme small or large values on a data set.
6. Mean is most appropriate in symmetrical data.
𝑆𝑢𝑚 𝑜𝑓 𝑎𝑙𝑙 𝑣𝑎𝑙𝑢𝑒𝑠
𝑀𝑒𝑎𝑛 = 𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑣𝑎𝑙𝑢𝑒𝑠
∑𝑥 ∑𝑥
Sample mean: 𝑥̅ = Population mean: 𝜇 =
𝑛 𝑁
where:
𝑥̅ = sample mean
𝜇 = population mean
𝑥 = the value of any particular observation or
measurement
∑ 𝑥 = sum of all 𝑥′𝑠
𝑛 = total number of values in the sample
𝑁 = total number of values in the population
Example 1:
The daily salaries of a sample of eight employees are 550, 420, 560,
500, 700, 670, 860, 480. Find the mean daily rate of employees.
Solution:
∑𝑥 𝑥1 +𝑥2 +𝑥3+𝑥4 +𝑥5 +𝑥6+𝑥7 +𝑥8
𝑥̅ = =
𝑛 𝑛
550+420+560+500+700+670+860+480 4740
𝑥̅ = = = 592.50
8 8
Example 2:
Find the population mean of the ages of 9 middle-management
employees of a certain company. The ages are 53, 45, 59, 48, 54, 46, 51, 58,
and 55.
Solution:
∑𝑥 𝑥1 +𝑥2 +𝑥3 +𝑥4 +𝑥5+𝑥6 +𝑥7 +𝑥8+𝑥9
𝜇= =
𝑁 𝑁
53+45+59+48+54+46+51+58+55
𝜇= = 52.11
9
Example 3:
Six friends in a biology class of 20 students received test grades of 92,
84, 65, 76, 88, and 90.
Solution:
Mathematics in the Modern World, Surigao del Sur State University
Page | 21
The 6 friends are a sample of the population of 20 students.
So, use 𝑥 𝑏𝑎𝑟"to represent the mean,
∑𝑥 92+84+65+76+88+90
𝑥̅ = = = 82.5
𝑛 6
B. Median
The median is the midpoint of the data array. When the data set is
ordered, whether ascending or descending, it is called a data array. Median is
an appropriate measure of central tendency for data that are ordinal or above,
but it is more valuable in an ordinal type of data.
Properties of Median
1. The median is unique, there is only one median for a set of data.
2. The median is found by arranging the set of data from lowest to highest
(or highest to lowest) and getting the value of the middle observation.
3. Median is not affected by the extreme small or large value.
4. Median can be applied for ordinal, interval and ratio data.
5. Median is most appropriate in a skewed data.
To determine the value of median for ungrouped, we need to consider two rules:
1. If 𝑛 is odd, the median is the middle ranked.
2. If 𝑛 is even, then the median is the average of the two middle ranked
values.
𝑛+1
𝑀𝑒𝑑𝑖𝑎𝑛 (𝑅𝑎𝑛𝑘 𝑣𝑎𝑙𝑢𝑒) = 2 .
Note that 𝑛 is the population/sample size.
Example 4:
Find the median of the ages of 9 middle-management employees of a
certain company. The ages are 53, 45, 59, 48, 54, 46, 51, 58, and 55.
Solution:
Step 1: Arrange the data in order.
45, 46, 48, 51, 53, 54, 55, 58, 59
Step 2: Select the middle rank value
𝑛+1 9+1 10
𝑀𝑒𝑑𝑖𝑎𝑛 (𝑅𝑎𝑛𝑘 𝑣𝑎𝑙𝑢𝑒) = = = =5
2 2 2
C. Mode
The mode is the value in a data set that appears most frequently. Like
the median and unlike the mean, extreme values in a data set do not affect the
mode. A data may not contain any mode if none of the values are “most typical”.
A data set that has only one value that occurs the greatest frequency is said to
be unimodal. If the data has two values with the same greatest frequency, both
values are considered the mode and the data set is bimodal. If a data set has
more than two modes, then the data is said to be multimodal. There are some
cases when the data set values have the same number frequency. When this
occurs, the data set is said to be no mode.
Properties of Mode
1. The mode is found by locating the most frequently occurring value.
2. The mode is the easiest average to compute.
3. There can be more than one mode or even no mode in any given data
set.
4. Mode is not affected by the extreme small or large values.
5. Mode can be applied for nominal, ordinal, interval and ratio data.
Example 7:
Find the mode of the data in the following lists.
(a) 18, 15, 21, 16, 15, 14, 15, 21
(b) 2, 5, 2, 3, 3, 2, 4, 5, 3
Solution:
a. In the list 18, 15, 21, 16, 15, 14, 15, 21, the number 15 occurs more
often than the other numbers. Thus, 15 is the mode.
b. In the list 2, 5, 2, 3, 3, 2, 4, 5, 3, the numbers 2 and 3 have the same
frequency. Thus, 2 and 3 are the modes.
Weighted Mean
Example 8:
At a certain company there are 18 construction workers, 12 painters, 7
supervisors, and 3 engineers. There monthly salaries are 30,500, 33,700,
38,600, and 45,000 respectively. What is the weighted mean salary?
Solution:
Let 𝑤1 = 18 𝑤2 = 12 𝑤3 = 7 𝑤4 = 3
𝑥1 = 30,500 𝑥2 = 33,700 𝑥3 = 38,600 𝑥4 = 45,000
30,500(18)+33,700(12)+38,600(7)+45,000(3)
𝑥̅𝑤 = = 33,965
18+12+7+3
Example 9:
A certain subdivision consists of 50 homes. The table shows the
frequency distribution of homes with respect to the number of bedrooms it has.
Find the mean number of bedrooms for the 50 homes.
No. of Bedrooms 2 3 4 5 6
No. of Homes 13 21 10 4 2
Solution:
Let 𝑤1 = 2 𝑤2 = 3 𝑤3 = 4 𝑤4 = 5 𝑤6 = 6
𝑥1 = 13 𝑥2 = 21 𝑥3 = 10 𝑥4 = 4 𝑥5 = 2
∑(𝑥.𝑤) 𝑥1 𝑤1 +𝑥2 𝑤2+𝑥3 𝑤3+𝑥4 𝑤4+𝑤5 𝑥5
𝑥̅𝑤 = ∑𝑤
= 𝑤1 +𝑤2 +𝑤3 +𝑤4 +𝑤5
2(13)+3(21)+4(10)+5(4)+6(2)
𝑥̅𝑤 = 13+21+10+4+2
= 3.22
APPLICATION
1. Find the mean, median, and mode/s, if any, for the given data. Round
noninteger means to the nearest tenth.
a. 8, 3, 3, 17, 9, 22, 19
b. 11, 8, 2, 5, 17, 39, 52, 42
c. 118, 105, 110, 118, 134, 155, 166, 166, 118
d. -12, -8, -5, -5, -3, 0, 4, 9, 21
e. -8.5, -2.2, 4.1, 4.1, 6.4, 8.3, 9.7
2. A college professor administered a unit exam to one of his classes and
found that the majority of the items were too easy. The scores are 45,39,
40, 48, 35, 37, 36, 37, 40, 44, 41, 49, 29, 28, 32, 36, 37, 41, 40, 36, 39,
30, 25, 43, and 50. Calculate the mean and median.
3. A professor grades student on 4 quizzes, a project, and a final
examination. Each quiz counts as 12% of the quiz grade. The project
counts as 22% of the course grade. The final examination counts as 30%
Mathematics in the Modern World, Surigao del Sur State University
Page | 25
of the course grade. Student A has quiz scores of 75, 80, 85, and 90.
His project score 95 and his final examination score is 92. Use the
weighted mean formula to find his average for the course.
4. Find the mean, median, and all modes for the data in the given frequency
distribution.
Points
scored in
a Frequency
basketball
game
2 6
4 5
5 6
9 3
10 1
14 2
19 1
5. Find the mean, median, and all modes for the data in the given frequency
distribution.
Scores on
a MMW Frequency
quiz
2 1
4 2
6 7
7 12
8 10
9 4
10 3
Well done! You have just finished Lesson 2 in this module. Now if you
are ready, please proceed to Lesson 3 of this module which will discuss about
Measures of Dispersion.
Objectives:
• Analyze the data using range, variance, and standard
deviation; and
• Solve mathematical problems involving measures of
dispersion.
Introduction
In the previous lesson, the three types of average values of data set was
discussed. In this lesson, another important characteristic of a data set is how
it is distributed, or how far each element is from some measure of central
tendency. To measure the spread or dispersion of data, statistical values known
as range and the standard deviation will be introduced.
ABSTRACTION
A. Range
Probably the simplest and easiest way to determine measure of
dispersion is the range. The range is the difference of the highest value and the
lowest value in the data set.
Advantages of the range
1. It is easy to compute; and
Mathematics in the Modern World, Surigao del Sur State University
Page | 27
2. It is easy to understand.
Disadvantages of the range
1. It can be distorted by a single extreme value (or outlier); and
2. Only two values are used in the calculation.
Example 1:
The daily salaries of a sample of eight employees are 550, 420, 560,
500, 700, 670, 860, 480. Find the range.
Solution:
Step 1: Determine the highest value and the lowest value in the data set.
Highest value (HV) is 860 and the lowest value (LV) is 420
Step 2: Solve for the range
𝑅𝑎𝑛𝑔𝑒 = 𝐻𝑉 − 𝐿𝑉 = 860 − 420 = 440
Hence, the range in daily rate salary is 440.
Example 2:
Find the range of the numbers of ounces dispensed by Machine 1 in the
table below:
Machine 1 Machine 2
9.52 8.01
6.41 7.99
10.07 7.95
5.85 8.03
8.15 8.02
𝑥̅ = 8.0 𝑥̅ = 8.0
Solution:
The greatest number of ounces dispensed is 10.07 and the least is 5.85.
The range of the numbers of ounces dispensed is 10.07 − 5.85 = 4.22 𝑜𝑧.
B. Standard Deviation
One of the most widely used measures of dispersion is the standard
deviation. The more spread apart the data, the higher the deviation. Standard
deviation is calculated as the square root of variance.
Example 3:
The following numbers were obtained by sampling a population. 2, 4, 7,
12, 15. Find the standard deviation of the sample.
Solution:
Step 1: The mean of the numbers is
2+4+7+12+15
𝑥̅ = =8
5
Step 2: For each number, calculate the deviation between the number
and the mean.
𝑥 𝑥 − 𝑥̅
2 2 − 8 = −6
4 4 − 8 = −4
7 7 − 8 = −1
12 12 − 8 = 4
15 15 − 8 = 7
Step 3: Calculate the square of each deviation in Step 2, and find the
sum of these squared deviations.
𝑥 𝑥 − 𝑥̅ (𝑥 − 𝑥̅ )2
2 2 − 8 = −6 (−6)2 = 36
4 4 − 8 = −4 (−4)2 = 16
7 7 − 8 = −1 (−1)2 = 1
12 12 − 8 = 4 42 = 16
15 15 − 8 = 7 72 = 49
118
Solution:
The mean for each sample of batteries is 7 ℎ.
The batteries from Company A have a standard deviation of
(6.2−7)2+(6.4−7)2 +⋯+(9.3−7)2
𝑠1 = √ = 1.328 ℎ
7
The batteries from Company B have a standard deviation of
(6.8−7)2 +(6.2−7)2 +⋯+(8.2−7)2
𝑠2 = √ = 0.719 ℎ
7
The batteries from Company C have a standard deviation of
(6.1−7)2 +(6.6−7)2 +⋯+(8.5−7)2
𝑠3 = √ = 0.877 ℎ
7
The batteries from Company B have the smallest standard deviation.
According to these results, the Company B produces the most consistent
batteries with regard to life expectancy under constant use.
C. VARIANCE
A statistic known as the variance is also used as a measure of
dispersion. The variance for a given set of data is the square of the standard
deviation of data.
Example 5:
Find the variance for the sample given in Example 3.
Solution:
In Example 3, we found 𝑠 = √29.5. the variance is the square of the
2
standard deviation. Thus, the variance is 𝑠 2 = (√29.5) = 29.5
1. Find the range, the standard deviation, and the variance for the given
samples. Round noninteger results to the nearest tenth.
a. 1, 2, 5, 7, 8, 19, 22
b. 3, 4, 7, 11, 12, 12, 15, 16
c. 2.1, 3.0, 1.9, 1.5, 4.8
d. 48, 91, 87, 93, 59, 68, 92, 100, 81
e. −8, −5, −12, −1, 4, 7, 11
Find the mean and sample standard deviation of these data. Round to
the nearest hundredth.
3.5 4.9 4.5 5.0 2.8 3.5 2.2 3.9 5.3 2.9
Find the mean and sample standard deviation of these data. Round to
the nearest hundredth.
4. A survey of 16 energy drinks noted the caffeine concentration of each
drink in milligrams per ounce. The results are given in the table below.
Concentration of caffeine (mg/oz)
9.1 7.5 7.8 8.9 9.0 8.2 9.1 8.7
Find the mean and sample standard deviation of these data. Round to
the nearest hundredth.
Find the mean and sample standard deviation of these data. Round to
the nearest hundredth.
Well done! You have just finished Lesson 3 in this module. Now if you
are ready, please proceed to Lesson 4 of this module which will discuss about
Measures of Relative Position.
Objectives:
• Analyze the data using z-scores, percentiles, and
quartiles; and
• Solve mathematical problems involving measures of
relative position.
Introduction
ABSTRACTION
A. Percentiles
Most standardized examinations provide scores in terms of percentiles,
which are defined as follows:
▪ 𝑝𝑡ℎ Percentile
A value 𝑥 is called the 𝑝𝑡ℎ percentile of a data set provided 𝑝% of the
data values are less than 𝑥.
Mathematics in the Modern World, Surigao del Sur State University
Page | 33
Example 1:
In a recent year, the median annual salary for a physical therapist was
74,480. If the 90th percentile for the annual salary of a physical therapist was
105,900, find the percent of physical therapists whose annual salary was
a. more than 74,480
b. less than 105,900
c. between 74,480 and 105,900
Solution:
a. By definition, the median is the 50th percentile. Therefore, 50% of the
physical therapists earned more than 74,480 per year.
b. Because 105,900 is the 90th percentile, 90% of all physical therapists
made less than 105,900.
c. From parts a and b, 90% − 50% = 40% of the physical therapists
earned between 74,480 and 105,900.
Example 2:
On a reading examination given to 900 students, Elaine’s score of 602
was higher than the scores of 576 of the students who took the examination.
What is the percentile for Elaine’s score?
Solution:
𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑑𝑎𝑡𝑎 𝑙𝑒𝑠𝑠 𝑡ℎ𝑎𝑛 𝑥
𝑃𝑒𝑟𝑐𝑒𝑛𝑡𝑖𝑙𝑒 𝑜𝑓 𝑠𝑐𝑜𝑟𝑒 𝑥 = . 100
𝑡𝑜𝑡𝑎𝑙 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑑𝑎𝑡𝑎 𝑣𝑎𝑙𝑢𝑒𝑠
𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑑𝑎𝑡𝑎 𝑙𝑒𝑠𝑠 𝑡ℎ𝑎𝑛 602
𝑃𝑒𝑟𝑐𝑒𝑛𝑡𝑖𝑙𝑒 𝑜𝑓 𝑠𝑐𝑜𝑟𝑒 𝑥 = . 100
𝑡𝑜𝑡𝑎𝑙 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑑𝑎𝑡𝑎 𝑣𝑎𝑙𝑢𝑒𝑠
576
= 900 × 100 = 64
B. Quartiles
𝑘 (𝑁 + 1)
𝑄𝑘 =
4
where: 𝑄𝑘 = Quartile
𝑁 = population
𝑘 = quartile location
Step 3: Identify the first, second, and third quartiles values in the data
set.
45, 46, 48, 51, 53, 54, 55, 58, 59
C. z-Score
z-Score is used to know the position of one observation relative to others
in a set of data we apply z-score. Let say, we want to know a score of a student
of 42 compared to the scores of the other students in the class based from a
quiz on a total of 50 points. The mean and the standard deviation of the scores
can be used to compute the z-score, which will measure the relative standing
of a measurement in a data set.
A z-score measures the distance between an observation and the mean,
measured in units of standard deviation. The following formulas show how to
compute the z-score for a data value 𝑥 in a population and in a sample.
Mathematics in the Modern World, Surigao del Sur State University
Page | 35
𝑥−𝜇 𝑥−𝑥̅
𝑧= (for population) 𝑧= (for sample)
𝜎 𝑠
Example 4:
The monthly expenditures of a large group of households are normally
distributed with a mean of 48,700 and a standard deviation of 10,400. What is
the 𝑧 −value of monthly expenditures of 59,400 and 38,000?
Solution.
Using the formula of 𝑧 to determine 𝑧 −values for the two x values
(59,400 and 38,300) are computed as follows:
𝑥−𝜇 59,400−48,700
For 𝑥 = 59,400 𝑧= = = 1.00
𝜎 10,400
𝑥−𝜇 38,300−48,700
For 𝑥 = 38,300 𝑧= 𝜎
= 10,400
= −1.00
Example 5:
Raul has taken two tests in his chemistry class. He scored 72 on the first
test, for which the mean of all scores was 65 and the standard deviation was 8.
He received a 60 on a second test, for which the mean of all scores was 45 and
the standard deviation was 12. In comparison to the other students, did Raul
do better on the first test or the second test?
Solution:
Find the z-score for each test.
72−65 60−45
𝑧72 = 8
= 0.875 𝑧60 = 12
= 1.25
Raul scored 0.875 standard deviation above the mean on the first test
and 1.25 standard deviations above the mean on the second test. These z-
scores indicates that, in comparison to his classmates, Raul scored better on
the second test than he did on the first test.
Example 6:
A consumer group tested a sample of 100 light bulbs. It found that the
mean life expectancy of the bulbs was 842 h, with a standard deviation of 90.
One particular light bulb from a company had a z-score of 1.2. What was the
life span of this light bulb?
108 = 𝑥 − 842
950 = 𝑥
APPLICATION
3. The following table list the calories per 100 milliliters of 25 popular sodas.
Find the quartiles for the data.
6. A data set has a mean of 𝑥̅ = 6.8 and a standard deviation of 1.9. Find
the z-score for each of the following. Round to the nearest hundredth.
a. 𝑥 = 6.2
b. 𝑥 = 7.2
c. 𝑥 = 9.0
d. 𝑥 = 5.0
7. A data set has a mean of 𝑥̅ = 4010 and a standard deviation of 115. Find
the z-score for each of the following. Round to the nearest hundredth.
a. 𝑥 = 3840
b. 𝑥 = 4200
c. 𝑥 = 4300
d. 𝑥 = 4030
Well done! You have just finished Lesson 4 in this module. Now if you
are ready, please proceed to Lesson 5 of this module which will discuss about
Normal Distribution.
REFERENCES
Sirug, W. (2018). Mathematics in the Modern World, CHED Curriculum
Compliant. Mindshapers Co. Inc.
Mathematics in the Modern World, Philippine Edition.