0% found this document useful (0 votes)
494 views

Module 4 - Data Management

This document provides an overview of Module 4 on data management for a mathematics course. It introduces key concepts in descriptive statistics such as frequency distributions, measures of central tendency, dispersion, and relative position. The module aims to teach students how to organize, summarize, present, and interpret data through graphical and tabular methods. Specific lessons will cover frequency distributions, measures of central tendency and dispersion, probabilities and the normal distribution, and linear regression and correlation.

Uploaded by

Kobe
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
494 views

Module 4 - Data Management

This document provides an overview of Module 4 on data management for a mathematics course. It introduces key concepts in descriptive statistics such as frequency distributions, measures of central tendency, dispersion, and relative position. The module aims to teach students how to organize, summarize, present, and interpret data through graphical and tabular methods. Specific lessons will cover frequency distributions, measures of central tendency and dispersion, probabilities and the normal distribution, and linear regression and correlation.

Uploaded by

Kobe
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 38

Surigao Del Sur State University

Bislig Campus
Maharlika, Bislig City

Mathematics in
the Modern
World

MODULE 4
Adam C. Macapili
Instructor
Mathematics in the Modern World, Surigao del Sur State University
Page | 1
Module 4

Module Overview

Data Management

In this Module

• Introduction to Data Management


• Measures of Central Tendency
• Measures of Dispersion
• Measures of Relative Position
• Probabilities and Normal Distribution
• Linear Regression and Correlation

Statistics involves the collection, organization, summarization,


presentation, and interpretation of data. The branch of statistics that involves
the collection, organization, summarization, and presentation of data is called
descriptive statistics. The branch that interprets and draws conclusions from
the data is called inferential statistics.
At the completion of this module, you should be able to:
• Draw the graph/table to present the data;
• Discuss the properties of mean, median, and mode;
• Compute the different measures of dispersion;
• Analyze and interpret the data;
• Performs operations on mathematical expressions correctly.

Are you ready? Then start the lesson now!

Mathematics in the Modern World, Surigao del Sur State University


Page | 2
Lesson
Introduction to Data Management
1

Objectives:
• Identify the essential parts of a table and describe the
different kinds of graphs for data presentation; and
• Analyze and interpret the data presented in a graph/table.

Introduction
When conducting a statistical analysis, investigation or report, the
analysis must collect data for the specific variable under investigation. In order
to explain circumstances, draw conclusions and draw inferences about events,
the researcher must arrange the data collected in some meaningful way. The
simplest and most commonly used way to arrange data is to create a frequency
distribution. A frequency distribution is a grouping of the data into categories
showing the number of observations in each of the non-overlapping classes.
After organizing data, the next move of the researcher is to present the
data so they can be understood easily by those who will benefit from reading
the study. The most useful method of presenting data is by constructing graphs
and charts. There are number of ways to plot graphs and charts, and each one
has a specific purpose.

ABSTRACTION

A. Organization of Data
Before we get started in constructing frequency
distribution, we must define some terms that are essential to
understand deeper the nature of data that are displayed in a
frequency distribution.

▪ Raw data is the data collected in original form.


▪ Range is the difference of the highest value and the lowest value in a
distribution.
▪ Frequency distribution is the organization of data in a tabular form, using
mutually exclusive classes showing the number of observations in each.
▪ Class Limits (or Apparent Limits) is the highest and lowest values describing
a class.
Mathematics in the Modern World, Surigao del Sur State University
Page | 3
▪ Class Boundaries (or Real Limits) is the upper and lower values of a class
for group frequency distribution whose values has additional decimal place
more than the class limits and end with the digit 5.
▪ Interval (or width) is the distance between the class lower boundary and the
class upper boundary and it is denoted by the symbol 𝑖.
▪ Frequency (f) is the number of values in a specific class of a frequency
distribution.
▪ Percentage is obtained by multiplying the relative frequency by 100%.
▪ Cumulative Frequency (cf) is the sum of the frequencies accumulated up to
the upper boundary of a class in a frequency distribution.
▪ Midpoint is the point halfway between the class limits of each class and is
representative of the data within that class.

A grouped frequency distribution is used when the range of the data set
is large; the data must be grouped into classes whether it is categorical data or
interval data. For interval data the class is more than one unit in width. The
procedure for constructing the frequency distribution is discussed in the
succeeding sections.

Categorical Frequency Distribution


The categorical frequency distribution is used to organize nominal-level
or ordinary-level type of data. Some examples where we can apply this
distribution are gender, business type, political affiliation, and others.

Example 1:
Twenty applicants were given a performance evaluation appraisal. The
data set is
High High High Low Average
Average Low Average Average Average
Low Average Average High High
Low Low Average High High

Construct a frequency distribution for the data.


Solution:
Step 1: Construct a table as shown below.

Class Tally Frequency Percentage


High
Average
Low

Step 2: Tally the raw data.

Mathematics in the Modern World, Surigao del Sur State University


Page | 4
Class Tally Frequency Percentage
High IIII-II
Average IIII-III
Low IIII

Step 3: Convert the tallied data into numerical frequencies.

Class Tally Frequency Percentage


High IIII-II 7
Average IIII-III 8
Low IIII 5

Step 4: Determine the percentage. The percentage is computed using


the formula:
𝑓
𝑃𝑒𝑟𝑐𝑒𝑛𝑡𝑎𝑔𝑒 = 𝑛 𝑥100%

where: 𝑓 = frequency of the class and 𝑛 = total number of values.

Class Tally Frequency Percentage Found by


High IIII-II 7 35 (7 ÷ 20) × 100
Average IIII-III 8 40 (8 ÷ 20) × 100
Low IIII 5 25 (5 ÷ 20) × 100
Total 20 100

For the sample, more applicants received an average performance


rating.

Determining Class Interval


Generally, the number of classes for a frequency distribution table varies
from 5 to 20, depending primarily on the number of observations in the data set.
It is preferred to have more classes as the size of a data set increases. The
decision about the number of classes depends on the method used by the
researcher.
1. Rule 1. To determine the number of classes is to use the smallest positive
integer 𝑘 such that 2𝑘 ≥ 𝑛, where 𝑛 is the total number of observations.
𝑅𝑎𝑛𝑔𝑒 𝐻𝑉−𝐿𝑉
𝑆𝑢𝑔𝑔𝑒𝑠𝑡𝑒𝑑 𝐶𝑙𝑎𝑠𝑠 𝐼𝑛𝑡𝑒𝑟𝑣𝑎𝑙 = 𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝐶𝑙𝑎𝑠𝑠𝑒𝑠 = 𝑘

Where: 𝐻𝑉 = Highest value in a data set


𝐿𝑉 = Lowest value in a data set
𝑘 = number of classes
𝑖 = suggested class interval
Mathematics in the Modern World, Surigao del Sur State University
Page | 5
2. Rule 2. Another way to determine the class interval is by applying the
formula below,
𝑅𝑎𝑛𝑔𝑒
𝑆𝑢𝑔𝑔𝑒𝑠𝑡𝑒𝑑 𝐶𝑙𝑎𝑠𝑠 𝐼𝑛𝑡𝑒𝑟𝑣𝑎𝑙 = 1+3.322(𝑙𝑜𝑔𝑎𝑟𝑖𝑡ℎ𝑚 𝑜𝑓 𝑡𝑜𝑡𝑎𝑙 𝑓𝑟𝑒𝑞𝑢𝑒𝑛𝑐𝑖𝑒𝑠)

Grouped Frequency Distribution


The Frequency Distribution or Grouped Frequency Distribution is a
method in organizing data when the data include a large number of
observations, it is convenient to group the values into mutually exclusive
classes and show the number of observations occurring in each class in a
tabular form.

• When using a frequency distribution, we may not know what exactly the
smallest value is and the highest value is unless we refer back to the
original ungrouped data.

• A simple frequency distribution essentially has two columns, one for the
classes which are also referred to as class intervals, and another for the
class frequencies indicate the number of observations falling within the
different class intervals.

Let us consider the following frequency distribution:

Table 1. Frequency Distribution of the Midterm Exam Score of 50


Students
Class Interval Class Frequency
(𝑥) (𝑓)
20 – 24 2
25 – 29 6
30 – 34 9
35 – 39 10
40 – 44 12
45 – 49 7
50 – 54 4
𝑖 = 5 𝑛 = 50

where: 𝑥 = class intervals


𝑓 = class frequencies
𝑖 = class size
𝑛 = sample data

The following table shows the class limits, class boundaries, and class
marks of the frequency distribution of table 1.

Table 2. Frequency Distribution of the Midterm Exam Score of 50 Students


Mathematics in the Modern World, Surigao del Sur State University
Page | 6
Showing the Class Limits, Class Boundaries, and Class Marks
Class Class Limits Class Class Class
Interval Boundaries Marks/ Frequency
(𝑥) Midpoint (𝑓)
20 – 24 20 and 24 19.5 and 24.5 22 2
25 – 29 25 and 29 24.5 and 29.5 27 6
30 – 34 30 and 34 29.5 and 34.5 32 9
35 – 39 35 and 39 24.5 and 39.5 37 10
40 – 44 40 and 44 39.5 and 44.5 42 12
45 – 49 45 and 49 44.5 and 49.5 47 7
50 – 54 50 and 54 49.5 and 54.5 52 4
𝑖 = 5 𝑛 = 50

Steps in Conducting a Grouped Frequency Distribution

1. Calculate the range of the data by subtracting the lowest value from the
highest value.
2. Decide on the number of class intervals. The use of 5 to 20 class
intervals is often justified depending on the nature of the data.
3. Divide the range by the desired number of class intervals. The result may
now be employed as the interval size 𝑖.
4. Choose an appropriate lower limit for the first-class interval. This number
should be less than, or equal to, the lowest value in the data. The general
practice, however, is to use, if possible, a lower limit that is divisible by
the interval size. The upper limit of the class interval is obtained by
adding 𝑖 – 1 to the lower limit.
5. Determine the rest of the class interval.
6. Count the number of observations or measurements falling within each
class interval and enter the results in the frequency column. This is
facilitated by providing a tally column to the right of the class intervals.

Example 2:
A sample of 40 companies belonging to a certain industry reported the
following numbers of employees.

43 58 21 24 31 49 40 51 55 28
50 33 62 30 25 39 59 29 36 42
38 46 42 61 50 41 37 35 40 52
47 35 57 55 36 45 32 45 42 36

Construct a frequency distribution using five class intervals for the


above data.

Solution:
Step 1: Calculate the range of the data by subtracting the lowest value
from the highest value.
Mathematics in the Modern World, Surigao del Sur State University
Page | 7
𝑅𝑎𝑛𝑔𝑒 = 𝐻𝑉 − 𝐿𝑉 = 62 − 21 = 41
Step 2: Decide on the number of class intervals. The use of 5 to 20 class
intervals is often justified depending on the nature of the data. In this
example, we are suggested to use 5.
Step 3: Divide the range by the desired number of class intervals. The
result may now be employed as the interval size 𝑖.
𝑅𝑎𝑛𝑔𝑒 41
= 5 = 8.2 ≈ 9 = 𝑖
5
Step 4: Choose an appropriate lower limit for the first-class interval. This
number should be less than, or equal to, the lowest value in the data.
The general practice, however, is to use, if possible, a lower limit that is
divisible by the interval size. The upper limit of the class interval is
obtained by adding 𝑖 – 1 to the lower limit.

In this case, let us choose a lower limit of 18 since it is divisible


by the interval size, 𝑖.
𝐿𝑜𝑤𝑒𝑟 𝑙𝑖𝑚𝑖𝑡 = 18
𝑈𝑝𝑝𝑒𝑟 𝑙𝑖𝑚𝑖𝑡 = 18 + (9 − 1) = 26

Class Interval
(𝑥)
18 − 26

Step 5: Determine the rest of the class interval.

Class Interval
(𝑥)
18 − 26
27 − 35
36 − 44
45 − 53
54 − 62

Step 6: Count the number of observations or measurements falling within


each class interval and enter the results in the frequency column. This is
facilitated by providing a tally column to the right of the class intervals.

Class Interval Tally Frequency


(𝑥)
18 − 26 III 3
27 − 35 IIII-III 8
36 − 44 IIII-IIII-III 13
45 − 53 IIII-IIII 9
54 − 62 IIII-II 7
𝑖=9 Total 𝑛 = 40
Mathematics in the Modern World, Surigao del Sur State University
Page | 8
B. Graphing Statistical Data
When the data set contains large number of values, making conclusions
from an ordered array or stem-and-leaf plot is often difficult. We will need
graphs or charts in such situations. There are a number of graphs or charts to
visually show numerical data. These include histogram, frequency polygon, and
cumulative frequency (ogive).

Histogram
A histogram is a graph in which the classes are marked on the horizontal
axis (𝑥 − 𝑎𝑥𝑖𝑠) and the class frequencies on the vertical axis (𝑦 − 𝑎𝑥𝑖𝑠). The
height of the bars represents the class frequencies, and the bars are drawn
adjacent to each other. Nevertheless, the histogram focusses on the frequency
of each class and sacrifices whatever information is contained in the actual
observation.

Frequency Polygon
A frequency polygon is a graph that displays the data using points which
are connected by lines. The frequencies are represented by the heights of the
points at the midpoints of the classes. The vertical axis represents the
frequency of the distribution while the horizontal axis represents the midpoints
of the frequency distribution.

Cumulative Frequency Polygon (Ogive)


A cumulative frequency polygon or ogive (reads as oh’-jive) is a graph
that displays the cumulative frequencies for the classes in a frequency
distribution. The vertical axis represents the cumulative frequency of the
distribution while the horizontal axis represents the upper class boundaries (real
upper limits) of the frequency distribution.

Example 3:
Shown below is the frequency distribution in Example 2.

Class Interval Frequency


(𝑥)
18 − 26 3
27 − 35 8
36 − 44 13
45 − 53 9
54 − 62 7
𝑖=9 𝑛 = 40

Construct a histogram, frequency polygon, and cumulative frequency


polygon.
Mathematics in the Modern World, Surigao del Sur State University
Page | 9
Solution:
a. Constructing a Histogram
Step 1: Find the class marks or midpoints of each class.

Class Interval Frequency Class Marks/


(𝑥) Midpoints
18 − 26 3 18+26
( ) = 22
2
27 − 35 8 31
36 − 44 13 40
45 − 53 9 49
54 − 62 7 58
𝑖=9 𝑛 = 40

Step 2: Draw and label the x-axis and y-axis.


Step 3: Represent the frequency on the y-axis and the midpoints on the
x-axis.
Step 4: Use the frequency to represent the height and draw the vertical
bars.

Histogram
14

12

10
Frequency

8
6

0
22 31 40 49 58
Class Marks or Midpoints

b. Constructing a Frequency Polygon


Step 1: Find the midpoints of each class
Step 2: Draw and label the x-axis and y-axis.
Step 3: Represent the frequency on the y-axis and the midpoints on the
x-axis.

Mathematics in the Modern World, Surigao del Sur State University


Page | 10
Step 4: Connect adjacent points with line segments. Draw a line back to
the x-axis at the beginning and end of the graph.

F req u en cy Po l yg o n
14

12

10
Frequency

0
22 31 40 49 58
Class Marks or Midpoints

c. Constructing a Cumulative Frequency Polygon (Ogive)


Step 1: Find the cumulative distribution of the data set.

Class Interval Class Frequency Cumulative Found by


(𝑥) Boundaries Frequency
(𝑐𝑓)
18 − 26 17.5 − 26.5 3 3 3
27 − 35 26.5 − 35.5 8 11 3 + 8 = 11
36 − 44 35.5 − 44.5 13 24 11 + 13 = 24
45 − 53 44.5 − 53.5 9 33 24 + 9 = 33
54 − 62 53.5 − 62.5 7 40 33 + 7 = 40
𝑖=9 𝑛 = 40

Step 2: Draw and label the x-axis and y-axis.


Step 3: Represent the cumulative frequency on the y-axis and the upper
class boundaries on the x-axis.
Step4: Connect adjacent points with line segments.

Mathematics in the Modern World, Surigao del Sur State University


Page | 11
Ogive
45

Cumulative Frequency
40
35
30
25
20
15
10
5
0
26.5 35.5 44.5 53.5 62.5
Upper Class Boundaries

Other Useful Graphs and Charts


1. Pareto Chart. A pareto chart is a graph used to represent a frequency
distribution for a categorical data (or nominal-level) and frequencies are
displayed by the heights of vertical bars, which are arranged in order
from highest to lowest.

2. Bar Chart (Bar Graph). A bar chart is similar to bar histogram. The
bases of the rectangles are arbitrary intervals whose centers are the
codes. The height of each rectangle represents the frequency of that
category. It is also applicable for categorical data (or nominal-level).

3. Pie Chart (Circle Graph). A pie chart is a circle divided into portions that
represent the relative frequencies (or percentages) of the data belonging
to different categories. The data in a pie chart should be categorical or
nominal-level.

4. Time Series Graph. A time series graph represents data that occur over
specific period of time under observation. In addition, it shows a trend or
pattern on the increase or decrease over the period of time.

5. Pictograph (Pictogram). A pictograph immediately suggests the nature


of the data being shown. It is a combination of the attention-getting
quality and the accuracy of the bar chart. Appropriate pictures arranged
in a row (sometimes in a column) present the quantities for comparison.

6. Scatter Plot. A scatter plot is used to examine possible relationships


between two numerical variables. The two variables are plot in x-axis
and y-axis.

Mathematics in the Modern World, Surigao del Sur State University


Page | 12
Example 4:
Using the information in the table below about the favorable snacks of
500 students, construct a pareto chart, bar chart, and pie chart.
Products Sales
Junk Foods 75
Candy 150
Ice Cream 105
Chocolate 130
Others 40

Solution:
a. Constructing a Pareto Chart
Step 1: Arrange the data from highest to lowest according to frequency.
Products Sales
Candy 150
Chocolate 130
Ice Cream 105
Junk Foods 75
Others 40

Step 2: Draw and label x-axis (Products) and y-axis (Sales).


Step 3: Construct the chart by arranging the frequency from highest to
lowest and from left to right. Make a bar with the same width and draw
the height corresponding to the frequencies.

Favorite Snacks
160
140
120
100
Sales

80
60
40
20
0
Candy Chocolate Ice Cream Junk Foods Others
Products

It can easily be seen in the pareto chart that candy is the most preferred
snacks followed by chocolate while other kinds of snacks are least preferred by
the students from the given population.
Mathematics in the Modern World, Surigao del Sur State University
Page | 13
b. Constructing a Bar Chart
Step 1: Draw and label x-axis (Products) and y-axis (Sales).
Step 2: Make a bar with the same width and draw the height
corresponding to the frequencies.

Favorite Snacks
160
140
120
100
Sales

80
60
40
20
0
Junk Foods Candy Ice Cream Chocolate Others
Products

The same observation can also be seen in the bar chart that candy is
the most preferred snacks followed by chocolate while other kinds of snacks
are least preferred by the students from the given population.

c. Constructing a Pie Chart


Step 1: Since there are 360° in a circle, frequency of each class must be
converted into a proportional part of the circle. This conversion is done
by applying the formula
𝑓
𝐷𝑒𝑔𝑟𝑒𝑒𝑠 = (𝑛) (360°)

where: frequency of each class, and 𝑛 = 𝑠𝑢𝑚 𝑜𝑓 𝑓𝑟𝑒𝑞𝑢𝑒𝑛𝑐𝑖𝑒𝑠


Hence, the following conversions are obtained. The degrees should total
360°.
75
Junk Foods: (500) (360°) = 54°
150
Candy: (500) (360°) = 108°
105
Ice Cream: (500) (360°) = 75.6°
130
Chocolate: (500) (360°) = 93.6°
40
Others: (500) (360°) = 28.8°
Mathematics in the Modern World, Surigao del Sur State University
Page | 14
Step 2: Each frequency must also be converted to a percentage and the
sum of these percentages must have a total of 100%. This percentage can be
done by applying the formula
𝑓
𝐷𝑒𝑔𝑟𝑒𝑒𝑠 = (𝑛) (100%)

where: frequency of each class, and 𝑛 = 𝑠𝑢𝑚 𝑜𝑓 𝑓𝑟𝑒𝑞𝑢𝑒𝑛𝑐𝑖𝑒𝑠


75
Junk Foods: (500) (100%) = 15%
150
Candy: (500) (100%) = 30%
105
Ice Cream: (500) (100%) = 21%
130
Chocolate: ( ) (100%) = 26%
500
40
Others: ( ) (100%) = 8%
500

Step 3: Using a protractor, graph each section and write its name and
appropriate percentage

Favorite Snacks
Junk Foods,
Others, 8%
15%

Chocolate,
26%

Candy, 30%

Ice Cream,
21%

Since the candy has the biggest slice in the pie chart, it is the most
preferred snacks followed by chocolate while other kinds of snacks are least
preferred by the students from the given population.

Example 5:
Using the information in the table below about the US dollar and
Philippine peso exchange rate from January to December of 2017, construct a
time series graph.

Mathematics in the Modern World, Surigao del Sur State University


Page | 15
Month Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
Rate 41 42 43 46 44 45 43 42 45 44 45 43

Solution:
Step 1: Draw and label the x-axis and y-axis.
Step2: Label the x-axis for months and y-axis for Peso per US Dollar.
Step 3: Plot each point according to the table.
Step 4: Draw the segments connecting adjacent points.

Peso-US Dollar Exchange Rate


47
46
Peso per US Dollar

45
44
43
42
41
40
39
38
Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
Months

It can be seen in the table that April has the highest exchange rate of US
dollar to Philippine peso and it is in the lowest in the months of January,
February, and August.

Example 6:
The information in the table below show the number of male students of
a certain College in Bislig City from 2016 to 2020. Construct a pictograph.
Year Male
2016 300
2017 375
2018 525
2019 600
2020 300

Solution:
Step 1: Draw and label the x-axis and y-axis.
Step 2: Label the x-axis for Students and y-axis for Years.

Mathematics in the Modern World, Surigao del Sur State University


Page | 16
Step 3: Draw a male and female icon to represent the number of
students.

Pictograph

2020

2019
Years

2018

2017

2016

0 100 200 300 400 500 600 700


Male Students

Legend: 1 male icon = 50 male students

It can be noted in the pictograph that more male students in 2018 and
2019, while less male students in 2016 and 2020 in a certain College in Bislig
City.

Example 7:
The owner of a chain of halo-halo stores would like to study the effect of
atmospheric temperature on sales during the summer season. A random
sample of 10 days is selected with the results given as follows:
Day 1 2 3 4 5 6 7 8 9 10
Temperature (℉) 79 76 78 84 90 83 93 94 97 85
Total Sales 147 143 147 168 206 155 192 211 209 187

Put the data on a scatter diagram.


Solution:
Step 1: Draw and label the x-axis and y-axis.
Step 2: Label the x-axis for Temperature (℉) and y-axis for Sales.
Step 3: Plot the points of each ordered pair in the Cartesian coordinate
system.

Mathematics in the Modern World, Surigao del Sur State University


Page | 17
250

200

150

Sales (Y) 100

50

0
0 20 40 60 80 100 120
Temperature (℉)

We deduce in the graph that there is a positive relationship with the


temperature and the number of sales of halo-halo. It means to say as the
temperature increases the sales also increases.

APPLICATION

Task: Problem Solving

Directions: Solve the following problems. Write your answer


in a separate sheet of paper.

1. A sample of 40 companies belonging to a certain industry reported the


following numbers of employees.

43 58 21 24 31 49 40 51 55 28
50 33 62 30 25 39 59 29 36 42
38 46 42 61 50 41 37 35 40 52
47 35 57 55 36 45 32 45 42 36

Construct a frequency distribution using seven class intervals for the


given data, and also construct a histogram, frequency polygon, and
cumulative frequency polygon.

2. A certain agency is interested in the number of brand-new cars imported


to the Philippines in 2015. The following data are as follows:

Country No. of Cars Imported


Japan 225,000
South Korea 78,300
USA 120,250
Mathematics in the Modern World, Surigao del Sur State University
Page | 18
United Kingdom 19,200
Italy 16,750
China 40,500
Total 500,000

Sketch the pareto chart, bar chart, and pie chart and interpret the data.

3. In a College program where General mathematics course is a


prerequisite for the Statistics and Probability course, a sample of 14
students was drawn. The grades for the General mathematics and
Statistics and Probability were recorded for each student. The data are
listed below. Sketch the graph of these data using scatter plot. Interpret
the result.

Student 1 2 3 4 5 6 7 8 9 10 11 12 13 14
General Math 90 80 75 78 79 84 86 93 95 76 84 81 84 87
Stat & Prob 88 84 76 77 76 83 88 95 85 78 89 84 87 89

4. The number of postpaid cellular phone subscribers for each of the last
12 years is listed below. Use the time series graph to represent these
figures. Interpret the result.
Year 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015
No. of 3.12 4.10 4.23 3.96 3.87 3.50 4.67 4.99 4.86 4.96 5.01 5.18
Subscribers

5. A real estate develops household in a province. The data in the table


below show the number of house construction from 2013 to
2017.Construct a pictograph.

Year No. of Houses


2013 400
2014 250
2015 600
2016 550
2017 700

Well done! You have just finished Lesson 1 in this module. Now if you
are ready, please proceed to Lesson 2 of this module which will discuss about
Measures of Central Tendency.

Mathematics in the Modern World, Surigao del Sur State University


Page | 19
Lesson
Measures of Central Tendency
2

Objectives:
• Analyze the data using mean, median, and mode; and
• Solve mathematical problems involving measures of
central tendency.

Introduction
One of the most basic statistical concepts involves finding measures of
central tendency of a set of numerical data. But what are the measures of
central tendency? You will find the answer to this question in this lesson.

ABSTRACTION

A measure of central tendency, commonly referred to


as average, is a single value that represents a data set. Its
purpose is to locate the center of a data set. There are three
measures of central tendency: the mean, median, and the
mode.

A. Mean
The arithmetic mean, often called as the mean, is the most frequently
used measure of central tendency. The mean is the only common measure in
which all values play an equal role, meaning, to determine its values you would
need to consider all the values of any given date set. The mean is appropriate
to determine the central tendency of an interval or ratio data.
The symbol 𝑥̅ , called "𝑥 𝑏𝑎𝑟", is used to represent the mean of a sample
and the symbol 𝜇, called "𝑚𝑢", is used to denote the mean of a population.

Properties of Mean
1. A set of data has only one mean.
2. Mean can be applied for interval and ratio data.
3. All values in the data set are included in computing the mean.
4. The mean is very useful in comparing two or more data set.
Mathematics in the Modern World, Surigao del Sur State University
Page | 20
5. Mean is affected by the extreme small or large values on a data set.
6. Mean is most appropriate in symmetrical data.
𝑆𝑢𝑚 𝑜𝑓 𝑎𝑙𝑙 𝑣𝑎𝑙𝑢𝑒𝑠
𝑀𝑒𝑎𝑛 = 𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑣𝑎𝑙𝑢𝑒𝑠

∑𝑥 ∑𝑥
Sample mean: 𝑥̅ = Population mean: 𝜇 =
𝑛 𝑁

where:
𝑥̅ = sample mean
𝜇 = population mean
𝑥 = the value of any particular observation or
measurement
∑ 𝑥 = sum of all 𝑥′𝑠
𝑛 = total number of values in the sample
𝑁 = total number of values in the population

Example 1:
The daily salaries of a sample of eight employees are 550, 420, 560,
500, 700, 670, 860, 480. Find the mean daily rate of employees.
Solution:
∑𝑥 𝑥1 +𝑥2 +𝑥3+𝑥4 +𝑥5 +𝑥6+𝑥7 +𝑥8
𝑥̅ = =
𝑛 𝑛
550+420+560+500+700+670+860+480 4740
𝑥̅ = = = 592.50
8 8

The sample mean daily salary of employees is 592.50

Example 2:
Find the population mean of the ages of 9 middle-management
employees of a certain company. The ages are 53, 45, 59, 48, 54, 46, 51, 58,
and 55.
Solution:
∑𝑥 𝑥1 +𝑥2 +𝑥3 +𝑥4 +𝑥5+𝑥6 +𝑥7 +𝑥8+𝑥9
𝜇= =
𝑁 𝑁
53+45+59+48+54+46+51+58+55
𝜇= = 52.11
9

The mean population age of middle-management employees is 52.11.

Example 3:
Six friends in a biology class of 20 students received test grades of 92,
84, 65, 76, 88, and 90.
Solution:
Mathematics in the Modern World, Surigao del Sur State University
Page | 21
The 6 friends are a sample of the population of 20 students.
So, use 𝑥 𝑏𝑎𝑟"to represent the mean,
∑𝑥 92+84+65+76+88+90
𝑥̅ = = = 82.5
𝑛 6

B. Median
The median is the midpoint of the data array. When the data set is
ordered, whether ascending or descending, it is called a data array. Median is
an appropriate measure of central tendency for data that are ordinal or above,
but it is more valuable in an ordinal type of data.

Properties of Median
1. The median is unique, there is only one median for a set of data.
2. The median is found by arranging the set of data from lowest to highest
(or highest to lowest) and getting the value of the middle observation.
3. Median is not affected by the extreme small or large value.
4. Median can be applied for ordinal, interval and ratio data.
5. Median is most appropriate in a skewed data.

To determine the value of median for ungrouped, we need to consider two rules:
1. If 𝑛 is odd, the median is the middle ranked.
2. If 𝑛 is even, then the median is the average of the two middle ranked
values.
𝑛+1
𝑀𝑒𝑑𝑖𝑎𝑛 (𝑅𝑎𝑛𝑘 𝑣𝑎𝑙𝑢𝑒) = 2 .
Note that 𝑛 is the population/sample size.

Example 4:
Find the median of the ages of 9 middle-management employees of a
certain company. The ages are 53, 45, 59, 48, 54, 46, 51, 58, and 55.
Solution:
Step 1: Arrange the data in order.
45, 46, 48, 51, 53, 54, 55, 58, 59
Step 2: Select the middle rank value
𝑛+1 9+1 10
𝑀𝑒𝑑𝑖𝑎𝑛 (𝑅𝑎𝑛𝑘 𝑣𝑎𝑙𝑢𝑒) = = = =5
2 2 2

Step 3: Identify the median in the data set


45, 46, 48, 51, 53, 54, 55, 58, 59

Mathematics in the Modern World, Surigao del Sur State University


Page | 22
5th
Hence, the median age is 53 years.
Example 5:
The daily salaries of a sample of eight employees are 550, 420, 560,
500, 700, 670, 860, 480. Find the median daily rate of employees.
Solution:
Step 1: Arrange the data in order.
420, 480, 500, 550, 560, 670, 700, 860
Step 2: Select the middle rank value.
𝑛+1 8+1 9
𝑀𝑒𝑑𝑖𝑎𝑛 (𝑅𝑎𝑛𝑘 𝑣𝑎𝑙𝑢𝑒) = = = = 4.5
2 2 2

Step 3: Identify the median in the data set.


420, 480, 500, 550, 560, 670, 700, 860
4.5th
Since the middle point falls between 550 and 560, we can determine the
median of the data set by getting the average of the two values.
550+560
𝑀𝑒𝑑𝑖𝑎𝑛 = = 555
2

Therefore, the median daily rate is 555.

C. Mode
The mode is the value in a data set that appears most frequently. Like
the median and unlike the mean, extreme values in a data set do not affect the
mode. A data may not contain any mode if none of the values are “most typical”.
A data set that has only one value that occurs the greatest frequency is said to
be unimodal. If the data has two values with the same greatest frequency, both
values are considered the mode and the data set is bimodal. If a data set has
more than two modes, then the data is said to be multimodal. There are some
cases when the data set values have the same number frequency. When this
occurs, the data set is said to be no mode.

Properties of Mode
1. The mode is found by locating the most frequently occurring value.
2. The mode is the easiest average to compute.
3. There can be more than one mode or even no mode in any given data
set.
4. Mode is not affected by the extreme small or large values.
5. Mode can be applied for nominal, ordinal, interval and ratio data.

Mathematics in the Modern World, Surigao del Sur State University


Page | 23
Example 6:
Find the mode of the ages of 9 middle-management employees of a
certain company. The ages are 53, 45, 59, 48, 54, 46, 51, 58, and 55.
Solution:
The ordered array for these data is 45, 46, 48, 51, 53, 54, 55, 58, 59.
There is no mode since the data set has the same frequency.

Example 7:
Find the mode of the data in the following lists.
(a) 18, 15, 21, 16, 15, 14, 15, 21
(b) 2, 5, 2, 3, 3, 2, 4, 5, 3
Solution:
a. In the list 18, 15, 21, 16, 15, 14, 15, 21, the number 15 occurs more
often than the other numbers. Thus, 15 is the mode.
b. In the list 2, 5, 2, 3, 3, 2, 4, 5, 3, the numbers 2 and 3 have the same
frequency. Thus, 2 and 3 are the modes.

Weighted Mean

The weighted mean is particularly useful when various classes or groups


contribute differently to the total. The weighted mean is found by multiplying
each value by its corresponding weight and dividing by the sum of the weights.

∑(𝑥.𝑤) 𝑥1 𝑤1 +𝑥2 𝑤2+𝑥3 𝑤3+⋯+𝑥𝑛 𝑤𝑛


𝑥̅𝑤 = ∑𝑤
= 𝑤1 +𝑤2+𝑤3 +⋯+𝑤𝑛
where:
𝑥̅𝑤 = weighted mean
𝑤𝑖 = corresponding weight
𝑥𝑖 = the value of any particular observations or
measurement

Example 8:
At a certain company there are 18 construction workers, 12 painters, 7
supervisors, and 3 engineers. There monthly salaries are 30,500, 33,700,
38,600, and 45,000 respectively. What is the weighted mean salary?
Solution:
Let 𝑤1 = 18 𝑤2 = 12 𝑤3 = 7 𝑤4 = 3
𝑥1 = 30,500 𝑥2 = 33,700 𝑥3 = 38,600 𝑥4 = 45,000

Mathematics in the Modern World, Surigao del Sur State University


Page | 24
∑(𝑥.𝑤) 𝑥1 𝑤1 +𝑥2 𝑤2+𝑥3 𝑤3+𝑥4 𝑤4
𝑥̅𝑤 = ∑𝑤
=
𝑤1 +𝑤2+𝑤3 +𝑤4

30,500(18)+33,700(12)+38,600(7)+45,000(3)
𝑥̅𝑤 = = 33,965
18+12+7+3

Example 9:
A certain subdivision consists of 50 homes. The table shows the
frequency distribution of homes with respect to the number of bedrooms it has.
Find the mean number of bedrooms for the 50 homes.
No. of Bedrooms 2 3 4 5 6
No. of Homes 13 21 10 4 2

Solution:
Let 𝑤1 = 2 𝑤2 = 3 𝑤3 = 4 𝑤4 = 5 𝑤6 = 6
𝑥1 = 13 𝑥2 = 21 𝑥3 = 10 𝑥4 = 4 𝑥5 = 2
∑(𝑥.𝑤) 𝑥1 𝑤1 +𝑥2 𝑤2+𝑥3 𝑤3+𝑥4 𝑤4+𝑤5 𝑥5
𝑥̅𝑤 = ∑𝑤
= 𝑤1 +𝑤2 +𝑤3 +𝑤4 +𝑤5

2(13)+3(21)+4(10)+5(4)+6(2)
𝑥̅𝑤 = 13+21+10+4+2
= 3.22

The weighted mean of bedrooms per home is 3.22.

APPLICATION

Task: Problem Solving

Directions: Solve the following problems and write your


answer in a separate sheet of paper.

1. Find the mean, median, and mode/s, if any, for the given data. Round
noninteger means to the nearest tenth.
a. 8, 3, 3, 17, 9, 22, 19
b. 11, 8, 2, 5, 17, 39, 52, 42
c. 118, 105, 110, 118, 134, 155, 166, 166, 118
d. -12, -8, -5, -5, -3, 0, 4, 9, 21
e. -8.5, -2.2, 4.1, 4.1, 6.4, 8.3, 9.7
2. A college professor administered a unit exam to one of his classes and
found that the majority of the items were too easy. The scores are 45,39,
40, 48, 35, 37, 36, 37, 40, 44, 41, 49, 29, 28, 32, 36, 37, 41, 40, 36, 39,
30, 25, 43, and 50. Calculate the mean and median.
3. A professor grades student on 4 quizzes, a project, and a final
examination. Each quiz counts as 12% of the quiz grade. The project
counts as 22% of the course grade. The final examination counts as 30%
Mathematics in the Modern World, Surigao del Sur State University
Page | 25
of the course grade. Student A has quiz scores of 75, 80, 85, and 90.
His project score 95 and his final examination score is 92. Use the
weighted mean formula to find his average for the course.
4. Find the mean, median, and all modes for the data in the given frequency
distribution.

Points
scored in
a Frequency
basketball
game
2 6
4 5
5 6
9 3
10 1
14 2
19 1

5. Find the mean, median, and all modes for the data in the given frequency
distribution.
Scores on
a MMW Frequency
quiz
2 1
4 2
6 7
7 12
8 10
9 4
10 3

Well done! You have just finished Lesson 2 in this module. Now if you
are ready, please proceed to Lesson 3 of this module which will discuss about
Measures of Dispersion.

Mathematics in the Modern World, Surigao del Sur State University


Page | 26
Lesson
Measures of Dispersion
3

Objectives:
• Analyze the data using range, variance, and standard
deviation; and
• Solve mathematical problems involving measures of
dispersion.

Introduction

In the previous lesson, the three types of average values of data set was
discussed. In this lesson, another important characteristic of a data set is how
it is distributed, or how far each element is from some measure of central
tendency. To measure the spread or dispersion of data, statistical values known
as range and the standard deviation will be introduced.

ABSTRACTION

There are several ways to measure the variability of the


data. Although the most common and most important is the
standard deviation, which provides an average distance for
each element from the mean, several others are also
important, and are hence discussed here.
Standard deviation is a statistical term that provides a
good indication of volatility. It measures how widely values are
dispersed from the average.
Dispersion is the difference between the actual value and the average
value.

A. Range
Probably the simplest and easiest way to determine measure of
dispersion is the range. The range is the difference of the highest value and the
lowest value in the data set.
Advantages of the range
1. It is easy to compute; and
Mathematics in the Modern World, Surigao del Sur State University
Page | 27
2. It is easy to understand.
Disadvantages of the range
1. It can be distorted by a single extreme value (or outlier); and
2. Only two values are used in the calculation.

Example 1:
The daily salaries of a sample of eight employees are 550, 420, 560,
500, 700, 670, 860, 480. Find the range.

Solution:
Step 1: Determine the highest value and the lowest value in the data set.
Highest value (HV) is 860 and the lowest value (LV) is 420
Step 2: Solve for the range
𝑅𝑎𝑛𝑔𝑒 = 𝐻𝑉 − 𝐿𝑉 = 860 − 420 = 440
Hence, the range in daily rate salary is 440.

Example 2:
Find the range of the numbers of ounces dispensed by Machine 1 in the
table below:
Machine 1 Machine 2
9.52 8.01
6.41 7.99
10.07 7.95
5.85 8.03
8.15 8.02
𝑥̅ = 8.0 𝑥̅ = 8.0

Solution:
The greatest number of ounces dispensed is 10.07 and the least is 5.85.
The range of the numbers of ounces dispensed is 10.07 − 5.85 = 4.22 𝑜𝑧.

B. Standard Deviation
One of the most widely used measures of dispersion is the standard
deviation. The more spread apart the data, the higher the deviation. Standard
deviation is calculated as the square root of variance.

Procedure for computing a Standard Deviation


1. Determine the mean of the 𝑛 numbers.
2. For each number, calculate the deviation (difference) between the
number and the mean of the numbers.
Mathematics in the Modern World, Surigao del Sur State University
Page | 28
3. Calculate the square of each deviation and find the sum of these
squared deviations.
4. If the data is a population, then divide the sum by 𝑛. If the data is a
sample, then divide the sum by 𝑛 − 1.
5. Find the square root of the quotient in step 4.

Example 3:
The following numbers were obtained by sampling a population. 2, 4, 7,
12, 15. Find the standard deviation of the sample.
Solution:
Step 1: The mean of the numbers is
2+4+7+12+15
𝑥̅ = =8
5

Step 2: For each number, calculate the deviation between the number
and the mean.

𝑥 𝑥 − 𝑥̅
2 2 − 8 = −6
4 4 − 8 = −4
7 7 − 8 = −1
12 12 − 8 = 4
15 15 − 8 = 7

Step 3: Calculate the square of each deviation in Step 2, and find the
sum of these squared deviations.

𝑥 𝑥 − 𝑥̅ (𝑥 − 𝑥̅ )2
2 2 − 8 = −6 (−6)2 = 36
4 4 − 8 = −4 (−4)2 = 16
7 7 − 8 = −1 (−1)2 = 1
12 12 − 8 = 4 42 = 16
15 15 − 8 = 7 72 = 49
118

Step 4: Because we have a sample of 𝑛 = 5 values, divide the sum 118


by𝑛 − 1, which is 4.
118
= 29.5
4
Step 5: The standard deviation of the sample is 𝑠 = √29.5. To the
nearest hundredth, the standard deviation is 𝑠 = 5.43.

Mathematics in the Modern World, Surigao del Sur State University


Page | 29
Example 4:
A consumer group has tested a sample of 8 size-D batteries from each
of 3 companies. The results of the tests are shown in the following table.
According to these tests, which company produces batteries for which the
values representing hours of constant use have the smallest standard
deviation?

Company Hours of Constant use per battery


Company A 6.2, 6.4, 7.1, 5.9, 8.3, 5.3, 7.5, 9.3
Company B 6.8, 6.2, 7.2, 5.9, 7.0, 7.4, 7.3, 8.2
Company C 6.1, 6.6, 7.3, 5.7, 7.1, 7.6, 7.1, 8.5

Solution:
The mean for each sample of batteries is 7 ℎ.
The batteries from Company A have a standard deviation of
(6.2−7)2+(6.4−7)2 +⋯+(9.3−7)2
𝑠1 = √ = 1.328 ℎ
7
The batteries from Company B have a standard deviation of
(6.8−7)2 +(6.2−7)2 +⋯+(8.2−7)2
𝑠2 = √ = 0.719 ℎ
7
The batteries from Company C have a standard deviation of
(6.1−7)2 +(6.6−7)2 +⋯+(8.5−7)2
𝑠3 = √ = 0.877 ℎ
7
The batteries from Company B have the smallest standard deviation.
According to these results, the Company B produces the most consistent
batteries with regard to life expectancy under constant use.

C. VARIANCE
A statistic known as the variance is also used as a measure of
dispersion. The variance for a given set of data is the square of the standard
deviation of data.

Example 5:
Find the variance for the sample given in Example 3.
Solution:
In Example 3, we found 𝑠 = √29.5. the variance is the square of the
2
standard deviation. Thus, the variance is 𝑠 2 = (√29.5) = 29.5

Mathematics in the Modern World, Surigao del Sur State University


Page | 30
APPLICATION

Task: Problem Solving

Directions: Solve the following problems and write your


answer in a separate sheet of paper.

1. Find the range, the standard deviation, and the variance for the given
samples. Round noninteger results to the nearest tenth.
a. 1, 2, 5, 7, 8, 19, 22
b. 3, 4, 7, 11, 12, 12, 15, 16
c. 2.1, 3.0, 1.9, 1.5, 4.8
d. 48, 91, 87, 93, 59, 68, 92, 100, 81
e. −8, −5, −12, −1, 4, 7, 11

2. A survey of 10 fast-food restaurants noted the number of calories in a


mid-sized hamburger. The results are given in the table below.
Calories in a mid-sized hamburger
514 507 502 498 496 506 458 478 463 514

Find the mean and sample standard deviation of these data. Round to
the nearest hundredth.

3. A customer at a specialty coffee shop observed the amount of time, in


minutes, that each of 20 customers spent waiting to receive an order.
The results are recorded in the table below.
Time (min) to receive order
3.2 4.0 3.8 2.4 4.7 5.1 4.6 3.5 3.5 6.2

3.5 4.9 4.5 5.0 2.8 3.5 2.2 3.9 5.3 2.9

Find the mean and sample standard deviation of these data. Round to
the nearest hundredth.
4. A survey of 16 energy drinks noted the caffeine concentration of each
drink in milligrams per ounce. The results are given in the table below.
Concentration of caffeine (mg/oz)
9.1 7.5 7.8 8.9 9.0 8.2 9.1 8.7

9.0 7.7 8.8 8.9 9.0 9.1 8.2 8.9

Find the mean and sample standard deviation of these data. Round to
the nearest hundredth.

Mathematics in the Modern World, Surigao del Sur State University


Page | 31
5. A survey of 15 large cities noted the average weekly commute times, in
hours, of the residents of each city. The results are recorded in the table
below.
Weekly commute time (h)
4.5 4.0 5.8 5.4 4.7

4.0 3.6 3.9 4.7 3.7

4.6 3.4 3.5 3.9 4.4

Find the mean and sample standard deviation of these data. Round to
the nearest hundredth.

Well done! You have just finished Lesson 3 in this module. Now if you
are ready, please proceed to Lesson 4 of this module which will discuss about
Measures of Relative Position.

Mathematics in the Modern World, Surigao del Sur State University


Page | 32
Lesson
Measures of Relative Position
4

Objectives:
• Analyze the data using z-scores, percentiles, and
quartiles; and
• Solve mathematical problems involving measures of
relative position.

Introduction

Relative positions are words that describe where objects are in an


environment. For example: top, behind, or next to. The most common
measures of position are percentiles, quartiles, and standard scores (also
known as, z-scores). In this lesson you will be given sample problems to
understand what is this measure all about.

ABSTRACTION

When presenting or analyzing data set it is sometimes


helpful to group subjects into several equal groups. For
example, to create four equal groups we need the values that
split the data such 25% of the observations are in each group.
The cut off points are called quartiles, and there are three (3)
of them (the middle one also being called median). The
general term for such cut off points is quantiles; other values
likely to be encountered are deciles, which split data into 10
parts, and percentiles, which split the data into 100 parts (also called
centiles). Values such as quartiles can also be expressed as percentiles;
for example, the lowest quartile is also the 25th percentile and the median
is the 50th percentile or the percentile or the 5th decile.

A. Percentiles
Most standardized examinations provide scores in terms of percentiles,
which are defined as follows:

▪ 𝑝𝑡ℎ Percentile
A value 𝑥 is called the 𝑝𝑡ℎ percentile of a data set provided 𝑝% of the
data values are less than 𝑥.
Mathematics in the Modern World, Surigao del Sur State University
Page | 33
Example 1:
In a recent year, the median annual salary for a physical therapist was
74,480. If the 90th percentile for the annual salary of a physical therapist was
105,900, find the percent of physical therapists whose annual salary was
a. more than 74,480
b. less than 105,900
c. between 74,480 and 105,900
Solution:
a. By definition, the median is the 50th percentile. Therefore, 50% of the
physical therapists earned more than 74,480 per year.
b. Because 105,900 is the 90th percentile, 90% of all physical therapists
made less than 105,900.
c. From parts a and b, 90% − 50% = 40% of the physical therapists
earned between 74,480 and 105,900.

▪ Percentile for a Given Value


Given a set of data and a data value 𝑥,
𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑑𝑎𝑡𝑎 𝑙𝑒𝑠𝑠 𝑡ℎ𝑎𝑛 𝑥
𝑃𝑒𝑟𝑐𝑒𝑛𝑡𝑖𝑙𝑒 𝑜𝑓 𝑠𝑐𝑜𝑟𝑒 𝑥 = 𝑡𝑜𝑡𝑎𝑙 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑑𝑎𝑡𝑎 𝑣𝑎𝑙𝑢𝑒𝑠 . 100

Example 2:
On a reading examination given to 900 students, Elaine’s score of 602
was higher than the scores of 576 of the students who took the examination.
What is the percentile for Elaine’s score?
Solution:
𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑑𝑎𝑡𝑎 𝑙𝑒𝑠𝑠 𝑡ℎ𝑎𝑛 𝑥
𝑃𝑒𝑟𝑐𝑒𝑛𝑡𝑖𝑙𝑒 𝑜𝑓 𝑠𝑐𝑜𝑟𝑒 𝑥 = . 100
𝑡𝑜𝑡𝑎𝑙 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑑𝑎𝑡𝑎 𝑣𝑎𝑙𝑢𝑒𝑠
𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑑𝑎𝑡𝑎 𝑙𝑒𝑠𝑠 𝑡ℎ𝑎𝑛 602
𝑃𝑒𝑟𝑐𝑒𝑛𝑡𝑖𝑙𝑒 𝑜𝑓 𝑠𝑐𝑜𝑟𝑒 𝑥 = . 100
𝑡𝑜𝑡𝑎𝑙 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑑𝑎𝑡𝑎 𝑣𝑎𝑙𝑢𝑒𝑠

576
= 900 × 100 = 64

Elaine’s score of 602 places her at the 64th percentile.

B. Quartiles
𝑘 (𝑁 + 1)
𝑄𝑘 =
4
where: 𝑄𝑘 = Quartile
𝑁 = population
𝑘 = quartile location

Mathematics in the Modern World, Surigao del Sur State University


Page | 34
Example 3:
Find the first, second, and third quartiles of the ages of 9 middle-
management employees of a certain company. The ages are 53, 45, 59, 48,
54, 46, 51, 58, and 55.
Solution:
Step 1: Arrange the data in order.
45, 46, 48, 51, 53, 54, 55, 58, 59
Step 2: Select the first, second, and third quartiles value
1(𝑁+1) 1(9+1) 10
𝑄1 = = = = 2.5
4 4 4
2(𝑁+1) 2(9+1) 20
𝑄2 = = = =5
4 4 4
3(𝑁+1) 3(9+1) 30
𝑄3 = = = = 7.5
4 4 4

Step 3: Identify the first, second, and third quartiles values in the data
set.
45, 46, 48, 51, 53, 54, 55, 58, 59

2.5th 5th 7.5th


Since the 2.5th falls between 46 and 48; and 7.5th falls between 55 and
58 we can determine the first and third quartiles of the data set by getting the
average of the two values.
46+48 94
𝑄1 = = = 47
2 2
55+58 113
𝑄3 = = = 56.5
2 2

Therefore, 𝑄1 = 47, 𝑄2 = 53, and 𝑄3 = 56.5

C. z-Score
z-Score is used to know the position of one observation relative to others
in a set of data we apply z-score. Let say, we want to know a score of a student
of 42 compared to the scores of the other students in the class based from a
quiz on a total of 50 points. The mean and the standard deviation of the scores
can be used to compute the z-score, which will measure the relative standing
of a measurement in a data set.
A z-score measures the distance between an observation and the mean,
measured in units of standard deviation. The following formulas show how to
compute the z-score for a data value 𝑥 in a population and in a sample.
Mathematics in the Modern World, Surigao del Sur State University
Page | 35
𝑥−𝜇 𝑥−𝑥̅
𝑧= (for population) 𝑧= (for sample)
𝜎 𝑠

Example 4:
The monthly expenditures of a large group of households are normally
distributed with a mean of 48,700 and a standard deviation of 10,400. What is
the 𝑧 −value of monthly expenditures of 59,400 and 38,000?
Solution.
Using the formula of 𝑧 to determine 𝑧 −values for the two x values
(59,400 and 38,300) are computed as follows:
𝑥−𝜇 59,400−48,700
For 𝑥 = 59,400 𝑧= = = 1.00
𝜎 10,400
𝑥−𝜇 38,300−48,700
For 𝑥 = 38,300 𝑧= 𝜎
= 10,400
= −1.00

The 𝑧 of 1.00 indicates that a monthly expenditure of (59,400 for


households is one standard deviation above the mean, and a 𝑧 of −1.00 shows
that 38,300 monthly expenditures is one standard deviation below the mean.
Note that both household monthly expenditures (59,400 and 38,300) are the
same distance (10,400) from the mean.

Example 5:
Raul has taken two tests in his chemistry class. He scored 72 on the first
test, for which the mean of all scores was 65 and the standard deviation was 8.
He received a 60 on a second test, for which the mean of all scores was 45 and
the standard deviation was 12. In comparison to the other students, did Raul
do better on the first test or the second test?
Solution:
Find the z-score for each test.
72−65 60−45
𝑧72 = 8
= 0.875 𝑧60 = 12
= 1.25

Raul scored 0.875 standard deviation above the mean on the first test
and 1.25 standard deviations above the mean on the second test. These z-
scores indicates that, in comparison to his classmates, Raul scored better on
the second test than he did on the first test.

Example 6:
A consumer group tested a sample of 100 light bulbs. It found that the
mean life expectancy of the bulbs was 842 h, with a standard deviation of 90.
One particular light bulb from a company had a z-score of 1.2. What was the
life span of this light bulb?

Mathematics in the Modern World, Surigao del Sur State University


Page | 36
Solution:
Substitute the given values into the z-score equation and solve for x.
𝑥−𝑥̅
𝑧𝑥 = 𝑠
𝑥−842
1.2 =
90

108 = 𝑥 − 842
950 = 𝑥

APPLICATION

Task: Problem Solving

Directions: Solve the following problems and write your


answer in a separate sheet of paper.

1. The median annual salary for an employee is 44,528. If the 25th


percentile for the annual salary of an employee is 32,761, find the
percent of employee whose annual salaries are
a. less than 44,528
b. more than 32,761
c. between 32,761 and 44,528

2. On an examination given to 8600 students, John’s score of 405 was


higher than the scores of 3952 of the students who took the examination.
What is the percentile for John’s score?

3. The following table list the calories per 100 milliliters of 25 popular sodas.
Find the quartiles for the data.

Calories, per 100 milliliters, of Selected Sodas


43 37 42 40 53 62 36 32 50 49
26 53 73 48 45 39 45 48 40 56
41 36 58 42 39

4. A data set has a mean of 𝑥̅ = 75 and a standard deviation of 11.5. Find


the z-score for each of the following. Round to the nearest hundredth.
a. 𝑥 = 85
b. 𝑥 = 95
c. 𝑥 = 50
d. 𝑥 = 75
Mathematics in the Modern World, Surigao del Sur State University
Page | 37
5. A data set has a mean of 𝑥̅ = 212 and a standard deviation of 40. Find
the z-score for each of the following. Round to the nearest hundredth.
a. 𝑥 = 200
b. 𝑥 = 224
c. 𝑥 = 300
d. 𝑥 = 100

6. A data set has a mean of 𝑥̅ = 6.8 and a standard deviation of 1.9. Find
the z-score for each of the following. Round to the nearest hundredth.
a. 𝑥 = 6.2
b. 𝑥 = 7.2
c. 𝑥 = 9.0
d. 𝑥 = 5.0
7. A data set has a mean of 𝑥̅ = 4010 and a standard deviation of 115. Find
the z-score for each of the following. Round to the nearest hundredth.
a. 𝑥 = 3840
b. 𝑥 = 4200
c. 𝑥 = 4300
d. 𝑥 = 4030

Well done! You have just finished Lesson 4 in this module. Now if you
are ready, please proceed to Lesson 5 of this module which will discuss about
Normal Distribution.

REFERENCES
Sirug, W. (2018). Mathematics in the Modern World, CHED Curriculum
Compliant. Mindshapers Co. Inc.
Mathematics in the Modern World, Philippine Edition.

Mathematics in the Modern World, Surigao del Sur State University


Page | 38

You might also like