0% found this document useful (0 votes)

117 views27 pages

Module 5

This document provides an overview of statistical tools used for data management. It discusses: 1) Descriptive statistics which describe data through symbolic forms and inferential statistics which allow generalization from samples to populations. 2) Different scales of measurement for quantifying observations including nominal, ordinal, interval, and ratio scales. 3) Key statistical concepts such as population which is defined as a group sharing a common trait, and variables which can be measured and observed to vary.

Uploaded by

Joseph Alianic

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

117 views27 pages

Module 5

Uploaded by

Joseph Alianic

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 27

SAINT JOSEPH COLLEGE OF SINDANGAN INCORPORATED

Poblacion, Sindangan, Zamboanga del Norte

Criminal Justice Education

Learning Module in Mathematics in the Modern World

MODULE 5

DATA MANAGEMENT

CORE IDEA

Statistical tools derived from mathematics are useful in processing and managing numerical data to
describe a phenomenon and predict values.

Learning Outcome:
5. Use a variety of statistical tools to process and manage numerical data.
6. Use the methods of linear regression and correlations to predict the value of a predict the value of a
variable given certain condi variable given certain conditions.
7. Advocate the use of statistical data in making important decisions. .

LESSON 1 – The Data

At the end of this lesson, the student should be able to:

1. To Understand the nature of statistics.

2. To gain deeper insights on the different levelsof measurements.

3. To clarify the meaning of some important key concepts.

4. To explore the strengths and limitations of graphical representation.

Introduction

1|Module 3
It is written in the Holy Book that “the truth shall set us free;” therefore, understanding statistics paves
the way towards intellectual freedom. For without sufficient knowledge about it, we may be doomed
to a life of half-truth. Statistics will provide deeper insights to critically evaluate information and to
bring us to the well-lit arena of practicality.

General Fields of Statistics: Descriptive Statistics and Inferential Statistics

Descriptive Statistics. If statistics, in general, basically deals with analysis of data, then descriptive
statistics part of the general field is about “describing” data in symbolic forms and abbreviated
fashions. Sometimes we dealing with a large amount of data and that it is impossible to describe it as
it is being a large amount

To explore the characteristics of descriptive statistics, let us create a fictitious situation. What does it
mean if someone tells you that majority of workers earn approximately P20,000.00 in a month? Were
you able to dissect the idea behind the plain statement? Does it trigger your mind to question further?

This statement is a piece of information that described a particular trait or characteristic of a group of
workers. Supplied with this singular information but armed with statistical inquisitiveness, descriptive
statistics can further describe the given information to the extent of its depth and breadth.

Inferential Statistics. We could probably argue that descriptive statistics, with its characteristic to
describe, is sufficient to depict any given information. While it is effective to describe a manageable
size of data, it can hardly engulf a sizeable amount of data. Thus, for this kind of situation, inferential
statistics is the alternative technique that can be used. Inferential statistics has the ability to “infer”
and to generalize and it offers the right tool to predict values that are not really known.

Let us consider the fictitious situation we made under descriptive statistics, but this time instead of
reporting the approximate monthly earning of some workers, we want to determine the estimated
monthly earnings of all the workers in a certain region. By attempting to apply descriptive statistics, it
would be impossible to ask all the workers in the entire region about their monthly income. But by
using inferential statistics, we would instead practically decide to select just a small number of
workers and ask them of their predict or approximate in a more less fashion the monthly income of all
workers in the entire region.

Of course, inference or generalization is a risky process that is why we need to ensure that the small
group of workers we selected are the approximate representative of the workers in the entire region.
But nevertheless, this inference or prediction is better than chance accuracy.

Measurement

It essentially means quantifying an observation according to a certain rule. For instance, the presence
of fever can be quantified by using a thermometer. Body weight can be determined by using a
weighing scale. Or the mental ability can be quantified by using written examination that can generate
scores. The quantification sometimes can be done is simply counting. In quantifying an observation,
there are two types of quantitative informations: variable and constant. A variable is something that
can be measured and observed to vary. While a constant is something that does not vary, and it only
maintains a single value.

Scales of Measurement

- Nominal Scale : Categorical Data

- Ordinal Scale : Ranked Data

- Interval/Ratio Scale : Measurement Data

2|Module 3
To quantify an observation, it is necessary to identify its scale of measurement, it is known as level of
measurement. Scale of measurement is the gateway to the fascinating world of statistics. Without
sufficient knowledge of it, all our statistical learnings lead to nowhere.

Nominal Scale. It concerns with categorical data. It simply means using numbers to label categories.
This is done by counting the occurrence of frequency within categories. One condition is that the
categories must be independent or mutually exclusive. This implies that once something is identified
under a certain category, then that something cannot be reassigned at the same time to another category.

An example for this, if we want to measure a group of people according to marital status. We can
categorize marital status by simply assigning a number. For instance “1” for single and “2” for married.

Obviously, those numbers only serve as labels and they do not contain any numerical weight. Thus,
we cannot say that married people (having been labelled 2) have more marital status than single
people (having been labelled 1).

Ordinal Scale: It concerns with ranked data. There are instances wherein comparison is necessary
and cannot be avoided. Ordinal scale provides ranking of the observation in order to generate
information to the extent of “greater than” or “less than;”. But the ranked data generated is limited
also the extent of “greater than” or “less than;”. It is not capable of telling information about how
much greater or how much less.

Ordinal scale can be best illustrated in sports activities like fun run. Finding the order finish among
the participants in a fun run always come up with a ranking. However, ranked data cannot provide
information as to the difference in time between 1st placer and 2nd placer. Relative to this, reading
reports with ordinal information is also tricky. For example, a TV commercial extol a certain brand
for being the number one product in the country. This may seem acceptable, but if you learned that
there is no other product then definitely the message of the commercial will be swallowed with an
smirking face.

Interval Scale: It deals with measurement data. In the nominal scale, we use numbers to label
categories while in the ordinal scale we use numbers to merely provide information regarding greater
than or less than. However, in interval scale we assign numbers in such a way that there is meaning
and weight on the value of points between intervals. This scale of measurement provides more
information about the data. Consider the comparative illustration below:

Academic performance of five students in a certain class

Student A Student B Student C Student D Student E

Interval Data 99 74 73 70 70
Ordinal Data 1st 2nd 3rd 4th 5th
Nominal Data Passed Failed Failed Failed Failed

3|Module 3
As you may have noticed, the interval scale provides substantial information about the grades of
students. Student A earned a grade of 99, and so on and so forth. Now look at the information given
by ordinal data. It is simply about ranking. With this of information, Student B can proudly and
rightfully claim the 2nd place in the ranking. Ordinal scale is a trusted friend to keep a secret, that the
grade of student B though claiming 2nd place is actually 74. Let us analyze the nominal data in our
example. With this scale, it is also alright for the school sadly to announce that only one student
passed and four students failed. Nominal data cannot provide more information specifically provide
brighter limelight to student A. Audience may assume that Student A just got passing grade a little bit
higher than the passing mark but student A grade of 99 will remain hidden forever.

Ratio Scale. This is an extension of an interval scale. It also pertains with measurement data but
ratio’s point of view is about absolute value. Because of this, we oftentimes cannot utilize ratio scale
in the social sciences. We cannot justify an absolute value to gauge intelligence. We cannot say that
our student A with a grade of 99 has an intelligence several points superior than student E who hardly
but successfully achieved a grade of 70.

Key Concepts in Statistics

Population. A population can be defined as an entire group people, things, or

events having at least one trait in common (Sprinthall, 1994). A common trait is the binding factor in
order to group a cluster and call it a population. Merely having a clustering of people, things or events
cannot be considered as a population. At least one common trait must be established to make a
population. But, on the other hand, adding too many common traits can also limit the size of the
population. In the illustration below, notice how a trait can severely reduce the size or membership in
the population.

A group of students (this is a population, since the common trait is “students”) A group of male
students.

A group of male students attending the Statistics class

A group of male students attending the Statistics class with iPhone

A group of male students attending the Statistics class with iPhone and Earphone

As we read the list, we can mentally visualize that the size of the population is dramatically
becoming smaller and as we add more traits we may wonder if anyone still qualifies. The more
common traits we add, the more we reduce the designated population.

Parameter. In gauging the entire population, any measure obtained is called a parameter .
Situationally, if someone asks you as to what is the parameter of the study, then bear in mind that he
is referring to the size of the entire population. In some situations where the actual size of the
population is difficult to obtain; the parameters are in the form of estimate or inference.

Sample. The small number of observation taken from the total number making up a population is
called a sample. As long as the observation or data is not the totality of the entire population, then it is
always considered a sample. For instance, in a population of 100, then 1 is considered as a sample. 30
is clearly a sample. It may seem absurd but 99 taken from 100 is still considered a sample. Not until
we include

that last number (making it 100) could we claim that it is already a population and no longer a
sample.

Statistic. In gauging the sample, any measure obtained from the sample is called a statistic.
Whenever we describe the sample, then it is called statistics. Since a sample is easier to observe or
gather than the population, then statistics are simpler to gather than the parameter.

4|Module 3
Graphs. It is another way to visually show the behavior of data. To create a graph, distribution of
scores must be organized. For instance, in the scores provided below, presenting the scores in an
unorganized manner can provide confusing or no information at all; Reporting raw can even hide
some significant scores to be noticed.

120, 65, 110, 75, 105, 80, 105,

85, 100, 85, 100, 90, 95, 90, 90

But when we arrange the scores from highest to lowest, which is a form of score distribution, some
pieces of information can gradually brought forth and exposed.

Distribution of Scores

120

110

105

100

The score distribution can still be organized in a form of a frequency distribution. Frequency
distribution provides information about raw scores, and the frequency of occurrences. Frequency
distribution provides clearer insights about the behavior of scores.

5|Module 3
Another alternative way of presenting data in frequency distribution is to present them in a tabular
form. A tabular form has the advantage of showing the visual representation of the data. This kind of
presentation is more appealing to the general audience.

Another way of showing the data in graphical form is by using Microsoft Excel, as also illustrated in
the graphs below. It is the frequency polygon of the scores in our cited example above.

Notice in the illustration of the frequency polygon, the two graphs may appear different but they are
actually the same and they disclose the similar information. This illustration will allow you realize
that unless you see things with a critical eye, a graph can create a false impression of what the data
really reveal. This is an obvious situation showing how graphs can be used to distort reality if you are
not equipped with a critical statistical mind. This type of deceitful cleverness in distorting graphs is
common in some corporations devising the tinsel to camouflage and also to portray some gigantic
leaps in sales in order to attract more clients or buyers.

Learning Activity 1

Indicate which scale of measurement- nominal ordinal or interval is being used.

1. Both Globe and Smart phone number prefix 0917 and 0923 served 1 million
and 2.5 subscribers, respectively.

6|Module 3
2. The Philippine Statistics Office announces that the average height of
Filipino male is 156.41 cm tall.

3. Postal Office shows that 4,231 individuals have a zip code of 4231.

4. The Sportsfest committee posted the names of individuals with their order
of finish for the
first 50 runners to reach the finish line.

5. The University Admission Office posted the names and scores of student
applicants who took the entrance examination.

MEASURES OF CENTRAL TENDENCY

Discussion

As we venture into the realm of descriptive statistics, let us now focus in describing the nature of a
quantitative data. By using an appropriate descriptive technique, we can organize and neatly
summarize small amounts and large amounts of data distribution. The procedure, utilizing measures
of central tendency, allows us to precisely describe the centrality of data distribution.

Measures of central tendency are methods that can used to determine information regarding average,
ranking, and category of any data distribution. Mean, median and mode are the three tools in
obtaining the measures of central tendency. But only by knowing and using the appropriate tool that
most accurate estimation of centrality can be achieved. The objective of the measures of central
tendency is to describe the centrality of the distribution into a single numerical unit. This single
numerical unit must provide clear description about the common trait being observed in the
distribution of scores.

7|Module 3
In this example, the mean is an appropriate measure of central tendency because the distribution is
fairly well-balanced. This means that there are no extremely high or extremely low scores in either
direction that can unusually influence the average

of the scores. Thus, the mean value of 190,083.00 represents the total picture of the distribution (i.e.
annual incomes). This means that in a “more or less” or approximate fashion it describes the entire
distribution.

Mean of Skewed Distribution. There are situations wherein the mean cannot be trusted to provide a
measure of central tendency because it portrays an extremely distorted picture of the average value of
a distribution of scores. For instance, let us still consider our example of annual incomes but this time
with some adjustment. Let us introduce another score. The annual income of an affluent new neighbor
who happened to move to this town just recently. This new neighbor has a frugal high annual income
so extremely far above the others.

8|Module 3
As you may have noticed, the mean income of Php 367,769.00 this time provides a highly misleading
picture of great prosperity for this neighborhood. The distribution was unbalanced by an extreme
score of the new affluent neighbor. This is what we call an skewed distribution.

Here are some graphic illustration of a skewed distribution:

When the tail goes to the right, the curve is positively skewed; when it goes to the left, it is negatively
skewed. The skew is in the direction of the tail-off of scores, not of the majority of scores. The mean
is always pulled toward the extreme score in a skewed distribution. When the extreme score is at the
low end, then the mean is too low to reflect centrality. When the extreme score is at the high end, the
mean is too high.

The Median

The median is the point that separates the upper half from the lower half of the distribution.
distribution. It is the middle point or midpoint midpoint of any distribution. distribution. If the
distribution is made up of an even number of scores, the median can be found by determining the
point that lies halfway between the two middlemost scores.

193,000.00
190,000.00
185,000.00 MEDIAN
180,000.00

9|Module 3
The Mode

Another measure of central tendency is called the mode. It is the most frequently occurring score in a
distribution. In a histogram, the mode is always located beneath the tallest bar.

10 | M o d u l e 3
The mode provides an extremely fast way of knowing the centrality of the distribution. You can
immediately spot the mode by simply looking at the data and find the dominant constant. It is the
frequently occurring scores.

Appropriate Use of the Mean, Median and Mode

The best way to illustrate the comparative applicability of the mean, median and mode is to look
again at the skewed distribution.

Distribution of monthly income per household in a certain municipality.

Most income is always skewed to the right because the low end has a fixed limit of zero while the
high end has no limit. If we consider that the area of the curve is 100 percent, then the median is the
exact midpoint of the distribution. The area below and above the median is both equal to 50 percent.
Thus, if the median income is P20,000.00 this means that 50% of the households have an income
below P20,000.00 and 50% of the households have an income above P20,000.00. On the other hand,
the mean in our figure above indicates a high income of P 100,000. This makes the curve positively
skewed. The value of the mean gives a distorted picture.

Effects of the Scale of Measurement Used

The scale of measurement in which the data are based oftentimes dictates the measures of central
tendency to be used. The interval data can entertain the calculations of all three measures of central
tendency. The modal and ordinal data cannot be used to calculate for the mean. Ordinal mean can
provide an extremely confusing wrong result. Since median is about ranking, a rank above the score
falls and a rank below a score falls; the ordinal arrangement is necessary in finding the median. For
the nominal data, however, neither the mean nor the median can be used. Nominal data are restricted
by simply using a number as a label for a category and the only measure of central tendency
permissible for nominal data is the mode.

In summary, if the interval data distribution is fairly well balanced, it is appropriate to use the mean to
measure the central tendency. If the distribution of the interval data is skewed, you may either remove
the outlier or adopt the median. If the interval data distribution manifests a significant clustering of

11 | M o d u l e 3
scores, then consider to visually analyze the scores and find the presence of dominant constant which
is the Mode.

Learning Activity 2

1. A class of 13 students takes a 20-item quiz on Science 101. Their scores were as follows: 11,
11, 13, 14, 15, 18, 19, 9, 6, 4, 1, 2, 2.

a. Find the mean. b. Find the median c. Find the mode.

2. A day after, the of 13 students mentioned in problem 1 takes the same test a second time. This
time their scores were: 10, 10, 10, 10, 11, 13, 19, 9, 9, 8, 1, 7, 8.

a. Find the mean. b. Find the median c. Find the mode.

d. Was there a difference in their performance when taking the test a second time?

3. For the set of scores: 1000, 50, 120, 170, 120, 90, 30, 120.

a. Find the mean. b. Find the median c. Find the mode.

d. Which measure of central tendency is the most appropriate, and why?

MEASURES OF POSITION

Learning Objectives:

1. illustrate the measures of position.

2. determine the different measures of position

3. illustrate quartiles, decile and percentiles.

4. give example of each measure of position.

5. identify the given on the set of data;

6. determine the formula for different measures of position

7. find the measures of position of the data using the formula.

8. take note of the definition of the different terms

9. solve all the problems with accuracy

10. perform the specific activities or tasks and complete the exercises and
assessments provided.

12 | M o d u l e 3
QUARTILE

Quartiles are values that divide a set of data into four equal parts. Each part is equal to
a quarter of the data. Quartiles are calculated only after the data have been sorted. Values are said to
be sorted if they are arranged in ascending order.
There are three quartiles, denoted by Q1, Q2, and Q3:
1
Q1 (First Quartile) This value is the median of the first half of the data. These separates of the
4
3
sorted values from the upper of the values. If there are 𝑁 sorted values,
4

25% 𝑜𝑓 𝑁 ≤ 𝑄1 ≤ 75% 𝑜𝑓 𝑁.

1
Q2 (Second Quartile) This value is the median of the entire data. This separates the bottom
2
1
of the sorted values from the upper of the values. If there are 𝑁 sorted of the values,
2

50% 𝑜𝑓 𝑁 ≤ 𝑄2 ≤ 50% 𝑜𝑓 𝑁.

Q3 ( Third Quartile) This value is the median of the second half of the dat. This separates the
3 1
bottom of the sorted values from the upper of the values. If there are 𝑁 sorted values,
4 4
75% 𝑜𝑓 𝑁 ≤ 𝑄3 ≤ 25% 𝑜𝑓 𝑁.

25% 25% 25% 25%

minimum Q1 Q2 Q3 maximum

QUARTILES OF UNGROUPED DATA

For ungrouped data, if there are n observations in a set of data, then,
a. lower quartile or first quartile (Q1)
𝑁+1
𝑄1 =
4
b. middle quartile or second quartile (Q2)
2(𝑁 + 1)
𝑄2 =
4
c. upper quartile or third quartile (Q3)
3(𝑁 + 1)
𝑄3 =
4
Example 1
The coach of a basketball team is interested to find out the amount of time that an individual
player can complete one lap around the basketball court. He tested 15 players, and the recorded times
given below.
6, 9, 7, 5, 15, 10, 9, 7, 4, 9, 10, 9, 8, 6, 7
(in terms of seconds)
Determine the median time, as well as the lower and upper quartiles.
13 | M o d u l e 3
Solution
Step 1: Determine the number of observations
➢ the number of observations is 𝑛 = 15

Step 2: Arrange the data

➢ 4, 5, 6, 6, 7, 7, 7, 8, 9, 9, 9, 9, 10, 10, 15
Step 3: Compute using the formula
➢ For 𝑄1
𝑛+1
𝑄1 = 4
4, 5, 6, 6, 7, 7, 7, 8, 9, 9, 9, 9, 10, 10, 15
15+1
𝑄1 = 4

16
𝑄1 = =4
4

𝑄1 = 4𝑡ℎ 𝑜𝑏𝑠𝑒𝑟𝑣𝑎𝑡𝑖𝑜𝑛

𝑄1 = 6

➢ For 𝑄2
2(𝑛+1)
𝑄2 = 4
4, 5, 6, 6, 7, 7, 7, 8, 9, 9, 9, 9, 10, 10, 15
2(15+1)
𝑄2 = 4

32
𝑄2 = =4
4

𝑄2 = 8𝑡ℎ 𝑜𝑏𝑠𝑒𝑟𝑣𝑎𝑡𝑖𝑜𝑛

𝑄2 = 8

➢ For 𝑄3
3(𝑛+1)
𝑄3 = 4
4, 5, 6, 6, 7, 7, 7, 8, 9, 9, 9, 9, 10, 10, 15
3(15+1)
𝑄3 =
4

48
𝑄3 = 4
=4

𝑄3 = 12𝑡ℎ 𝑜𝑏𝑠𝑒𝑟𝑣𝑎𝑡𝑖𝑜𝑛

𝑄3 = 9

Thus, 𝑄1 = 6 𝑠𝑒𝑐𝑜𝑛𝑑𝑠, 𝑄2 = 8 𝑠𝑒𝑐𝑜𝑛𝑑𝑠 and 𝑄3 = 9 𝑠𝑒𝑐𝑜𝑛𝑑𝑠

Example 2
The following are the test scores of 20 students for the first quarterly exam in mathematics.
Find the three quartiles.

72 84 70 80 62 78 44 74 72 82
72 56 74 58 75 64 78 64 79 82
Solution
Step 1: Determine the number of observations
➢ the number of observations is 𝑛 = 20
14 | M o d u l e 3
Step 2: Arrange the data
➢ 44, 56, 58, 62, 64, 64, 70, 72, 72, 72, 74, 74, 75, 78, 78, 79, 80, 82, 82, 84

Step 3: Compute using the formula

➢ For 𝑄1
𝑛+1
𝑄1 = 4
44, 56, 58, 62, 64, 64, 70, 72, 72, 72, 74, 74, 75, 78, 78, 79, 80, 82, 82, 84
20+1
𝑄1 = 4

21
𝑄1 = = 5.25 ≈ 6
4

𝑄1 = 6𝑡ℎ 𝑜𝑏𝑠𝑒𝑟𝑣𝑎𝑡𝑖𝑜𝑛

𝑄1 = 64

➢ For 𝑄2
2(𝑛+1)
𝑄2 = 4

2(20+1) 44, 56, 58, 62, 64, 64, 70, 72, 72, 72, 74, 74, 75, 78, 78, 79, 80, 82, 82, 84
𝑄2 =
4

42
𝑄2 = = 10.5 ≈ 11
4

𝑄2 = 11𝑡ℎ 𝑜𝑏𝑠𝑒𝑟𝑣𝑎𝑡𝑖𝑜𝑛

𝑄2 = 74

➢ For 𝑄3
3(𝑛+1)
𝑄3 = 4
44, 56, 58, 62, 64, 64, 70, 72, 72, 72, 74, 74, 75, 78, 78, 79, 80, 82, 82, 84
3(20+1)
𝑄3 = 4

63
𝑄3 = = 15.75 ≈ 16
4

𝑄3 = 16𝑡ℎ 𝑜𝑏𝑠𝑒𝑟𝑣𝑎𝑡𝑖𝑜𝑛

𝑄3 = 79

Thus, in the result of the first quarterly exam in Math, 𝑄1 = 64, 𝑄2 = 74 and 𝑄3 = 79
This means to say that –
1. 64 represent the first quartile. 25% of the scores are below 64
2. 74 represents the second quartile. 50% of the scores are below 74
3. 79 represents the third quartile. 75% of the scores are below 79

15 | M o d u l e 3
QUARTILES OF GROUPED DATA
Let us find out from the following example how we can compute 𝑄1 , 𝑄2 and 𝑄3 for grouped data.
Example 3
A survey was conducted among 500 families to find out the length of time they spent in watching
TV per day the following data were obtained

No. of To solve for the median or 𝑄2 , the following steps are recommended
Hours
Families
1. Start by constructing a cumulative frequency or (cf).
0–1 55
This is done by successively adding the frequencies starting from the
2–3 87 lowest class interval.

4–5 145 2. Identify the median class.

𝑛
6–7 90 The median class is the interval containing in the cumulative
2
frequency column.
8–9 73

10 – 11 35

12 - 13 15

3. Solve the median by substituting the corresponding values of the variables in the formula.
Cumulative To get the cumulative frequency, add the
Interval 𝑥1 𝑓1
Frequency frequency from this row + the previous
frequency, eg. 87+55=142
12 – 13 12.5 15 500

10 – 11 10.5 35 485

8–9 8.5 73 450

6–7 6.5 90 377

4–5 4.5 145 287

2–3 2.5 87 142

0–1 0.5 55 55

Total 500

𝑁
Next, identify the class interval to which belongs.
2
𝑁 500
= = 250
2 2
The class interval to which 250 belongs is 4 – 5 because it contains one half of the total frequency which
is 250. The lower-class boundary of the median class is 3.5, so
𝑁 = 250

𝐿𝑄1 = 3.5

< 𝑐𝑓 = 142

16 | M o d u l e 3
𝑖 = 2
The lower class boundary is formed by subtracting 0.5 units from the lower
𝑓𝑚 = 145
class limit. The upper class boundary is formed by adding 0.5 units to the
upper limit.

The median is
250 − 42
𝑚𝑒𝑑𝑖𝑎𝑛 = 3.5 + ( ) 2 = 4.99
145

The following are the formulas for computing Q1 and Q3 for grouped data:

𝑁
− 𝑐𝑓
𝑄1 = 𝑋𝐿𝐵 + ( 4 )𝑖
𝑓𝑄1

Where
𝑋𝐿𝐵 = lower boundary of the q1 class
𝑁 = total frequency
𝑐𝑓 = commulative frequency before the q1 class
𝑓𝑄1 = frequency of the 𝑄1 class
𝑖 = size of the class interval

3𝑁
− 𝑐𝑓
𝑄3 = 𝑋𝐿𝐵 + ( 4 )𝑖
𝑓𝑄3

𝑋𝐿𝐵 = lower boundary of the q1 class

𝑁 = total frequency
𝑐𝑓 = commulative frequency before the q1 class
𝑓𝑄3 = frequency of the 𝑄3 class
𝑖 = size of the class interval

Compute for the three quartiles of the height of 50 Filipino children, 7 to 12 years of age.

Class Interval For 1st Quartile

f cf
Height (cm) Class:
𝑁
=
50
= 12.5
4 4

134 – 139 10 50 Class interval: 98 – 103

128 – 133 9 40 𝑋𝐿𝐵 = 97.5
𝑁
122 – 127 8 31 𝑐𝑓𝑏 = 6 −𝑐𝑓𝑏
𝑄1 = 𝑋𝐿𝐵 + ( 4 )𝑖
𝑓𝑄1
116 – 121 1 23 𝑓𝑄1 = 9
12.5
−6
4
110 – 115 5 22 𝑖=6 𝑄1 = 97.5 + ( )6
9

104 – 109 2 17 𝑄1 = 101.82

98 – 103 9 15

17 | M o d u l e 3
92 – 97 5 6

86 – 91 1 1

N=50

Class Interval For 2nd Quartile

f cf
Height (cm) Class:
2𝑁
=
2(50)
= 25
4 4

134 – 139 10 50 Class interval: 122 - 127

128 – 133 9 40 𝑋𝐿𝐵 = 121.5
𝑁
122 – 127 8 31 𝑐𝑓𝑏 = 23 −𝑐𝑓𝑏
𝑄2 = 𝑋𝐿𝐵 + ( 4 )𝑖
𝑓𝑄2
116 – 121 1 23 𝑓𝑄2 = 8
25−23
𝑄2 = 121.5 + ( )6
110 – 115 5 22 𝑖=6 8

𝑄2 = 123
104 – 109 2 17

98 – 103 9 15

92 – 97 5 6

86 – 91 1 1

N=50

Class Interval For 3rd Quartile

f cf
Height (cm) Class:
3𝑁
=
3(50)
= 37.5
4 4

134 – 139 10 50 Class interval: 128 - 133

128 – 133 9 40 𝑋𝐿𝐵 = 127.5
3𝑁
122 – 127 8 31 𝑐𝑓𝑏 = 31 −𝑐𝑓𝑏
𝑄3 = 𝑋𝐿𝐵 + ( 4 )𝑖
𝑓𝑄3
116 – 121 1 23 𝑓𝑄3 = 9
37.5−31
𝑄3 = 127.5 + ( )6
110 – 115 5 22 𝑖=6 9

𝑄3 = 131.83
104 – 109 2 17

98 – 103 9 15

92 – 97 5 6

86 – 91 1 1

N=50

This means that 25% of the data is below 𝑄1 = 101.82, 50% of the data is below 𝑄2 = 123 and 75% of
the data is below 𝑄3 = 131.83.

Fifteen children have a height below 101.82 cm; thirty-one children have a height of 123 cm and below;
forty children have a height of 125.82 cm and below.
18 | M o d u l e 3
DECILE
Just like the quartiles, the deciles are values which divides a collection of data into equal
parts to analyze relationships. In statistics, even the most meager relationships are very significant in
tracing the existence of pattern, or in finding solutions to problems.

What are deciles?

Deciles are the score-points that divide a distribution into 10 equal parts. The decile are computed in
the same way the quartiles are computed.

Thus, for ungroup data, we use the formula

𝑘(𝑁 + 1)
𝐷𝑘 =
10

and for grouped data, we use

𝑘𝑁
− 𝑐𝑓𝑏
𝐷𝑘 = 𝑋𝐿𝐵 + ( 10 )𝑖
𝑓𝐷𝑘

Where 1 ≤ 𝑘 ≤ 9

Example 1. A garment factory owner keeps track of the quality of all produced merchandise before
they are sold out in the market. She patiently checks the garments and records all those with defects.
For the past 24 months, the owner noted the following number of defective items produced in the
month

45, 30, 36, 16, 21, 33, 40, 32, 14, 10, 29, 23, 39, 17, 11, 18, 34, 19, 24, 21, 65, 42, 37

Find the 5th decile

Solution:

Step 1. Find the number of observations

➢ Number of observations = 24

Step 2. Arrange the data

➢ 10, 11, 14, 16, 17, 18, 19, 21, 21, 23, 24, 26, 29, 30 ,32, 33, 34, 35, 36, 37, 39, 40, 42, 45

Step 3. Apply the formula

𝑘(𝑁 + 1)
𝐷𝑘 =
10
5(24 + 1)
𝐷5 =
10

𝐷5 = 12.5𝑡ℎ 𝑜𝑏𝑠𝑒𝑟𝑣𝑎𝑡𝑖𝑜𝑛

The 12.5th observation is between 26 and 29 in the list.

29 − 26
𝐷5 = 26 +
2

𝐷5 = 26 + 1.5 = 27.5

19 | M o d u l e 3
Example 2. Compute for the 𝐷3 , 𝐷6 , and 𝐷9 of the height of 50 Filipino children, with ages 7 – 12
years old.

Use the grouped data below

Class Interval For 3rd Decile

f cf
Height (cm) Class:
3𝑁
=
3(50)
= 15
10 10

134 – 139 10 50 Class interval: 98 - 103

128 – 133 9 40 𝑋𝐿𝐵 = 97.5
3𝑁
122 – 127 8 31 𝑐𝑓𝑏 = 6 10
−𝑐𝑓𝑏
𝐷3 = 𝑋𝐿𝐵 + ( )𝑖
𝑓𝐷3
116 – 121 1 23 𝑓𝐷3 = 9
15−6
𝐷3 = 97.5 + ( )6
110 – 115 5 22 𝑖=6 9

𝐷3 = 103.5
104 – 109 2 17

98 – 103 9 15

92 – 97 5 6

86 – 91 1 1

N=50

Class Interval For 6th Decile

f cf
Height (cm) Class:
6𝑁
=
6(50)
= 30
10 10

134 – 139 10 50 Class interval: 122 – 127

128 – 133 9 40 𝑋𝐿𝐵 = 121.5
6𝑁
122 – 127 8 31 𝑐𝑓𝑏 = 23 −𝑐𝑓𝑏
𝐷6 = 𝑋𝐿𝐵 + ( 10 )𝑖
𝑓𝐷6
116 – 121 1 23 𝑓𝐷6 = 8
30−23
𝐷6 = 121.5 + ( )6
110 – 115 5 22 𝑖=6 8

𝐷6 = 126.78
104 – 109 2 17

98 – 103 9 15

92 – 97 5 6

86 – 91 1 1

20 | M o d u l e 3
N=50

Class Interval For 9th Decile

f cf
Height (cm) Class:
9𝑁
=
9(50)
= 45
10 10

134 – 139 10 50 Class interval: 134 – 139

128 – 133 9 40 𝑋𝐿𝐵 = 133.5
9𝑁
122 – 127 8 31 𝑐𝑓𝑏 = 40 −𝑐𝑓𝑏
𝐷9 = 𝑋𝐿𝐵 + ( 10 )𝑖
𝑓𝐷9
116 – 121 1 23 𝑓𝐷9 = 10
45−40
𝐷9 = 133.5 + ( )6
110 – 115 5 22 𝑖=6 10

𝐷9 = 136.5
104 – 109 2 17

98 – 103 9 15

92 – 97 5 6

86 – 91 1 1

N=50

These means that 30% of the children have a height 103.5 c. and below, 60% of the children have a
height of 126.78 cm and below and 90% of the children have a height of 136.5 cm and below.

PERCENTILE
Percentiles are the values of arrange data which divide the whole data into one hundred equal parts.
Very often, schools used percentiles to ranks students based on their academic performances.

In common use, the percentile usually indicates that a certain percentage falls below the given
percentile rank. For example, if you are ranked 25 th percentile, then 25% of the test scores are below
your score. Expressed in another way, when you ranked in the 25 th percentile, then 75% of the test
scores are above you.

Thus, the formula for computing the percentile of ungrouped data is

21 | M o d u l e 3
𝑝
𝑙 = 𝑛( )
100

where
𝑙 = location of the data value
𝑝 =percentile as a whole number
𝑛 =sample size
and for grouped data, we have the formula
𝐾𝑛
− 𝑐𝑓
𝑃𝑘 = 𝐿 + ( 100 )𝑖
𝑓𝑘

where
𝑃𝑘 =the kth percentile
𝐿 =the lower-class boundary of the kth percentile class
𝑓𝑘 =the frequency of the class containing the 𝑃𝑘
𝑛 =the total number of observations
𝑖 =the width of the class interval containing the percentile point
𝑐𝑓 =cumulative frequency for the class interval immediately below the class interval containing the
percentile point

Example 1. Given the data 11, 11, 14, 15, 16, 16, 17, 19, 22, 25, 26, 27, 31, 34, 36, what data value lies at
the 30th percentile?

Solution

Step 1. Find the number of observations

➢ Number of observations = 15

Step 2. Arrange the data

➢ 11, 11, 14, 15, 16, 16, 17, 19, 22, 25, 26, 27, 31, 34, 36

Step 3. Apply the formula

30
𝑙 = 15 ( )
100
4500
𝑙= = 4.5𝑡ℎ
100

round the point to the 5th position. Thus, the 30th percentile is

11, 11, 14, 15, 16, 16, 17, 19, 22, 25, 26, 27, 31, 34, 36

Example 2

Using the table of the tabulated height of 50 students, compute for 𝑃13 , 𝑃32 and 𝑃84 .

Class Interval For 13th Percenile

f cf
Height (cm) Class:
13𝑁
=
13(50)
= 6.5
100 10

134 – 139 10 50

22 | M o d u l e 3
128 – 133 9 40 Class interval: 98 - 103
13𝑛
122 – 127 8 31 𝐿 = 97.5 −𝑐𝑓
𝑃13 = 𝐿 + ( 100 )𝑖
𝑓𝑃13
116 – 121 1 23 𝑐𝑓 = 6
6.5−6
𝑓𝑃13 = 9 𝑃13 = 97.5 + ( )6
110 – 115 5 22 100

𝑖=6 𝑃13 = 97.86

104 – 109 2 17

98 – 103 9 15

92 – 97 5 6

86 – 91 1 1

N=50

Class Interval For 32th Percenile

f cf
Height (cm) Class:
32𝑁
=
32(50)
= 16
100 10

134 – 139 10 50 Class interval: 104 - 109

128 – 133 9 40 𝐿 = 103.5
32𝑛
122 – 127 8 31 𝑐𝑓 = 15 −𝑐𝑓
𝑃32 = 𝐿 + ( 100 )𝑖
𝑓𝑃32
116 – 121 1 23 𝑓𝑃32 = 2
16−15
𝑃32 = 103.5 + ( )6
110 – 115 5 22 𝑖=6 100

𝑃32 = 106.5
104 – 109 2 17

98 – 103 9 15

92 – 97 5 6

86 – 91 1 1

N=50

23 | M o d u l e 3
Class Interval For 84th Percenile
f cf
Height (cm) Class:
84𝑁
=
84(50)
= 42
100 10

134 – 139 10 50 Class interval: 134 - 139

128 – 133 9 40 𝐿 = 133.5
84𝑛
122 – 127 8 31 𝑐𝑓 = 40 −𝑐𝑓
𝑃84 = 𝐿 + ( 100 )𝑖
𝑓𝑃84
116 – 121 1 23 𝑓𝑃84 = 10
42−40
𝑃84 = 133.5 + ( )6
110 – 115 5 22 𝑖=6 100

𝑃84 = 134.7
104 – 109 2 17

98 – 103 9 15

92 – 97 5 6

86 – 91 1 1

N=50

Learning Activity 3

A. Compute the three quartiles of each of the following sets of data.

1. 88, 85, 83, 83, 80, 93, 81, 83, 98, 90

2. 66, 65, 71, 72, 74, 66, 74, 63, 81, 67

B. Find 𝐷1 up to 𝐷9 of the following ungrouped data

1. 15, 6, 17, 84, 13, 57, 89, 19, 53, 85, 14, 23, 27, 66, 21, 1, 32, 62, 66, 38

C. Compute the three quartiles of the following sets of data

1. The following are test scores of the first quarterly exam in Statistics and Probability. Find the three
quartiles

Score Frequency (f)

1–5 4

6 – 10 6

11 – 15 10

24 | M o d u l e 3
16 – 20 8

21 – 25 7

26 - 30 5

N=40

D. Find The 25th Percentile.

1. 14, 20, 23, 28, 17, 16, 13, 18, 14, 15

2. 122, 125, 133, 122, 132, 122, 123, 125, 122, 123

E. Complete the entries in the following table of the scores of the Oral Exam in Math and compute
and interpret the 9 deciles.

Cumulative
Score Frequency (f)
Frequency (cf)

91 – 100 2

81 – 90 6

71 – 80 9

61 – 70 11

51 – 60 8

41 – 50 7

31 – 40 5

21 – 30 4

11 – 20 3

1 – 10 1

Total

25 | M o d u l e 3
F. Refer to the set of fata from the scores of 50 students in an English Proficiency Test and identify
the following information.

1. Class interval of P_21 class

2. Frequency of P_37 class

3. cf of P_53 class

4. Class interval of P_87 class

5. cf above P_38 class

6. Lower boundary of P_23 class

7. Upper boundary of P_67 class

8. Frequency below P_4 class

9. Class interval above P_16 class

10. P_43

Class Interval f cf

98 – 100 3 50

95 – 97 4 47

92 – 94 7 43

89 – 91 12 36

86 – 88 9 24

83 – 85 5 15

80 – 82 2 10

77 – 79 1 8

74 – 76 4 7

71 - 73 3 3

N=50

26 | M o d u l e 3
27 | P a g e

Basic Statistics PDF
No ratings yet
Basic Statistics PDF
43 pages
Basic Concepts of Statistics
83% (29)
Basic Concepts of Statistics
36 pages
Lecture Notes 1 Introduction To Statistics and Data Analysis
100% (1)
Lecture Notes 1 Introduction To Statistics and Data Analysis
23 pages
Statistics For Business and Economics: Anderson Sweeney Williams
100% (1)
Statistics For Business and Economics: Anderson Sweeney Williams
47 pages
Chapter 1-Basic Statistical Concepts
No ratings yet
Chapter 1-Basic Statistical Concepts
30 pages
Nature of Statistics
100% (1)
Nature of Statistics
7 pages
Iit Madras BSC Degree Qualifier Notes
No ratings yet
Iit Madras BSC Degree Qualifier Notes
92 pages
Practical Research 2: Quarter 2-Module 5 Data Collection Procedure
No ratings yet
Practical Research 2: Quarter 2-Module 5 Data Collection Procedure
19 pages
EBS 234 Assessment in Basic Schools
No ratings yet
EBS 234 Assessment in Basic Schools
92 pages
Methodology of Quantitative Social Research
No ratings yet
Methodology of Quantitative Social Research
118 pages
Elementary Statistics and Probability: By: Carmela O. Zamora-Reyes Lorelei B. Ladao - Saren
100% (2)
Elementary Statistics and Probability: By: Carmela O. Zamora-Reyes Lorelei B. Ladao - Saren
27 pages
Statistics MMW
No ratings yet
Statistics MMW
65 pages
Statistical Analysis With Software Application
100% (1)
Statistical Analysis With Software Application
6 pages
Math 101 Statistics
No ratings yet
Math 101 Statistics
100 pages
Basic Statistics For Testing
No ratings yet
Basic Statistics For Testing
58 pages
Chapter-1 Data Analysis
No ratings yet
Chapter-1 Data Analysis
14 pages
Mathematics in Our World - Mathematics As A Tool: Data Management
100% (1)
Mathematics in Our World - Mathematics As A Tool: Data Management
24 pages
Intro To Career in Data Science: Md. Rabiul Islam
100% (1)
Intro To Career in Data Science: Md. Rabiul Islam
62 pages
Reviewer 1.2
No ratings yet
Reviewer 1.2
300 pages
Stat - Lesson 1 Concepts and Definitions
No ratings yet
Stat - Lesson 1 Concepts and Definitions
5 pages
Lecture Notes - Prob and Stat
No ratings yet
Lecture Notes - Prob and Stat
229 pages
Basics Concepts of Statistics
No ratings yet
Basics Concepts of Statistics
18 pages
Unit01 03
No ratings yet
Unit01 03
147 pages
Statistics Lesson 1
No ratings yet
Statistics Lesson 1
111 pages
Math 01 Module DATA MANAGEMENT Enhanced
No ratings yet
Math 01 Module DATA MANAGEMENT Enhanced
32 pages
Nature of Statistics
No ratings yet
Nature of Statistics
7 pages
Introduction
No ratings yet
Introduction
97 pages
Chapter One: 1. Basic Concepts, Methods of Data Collection and Presentation
No ratings yet
Chapter One: 1. Basic Concepts, Methods of Data Collection and Presentation
111 pages
Stat 01
No ratings yet
Stat 01
35 pages
MMW (Module5)
No ratings yet
MMW (Module5)
63 pages
Statistics and Probability - Lect1
No ratings yet
Statistics and Probability - Lect1
36 pages
Statistics
No ratings yet
Statistics
69 pages
Topic 5 Data Management (Statistics)
No ratings yet
Topic 5 Data Management (Statistics)
116 pages
DrSoomro - 2588 - 20292 - 1 - Lecture 7 & 8
No ratings yet
DrSoomro - 2588 - 20292 - 1 - Lecture 7 & 8
60 pages
Ge104 Chapter4 Module
No ratings yet
Ge104 Chapter4 Module
58 pages
Rishabh Gupta RM File
No ratings yet
Rishabh Gupta RM File
64 pages
GE103 Statistics
No ratings yet
GE103 Statistics
30 pages
Basic Ideas of Data Management
No ratings yet
Basic Ideas of Data Management
32 pages
Unit 1 Mean and SD
No ratings yet
Unit 1 Mean and SD
45 pages
5.1 MMW
No ratings yet
5.1 MMW
36 pages
Business Research Method The Design of Research
No ratings yet
Business Research Method The Design of Research
18 pages
Stat BootCamp
No ratings yet
Stat BootCamp
51 pages
WBSU PG SYLLABUS 2019 Final
No ratings yet
WBSU PG SYLLABUS 2019 Final
45 pages
Nonparametric Test
No ratings yet
Nonparametric Test
18 pages
Normally Distributed Kruskal-Wallis Test: Median
No ratings yet
Normally Distributed Kruskal-Wallis Test: Median
42 pages
Exc1-4 - Statistics - Daud Ramli
No ratings yet
Exc1-4 - Statistics - Daud Ramli
23 pages
Module 3 Lesson 3.1
No ratings yet
Module 3 Lesson 3.1
10 pages
Statistics
No ratings yet
Statistics
101 pages
Note For Int To Statistics
No ratings yet
Note For Int To Statistics
24 pages
Mathematics in The Modern World-Module 3a
No ratings yet
Mathematics in The Modern World-Module 3a
15 pages
Lesson 5.1
No ratings yet
Lesson 5.1
43 pages
Lecture1 A Business Statistics
No ratings yet
Lecture1 A Business Statistics
15 pages
Stat 50 Triola 13th Chapter 1
No ratings yet
Stat 50 Triola 13th Chapter 1
5 pages
Types of Data or Classification of Variables 1
No ratings yet
Types of Data or Classification of Variables 1
14 pages
Kyu Edu 2301 WK9
No ratings yet
Kyu Edu 2301 WK9
10 pages
Module 4.1 MMW
No ratings yet
Module 4.1 MMW
9 pages
Stats
No ratings yet
Stats
20 pages
Lesson 2 Statistics Refresher
No ratings yet
Lesson 2 Statistics Refresher
32 pages
Adv Stats Lessons
No ratings yet
Adv Stats Lessons
36 pages
Chapter One
No ratings yet
Chapter One
7 pages
Descriptive Statistics: Atistics
No ratings yet
Descriptive Statistics: Atistics
49 pages
Statistik 1
No ratings yet
Statistik 1
17 pages
Business Research Measurement Scales
No ratings yet
Business Research Measurement Scales
6 pages
Statistics and Probability
No ratings yet
Statistics and Probability
17 pages
Notes of Statisitcs
No ratings yet
Notes of Statisitcs
30 pages
Tutorial 1
No ratings yet
Tutorial 1
3 pages
Statatics Cha 1
No ratings yet
Statatics Cha 1
8 pages
Math 3
No ratings yet
Math 3
19 pages
Data Collection Methods: Interview and Observations: Group Members Submitted To: DR Swati Upveja
No ratings yet
Data Collection Methods: Interview and Observations: Group Members Submitted To: DR Swati Upveja
27 pages
Chapter 2 Stat (MMW)
No ratings yet
Chapter 2 Stat (MMW)
13 pages
CHAPTER ONE State 1
No ratings yet
CHAPTER ONE State 1
17 pages
STATS w2 Done
No ratings yet
STATS w2 Done
8 pages
CIA 01-Data Visualization 2228328
No ratings yet
CIA 01-Data Visualization 2228328
9 pages
Lesson 1 Basic Concepts in Statistics
No ratings yet
Lesson 1 Basic Concepts in Statistics
4 pages
Inferential Statistics: Rashid Msba
No ratings yet
Inferential Statistics: Rashid Msba
34 pages
Chapter 1: Introduction To Statistics
No ratings yet
Chapter 1: Introduction To Statistics
28 pages
ProbStat Tutor 01
No ratings yet
ProbStat Tutor 01
3 pages
CH 03
No ratings yet
CH 03
32 pages
Module Lesson 1
No ratings yet
Module Lesson 1
12 pages
Intensive Training On Statistics For High School and Tertiary Teachers For K-12 Curriculum Module 1: Descriptive Statistics
No ratings yet
Intensive Training On Statistics For High School and Tertiary Teachers For K-12 Curriculum Module 1: Descriptive Statistics
54 pages
Statics: " " Seems To Be A
No ratings yet
Statics: " " Seems To Be A
22 pages
Lesson 1 Basic Concepts in Statistics
No ratings yet
Lesson 1 Basic Concepts in Statistics
4 pages
Intro To Measurement and Statistics
No ratings yet
Intro To Measurement and Statistics
6 pages
1) Unit 1. Introduction PDF
No ratings yet
1) Unit 1. Introduction PDF
7 pages
Types of Data Qualitative Data
No ratings yet
Types of Data Qualitative Data
5 pages
Module 1
No ratings yet
Module 1
10 pages
1 Nature of Statistics
No ratings yet
1 Nature of Statistics
7 pages
Lesson I PDF
No ratings yet
Lesson I PDF
4 pages
Masbate National Comprehensive High School: 2 Semester S.Y. 2018-2019
No ratings yet
Masbate National Comprehensive High School: 2 Semester S.Y. 2018-2019
3 pages