3 Organizing Data
3 Organizing Data
You have learned the different ways on how to gather data and the sampling
techniques from which you can choose the one that you will employ in your research.
Now is the time for you to know what to do with the data that you have gathered. It is
essential to organize your data so that you can easily interpret them.
Data may be ungrouped or grouped. Ungrouped data are unsorted or raw data.
This means that the data have not been grouped or classified according to any
characteristic. On the other hand, grouped data are data that have been organized
or grouped.
1. By forming an array
An array is an arrangement of numbers in increasing or decreasing
order.
The three ways of organizing a set of ungrouped data are shown using the
example below.
42, 51, 44, 28, 32, 24, 30, 25, 24, 35, 43, 37, 28,
28, 22, 45, 29, 28, 36, 35, 50, 25, 25, 46, 44
You can organize the data in the following ways:
Stem-leaf Plot: (Assuming that we did not form an array, let us refer to the original
data.)
42, 51, 44, 28, 32, 24, 30, 25, 24, 35, 43, 37, 28,
28, 22, 45, 29, 28, 36, 35, 50, 25, 25, 46, 44
In the first value, 42, the digit 4 is the stem, and the digit 2 is the leaf. In the
second value, 51, the digit 5 is the stem, and the digit 1 is the leaf. Continue plotting
all the ages of the employees. After all the ages have been plotted, make another table
and arrange the leaves in increasing order.
Draft: (as the data is given) Final: (after arranging the leaves from
The lowest to the highest)
Table No. 1
Ages of 25 Employees in a Supermarket
Ages Frequency
22 1
24 2
25 3
28 4
29 1
30 1
32 1 Note: You may first form an
35 2 array or a stem-leaf plot so
36 1 that it would be easier to
37 1 construct the frequency
42 1 distribution table.
43 1
44 2
45 1
46 1
50 1
51 1
25
Interpretation of the data may be made in this way.
The table shows that of the twenty-five employees of the supermarket, the
youngest is twenty-two years old while the oldest is fifty-one years old. Most of the
employees are in their twenties, six are in their thirties and their forties, and only two
are in their fifties.
Table No 2
Scores of a Sample of 40 Students in a Biology Test
You have to remember the definition of the following terms that are found in
the frequency distribution table:
Class interval refers to the grouping bounded by the lower limit (LL) and upper limit
(UL).
Class size (c) is the length or width of the class.
Class frequency (f) is the number of observations falling within a class interval.
Class boundaries refer to the true boundaries (true limits) of a class interval
Class
Intervals In this example, 17-21 is the first class interval
(Scores) where 17 is the lower limit and 21 is the upper
17-21 limit. The lower limit of the first class interval is
17+5 22-26 21+5 usually the lowest value in the data. The upper
limit 21 was obtained by counting 5 units (since
22+5 27-31 26+5
c=5) starting from the lower limit 17 (17, 18, 19,
27+5 32-36 31+5
20, 21). To get the succeeding lower limits, just
32+5 37-41 36+5
add 5 which is the class size. Do the same for the
37+5 42-46 41+5 upper limits.
42+5 47-51 46+5
c=5
*21+0.5=21.5
Class mark or class midpoint refers to the representative of the class interval.
You have to follow the steps to construct a frequency distribution table. To show
you these steps, let us consider the test scores of 50 students in Statisticsrecorded as
follows:
Table 1
Test Score of 50 Students in Statistics
48 39 55 65 51
79 63 89 29 54
65 58 64 76 90
30 84 50 55 59
69 43 79 44 40
49 50 24 78 71
63 64 73 35 65
58 36 47 86 46
85 74 64 72 54
38 52 33 53 42
Step 1: Determine the Range (R) of the distribution. The range is equal to the
highest score minus the lowest score.
R = 90 - 24
R = 66
Step 2: Determine the class size by dividing the range by the desired number of
classes. (The number of classes must not be too few nor too many. Too many
class intervals may result in classes with zero frequencies.) Let us have ten
classes on this problem. In some cases, the class size is already given.
Step 3: Unless otherwise specified, always start the lowest class limit by the lowest
value of the given data (raw data). For the second lower limit, just add the
class size and then continue to add the class size to this lower limit to get the
rest of the lower limits. To get the first upper limit, subtract one (1) from the
second lower limit. For the second upper limit, just add the class size continue
to add the class size to this upper limit to get the rest of the upper limits.
Note: The last class interval should contain the highest value.
Step 4: Determine the class boundaries by subtracting 0.5 from each of the lower
class limits and adding 0.5 to each of the upper class limits.
Lower Upper
Class Boundaries
Lower Limit Boundaries Upper Limit Boundaries
LB - UB
- 0.5 (LB) + 0.5 (UB)
24 - 0.5 23.5 30 + 0.5 30.5 23.5 – 30.5
31 - 0.5 30.5 37 + 0.5 37.5 30.5 – 37.5
38 - 0.5 37.5 44 + 0.5 44.5 37.5 – 44.5
45 - 0.5 44.5 51 + 0.5 51.5 44.5 – 51.5
52 - 0.5 51.5 58 + 0.5 58.5 51.5 – 58.5
59 - 0.5 58.5 65 + 0.5 65.5 58.5 – 65.5
66 - 0.5 65.5 72 + 0.5 72.5 65.5 – 72.5
73 - 0.5 72.5 79 + 0.5 79.5 72.5 – 79.5
80 - 0.5 79.5 86 + 0.5 86.5 79.5 – 86.5
87 - 0.5 86.5 93 + 0.5 93.5 86.5 – 93.5
Step 5: Calculate the class marks or class midpoints. It is the numerical location
of the center of the class and is computed as follows:
Applying the steps, table 2 shows how the frequency distribution table looks like.
Table No. 2
Frequency Distribution of the 50 Test Scores in Statistics
Class Class Class
Frequency
Intervals Boundaries Marks
f
LL - UL LB – UB Xi
24 - 30 23.5 - 30.5 27 3
31 – 37 30.5 - 37.5 34 3
38 – 44 37.5 - 44.5 41 6
45 – 51 44.5 - 51.5 48 7
52 – 58 51.5 - 58.5 55 8
59 – 65 58.5 - 65.5 62 9
66 – 72 65.5 - 72.5 69 3
73 – 79 72.5 - 79.5 76 6
80 – 86 79.5 - 86.5 83 3
87 - 93 86.5 - 93.5 90 2
c=7 n = 50
Now let us continue with the cumulative frequency distribution.
Illustration 1
(Data from Test Scores of 50 Students in Statistics)
Resulting "Less
Successive addition of Successive addition of than" and
frequencies from top to frequencies from "Greater than"
bottom bottom to top Cumulative
Frequencies
Greater Cumulative
Less Than Frequency
than
Frequency Cumulative Frequency
Cumulative f
f Frequency f <cf >cf
Frequency
(<cf)
(>cf)
3 3 3 47 + 3 50 3 3 50
3 3+3 6 3 44 + 3 47 3 6 47
6 6+6 12 6 38 + 6 44 6 12 44
7 12 + 7 19 7 31 + 7 38 7 19 38
8 19 + 8 27 8 23 + 8 31 8 27 31
9 27 + 9 36 9 15 + 9 23 9 36 23
3 36 + 3 39 3 11 + 3 14 3 39 14
6 39 + 6 45 6 5+6 11 6 45 11
3 45 + 3 48 3 2+3 5 3 48 5
2 48 + 2 50 2 2 2 50 2
n = 50 n = 50 n=50
Then let us proceed to make the table on cumulative percentage frequency.
𝑐𝑐𝑐𝑐 Formula
Cumulative Percentage Frequency (cpf) = 𝑥𝑥 100
𝑛𝑛
Cumulative Percentage
Frequency Cumulative Frequency
Frequency
f
<cf >cf <cpf >cpf
3 50
3 3 50 𝑥𝑥100 = 6 𝑥𝑥100 = 100
50 50
6 48
3 6 48 𝑥𝑥100 = 12 𝑥𝑥100 = 96
50 50
Now our table looks like this with the addition of the column on cumulative
percentage frequency.
Table 3
Cumulative Percentage Distribution of 50 Test Scores in Statistics
Cumulative
Class Class Cumulative
Frequency Percentage
Intervals Boundaries Frequency
f Frequency
LL - UL LB - UB
<cf >cf <cpf >cpf
24 - 30 23.5 - 30.5 3 3 50 6 100
31 – 37 30.5 - 37.5 3 6 48 12 94
38 – 44 37.5 - 44.5 6 12 44 24 88
45 – 51 44.5 - 51.5 7 19 38 38 76
52 – 58 51.5 - 58.5 8 27 31 54 62
59 – 65 58.5 - 65.5 9 36 23 72 46
66 – 72 65.5 - 72.5 3 39 14 78 28
73 – 79 72.5 - 79.5 6 45 11 90 22
80 – 86 79.5 - 86.5 3 48 5 96 10
87 - 93 86.5 - 93.5 2 50 2 100 4
c=7 n= 50
This is how to interpret the cumulative frequency and the cumulative
percentage frequency.
Remember: Use the upper class boundaries in interpreting the <cf and
the <cpf. (lower than the upper class boundaries)
Use the lower class boundaries in interpreting the >cf and the
>cpf. (higher than the lower class boundaries)
The following is an example of an interpretation for the scores of the 50 students in
Statistics.
(For less than cumulative frequency and less than cumulative percentage frequency,
use the numbers colored yellow in the table.)
As seen in table 3, of the 50 students who took the test in Statistics, three
(3) students scored lower than 30.5 while six (6) students scored lower than 37.5.
Twenty-seven (27) or more than half of them, scored lower than 58.5. Moreover,
24% of the students scored lower than 44.5, while 78% scored lower than 72.5.
(For greater than cumulative frequency and greater than cumulative percentage
frequency, use the numbers colored green in the table.)
𝑓𝑓 Formula
Relative Frequency (rf) = 𝑛𝑛 𝑥𝑥 100%
Resulting
Relative Frequency
Illustration No.3 Distribution of 50 Test
Scores in Statistics
Relative Relative
Frequency Frequency
Frequency Frequency
f f
(rf) (rf)
3 3 6
𝑥𝑥100% 6 3
50
3 3 6
𝑥𝑥100% 6 3
50
6 6 12
𝑥𝑥100% 12 6
50
7 14
8 16
9 18
3 6
6 12
3 6
2 4
n = 50 100%
Table 4 shows the addition of the column on Relative Frequency (RF).
Table 4
Cumulative Percentage and Relative Frequency Distribution of 50 Test Scores
in Statistics
Cumulative
Class Class Cumulative Relative
Frequency Percentage
Intervals Boundaries Frequency Frequency
f Frequency
LL – UL LB - UB (RF)
<cf >cf <cpf >cpf
24 – 30 23.5 - 30.5 3 3 50 6 100 6
31 – 37 30.5 - 37.5 3 6 48 12 94 6
38 – 44 37.5 - 44.5 6 12 44 24 88 12
45 – 51 44.5 - 51.5 7 19 38 38 76 14
52 – 58 51.5 - 58.5 8 27 31 54 62 16
59 – 65 58.5 - 65.5 9 36 23 72 46 18
66 – 72 65.5 - 72.5 3 39 14 78 28 6
73 – 79 72.5 - 79.5 6 45 11 90 22 12
80 – 86 79.5 - 86.5 3 48 5 96 10 6
87 – 93 86.5 - 93.5 2 50 2 100 4 4
c=7 n = 50 100%
1. Twenty-five Grade 3 pupils were asked about their favorite Marvel superhero.
The top 2 marvel superhero from the poll will be drawn to the wall of the
classroom. The responses of the pupils were documented below.
Iron Man Spiderman Captain Thor Thor
America
Hulk Hulk Spiderman Thor Spiderman
Captain Iron Man Spiderman Captain Captain
America America America
Iron Man Thor Iron Man Spiderman Spiderman
Iron Man Captain Captain Captain Hulk
America America America
20 21 16 19 21 21 24 23 22 29
15 25 18 16 27 26 22 23 25 32
17 15 21 18 24 35 19 25 24 36
28 20 31 25 32 20 27 28 24 23
23 24 24 26 26 27 26 25 24 28
c=
Lesson Check-Up Test
Organizing Data
Time Allotment: 1.5 hours Time Started: ________ Time Finished: _________
Direction: Write the letter of the correct answer on the space provided before each
number.
______ 10. It refers to the distance of the highest score from the lowest
score.
a. Range b. Upper Class
Boundaries
c. Upper Limit d. Lower Class
Boundaries
______ 11. Which of the following shows the right way of computing the
class marks?
a. Add the lower limit and upper limit of each class
interval.
b. Subtract each lower limit from their respective upper
limit.
c. Choose any number within each class interval.
d. Add each lower limit to their respective upper limit
then divide the sum by 2.
______ 12. Which of the following is NOT true when constructing class
intervals?
a. There must be one and only one size to be used for
every class interval.
b. It should be bounded by the upper limit and lower
limit.
c. The class size to be used must be determined first
before constructing class intervals.
d. Class size can be calculated by subtracting the
lower limit from its upper limit.
______ 14. What is the range of the ages of these ten people?
a. 8 b. 7
c. 9 d. 10
______ 15. How many had diagnosed with diabetes at the age below
15?
a. 5 b. 4
c. 9 d. 6
______ 16. What percentage of the ten people who were diagnosed at
the age of 15 and above?
a. 20% b. 50%
c. 40% d. 10%
Gadget No. of
Response
Cellphone 8
Tablet/Ipad 12
Laptop 24
Desktop 6
______ 23. Which the following contains 40% of the total sample?
a. 56-60 b. 46-50
c. 41-45 d. 51-55
______ 24. What is the class mark of the third class interval?
a. 38 b. 43
c. 48 d. 53
______ 26. What is the lower limit of the last class interval?
a. 56 b. 51
c. 46 d. 41
______ 28. If the minimum number of hours for work is 45, what part of
the employees worked overtime?
a. 91% b. 73%
c. 33% d. 13%
______ 29. What is the upper class boundary of the second class
interval?
a. 55.5 b. 50.5
c. 45.5 d. 40.5
______ 30. Which among the following best describes the highlighted
cell?
a. 67% of the employees worked at most 50 hours.
b. 67% of the employees worked exceeding 50.5.
c. 67% of the employees worked for less than 46 hours.
d. 67% of the employees worked from 46 to 50 hours.
31-40 Table II. The frequency table shows the weight loss in
pounds of people who took the detoxifying program for 60
days.
______ 31. What is the total number of people who took the detoxifying
program? Which among these statements is true?
a. The lowest amount of hours existing in the table is 40
hours.
b. The highest amount of hours existing in the table is
56 hours.
c. The class interval with the highest frequency is 46 -
50 hours.
d. The class interval with the lowest frequency is 41 -
45 hours.