TOPIC3. Data Presentation Methods
TOPIC3. Data Presentation Methods
Topic 3
Data Presentation Methods
Once the data are gathered from the sample or population, one thing you could do is present them in
some way understandable. As mentioned before, data will remain as data unless something is done
to make them useful. There are three general methods of data presentation: textual presentation,
tabular presentation and graphical presentation.
❖ Tabular presentation uses tables to present your data. A table consists of a title describing
the data being presented, and headings for particular entries such as the frequencies and
relative frequencies or percentages.
• A one-way table presents the categories of one variable with its corresponding
frequencies (Freq) and percentages (or relative frequencies).
Table 1 and Table 2 show that 46% of the survivors have Type O blood, followed
by Type B blood (24%), Type AB blood (16%) and Type A blood (14%).
• A two-way table (or 𝒓 𝒙 𝒄 table) presents the categories of each of the two variables (row
variable and column variable) with corresponding frequencies. It is also called a cross
tabulation (cross tabs) or 𝒓 𝒙 𝒄 contingency table.
hvvvalle
Page 2 of 17
Table 3. Distribution of COVID19 survivors by blood type and sex
Sex
Blood type Total
Male Female
A 4 3 7
B 5 7 12
AB 3 5 8
O 15 8 23
Total 27 23 50
Table 3 shows that almost half of the survivors have Type O blood. Of the 27 male
survivors, majority have Type O blood.
This table is a 𝟒 𝒙 𝟐 table since there are 4 categories for the row variable (Blood type,
𝑟 = 4) and 2 categories for the column variable (Sex, 𝑐 = 2). You can add row
percentages or column percentages if you like.
The variables in the two tables are qualitative so it would be very easy for you to construct
a table of these types because all you have to do is identify the categories and count how
many entities belong to each category. Now what if the data are quantitative? Example:
Number of siblings that a student has−5, 6, 0, 2, 3, 3, 4, 13, 4, 4, 5, 6, 6, 7, 3, 3, 7,
10, 2, 3. A simple one-way table may be constructed, that is,
Table 4 shows that 5 or 25% of the respondents have 3 siblings. Only 1 respondent is
an only child.
This is fairly easy since there are only 20 data values and they are discrete. What if the
number of data values is large? Solution: Construct a frequency distribution table or FDT.
The table above is an FDT for ungrouped data (raw data).
• A frequency distribution table for grouped data is a table where frequencies are
determined from each class/interval. Classes may look like these:
17 − 20 2.5 − 4.6
21 − 24 or 4.7 − 6.8
25 − 28 6.9 − 9.0
hvvvalle
Page 3 of 17
Steps:
1. Solve for the 𝑅𝑎𝑛𝑔𝑒, where 𝑅𝑎𝑛𝑔𝑒 = 𝑀𝑎𝑥𝑖𝑚𝑢𝑚 𝑉𝑎𝑙𝑢𝑒 − 𝑀𝑖𝑛𝑖𝑚𝑢𝑚 𝑉𝑎𝑙𝑢𝑒. (Do
not round off your answer.)
2. Solve for 𝑘, the approximate number of classes/ intervals that can be constructed,
where 𝑘 = √𝑛 and 𝑛 is the total number of observations (or data values). This 𝑘
should be rounded off to the next higher integer (not the nearest integer) to
accommodate all observations. 𝑘 is an approximate value so when you construct your
FDT, the actual number of classes/intervals may or may not be equal to the computed
𝑘.
𝑅𝑎𝑛𝑔𝑒
3. Solve for the class width 𝑐, where 𝑐 = 𝑘
. This 𝑐 is rounded off with the same
number of decimal places as the observations in the data set (rounding off to the
nearest value).
Example:
# of
Data Set Data values decimal Computed c
places
A 10, 9, 1, 2, 5, 9, 3, ….. 0 c=3.6≈4
B 25, 22, 3, 10, …. 0 c=6.1≈6
C 2.5, 2.3, 1.5, 8.5,.. 1 c=2.6822≈2.7
D 12.4, 31.2, 18.8,… 1 c=5.4346≈5.4
E 1.76, 2.54, 1.98,… 2 c=0.2342≈0.23
F 11.25, 13.26, 8.31, 2.20,… 2 c=4.127112≈4.13
G 5.456, 3.145, 10.333, … 3 c=2.123056≈2.123
H 3.1, 2, 8.24, 6.235, 2.25,… ? c=3.82118≈3.821
hvvvalle
Page 4 of 17
You are now ready to construct your FDT. The FDT includes three basic columns:
Classes/Intervals, Frequencies, Relative Frequencies (%). Columns for the True Class
Boundaries (𝑇𝐶𝐵), Class Marks (𝐶𝑀), Less than Cumulative Frequencies (< 𝐶𝐹), and
Greater than Cumulative Frequencies (> 𝐶𝐹) can also be added.
𝑳𝑳𝟏 17 − 20 𝑼𝑳𝟏
𝑳𝑳𝟐 21 − 24 𝑼𝑳𝟐
𝑳𝑳𝟑 25 − 28 𝑼𝑳𝟑
The 𝑳𝑳𝟏 is usually the minimum in the data set; 𝑼𝑳𝟏 = 𝑳𝑳𝟏 + 𝒄 − 𝒑. What is 𝒑?
It is the precision of the data, which has to do with the number of decimal places
of the observations in the data set.
Example:
# of decimal places of
Observations 𝑝
the observations
3, 4, 0, 5, 11,… 0 1
5.1, 2.9, 8.0, … 1 0.1
2.34, 7.00, 4.06, … 2 0.01
6.231, 8.992, 0.008 3 0.001
11.4534, 5.6672, 4.2105… 4 ?
To get 𝑳𝑳𝟐 , 𝒄 is added to 𝑳𝑳𝟏 . To get the 𝑼𝑳𝟐 , add 𝒄 to the 𝑼𝑳𝟏 .
To get 𝑳𝑳𝟑 , 𝒄 is added to the 𝑳𝑳𝟐 . To get 𝑼𝑳𝟑 , add 𝒄 to the 𝑼𝑳𝟐 .
This process continues until you have counted in the maximum observation in the
last class/interval.
• Freq is the number of observations within each class/interval. Tally the values if
you do not want to strain your eyes looking for them in the data set.
Freq
• 𝐑𝐞𝐥𝐚𝐭𝐢𝐯𝐞 𝐅𝐫𝐞𝐪𝐮𝐞𝐧𝐜𝐲 (RF, %) =
n
∗ 100%
• Less than Cumulative Frequency (< 𝑪𝑭) is the number of observations that are
less than or equal to a specified upper limit 𝑼𝑳.
• Greater than Cumulative Frequency (> 𝑪𝑭) is the number of observations that
are greater than or equal to a specified lower limit 𝑳𝑳.
• If data are continuous, then the True Class Boundaries (𝑻𝑪𝑩)may be computed.
A 𝑻𝑪𝑩 is composed of the Lower True Class Boundary (𝑳𝑻𝑪𝑩) and the Upper
True Class Boundary (𝑼𝑻𝑪𝑩).
hvvvalle
Page 5 of 17
𝑳𝑻𝑪𝑩 = 𝑳𝑳 − 𝟎. 𝟓 ∗ 𝒑 𝑼𝑻𝑪𝑩 = 𝑼𝑳 + 𝟎. 𝟓 ∗ 𝒑
𝑳𝑳+𝑼𝑳
• Class Mark (𝑪𝑴)is the mean of the 𝑳𝑳 and 𝑼𝑳; 𝑪𝑴 = 𝟐
Do not roundoff!!
Numerical Example 1
1. 𝑹𝒂𝒏𝒈𝒆 = 𝟏𝟑 − 𝟎 = 𝟏𝟑
3. 𝒄 = 𝟏𝟑/𝟓 = 𝟐. 𝟔 ≈ 𝟑
Freq. For each interval, count how many observations are included from the data set.
For the interval 𝟎 − 𝟐, there are 3 observations (𝟎, 𝟐, 𝟐). For the interval 𝟑 − 𝟓, there
are 10 observations (𝟓, 𝟑, 𝟑, 𝟒, 𝟒, 𝟒, 𝟓, 𝟑, 𝟑, 𝟑), and so on. You know you are in the right
track if the largest value (𝑴𝒂𝒙𝒊𝒎𝒖𝒎 = 𝟏𝟑) is already contained in the last interval and
the total frequency is equal to 𝒏 = 𝟐𝟎.
Relative Frequency (𝑹𝑭) for each interval. For the first interval, the 𝑹𝑭 is 15%.
𝐹𝑟𝑒𝑞 3
𝑅𝐹 (%) = ∗ 100% = ∗ 100% = 15% …and so on….
𝑛 20
< 𝑪𝑭𝟏 = the number of observations that are less than or equal to 𝑼𝑳𝟏
< 𝑪𝑭𝟐 = the number of observations that are less than or equal to 𝑼𝑳𝟐
and so on…
> 𝑪𝑭𝟏 = the number of observations that are greater than or equal to 𝑳𝑳𝟏
> 𝑪𝑭𝟐 = the number of observations that are greater than or equal to 𝑳𝑳𝟐
and so on…
hvvvalle
Page 6 of 17
𝑪𝑴𝟏 = (𝟎 + 𝟐)/𝟐 = 𝟏
𝑪𝑴𝟐 = 𝑪𝑴𝟏 + 𝒄 = 𝟏 + 𝟑 = 𝟒
and so on…
Shortcut in getting the (< 𝑪𝑭): Let the 𝐹𝑟𝑒𝑞1 be the < 𝑪𝑭𝟏 (See figures in red). Then,
Questions:
https://www.youtube.com/watch?v=DW1lsZnaP8Q&t=1313s
hvvvalle
Page 7 of 17
Numerical Example 2
Construct the FDT of the weights (in kg) of a sample female COVID19 survivors given
below.
𝒏 = 𝟑𝟓, 𝒑 = 𝟎. 𝟏
Steps:
2. 𝒌 = √𝟑𝟓 = 𝟓. 𝟗𝟏𝟔𝟎 … ≈ 𝟔
𝟑𝟎.𝟕
3. 𝒄 = 𝟔
= 𝟓. 𝟏𝟏𝟔𝟔 ≈ 𝟓. 𝟏
𝑳𝑻𝑪𝑩 = 𝑳𝑳 − 𝟎. 𝟓 ∗ 𝒑 𝑼𝑻𝑪𝑩 = 𝑼𝑳 + 𝟎. 𝟓 ∗ 𝒑
hvvvalle
Page 8 of 17
Practice:
hvvvalle
Page 9 of 17
• Graphical presentation makes use of graphs and charts or any means to visually display the
data. Graphs and charts catch the attention of the viewer easily compared to textual and
tabular presentation.
Types of Graphs
❖ Bar chart – Each bar represents a category in a variable; if bars lie in the x-axis, the
frequencies (counts) or relative frequencies lie in the y-axis and vice versa; they can also
be clustered or stacked. Referring to Table 1 on page 16, the variable blood type is
qualitative (categorical) with four categories (𝐀, 𝐁, 𝐀𝐁, 𝐎). In the said table, there are 𝑛 =
50 and the highest frequency is 23, pertaining to blood type O. To construct its bar chart,
do not calibrate your y-axis from 0 to 50; instead calibrate it from 0 up to the highest
frequency, more or less, or multiples of 5, whichever is applicable. In the graph below,
23 is between 20 and 25. Point to ponder: What will your graph look like if you calibrate
it from 1 to 50? Take note also that the variable under consideration has 4 distinct
categories so the bars must be separated with spaces. You may use Excel or a statistical
software or an online tool to construct this. If none of these is available, you can do it
manually. Provide an appropriate title below the graph, not above it.
❖ Pie chart—As the name suggests, it looks like a pie (round, not the pan-pizza type). A slice
represents a category of the variable under consideration. One advantage of the pie chart
is that it gives a visual view of a portion of the data (slice) with regards to the whole.
Each slice has a corresponding angle measurement (a circle has 360 degrees). Referring
to Table 1 on page 16, the corresponding angle measurements are computed as follows:
hvvvalle
Page 10 of 17
Using a compass and a protractor, you make a circle and measure the degrees for each
angle corresponding to a particular category. You may use Excel or a statistical software
or an online tool to construct this. Provide an appropriate title below the graph.
❖ Stem-and-Leaf Plot—It presents numerical data in terms of “stems and “leaves”. For
example, you have the number 29. Its stem is 2 and its leaf is 9. For the number 290, its
stem is 29 and its leaf is 0.
Example 1
In ascending order: 8, 9, 10, 20, 22, 23, 31, 34, 40, 45, 60, 63
Write the stems first. Our data show that the stems are 0, 1, 2, 3, 4, 6 (highlighted in
yellow, 0 not included since 08 is the same as 8)
So, for the stem 0, there are two observations (8 and 9) thus the leaves are 8 and 9
respectively.
For the stem 1, there is only one observation (10) so the leaf is 0.
For the stem 2, there are three observations (20, 22, 23) so the leaves are 0, 2, and 3
respectively.
hvvvalle
Page 11 of 17
For the stem 3, there are two observations (31 and 34) so the leaves are 1 and 4
respectively. And so on…
0 8 9
1 0
2 0 2 3
3 1 4
4 0 5
5
6 0 3
Key: 2|1=21
Example 2
Construct the stem-and-leaf plot of the systolic blood pressures of a sample of baseball
athletes before a big game. The systolic blood pressure is the pressure exerted when
blood is injected into the arteries and is the upper number in blood pressure
110
measurements (e. g. , ).
70
110, 120, 118, 110, 109, 90, 115, 110, 113, 108, 100
In ascending order: 90, 100, 108, 109, 110, 110, 110, 113, 115, 118, 120
Now the stems are 9, 10,11 (highlighted in yellow). Write the corresponding leaves. Do
not forget to write the key.
9 0
10 0 8 9
11 0 0 0 3 5 8
12 0
Key: 12|5=125
Fig. 4. Systolic blood pressures of a sample of baseball athletes prior to a big game
Example 3
Given below are average ratings (1.0 to 5.0) given to an instructor by some of his students.
1.5, 2.4, 4.5, 5.0, 3.5, 4.3, 3.4, 2.1, 4.0, 4.3, 3.7, 5.0, 3.2, 1.9, 3.0
In ascending order:
1.5, 1.9, 2.1, 2.4, 3.0, 3.2, 3.4, 3.5, 3.7, 4.0, 4.3, 4.3, 4.5, 5.0, 5.0
hvvvalle
Page 12 of 17
Key: 2|3=2.3
Example 4
1.00 1.25 1.25 2.25 2.50 2.75 3.00 4.00 3.00 2.25 2.50 1.50
1.75 2.00 2.25 2.75 2.50 1.25 1.50 1.75 3.00 4.00 5.00 5.00
In ascending order:
1.00 1.25 1.25 1.25 1.50 1.50 1.75 1.75 2.00 2.25 2.25 2.25
2.50 2.50 2.50 2.75 2.75 3.00 3.00 3.00 4.00 4.00 5.00 5.00
Key: 1.2|5=1.25
Example 5
Suppose your data consist of the scores of 2BSND students in the 1st statistics quiz and
their sexes. Construct the stem-and-leaf plot of the data.
Scores 15 45 33 23 32 54 50 48 19 36 41 40 26 18 38 44 10 28 25 10 20
Sex M M F M F F F M M F M F M M F F F F M M F
hvvvalle
Page 13 of 17
In ascending order:
Scores 10 10 15 18 19 20 23 25 26 28 32 33 36 38 40 41 44 45 48 50 54
Sex F M M M M F M M M F F F F F F M F M M F F
For stem 1, the leaf for females is 0 (stem then going to the left). For males, the leaves
are 0, 5, 8, and 9.
For stem 2, the leaves for females are 0 and 8. For males, the leaves are 3, 5, and 6.
And so on….
FEMALE MALE
0 1 0 5 8 9
8 0 2 3 5 6
8 6 3 2 3
4 0 4 1 5 8
4 0 5
Key: 1|3=31 Key: 1|3=13
Fig. 7. Scores of 2BSND students in the 1st statistics quiz
❖ Dot Plot− It is also known as a dot chart. It makes use of dots to present the frequency
of an observation in a category. One dot is equivalent to one observation. Just count how
many observations belong to a category
Example 6
Suppose these are the favorite dog breeds of some BS Biology students:
hvvvalle
Page 14 of 17
Fig. 8. Favorite dog breeds of BS Biology students Fig. 9. Favorite dog breeds of BS Biology students
Example 7
The scores of students in a 10-item quiz are shown below. Construct its dot plot.
1, 1, 1, 0, 5, 6, 8, 9, 3, 4, 5, 10, 7, 6, 5, 7, 4, 5, 8, 9, 4, 6, 7, 3
Fig. 10. Scores of students in 10-item quiz Fig. 11. Scores of students in 10-item quiz
❖ Pictograph – This graph presents the frequency of your data using picture or symbols. For
example, if you want to present the population of a certain country, then a human image
is the best image to use. For one human image, there is a corresponding number of
people. For penguin sightings, a pictograph is shown below:
https://www.subjectcoach.com/imagecdn/prep-k/xpictograph1.png.pagespeed.ic.G3efOd_9Cv.png
hvvvalle
Page 15 of 17
There are graphs that can be made from the FDT. Two of these are the bar graph and the
histogram.
• Bar graph—The classes/intervals are plotted in the x-axis and the frequencies (or
relative frequencies) are plotted in the y-axis. It is best used when data are discrete.
• Histogram−The true class boundaries (𝑻𝑪𝑩) are plotted in the x-axis and the
frequencies (or relative frequencies) are plotted in the y-axis. It is best used when
data are continuous.
hvvvalle
Page 16 of 17
Freq
hvvvalle
Page 17 of 17
• Frequency polygon−The class marks (𝑪𝑴) are plotted in the x-axis and the frequencies
(or relative frequencies) are plotted in the y-axis. For a figure to be called a polygon, then
it should be a closed figure. To do this, just subtract the value of 𝒄 from the first class
mark with frequency equal to 0 and add the value of 𝒄 to the last class mark with
frequency equal to 0.
Upon subtracting 𝒄 from the first class mark and adding c to the last class mark, both with
0 frequencies, we get these values:
𝑪𝑴 𝑭𝒓𝒆𝒒
−2 0
1 3
4 10
7 5
10 1
13 1
16 0
Freq
hvvvalle