1 Frequency Distribution
1 Frequency Distribution
ANALYSIS
Frequency Distributions and Graphs
Engr. Stephanie Y. Cañete
Department of Civil Engineering
Frequency Distributions
2
A frequency distribution
is the organization of
raw data in table form,
using classes and
frequencies.
FREQUENCY
DISTRIBUTIONS
The categorical
frequency
distribution is used
for data that can be
placed in specific
categories
Sample 1
25 people were given a blood test to determine their blood type.
Construct a categorical frequency distribution for the data given
Sample 1
Sample 1
Total 25 100
For the sample, more people have type O blood than any other type.
Grouped Frequency Distribution
■Grouped Frequency
When the range Distributions
of the data is large, the data must be grouped into classes that are
morethe
When than one of
range unitthe
in data
width,isinlarge,
what the
is called
data amust
grouped frequency
be grouped distribution.
into Forare more
classes that
example,
than one unita distribution
in width, inof what
the number of hours
is called that boat
a grouped batteriesdistribution.
frequency lasted is the For exam-
ple,following.
a distribution of the number of hours that boat batteries lasted is the following.
Class Class
l Stat limits boundaries Tally Frequency
of 24–30 23.5–30.5 !!! 3
say they 31–37 30.5–37.5 ! 1
38–44 37.5–44.5 !!!! 5
45–51 44.5–51.5 !!!! !!!! 9
52–58 51.5–58.5 !!!! ! 6
59–65 58.5–65.5 ! 1
25
Lower limit ! 0.5 " 31 ! 0.5 " 30.5 " lower boundary
Upper limit # 0.5 " 37 # 0.5 " 37.5 " upper boundary
If the data are in tenths, such as 6.2, 7.8, and 12.6, the limits for a class hypothet
U nusual StatIf the datacally
aremight
in tenths, such and
be 7.8–8.8, as 6.2, 7.8, and 12.6,
the boundaries for thatthe limits
class wouldforbea 7.75–8.85.
class Find thes
One out of everyhypothetically
valuesmight be 7.8–8.8,
by subtracting and the
0.05 from boundaries
7.8 and adding 0.05 fortothat
8.8.class would be 7.75–
undred people8.85.
in Finally,
Find these the class
values width for a 0.05
by subtracting class in a frequency
from 7.8 and distribution
adding 0.05 is found by subtrac
to 8.8.
he United States is ing the lower (or upper) class limit of one class from the lower (or upper) class limit o
■ The researcher must decide how many classes to use and the width of each class. To
construct a frequency distribution, follow these rules:
1. There should be between 5 and 20 classes. Although there is no hard-and-fast rule for the
number of classes contained in a frequency distribution, it is of the utmost importance to
have enough classes to present a clear description of the collected data.
2. It is preferable but not absolutely necessary that the class width be an odd number. This
ensures that the midpoint of each class has the same place value as the data. The class
midpoint Xm is obtained by adding the lower and upper boundaries and dividing by 2, or
adding the lower and upper limits and dividing by 2.
5.5 # 11.5 17 Age
" " 8.5
2 2 10–20
20–30
Rule 2 is only■ a suggestion, and it is not rigorously followed,
The researcher must decide how many 30–40 classesespecially
to use andwhen a of each class. To
the width
computer is usedconstruct
to group adata.
frequency distribution, follow these rules:
40–50
The classes must be mutually exclusive. Mutually exclusive classes have
are found in the literature or in surveys. If a person is 40 years old, into which class
nonoverlapping class limits so that data cannot be placed into two classes. Many
3. The
should she or he
classes must
bemutually
be as
placed?exclusive.
A better way to construct a frequency distribution is to
(not overlapping)
times, frequency distributions such
use classes such as
Age Age
10–20 10–20
20–30 21–31
30–40 32–42
40–50 43–53
4.literature
4. The
are found in the The classes
classes inmust
ormust bebe continuous.
continuous.
surveys. Even
(No
If a person 40ifyears
isgaps) there old,
are no values
into whichinclass
a class, the class
should she or he bemust be included
placed? A betterinway
the to
frequency
constructdistribution.
a frequencyThere should be
distribution is no
to gaps in a
use classes such as
5. The classes must be exhaustive.
Age There should be enough classes to accommodate all the
data.
10–20
21–31
32–42
43–53
Section 2–1 Organizing Data 41
frequency distribution. The only exception occurs when the class with a zero
frequency is the first or last class. A class with a zero frequency at either end can be
■ The researcher must decide how many classes to use and the width of each class. To
omitted without affecting the distribution.
construct a frequency distribution, follow these rules:
5. The classes must be exhaustive. There should be enough classes to accommodate all
the data.
6. The
6. The classesmust
classes must be
beequal
equalin in
width. ThisThis
width. avoids a distorted
avoids view of view
a distorted the data.
of the data.
One exception occurs when a distribution has a class that is open-ended. That is,
One exception occurs
the class has when
no specific a distribution
beginning value or has a class
no specific that value.
ending is open-ended.
A frequency
distribution with an open-ended class is called an open-ended distribution. Here
are Example of an
two examples open-ended
of distributions distribution:
with open-ended classes.
Age Frequency Minutes Frequency
10–20 3 Below 110 16
21–31 6 110–114 24
32–42 4 115–119 38
43–53 10 120–124 14
54 and above 8 125–129 5
The frequency distribution for age is open-ended for the last class, which means
that anybody who is 54 years or older will be tallied in the last class. The
distribution for minutes is open-ended for the first class, meaning that any minute
values below 110 will be tallied in that class.
Example 2–2 shows the procedure for constructing a grouped frequency distribution,
i.e., when the classes contain more than one data value.
values below 110 will be tallied in that class.
Example 2–2 shows the procedure for constructing a grouped frequency distribution,
Example:
i.e., when the classes contain more than one data value.
Solution
The procedure for constructing a grouped frequency distribution for numerical data
follows.
120 113 120 117 105 110 118 112 114 114
Source: The World Almanac and Book of Facts.
Solution
The procedure for constructing a grouped frequency distribution for numerical data
follows.
ats Step 1 Determine the classes.
Find the highest value and lowest value: H " 134 and L " 100.
ges
Find the range: R " highest value # lowest value " H # L , so
It is
on R " 134 # 100 " 34
erson Select the number of classes desired (usually between 5 and 20). In this case,
gallons 7 is arbitrarily chosen.
er year,
2 Find the class width by dividing the range by the number of classes.
R 34
Width " " " 4.9
number of classes 7
2–7
ditions in the adding 0.5 to each upper class limit:
tary hospitals that 99.5–104.5, 104.5–109.5, etc.
ed for the wounded
diers. Step 2 Tally the data.
Step 3 Find the numerical frequencies from the tallies.
The completed frequency distribution is
Class Class
limits boundaries Tally Frequency
100–104 99.5–104.5 !! 2
105–109 104.5–109.5 !!!! !!! 8
110–114 109.5–114.5 !!!! !!!! !!!! !!! 18
115–119 114.5–119.5 !!!! !!!! !!! 13
120–124 119.5–124.5 !!!! !! 7
125–129 124.5–129.5 ! 1
130–134 129.5–134.5 ! 1
n " "f " 50
Ogives
Histograms, Frequency Polygons, and
Ogives
■ After you have organized the data into a frequency distribution, you can present
them in graphical form. The purpose of graphs in statistics is to convey the data to
the viewers in pictorial form. It is easier for most people to comprehend the meaning
of data presented graphically than data presented numerically in tables or
frequency distributions. This is especially true if the users have little or no statistical
knowledge.
■ Statistical graphs can be used to describe the data set or to analyze it. Graphs are
also useful in getting the audience’s attention in a publication or a speaking
presentation. They can be used to discuss an issue, reinforce a critical point, or
summarize a data set. They can also be used to discover a trend or pattern in a
situation over a period of time.
The Histogram
■ The histogram is a graph that displays the data by using contiguous vertical bars
(unless the frequency of a class is 0) of various heights to represent the frequencies
of the classes.
Solution
02.qxd 8/18/10 13:23 Page 53
The Histogram
■ Example Section 2–2 Histograms, Frequency Polygons, and Ogives 53
12
Frequency
9
istorical Note
6
hs originated
ancient 3
nomers drew the x
0
on of the stars in 99.5° 104.5° 109.5° 114.5° 119.5° 124.5° 129.5° 134.5°
eavens. Roman Temperature (°F)
yors also used
Solution
The
Step 1Frequency
Find the midpointsPolygon
of each class. Recall that midpoints are found by adding
the upper and lower boundaries and dividing by 2:
■ 99.5 !polygon
The frequency 104.5
104.5 is a graph that ! 109.5
displays the data by using lines that connect
points plotted for the
" 102
frequencies at the midpoints of
" 107
the classes. The frequencies
2 2
are represented by the heights of the points.
and so on. The midpoints are
Class boundaries Midpoints Frequency
99.5–104.5 102 2
104.5–109.5 107 8
109.5–114.5 112 18
114.5–119.5 117 13
119.5–124.5 122 7
124.5–129.5 127 1
129.5–134.5 132 1
The Frequency Polygon
hapter 2 Frequency Distributions and Graphs
Polygon for 18
–5
15
12
Frequency
3
x
0
102° 107° 112° 117° 122° 127° 132°
Temperature (°F)
Step 2 Draw the x and y axes. Label the x axis with the midpoint of each class, and
frequency distribution.
20
10
5
x
0
The Ogive
99.5° 104.5° 109.5° 114.5° 119.5° 124.5° 129.5° 134.5°
Temperature (°F)
or Example 2–6 50
45
40
35
Cumulative
frequency
30
25
20
15
10
5
x
0
99.5° 104.5° 109.5° 114.5° 119.5° 124.5° 129.5° 134.5°
Temperature (°F)
The Ogive
hapter 2 Frequency Distributions and Graphs
Specific 50
e Frequency 45
40
35
Cumulative
frequency
30
28
25
20
15
10
5
x
0
99.5° 104.5° 109.5° 114.5° 119.5° 124.5° 129.5° 134.5°
Temperature (°F)
The steps for drawing these three types of graphs are shown in the following
Procedure Table.
In summary….
Procedure Table
ent
ep Constructing Statistical Graphs
fewer. Step 1 Draw and label the x and y axes.
Step 2 Choose a suitable scale for the frequencies or cumulative frequencies, and label it
on the y axis.
Step 3 Represent the class boundaries for the histogram or ogive, or the midpoint for the
frequency polygon, on the x axis.
Step 4 Plot the points and then draw the bars or lines.
Relative Frequency
ency Distributions and Graphs
5.5–10.5
5.5–10.5
Class boundaries
boundaries Frequency
Midpoints
8 1
frequency
0.05
Step 3 Draw each graph as shown in Figure 2–7. For the histogram 10.5–15.5
and10.5–15.5
ogive, 13 2 0.10
use the class boundaries along the x axis. For the frequency polygon,
15.5–20.5
15.5–20.5 18 3 0.15
use the midpoints on the x axis. The scale on the y axis uses 20.5–25.5
20.5–25.5 23 5 0.25
proportions. 25.5–30.5
25.5–30.5 28 4 0.20
30.5–35.5
30.5–35.5 33 3 0.15
Histogram for Runners’ Miles
y 35.5–40.5
35.5–40.5 38 2 0.10
0.25 20 1.00
0.20
Step 2 Find the cumulative relative frequencies. To do this, a
Solution
class to the total frequency of the preceding class. In t
Relative frequency
Relative
0.10 is 203 ! 0.15; and so on.
Place these values in the column labeled Relative frequ
Relative Frequency
0.05
Class
x
Relative
0 boundaries Midpoints frequency
5.5 10.5 15.5 20.5 25.5 30.5 35.5 40.5
Miles 5.5–10.5 8 0.05
(a) Histogram 10.5–15.5 13 0.10
15.5–20.5 18 0.15
Frequency Polygon for Runners’ Miles
y 20.5–25.5 23 0.25
0.25 25.5–30.5 28 0.20
30.5–35.5 33 0.15
0.20 35.5–40.5 38 0.10
Relative frequency
1.00
0.15
Step 2 Find the cumulative relative frequencies. To do this, ad
class to the total frequency of the preceding class. In th
0.10
0.05, 0.05 " 0.10 ! 0.15, 0.15 " 0.15 ! 0.30, 0.30 "
these values in the column labeled Cumulative relative
0.05
An alternative method would be to find the cumulat
then convert
x each one to a relative frequency.
0
8 13 18 23 28 33 38 Cumulative
Miles
Cumulative relative
(b) Frequency polygon
frequency frequency
then convert each one to a relative frequency.
Relative
0.10
Cumulative
Cumulative relative
Relative Frequency
0.05
x than 5.5
Less
frequency
0
frequency
0.00
0
8 13 18 23 28 33 38 Less than 10.5 1 0.05
Miles Less than 15.5 3 0.15
(b) Frequency polygon Less than 20.5 6 0.30
Less than 25.5 11 0.55
Ogive for Runners’ Miles Less than 30.5 15 0.75
y
Less than 35.5 18 0.90
1.00
Less than 40.5 20 1.00
Cumulative relative frequency
0.80
0.60
0.40
0.20
x
0
5.5 10.5 15.5 20.5 25.5 30.5 35.5 40.5
Miles
(c) Ogive
Source: World Almanac and Book of Facts. 6–8
9–11
3. Counties, Divisions, or Parishes for 50 States
ACTIVITY #1 The number of counties, divisions, or parishes for
12–14
15–17
each of the 50 states is given below. Use the data to 18–20
construct a grouped frequency distribution with 21–23
■ 6 classes,
The number a histogram,
of counties or divisionsafor
frequency
each of thepolygon,
50 states isand anbelow. Use the
given 24–26
data toogive. Analyze
construct thefrequency
a grouped distribution. (Thewith
distribution data in thisa histogram, a
6 classes,
frequency polygon, and an ogive. 7. Air
exercise will be used for Exercise 24 in Section 2–2.)
selecte
67 27 15 75 58 64 8 67 159 5 accepta
102 44 92 99 105 120 64 16 23 14 for 199
83 87 82 114 56 93 16 10 21 33 distribu
62 100 53 88 77 36 67 5 46 66 of data,
95 254 29 14 95 39 55 72 23 3
Source: World Almanac and Book of Facts. 1