0% found this document useful (0 votes)
29 views33 pages

1 Frequency Distribution

This document provides information about frequency distributions, which organize raw data into a table with classes and frequencies. It discusses categorical frequency distributions for categorical data and grouped frequency distributions for data with a large range. Guidelines are given for determining the number of classes, class limits and boundaries to construct a clear frequency distribution. The class width is found by subtracting the lower limit of one class from the next.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
29 views33 pages

1 Frequency Distribution

This document provides information about frequency distributions, which organize raw data into a table with classes and frequencies. It discusses categorical frequency distributions for categorical data and grouped frequency distributions for data with a large range. Guidelines are given for determining the number of classes, class limits and boundaries to construct a clear frequency distribution. The class width is found by subtracting the lower limit of one class from the next.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 33

ENGINEERING DATA

ANALYSIS
Frequency Distributions and Graphs
Engr. Stephanie Y. Cañete
Department of Civil Engineering
Frequency Distributions

■ After collecting data, the first task for a researcher is to


organize and simplify the data so that it is possible to get a
general overview of the results.
■ This is the goal of descriptive statistical techniques.
■ One method for simplifying and organizing data is to
construct a frequency distribution.

2
A frequency distribution
is the organization of
raw data in table form,
using classes and
frequencies.
FREQUENCY
DISTRIBUTIONS
The categorical
frequency
distribution is used
for data that can be
placed in specific
categories
Sample 1
25 people were given a blood test to determine their blood type.
Construct a categorical frequency distribution for the data given
Sample 1
Sample 1
Total 25 100
For the sample, more people have type O blood than any other type.
Grouped Frequency Distribution
■Grouped Frequency
When the range Distributions
of the data is large, the data must be grouped into classes that are
morethe
When than one of
range unitthe
in data
width,isinlarge,
what the
is called
data amust
grouped frequency
be grouped distribution.
into Forare more
classes that
example,
than one unita distribution
in width, inof what
the number of hours
is called that boat
a grouped batteriesdistribution.
frequency lasted is the For exam-
ple,following.
a distribution of the number of hours that boat batteries lasted is the following.

Class Class
l Stat limits boundaries Tally Frequency
of 24–30 23.5–30.5 !!! 3
say they 31–37 30.5–37.5 ! 1
38–44 37.5–44.5 !!!! 5
45–51 44.5–51.5 !!!! !!!! 9
52–58 51.5–58.5 !!!! ! 6
59–65 58.5–65.5 ! 1
25

The procedure for constructing the preceding frequency distribution is given in


and 30 of the first class are called class limits. The lower class limit is 24; it represen
the smallest data value that can be included in the class. The upper class limit is 30;
■ Students sometimes
represents havedata
the largest difficulty finding
value that can beclass boundaries
included when
in the class. given the
The numbers in the sec
ond column are called class boundaries. These numbers are used to separate the classe
class limits.
so that there are no gaps in the frequency distribution. The gaps are due to the limits; fo
■ The basic rule of
example, thumb
there is that
is a gap the 30
between class
andlimits
31. should have the same decimal
place value as thesometimes
Students data, but havethe class boundaries
difficulty should
finding class have one
boundaries additional
when given the clas
place value
limits. and
The end
basicinrule
a 5.of thumb is that the class limits should have the same decima
place value
■ For example, as values
if the the data,inbut thedata
the classset
boundaries
are whole should have one
numbers, additional
such as 24,place
32, valu
and end in a 5. For example, if the values in the data set are whole numbers, such as 2
and 18, the limits for a class might be 31–37, and the boundaries are 30.5–
32, and 18, the limits for a class might be 31–37, and the boundaries are 30.5–37.5. Fin
37.5. Find the boundaries by subtracting 0.5 from 31 (the lower class limit) and
the boundaries by subtracting 0.5 from 31 (the lower class limit) and adding 0.5 to 3
adding(the
0.5upper
to 37class
(thelimit).
upper class limit).

Lower limit ! 0.5 " 31 ! 0.5 " 30.5 " lower boundary
Upper limit # 0.5 " 37 # 0.5 " 37.5 " upper boundary

If the data are in tenths, such as 6.2, 7.8, and 12.6, the limits for a class hypothet
U nusual StatIf the datacally
aremight
in tenths, such and
be 7.8–8.8, as 6.2, 7.8, and 12.6,
the boundaries for thatthe limits
class wouldforbea 7.75–8.85.
class Find thes
One out of everyhypothetically
valuesmight be 7.8–8.8,
by subtracting and the
0.05 from boundaries
7.8 and adding 0.05 fortothat
8.8.class would be 7.75–
undred people8.85.
in Finally,
Find these the class
values width for a 0.05
by subtracting class in a frequency
from 7.8 and distribution
adding 0.05 is found by subtrac
to 8.8.
he United States is ing the lower (or upper) class limit of one class from the lower (or upper) class limit o
■ The researcher must decide how many classes to use and the width of each class. To
construct a frequency distribution, follow these rules:

1. There should be between 5 and 20 classes. Although there is no hard-and-fast rule for the
number of classes contained in a frequency distribution, it is of the utmost importance to
have enough classes to present a clear description of the collected data.

2. It is preferable but not absolutely necessary that the class width be an odd number. This
ensures that the midpoint of each class has the same place value as the data. The class
midpoint Xm is obtained by adding the lower and upper boundaries and dividing by 2, or
adding the lower and upper limits and dividing by 2.
5.5 # 11.5 17 Age
" " 8.5
2 2 10–20
20–30
Rule 2 is only■ a suggestion, and it is not rigorously followed,
The researcher must decide how many 30–40 classesespecially
to use andwhen a of each class. To
the width
computer is usedconstruct
to group adata.
frequency distribution, follow these rules:
40–50
The classes must be mutually exclusive. Mutually exclusive classes have
are found in the literature or in surveys. If a person is 40 years old, into which class
nonoverlapping class limits so that data cannot be placed into two classes. Many
3. The
should she or he
classes must
bemutually
be as
placed?exclusive.
A better way to construct a frequency distribution is to
(not overlapping)
times, frequency distributions such
use classes such as
Age Age
10–20 10–20
20–30 21–31
30–40 32–42
40–50 43–53
4.literature
4. The
are found in the The classes
classes inmust
ormust bebe continuous.
continuous.
surveys. Even
(No
If a person 40ifyears
isgaps) there old,
are no values
into whichinclass
a class, the class
should she or he bemust be included
placed? A betterinway
the to
frequency
constructdistribution.
a frequencyThere should be
distribution is no
to gaps in a
use classes such as
5. The classes must be exhaustive.
Age There should be enough classes to accommodate all the
data.
10–20
21–31
32–42
43–53
Section 2–1 Organizing Data 41

frequency distribution. The only exception occurs when the class with a zero
frequency is the first or last class. A class with a zero frequency at either end can be
■ The researcher must decide how many classes to use and the width of each class. To
omitted without affecting the distribution.
construct a frequency distribution, follow these rules:
5. The classes must be exhaustive. There should be enough classes to accommodate all
the data.
6. The
6. The classesmust
classes must be
beequal
equalin in
width. ThisThis
width. avoids a distorted
avoids view of view
a distorted the data.
of the data.
One exception occurs when a distribution has a class that is open-ended. That is,
One exception occurs
the class has when
no specific a distribution
beginning value or has a class
no specific that value.
ending is open-ended.
A frequency
distribution with an open-ended class is called an open-ended distribution. Here
are Example of an
two examples open-ended
of distributions distribution:
with open-ended classes.
Age Frequency Minutes Frequency
10–20 3 Below 110 16
21–31 6 110–114 24
32–42 4 115–119 38
43–53 10 120–124 14
54 and above 8 125–129 5
The frequency distribution for age is open-ended for the last class, which means
that anybody who is 54 years or older will be tallied in the last class. The
distribution for minutes is open-ended for the first class, meaning that any minute
values below 110 will be tallied in that class.
Example 2–2 shows the procedure for constructing a grouped frequency distribution,
i.e., when the classes contain more than one data value.
values below 110 will be tallied in that class.
Example 2–2 shows the procedure for constructing a grouped frequency distribution,
Example:
i.e., when the classes contain more than one data value.

2–2 Record High Temperatures


These data represent the record high temperatures in degrees Fahrenheit (!F) for
each of the 50 states. Construct a grouped frequency distribution for the data using
7 classes.
112 100 127 120 134 118 105 110 109 112
110 118 117 116 118 122 114 114 105 109
107 112 114 115 118 117 118 122 106 110
116 108 110 121 113 120 119 111 104 111
120 113 120 117 105 110 118 112 114 114
Source: The World Almanac and Book of Facts.

Solution
The procedure for constructing a grouped frequency distribution for numerical data
follows.
120 113 120 117 105 110 118 112 114 114
Source: The World Almanac and Book of Facts.

Solution
The procedure for constructing a grouped frequency distribution for numerical data
follows.
ats Step 1 Determine the classes.
Find the highest value and lowest value: H " 134 and L " 100.
ges
Find the range: R " highest value # lowest value " H # L , so
It is
on R " 134 # 100 " 34
erson Select the number of classes desired (usually between 5 and 20). In this case,
gallons 7 is arbitrarily chosen.
er year,
2 Find the class width by dividing the range by the number of classes.
R 34
Width " " " 4.9
number of classes 7

2–7
ditions in the adding 0.5 to each upper class limit:
tary hospitals that 99.5–104.5, 104.5–109.5, etc.
ed for the wounded
diers. Step 2 Tally the data.
Step 3 Find the numerical frequencies from the tallies.
The completed frequency distribution is

Class Class
limits boundaries Tally Frequency
100–104 99.5–104.5 !! 2
105–109 104.5–109.5 !!!! !!! 8
110–114 109.5–114.5 !!!! !!!! !!!! !!! 18
115–119 114.5–119.5 !!!! !!!! !!! 13
120–124 119.5–124.5 !!!! !! 7
125–129 124.5–129.5 ! 1
130–134 129.5–134.5 ! 1
n " "f " 50

The frequency distribution shows that the class 109.5–114.5 contains


the largest number of temperatures (18) followed by the class 114.5–119.5
with 13 temperatures. Hence, most of the temperatures (31) fall between
109.5 and 119.5$F.
■ A cumulative frequency distribution is a distribution that shows the number of data values
less than or equal to a specific value (usually an upper boundary).
Section TheData
2–1 Organizing values 43
are found by
adding the frequencies of the classes less than or equal to the upper class boundary of a
specific class.
example, This gives frequency
the cumulative an ascending
for thecumulative frequency.
number of data values less than 114.5 can be
■ Thefound by addingfrequency
cumulative 10 ! 18 "distribution
28. The cumulative
for thefrequency distribution
data in this example foristhe
asdata in this
follows:
example is as follows:
Cumulative frequency

Less than 99.5 0


Less than 104.5 2
Less than 109.5 10
Less than 114.5 28
Less than 119.5 41
Less than 124.5 48
Less than 129.5 49
Less than 134.5 50
Cumulative frequencies are used to show how many data values are accumulated up
■ Cumulative frequencies
to and including areclass.
a specific usedIntoExample
show how 2–2,many
28 of data values
the total arehigh
record accumulated
tempera- up to and
including
tures arealess
specific
than orclass.
equal In Example
to 114#F. 2–2, 28ofof
Forty-eight thethe total
total record
record high high temperatures
temperatures are are less
than
lessorthan
equal to 114
or equal F. Forty-eight of the total record high temperatures are less than or
to 124#F.
equal to 124
After F. data have been organized into a frequency distribution, it will be ana-
the raw
lyzed by looking for peaks and extreme values. The peaks show which class or classes
InProcedure
summary….Table

Constructing a Grouped Frequency Distribution


Step 1 Determine the classes.
Find the highest and lowest values.
Find the range.
Select the number of classes desired.
Find the width by dividing the range by the number of classes and rounding up.
Select a starting point (usually the lowest value or any convenient number less
than the lowest value); add the width to get the lower limits.
Find the upper class limits.
Find the boundaries.
Step 2 Tally the data.
Step 3 Find the numerical frequencies from the tallies, and find the cumulative
frequencies.
Frequency distributions

Histograms Frequency polygons

Ogives
Histograms, Frequency Polygons, and
Ogives
■ After you have organized the data into a frequency distribution, you can present
them in graphical form. The purpose of graphs in statistics is to convey the data to
the viewers in pictorial form. It is easier for most people to comprehend the meaning
of data presented graphically than data presented numerically in tables or
frequency distributions. This is especially true if the users have little or no statistical
knowledge.
■ Statistical graphs can be used to describe the data set or to analyze it. Graphs are
also useful in getting the audience’s attention in a publication or a speaking
presentation. They can be used to discuss an issue, reinforce a critical point, or
summarize a data set. They can also be used to discover a trend or pattern in a
situation over a period of time.
The Histogram
■ The histogram is a graph that displays the data by using contiguous vertical bars
(unless the frequency of a class is 0) of various heights to represent the frequencies
of the classes.

■ Here are the steps:

1. Determine the frequency and relative frequency for each class.


2. Mark the class boundaries on a horizontal measurement axis.
3. Above each class interval, draw a rectangle whose height is the corresponding relative
frequency (or frequency).
histogram in 1891. The histogram is a graph that displays the data by using contiguous vertical bars
He used it to show (unless the frequency of a class is 0) of various heights to represent the frequencies of
time concepts of the classes.
various reigns of Prime
Ministers. The Histogram
■ Example

Example 2–4 Record High Temperatures


Construct a histogram to represent the data shown for the record high temperatures for
each of the 50 states (see Example 2–2).

Class boundaries Frequency


99.5–104.5 2
104.5–109.5 8
109.5–114.5 18
114.5–119.5 13
119.5–124.5 7
124.5–129.5 1
129.5–134.5 1

Solution
02.qxd 8/18/10 13:23 Page 53

The Histogram
■ Example Section 2–2 Histograms, Frequency Polygons, and Ogives 53

Record High Temperatures


re 2–2 y
18
gram for
ple 2–4 15

12
Frequency

9
istorical Note
6
hs originated
ancient 3
nomers drew the x
0
on of the stars in 99.5° 104.5° 109.5° 114.5° 119.5° 124.5° 129.5° 134.5°
eavens. Roman Temperature (°F)
yors also used
Solution

The
Step 1Frequency
Find the midpointsPolygon
of each class. Recall that midpoints are found by adding
the upper and lower boundaries and dividing by 2:
■ 99.5 !polygon
The frequency 104.5
104.5 is a graph that ! 109.5
displays the data by using lines that connect
points plotted for the
" 102
frequencies at the midpoints of
" 107
the classes. The frequencies
2 2
are represented by the heights of the points.
and so on. The midpoints are
Class boundaries Midpoints Frequency

99.5–104.5 102 2
104.5–109.5 107 8
109.5–114.5 112 18
114.5–119.5 117 13
119.5–124.5 122 7
124.5–129.5 127 1
129.5–134.5 132 1
The Frequency Polygon
hapter 2 Frequency Distributions and Graphs

Record High Temperatures


–3 y

Polygon for 18
–5
15

12
Frequency

3
x
0
102° 107° 112° 117° 122° 127° 132°
Temperature (°F)

Step 2 Draw the x and y axes. Label the x axis with the midpoint of each class, and
frequency distribution.

Example 2–6 shows the procedure for constructing an ogive.


The Ogive
xample 2–6 Record High Temperatures
Construct an ogive for the frequency distribution described in Example 2–4.
■ The ogive is a graph that represents the cumulative frequencies for the classes in a
frequency distribution.
Solution
Step 1 Find the cumulative frequency for each class.
Cumulative frequency

Less than 99.5 0


Less than 104.5 2
Less than 109.5 10
Less than 114.5 28
Less than 119.5 41
Less than 124.5 48
Less than 129.5 49
Less than 134.5 50

20
10
5
x
0

The Ogive
99.5° 104.5° 109.5° 114.5° 119.5° 124.5° 129.5° 134.5°
Temperature (°F)

Record High Temperatures


2–5 y

or Example 2–6 50
45
40
35
Cumulative
frequency

30
25
20
15
10
5
x
0
99.5° 104.5° 109.5° 114.5° 119.5° 124.5° 129.5° 134.5°
Temperature (°F)
The Ogive
hapter 2 Frequency Distributions and Graphs

Record High Temperatures


–6 y

Specific 50
e Frequency 45
40
35
Cumulative
frequency

30
28
25
20
15
10
5
x
0
99.5° 104.5° 109.5° 114.5° 119.5° 124.5° 129.5° 134.5°
Temperature (°F)

The steps for drawing these three types of graphs are shown in the following
Procedure Table.

In summary….
Procedure Table
ent
ep Constructing Statistical Graphs
fewer. Step 1 Draw and label the x and y axes.
Step 2 Choose a suitable scale for the frequencies or cumulative frequencies, and label it
on the y axis.
Step 3 Represent the class boundaries for the histogram or ogive, or the midpoint for the
frequency polygon, on the x axis.
Step 4 Plot the points and then draw the bars or lines.

Relative Frequency Graphs


distribution (shown
Placehere)
theseofvalues
the miles
in thethat 20 randomly
column selectedfreru
labeled Relative
given week. Class Relative

Relative Frequency
ency Distributions and Graphs

5.5–10.5
5.5–10.5
Class boundaries
boundaries Frequency
Midpoints
8 1
frequency
0.05
Step 3 Draw each graph as shown in Figure 2–7. For the histogram 10.5–15.5
and10.5–15.5
ogive, 13 2 0.10
use the class boundaries along the x axis. For the frequency polygon,
15.5–20.5
15.5–20.5 18 3 0.15
use the midpoints on the x axis. The scale on the y axis uses 20.5–25.5
20.5–25.5 23 5 0.25
proportions. 25.5–30.5
25.5–30.5 28 4 0.20
30.5–35.5
30.5–35.5 33 3 0.15
Histogram for Runners’ Miles
y 35.5–40.5
35.5–40.5 38 2 0.10
0.25 20 1.00
0.20
Step 2 Find the cumulative relative frequencies. To do this, a
Solution
class to the total frequency of the preceding class. In t
Relative frequency

Step 1 Convert each


0.05, frequency
0.05 " 0.10 ! to 0.15,
a proportion or relative
0.15 " 0.15 ! 0.30, frequenc
0.30 "
0.15
frequency thesefor eachinclass
values by the total
the column number
labeled of observation
Cumulative relativ
0.10 For classAn alternative
5.5–10.5, method frequency
the relative would be to is find
1 the cumul
20 ! 0.05; for
then convert
the relative eachisone
frequency 2 to a relative frequency.
20 ! 0.10; for class 15.5–20.5, the
0.05
is 203 ! 0.15; and so on. Cumulative
Place these values in theCumulative relative
x
0 column labeled Relative frequenc
5.5 10.5 15.5 20.5 25.5 30.5 35.5 40.5 frequency frequency
Miles Class Relative
(a) Histogram boundaries Less than 5.5Midpoints 0 frequency 0.00
the relative frequency is 202 ! 0.10; for class 15.5–20.5

Relative
0.10 is 203 ! 0.15; and so on.
Place these values in the column labeled Relative frequ
Relative Frequency
0.05
Class
x
Relative
0 boundaries Midpoints frequency
5.5 10.5 15.5 20.5 25.5 30.5 35.5 40.5
Miles 5.5–10.5 8 0.05
(a) Histogram 10.5–15.5 13 0.10
15.5–20.5 18 0.15
Frequency Polygon for Runners’ Miles
y 20.5–25.5 23 0.25
0.25 25.5–30.5 28 0.20
30.5–35.5 33 0.15
0.20 35.5–40.5 38 0.10
Relative frequency

1.00
0.15
Step 2 Find the cumulative relative frequencies. To do this, ad
class to the total frequency of the preceding class. In th
0.10
0.05, 0.05 " 0.10 ! 0.15, 0.15 " 0.15 ! 0.30, 0.30 "
these values in the column labeled Cumulative relative
0.05
An alternative method would be to find the cumulat
then convert
x each one to a relative frequency.
0
8 13 18 23 28 33 38 Cumulative
Miles
Cumulative relative
(b) Frequency polygon
frequency frequency
then convert each one to a relative frequency.

Relative
0.10
Cumulative
Cumulative relative
Relative Frequency
0.05

x than 5.5
Less
frequency
0
frequency
0.00
0
8 13 18 23 28 33 38 Less than 10.5 1 0.05
Miles Less than 15.5 3 0.15
(b) Frequency polygon Less than 20.5 6 0.30
Less than 25.5 11 0.55
Ogive for Runners’ Miles Less than 30.5 15 0.75
y
Less than 35.5 18 0.90
1.00
Less than 40.5 20 1.00
Cumulative relative frequency

0.80

0.60

0.40

0.20

x
0
5.5 10.5 15.5 20.5 25.5 30.5 35.5 40.5
Miles
(c) Ogive
Source: World Almanac and Book of Facts. 6–8
9–11
3. Counties, Divisions, or Parishes for 50 States
ACTIVITY #1 The number of counties, divisions, or parishes for
12–14
15–17
each of the 50 states is given below. Use the data to 18–20
construct a grouped frequency distribution with 21–23
■ 6 classes,
The number a histogram,
of counties or divisionsafor
frequency
each of thepolygon,
50 states isand anbelow. Use the
given 24–26
data toogive. Analyze
construct thefrequency
a grouped distribution. (Thewith
distribution data in thisa histogram, a
6 classes,
frequency polygon, and an ogive. 7. Air
exercise will be used for Exercise 24 in Section 2–2.)
selecte
67 27 15 75 58 64 8 67 159 5 accepta
102 44 92 99 105 120 64 16 23 14 for 199
83 87 82 114 56 93 16 10 21 33 distribu
62 100 53 88 77 36 67 5 46 66 of data,
95 254 29 14 95 39 55 72 23 3
Source: World Almanac and Book of Facts. 1

4. NFL Salaries The salaries (in millions of dollars) for 43 76 51


31 NFL teams for a specific season are given in this 20 0 5

You might also like