Lecture Notes 2
Lecture Notes 2
frequency
A frequency distribution is the organization of raw data in
distributions
table form, using classes and frequencies.
The categorical frequency distribution is used for data that can be placed in specific categories, such as nominal-
or ordinal-level data. For example, data such as political affiliation, religious affiliation, or major field of study
would use categorical frequency distributions.
Example 2–1: Twenty-five army inductees were given a blood test to determine their blood type. The data set is
A B B AB O
O O B AB B
B B O A O
A O O O AB
AB A O B A
Construct a frequency distribution for the data.
Solution
Since the data are categorical, discrete classes can be used. There are four blood types: A, B, O, and AB. These
types will be used as the classes for the distribution.
The frequency distribution for the data is:
A B C D
Class Tally Frequency Percent
A //// 5 20
B //// // 7 28
O //// //// 9 36
AB //// 4 16
Total 25 100
For the sample, more people have type O blood than any other type.
When the range of the data is large, the data must be grouped into classes that are more than one unit in width, in
what is called a grouped frequency distribution.
Example 2–2: These data represent the record high temperatures in degrees Fahrenheit (oF) for each of the 50
states. Construct a grouped frequency distribution for the data using 7 classes.
112 100 127 120 134 118 105 110 109 112
110 118 117 116 118 122 114 114 105 109
107 112 114 115 118 117 118 122 106 110
116 108 110 121 113 120 119 111 104 111
120 113 120 117 105 110 118 112 114 114
Solution
The grouped frequency distribution for the data is:
Class Class
limits boundaries Tally Frequency
100–104 99.5–104.5 // 2
105–109 104.5–109.5 //// /// 8
110–114 109.5–114.5 //// //// //// // 18
115–119 114.5–119.5 //// //// /// 13
120–124 119.5–124.5 //// // 7
125–129 124.5–129.5 / 1
130–134 129.5–134.5 / 1
n = Ʃf = 50
The frequency distribution shows that the class 109.5–114.5 contains the largest number of temperatures (18)
followed by the class 114.5–119.5 with 13 temperatures. Hence, most of the temperatures (31) fall between 109.5
and 119.5oF.
Cumulative frequencies are used to show how many data values are accumulated up to and including a specific
class. In Example 2–2, 28 of the total record high temperatures are less than or equal to 114oF. Forty-eight of the
total record high temperatures are less than or equal to 124oF.
PHM111s - Probability and Statistics
After organizing the data into a frequency distribution, they can be presented in graphical form.
2. The frequency polygon is a graph that displays the data by using lines
that connect points plotted for the frequencies at the midpoints of the
classes. The frequencies are represented by the heights of the points.
3. The ogive is a graph that represents the cumulative frequencies for the
classes in a frequency distribution.
Example 2–3: Construct a histogram to represent the data shown for the record high temperatures for each of the
50 states in Example 2–2.
Solution
Class boundaries Frequency
99.5–104.5 2
104.5–109.5 8
109.5–114.5 18
114.5–119.5 13
119.5–124.5 7
124.5–129.5 1
129.5–134.5 1
Solution
Find the midpoints of each class. Recall that midpoints are found by adding the upper and lower boundaries and
dividing by 2:
99.5 + 104.5 104.5 + 109.5
= 102 = 107
2 2
99.5–104.5 102 2
104.5–109.5 107 8
109.5–114.5 112 18
114.5–119.5 117 13
119.5–124.5 122 7
124.5–129.5 127 1
129.5–134.5 132 1
Example 2–5: Construct an ogive for the frequency distribution described in Example 2–3.
Solution
Find the cumulative frequency for each class.
Cumulative frequency
Solution
0 2
1 34
2 035
3 1222236
4 3445
5 127
The figure shows that:
• The distribution peaks in the center.
• There are no gaps in the data.
• For 7 of the 20 days, the number of patients receiving cardiograms was between 31 and 36.
• The testing center treated from a minimum of 2 patients to a maximum of 57 patients in any one day.
If there are no data values in a class, you should write the stem number and leave the leaf row blank. Do not put a
zero in the leaf row.
Example 2–7: An insurance company researcher conducted a survey on the number of car thefts in a large city for
a period of 30 days last summer. The raw data are shown. Construct a stem and leaf plot by using
classes 50–54, 55–59, 60–64, 65–69, 70–74, and 75–79.
52 62 51 50 69
58 77 66 53 57
75 56 55 67 73
79 59 68 65 72
57 51 63 69 75
65 53 78 66 55
5 011233
5 5567789
6 23
6 55667899
7 23
7 55789