CH - IV
CH - IV
Frequency Distribution
The probability distribution of a random variable is often very useful in studying the
behaviour of the distribution if presented in a suitable form. Considerable information can
be obtained by grouping our data into classes and determining the number of observations
in each of the classes. Such an arrangement is called a frequency distribution.
Frequency distribution is one of the ways in which we can organize a large number of
data. Data that are presented in the form a frequency distribution are called grouped data.
The number of observations falling in a particular class is called the class frequency and is
denoted 𝑓.
The basic way to build frequency distribution is to divide the range of data values into
classes or (class center) and limit the number of data within each class (or class center).
Example 1.
The following data represent the grades of 32 students in physics in the ministerial
exam for the preparatory stage
22, 47, 88, 71, 34, 54, 62, 41, 36, 87, 76, 69, 48, 29, 33, 66, 42, 52, 58, 99, 53, 57, 59, 74,
39, 45, 42, 58, 63, 84, 55, 58.
The following table represents the frequency distribution of these grades.
classes frequency
20-30 2
30-40 4
40-50 6
50-60 9
60-70 4
70-80 3
80-90 3
90-100 1
Total 32 From the above example, it is shown that the
frequency distribution is a table consisting of classes,
namely the values of observations or measurements, and the frequencies
corresponding to these classes.
When building the frequency distribution, we have to take in view the following points:
1
1. Classes must be separated.
2. The classes should be of equal length.
3. The classes are sufficient to hold the data. This means that if we look at any value in the
data we can put it in one class. This allows us to enter all the data in the frequency
distribution classes and the sum of these frequencies equal to the number of data, i.e. if the
number of data is n we have ∑k𝑖=1 𝑓i = n (where k represents the number of classes).
Example.2
To illustrate the construction of a frequency distribution, consider the following data,
which represent the lives of 40 similar car batteries recorded to the nearest tenth of year.
Let us choose 7 class intervals, to determine approximate class width, we divide the range
by the number of intervals. Therefore the range is 4.7 - 1.6 = 3.1 and the class width can be
no less than
3.1
= 0.443 we choose 0.5.
7
2
The frequency distribution for the above data is given in the following table
Class
Class interval class mark Frequency
boundaries
1.5 – 1.9 1.45 – 1.95 1.7 2
2.0 – 2.4 1.95 – 2.45 2.2 1
2.5 – 2.9 2.45 – 2.95 2.7 4
3.0 – 3.4 2.95 – 3.45 3.2 15
3.5 – 3.9 3.45 – 3.95 3.7 10
4.0 – 4.4 3.95 – 4.45 4.2 5
4.5 – 4.9 4.45 – 4.95 4.7 3
total 40
Graphic Representation
The information provided by a frequency distribution in tabular form is easier to grasp
if presented graphically.
The most widely used form of graphic presentation of numerical data are bar charts,
histograms and polygons.
In this chapter, we will learn
how to define a histograms
how to make and interpret histograms
the differences between histograms and bar graphs
The following diagram shows the differences between a histogram and a bar chart.
3
Figure 1: Histogram and Bar Chart
Histograms are used to show distributions of variables whereas bar charts are used to
compare variables. Histograms plot quantitative data with ranges of the data grouped into
intervals while bar charts plot categorical data.
Note that there are no spaces between the bars of a histogram since there are no gaps
between the intervals. On the other hand, there are spaces between the variables of a bar
chart.
Bar Charts
A bar chart represents the data as horizontal or vertical bars. The length of each bar is
proportional to the amount that it represents.
The Bar Chart of the table for the frequency distribution in example.2 is shown in the
following figure.
4
Bar Chart of example.2
16
14
12
10
8
6
4
2
0
1.5 – 1.9 2.0 – 2.4 2.5 – 2.9 3.0 – 3.4 3.5 – 3.9 4.0 – 4.4 4.5 – 4.9
Histograms
How to define a histogram, interpret a histogram and create a histogram from data?
A histogram is a bar graph that represents a frequency distribution. The width represents
the interval and the height represents the corresponding frequency. There are no spaces
between the bars.
Polygons
The frequency polygon is obtained by fixing the position of each mid class against the
frequency of that class and then connecting these points by straight lines. We reached the
two end points of the polygon by the previous mid class point from the left and the next
mid class from the right. The polygon is joined by these two points, as in the following
figure:
5
Figure 3: Polygon graph of example.1
The frequency polygon can also be obtained from the histogram by pointing the upper
sides of the rectangles in the histogram and then connecting these points together with each
other as in the following figure for example.1:
The following graph represent the frequency polygon of Example.2 (Battery Lives).
6
16
14
12
10
0
1.5 – 1.9 2.0 – 2.4 2.5 – 2.9 3.0 – 3.4 3.5 – 3.9 4.0 – 4.4 4.5 – 4.9
Relative Frequencies:
The relative frequency of each class is the ratio of the frequency of that class to the total
frequency. If the sum of the total frequencies is n and the frequency of class 𝑖 is 𝑓𝑖 , then its
𝑓𝑖
relative frequency is 𝑝𝑖 = . If we multiply the relative frequency 𝑝𝑖 by 100 (𝑝𝑖 x100),
𝑛
then we get the percentage frequency as in the following examples:
Example.3
The following table shows the relative frequency and the percentage frequency of the
frequency distribution table for example (1):
7
Cumulative frequency distribution:
Often our interest is in the number of observation that are equal to or smaller than a
given value.
The sum of the frequencies of all values that are equal to or smaller than a value is the
cumulated frequency of that value.
Example.4
Below is the cumulated frequency distribution table for Example.1:
classes Cumulated
freq. 𝒇𝒊
freq.
20-30 2 2
30-40 4 6
40-50 6 12
50-60 9 21
60-70 4 25
70-80 3 28
80-90 3 31
90-100 1 32
Total 32
The number of grades that fall in the class 50-60 or less is 21.
8
The computation of the mean for the data of (Battery Lives) is illustrated by the
following table
136.5
Hence, the mean 𝜇= = 3.4125 years.
40
Definition.2 For grouped data (frequency distribution data), to find the median we first
specified the median class. The median class is defined as that class who has cumulated
frequency greater than or equal to N / 2 directly. After determining the median class, the
median is calculated from the following formula:
𝑁
− 𝑓𝑐
𝑀𝑒 = 𝑋𝑙 + 2 ∗𝑐
𝑓
where:
𝑙 is the number of classes.
∑li=1 𝑓𝑖 = 𝑁 is the total number of frequencies.
𝑋L is the lower bound of the median class.
𝑓 𝑐 is the cumulated frequency before the median class.
𝑓 is the frequency of the median class.
𝑐 is length the class interval.
Notice that the calculated median does not depend on all the values and does not affected
by the extreme values.
9
3.0 – 3.4 3.2 15 22 MedianClass and
3.5 – 3.9 3.7 10 32 ModeClass
4.0 – 4.4 4.2 5 37
4.5 – 4.9 4.7 3 40
Total 40
Definition.3 The mode 𝑿𝒎 for usual data is defined as the value with the highest
frequency. When the data is given in frequency distribution, the corresponding class must
first be fixed. The mode class is defined as the class with the highest frequency. After
finding the mode class we find the mode from the following formula:
∆1
𝑀𝑜 = X L + ∗𝑐
∆1 + ∆2
𝑋L is the lower limit of the mode class .
∆1 is the difference between the frequency of the mode class and the frequency of the
previous class.
∆2 is the difference between the frequency of the mode class and the frequency of the
subsequent class. 𝑐 is the length of the mode class.
(XL = 3, ∆1 = 15 − 4 = 11, ∆2 = 15 − 10 = 5, 𝑐 = 0.4) 𝑓𝑜𝑟 𝑒𝑥𝑎𝑚𝑝𝑙𝑒. 2
Measurements of dispersion
Range
The range is defined as the difference between the highest value and the smallest value
of data. If the range is small, it means that the data is confined to a close range and if the
range is large, the data is within a long distance.
The range in the frequency distribution is also defined as the difference between the
upper limit of the upper class and the lower limit of the lower class.
11
Variance
In the case of the frequency distribution with mid classes 𝑋1 , X2 , … , 𝑋𝑙 and their
corresponding frequencies f1 , f 2 ,...., f l , the variance is:
𝑙 𝑙
1 1
2
𝑆 = ∑ 𝑓𝑖 (𝑋𝑖 − 𝑋̅)2 = [∑ 𝑓𝑖 𝑋𝑖2 − 𝑛𝑋̅ 2 ]
𝑛−1 𝑛−1
𝑖=1 𝑖=1
Where ∑𝑙𝑖=1 𝑓𝑖 = 𝑛.
Standard Deviation
The standard deviation is defined as the positive square root of the variance i.e.𝑆 = √𝑆 2
Example.5
The table below shows the weights(kg) of members in a sport club. Calculate the mean,
median, mode, mean deviation and standard deviation of the distribution.
Solution:
Total 50 3395
11
∑𝑘𝑖=1 𝑓𝑖 𝑥𝑖 3395
𝑥
̅= 𝑘 = = 67.9
∑𝑖=1 𝑓𝑖 50
2. To find the median we first specify the median class which is represent the
cumulative frequency greater than or equal to (50/ 2) directly.
𝑁
−𝑓 𝑐 59+60
𝑀𝑒 = 𝑋𝑙 + 2 𝑓 ∗ 𝑐 where c=10 , 𝑋𝑙 = = 59.5 , 𝑓 𝑐 = 14 , 𝑓 = 12
2
25−14
𝑀𝑒 = 59.5 + ∗ 10 = 68.66
12
3. We can find the mode class which contain highest frequency, then find the mode
from the following formula
∆1
𝑀𝑜 = X L + ∗𝑐
∆1 + ∆2
69+70
the lower limit of the mode class is 𝑋𝐿 = = 69.5 and
2
∆1 = 14 − 12 = 2, ∆2 = 14 − 7 = 7. Then:
2
𝑀𝑜 = 69.5 + = 69.722
2+7
∑𝑙𝑖=1 𝑓𝑖 |𝑥𝑖 −𝑋
̅| 140.4+107.2+40.8+92.4+116.2+79.8
4. Mean deviation = ∑𝑙𝑖=1 𝑓𝑖
= 50
= 11.536
1
5. Variance 𝑆 2 = 𝑛−1 [∑𝑙𝑖=1 𝑓𝑖 𝑋𝑖2 − 𝑛𝑋̅ 2 ]
1
𝑆 2 = 49 {[6(44.5)2 + 8(54.5)2 + 12(64.5)2 + 14(74.5)2 + 7(84.5)2 + 3(94.5)2 ]
9522
−50(67.9)2 } =
49
Then the standard deviation =√𝑆 2
S = 13.94
∎
12
EXERCISES
1. The following scores represent the final examinations grade for an elementary statistics
course:
23 60 79 32 57 74 52 70 82 36
80 77 81 95 41 65 92 85 55 76
52 10 64 75 78 25 80 98 81 67
41 71 83 54 64 72 88 62 74 43
60 78 89 76 84 48 84 90 15 79
34 67 17 82 69 74 63 80 85 61
13
7 – 13 5
42 – 48 1
49 – 55 2
56 – 62 1
4. The following data represent the spending in dollars on extracurricular activities for a
random sample of college students during the first week of the first semester
6 6 9 22 12 7 18 13 11 12 8 2 10 6
14
9 4 9 14 13 8 10 12 20 29 9 5 11 3
5 6 5 24 15 4 11 22 13 19 6 4 10 5
15