Lecture 5-Statistics-New
Lecture 5-Statistics-New
Frequency Distribution
In last lecture we studied: Numeric frequency( 𝑓), Relative Frequency (𝑓𝑟 ), Percent
Frequency (𝑓𝑝 ), Cumulative Frequency (𝑓𝐶 ).
Note:
Solution:
Page | 1
Example 2:
Take a look at the table below that shows the income of rich people.
Solution :
Re write the table again put all the information
(𝒇𝑪 ) Cumulative relative
Income (in millions of
Frequency frequency (𝒇𝑪𝒓 )
dollars)
(9-18) 12
(19-28) 9
(29-38) 7
(39-48) 8
(49-58) 7
(59-68) 7
Total
Measure of Dispersion
You have learnt various measures of central tendency. Measures of central tendency
help us to represent the entire mass of the data by a single value. Can the central
tendency describe the data fully and adequately? In order to understand it, let us
consider an example. The daily income of the workers in two factories are:
Factory A: 35 45 50 65 70 90 100
Factory B: 60 65 65 65 65 65 70
Here we observe that in both the groups the mean of the data is the same, namely, 65.
(1) In group A, the observations are much more scattered from the mean.
Page | 2
(2) In group B, almost all the observations are concentrated around the mean.
Certainly, the two groups differ even though they have the same mean. Therefore, we
need some aadditional statistical information to determine how the values are spread
in data. For this, we shall discuss Measures of Dispersion.
Dispersion is a measure which gives an idea about the scatteredness of the values.
Measures of Variation (or) Dispersion of a data provide an idea of how
observations spread out (or) scattered throughout the data.
Measure of Dispersion
To explain the meaning of dispersion, let us consider an example: Consider the runs scored
by two group players in their last ten matches as follows:
Players B: 53, 46, 48, 50, 53, 53, 58, 60, 57, 52
Clearly, mean of the runs scored by both the group players A and B is same i.e. 53. Can we
say that the performance of two players is same? Clearly No, because the variability in the
scores of players A is from 0 to 117, whereas the variability of the runs scored by players B is
from 46 to 60.
Let us now plot the above scores as dots on a number line. We find the following diagrams:
Clearly, the extent of spread or dispersion of the data is different in Players A from that of B.
The measurement of the scatter of the given data about the average or mean is said to be a
measure of dispersion or scatter.
Page | 3
Types of Measures of Dispersion
In previous lectures we learn how to find rang and mean in different ways and for grouped
and ungrouped data. Now we will continue studying Variance, standard deviation for
ungrouped and grouped data and so on.
Although the sample mean is useful, it does not convey all the information about a sample
of data. The variability or scatter in the data may be described by the sample variance
If the n observations in a sample are denoted by x1, x2, . . . , xn, then the sample
variance is
The sample standard deviation, s, is the positive square root of the sample variance.
The units of measurement for the sample variance are the square of the original units
of the variable. Thus, if x is measured in Pa, the units for the sample variance are
(Pa)2.
Therefore, the units of standard deviation (SD) should be-------------.
Page | 4
Example 3:
Seven oxide thickness measurements of wafers are studied to assess quality in a
semiconductor manufacturing process. The data (in angstroms) are 1264, 1280, 1301, 1300,
1292, 1307, and 1275. Calculate the sample average, variance and sample standard deviation
Solution:
i Xi xi-𝑥̅ (xi-𝑥̅ )2
1 1264
2 1280
3 1301
4 1300
5 1292
6 1307
7 1275
Total
Page | 5
Note the above equation requires squaring each individual 𝑥𝑖 , then squaring the
(∑𝑛 𝑥 )2
sum of the 𝑥𝑖 subtracting 𝑖=1 𝑖 ⁄𝑛 from ∑𝑛𝑖=1 𝑥𝑖2 , and finally dividing by n -1.
Sometimes this computational formula is called the shortcut method for
calculating s2 (or s).
Example 4:
The results of a set of measurements (in cm) are as follows: 20.1, 20.5, 20.3, 20.5, 20.6, 20.1,
20.2, and 20.4. Calculate sample variance and sample standard deviation using shortcut
method.
Solution:
i Xi 𝑥𝑖2
1 20.1
2 20.5
3 20.3
4 20.5
5 20.6
6 20.1
7 20.2
8 20.4
(∑𝑛𝑖=1 𝑥𝑖 )2 =
Population Variance
Analogous to the sample variance s2, there is a measure of variability in the population
called the population variance. We will use the Greek letter 2 (sigma squared) to denote
the population variance.
The positive square root of 2, or , will denote the population standard deviation (SD).
When the population is finite and consists of N values, we may define the population
variance as:
Page | 6
Example 5 and H.W
An important quality characteristic of water is the concentration of suspended solid material
in mg/l. Twelve measurements on suspended solids from a certain lake are as follows: 42.4,
65.7, 29.8, 58.7, 52.1, 55.8, 57.0, 68.7, 67.3, 67.3, 54.3, and 54.0. Calculate the sample
average, sample standard deviation using normal method and shortcut method
Page | 7