BS Lect 05
BS Lect 05
Chapter 2 – Part 2
Measures of Dispersion
Measures of dispersion
Mean deviation
Variance
Standard deviation
Example:
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14
Range = 14 - 1 = 13
Disadvantages of the Range
Ignores the way in which data are distributed
7 8 9 10 11 12 7 8 9 10 11 12
Range = 12 - 7 = 5 Range = 12 - 7 = 5
Sensitive to outliers
1,1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,3,3,3,3,4,5
Range = 5 - 1 = 4
1,1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,3,3,3,3,4,120
Range = 120 - 1 = 119
Interquartile Range
Example:
X Median X
minimum Q1 (Q2) Q3 maximum
12 30 45 57 70
Interquartile range
= 57 – 30 = 27
Mean deviation
| x x |
i | x x | f
i i
M .D i 1 M .D i 1
h
n f
i 1
i
Variance and standard deviation
(Ungroup data)
Sample variance
Average (approximately) of squared deviations of
values from the mean
2
2
n n n
( xi x ) n xi xi
2
s 2 i 1 i 1 i 1
n 1 n(n 1)
Sample standard deviation
2
n n
n
(x x) n xi xi
2 2
i
s s2 i 1
i 1 i 1
n 1 n(n 1)
Variance and standard deviation
(Group data)
Sample variance
2
n n
n
2
( xi x ) f i 2
n xi f i xi i
f
s 2 i 1 h i 1 i 1
n(n 1)
( f i ) 1
i 1
2
n n
n
(x x) n xi f i xi f i
2 2
i fi
s s2 i 1
i 1 i 1
n 1 n(n 1)
Standard deviation
1 x1 f1 x1 f1 F1 x12 f1 | x1 x | f1 ( x1 x ) 2 f1
2 x2 f2 x2 f2 F2 x 22 f 2 | x2 x | f 2 ( x2 x ) 2 f 2
. . . . . . . .
. . . . . . . .
h xh fh xhfh Fh x h2 f h | xh x | f h ( xh x ) 2 f h
Total h h h h h
f i n xi f i xi2 f i | x i x | f i ( xi x ) 2 f i
i 1 i 1 i 1 i 1 i 1
Example 2.5.1
Compute mean deviation, variance and standard deviation
for the following random sample of 8 observations:
x
i 1
i x
Mean deviation= = 18.5/8 = 2.3
n
n n n
n x ( xi )
i
2 2
8(832) (78) 2
( xi x ) 2
i 1 i 1 i 1
2
Variance, s = = 71.5 / 7 10.21
n(n 1) (8)(7) n 1
Compute mean deviation, variance and standard deviation from the following frequency
distribution
x i x fi
178.36
Mean deviation i 1
h
= = 3.24
55
f
i 1
i
i
( x
i 1
x ) 2
fi
861.71
2
Variance, s = h
= = 15.96
55
( f i ) 1
i 1
2
h h
n xi f i xi f i
2
or = i 1 i 1 = (55)(19741) 1019 2
= 15.96
n(n 1) 55(55 1)
Data A
Mean = 15.5
11 12 13 14 15 16 17 18 19 20 21 S = 3.338
Data B
Mean = 15.5
11 12 13 14 15 16 17 18 19 20 21 S = 0.926
Data C
Mean = 15.5
11 12 13 14 15 16 17 18 19 20 21 S = 4.567
Advantages of Variance and
Standard Deviation
S
CV 100%
X
Example
Example 2.5.3
Solution
4
Standard deviation, s = 4, mean, x = 18.53, Therefore c. v = x 100 = 21.6%
18.53
Another example: Comparing
Coefficient of Variation
Stock A:
Average price last year = K50
Standard deviation = K5
S k5
CVA 100%
100% 10%
X k50 Both stocks
have the same
Stock B: standard
Average price last year = k100 deviation, but
stock B is less
Standard deviation = k5
variable relative
to its price
S k5
CVB 100%
100% 5%
X k100
Shape of a Distribution
3( x m)
s.k =
s
Example 2.5.4
Refer to the frequency distribution in example 2.5.2 to compute the measure of
skewness to determine the type of the distribution.
Solution
3(18.53 18.92)
s.k = = - 0.3
4
Compare the following data sets that represent the test scores of two groups of
students for a special IQ test.
Group A Group B
1 4
2 4
3 4
4 4
4 4
4 4
4 6
10 2
Solution
Summary statistics of the test scores of the students
Absolute Square
Group A Deviation from Deviation
Student Group A Group B Deviation from mean from mean
(xi - x ) xi - x (xi - x )2
1 1 4 -3 3 9
2 2 4 -2 2 4
3 3 4 -1 1 1
4 4 4 0 0 0
5 4 4 0 0 0
6 4 4 0 0 0
7 4 6 0 0 0
8 10 2 6 6 36
Total 32 32 0 12 50
Mean 4 4
Mode 4 4
Median 4 4
Range 9 4
Variance 7.14 1.14
Mean
deviation 1.5 0.5
Standard
deviation 2.67 1.07
c. v 66.8% 26.7%
s. k 0 0
Comparison Comments
The measures of central tendency are equal so the two
sets of data cannot be compared in terms of the
measures of central tendency.
The range for group A greater than group B which means
the scores for group A are scattered than for group B. In
other words scores for group B are more uniform, which
means the scores are much closer to each other.
The distribution of the scores of the two groups are
symmetrical, this is because the mean and the median
coincide.
Coefficient of variation for A is greater than that of B
because there is more variation in A.
Theorems regarding mean and
variance
Theorem 2.5.1
Theorem 2.5.2
Theorem 2.5.3
Variance is not affected by the change of origin, but is affected by the change of
scale, or, variance does not depend on the change of origin but it does depend on
the change of scale. For example, if a fixed number say ‘a’ is added to or,
subtracted from all observations in a data set, then the variance of the new
observations will be same as the original Variance. Butif all observations in a data
set are multiplied or divided by a fixed number say ‘b’, then the Variance of the
new observations will be b2 (the original Variance), or (the original Variance) /
b2 .
Example 2.5.6
a) Suppose each observation in example 1 is increased by a factor of 3, what is the
new mean, and the new variance?
b) Suppose each observation in example 1 is increased by 3, what is the new
mean, and the new variance?
c) Suppose each observation in example 1 is decreased by 4, what is the new
mean?
Solution
Click OK
Excel output
Microsoft Excel
descriptive statistics output,
using the house price data:
House Prices:
$2,000,000
500,000
300,000
100,000
100,000
Chapter Summary