03 - Measures - of - Center - Variation
03 - Measures - of - Center - Variation
Central Tendency
Measures of Central Tendency
X i
X1 + X 2 + + XN
Population Mean = i =1
=
N N
x i
x1 + x2 + + xn
Sample Mean x= i =1
=
n n
Properties of Arithmetic mean
1. Mean of the constant is equal to that constant
2. The sum of the deviations of the observations from their mean is
equal to zero. i.e., 𝒏
ഥ =𝟎
𝑿𝒊 − 𝑿
𝒊=𝟏
𝒊=𝟏 𝒊=𝟏
σ 𝑿 𝟓𝟒𝟖
ഥ=
𝑿 = = 𝟔𝟖. 𝟓
𝒏 𝟖
Properties of Arithmetic mean
4. If X1, X2 , …………, Xn have mean 𝑋ത then the mean after multiplying each
observation by a constant ‘a’ is the mean multiplied by that constant.
σ 𝒏
ഥ ∗ 𝒊=𝟏 𝒂𝑿𝒊 ഥ
𝑿 = =𝒂 ×𝑿
𝒏
5. If a constant ‘a’ is added to each of the observation X1, X2 , …………, Xn having
mean 𝑋ത then mean increases by that constant.
σ 𝒏
𝒊=𝟏 (𝒂+𝑿𝒊 )
ഥ =
𝑿∗ ഥ+𝒂
=𝑿
𝒏
Combined Mean
σ𝒏𝒊=𝟏 𝑳𝒐𝒈 𝑿𝒊
𝑮. 𝑴 = 𝑨𝒏𝒕𝒊𝒍𝒐𝒈
𝒏
• The Harmonic Mean (H) of a set of n values 𝑥1 , 𝑥2 , … , 𝑥𝑛 is defined as the reciprocal of
the arithmetic mean of the reciprocals of the values.
𝒏
𝑯. 𝑴 =
𝟏
σ𝒏𝒊=𝟏
𝑿𝒊
Example
Find Geometric Mean and Harmonic Mean from the following data?
5 0.699 0.200
6 0.778 0.167
0.778 0.167 𝒏
6 𝑯. 𝑴 = = 𝟓. 𝟖𝟕
𝟏
7 0.845 0.143 σ𝒏𝒊=𝟏
𝑿𝒊
10 1.000 0.100
12 1.079 0.083
49 5.6567 1.1929
Mode
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 0 1 2 3 4 5 6
No Mode
Mode = 9
Median
𝑛+1
𝑀𝑒𝑑𝑖𝑎𝑛 = 𝑆𝑖𝑧𝑒 𝑜𝑓 𝑡ℎ 𝑜𝑏𝑠𝑒𝑟𝑣𝑎𝑡𝑖𝑜𝑛
2
Quartiles
▪ Divide an array into four equal parts, each part having
25% of the distribution of the data values, denoted by Q j
▪ 25th of the observations are below the 1st quartile.
▪ 1st quartile is the 25th percentile; the 2nd quartile is the
50th percentile, also the median and the 3rd quartile is
the 75th percentile.
𝒏+𝟏
𝑸𝒋 = 𝑺𝒊𝒛𝒆 𝒐𝒇 𝒋 𝒕𝒉 𝒐𝒃𝒔𝒆𝒓𝒗𝒂𝒕𝒊𝒐𝒏
𝟒
Where j = 1, 2, 3
Deciles
▪ Divide an array into ten equal parts, each part having ten
percent of the distribution of the data values, denoted by Dj
▪ 10 percent of the total observations fall below D1 and the
rest 90% are above it.
▪ 5th Decile is equal to the Q2 and Median
𝒏+𝟏
𝑫𝒋 = 𝑺𝒊𝒛𝒆 𝒐𝒇 𝒋 𝒕𝒉 𝒐𝒃𝒔𝒆𝒓𝒗𝒂𝒕𝒊𝒐𝒏
𝟏𝟎
Where j = 1, 2, 3, …,9
Percentiles
▪ Divide an array (raw data arranged in increasing or
decreasing order of magnitude) into 100 equal parts.
▪ The jth percentile, denoted as Pj, is the data value in the data
set that separates the bottom j% of the data from the top
(100-j)%.
𝒏+𝟏
𝑷𝒋 = 𝑺𝒊𝒛𝒆 𝒐𝒇 𝒋 𝒕𝒉 𝒐𝒃𝒔𝒆𝒓𝒗𝒂𝒕𝒊𝒐𝒏
𝟏𝟎𝟎
Where j = 1, 2, 3, …,99
Example
▪ Suppose ALI was told that relative to the other scores on a NTS
test, his score was the 95th percentile i.e., his percentile score
is 95. How do we interpret it?
➔ This means that 95% of those who took the test had scores
less than or equal to Ali’s score, while 5% had scores higher than
Ali’s.
Exercise
Sr. No. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
18 20 28 29 30 36 37 39 42 53 54 55 58 61 68 70 74 82 93 94
Median & Quartiles
𝒏+𝟏
• 𝑴𝒆𝒅𝒊𝒂𝒏 = 𝑺𝒊𝒛𝒆 𝒐𝒇 𝒕𝒉 𝒐𝒃𝒔𝒆𝒓𝒗𝒂𝒕𝒊𝒐𝒏
𝟐
= Size of 10.5th Observation
= 10th Observation + 0.5 (11th Observation – 10th Observation)
= 53 + 0.5 (54 – 53)
= 53.5
𝒏+𝟏
• Q3= 𝑺𝒊𝒛𝒆 𝒐𝒇 𝟑 𝒕𝒉 𝒐𝒃𝒔𝒆𝒓𝒗𝒂𝒕𝒊𝒐𝒏
𝟒
= Size of 15.75th observation
= 15th Observation + 0.75 (16th Observation – 15th Observation)
= 68 + 0.75 (70 – 68)
=69.5
Example
Minimum = 20
Q1 = 36.25
Median = 54.5
Q3 = 73
Maximum = 94
Measures of Variation
Variation / Dispersion / Spread
• Although arithmetic mean is a concise method of presentation of a statistical
data yet it is inadequate for several reasons, for example, it gives no
indication of its reliability.
• It is possible that average of two data sets are same but even than two data
sets may be quite different with respect to variation among values with in
each data set
• Variance
• Standard Deviation
• Range
• Inter Quartile Range
• Semi Inter Quartile Range
• Mean Deviation
Variance
𝑛 ത 2
Sample Variance σ𝑖=1 𝑋𝑖 − 𝑋
𝑆2 =
𝑛−1
Standard Deviation
Population Standard σ𝑁
𝑖=1 𝑋𝑖 − 𝜇
2
Deviation 𝜎=
𝑁
σ𝑛𝑖=1 𝑋𝑖 − 𝑋ത 2
Sample Standard 𝑆=
Deviation 𝑛−1
X ഥ)
(𝑿 − 𝑿 ഥ )𝟐
(𝑿 − 𝑿
Example 1 2 -4 16
4 -2 4
6 0 0
• Consider the following data of height 8 2 4
(cm) of 5 plants. 10 4 16
2, 4, 6, 8, 10 30 0 40
• Find the average, variance and the σ 𝑿 𝟑𝟎
standard deviation of the yield. ഥ=
𝑿 = =𝟔
𝒏 𝟓
σ 𝒏 ഥ 𝟐
𝟐 𝒊=𝟏 𝑿𝒊 − 𝑿 𝟒𝟎
𝑺 = = = 𝟏𝟎
𝒏−𝟏 𝟓−𝟏
𝑺 = 𝟏𝟎 = 𝟑. 𝟏𝟔
X (𝑿 − 𝟔𝟖. 𝟓) (𝑿 − 𝟔𝟖. 𝟓)𝟐
Example 2 65 -3.5 12.25
71 2.5 6.25
67 -1.5 2.25
• Consider the following data of yield of 75 6.5 42.25
wheat (in kgs) from 8 experimental 63 -5.5 30.25
plots. 69 0.5 0.25
75 6.5 42.25
65, 71, 67, 75, 63, 69, 75, 63 63 -5.5 30.25
• Find the average, variance and the 548 0 166
standard deviation of the yield.
σ 𝑿 𝟓𝟒𝟖
ഥ=
𝑿 = = 𝟔𝟖. 𝟓
𝒏 𝟖
σ𝒏 ഥ 𝟐
𝒊=𝟏 𝑿 𝒊 − 𝑿 𝟏𝟔𝟔
𝑺𝟐 = = = 𝟐𝟑. 𝟕𝟏
𝒏−𝟏 𝟖−𝟏
𝑺 = 𝟐𝟑. 𝟕𝟏 = 𝟒. 𝟖𝟕
The Range & Coefficient of Range
• The Range R is defined as the difference between the largest and the
smallest observations in a dataset. i.e,
𝑅 = 𝑋𝑚𝑎𝑥 − 𝑋𝑚𝑖𝑛
𝑋𝑚𝑎𝑥 − 𝑋𝑚𝑖𝑛
𝐶𝑜𝑒𝑓𝑓. 𝑜𝑓 𝑅𝑎𝑛𝑔𝑒 =
𝑋𝑚𝑎𝑥 + 𝑋𝑚𝑖𝑛
Example
7 8 9 10 11 12 7 8 9 10 11 12
Range = 12 - 7 = 5 Range = 12 - 7 = 5
• Sensitive to outliers
1,1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,3,3,3,3,4,5
Range = 5 - 1 = 4
1,1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,3,3,3,3,4,120
Range = 120 - 1 = 119 37
The Mean Deviation OR Average Deviation
Fish 1 2 3 4 5 6 7 8 9 10
Weight 1.8 1.9 2.1 2.4 2.5 2.6 2.7 2.8 3.1 3.2
Length 11 12 12 13 15 15 16 17 18 18
σ𝑛𝑖=1 𝑋𝑖 − 𝑋ത 4
𝐾= −3
𝑛𝑆 4
Interpretation
K=0
mesokurtic
K>0 K<0
leptokurtic platykurtic
Example
Consider the following data:- Mean 32
Standard Error 1.73
25, 27, 36, 31, 33, 35, 37
Median 33
Find Mean, Variance, Coefficient of Standard Deviation 4.58
Skewness and Coefficient of Kurtosis and Sample Variance 21
interpret the results. Kurtosis -1.65
Skewness -0.39
Range 12
Minimum 25
Maximum 37
Sum 224
Count 7
How to do it…
𝑿 ഥ
𝑿−𝑿 ഥ
𝑿−𝑿 𝟐 ഥ
𝑿−𝑿 𝟑 ഥ
𝑿−𝑿 𝟒
25 -7 49 -343 2401
27 -5 25 -125 625
36 4 16 64 256
31 -1 1 -1 1
33 1 1 1 1
35 3 9 27 81
37 5 25 125 625
224 0 126 -252 3990
Five Number Summary
53 74 82 42 39 28 20 81 68 58
54 93 70 30 61 55 36 37 29 94
Construct Boxplot of the data and interpret it.
Minimum = 20
Q1 = 36.25
Median = 54.5
Q3 = 73
Maximum = 94