CH No 3 Statistics
CH No 3 Statistics
Central Tendency
Measures of Central Tendency
• A Measure of Location summarizes a data set by giving a “single quantitative value”
within the range of the data values that describes its location relative to entire data set.
• Some Common Measures are:
• Arithmetic Mean / Average
• Geometric Mean
• Harmonic Mean
• Median
• Mode
Arithmetic mean / mean
• Most common measure of the center
• Obtained by dividing the SUM of all the observations by the total
number of observations
N
X i
X1 + X 2 + + XN
Population Mean = i =1
=
N N
x i
x1 + x2 + + xn
Sample Mean x= i =1
=
n n
Properties of Arithmetic mean
1. Mean of the constant is equal to that constant
2. The sum of the deviations of the observations from their mean is
equal to zero. i.e., 𝒏
ഥ =𝟎
𝑿𝒊 − 𝑿
𝒊=𝟏
𝒊=𝟏 𝒊=𝟏
σ 𝑿 𝟓𝟒𝟖
ഥ=
𝑿 = = 𝟔𝟖. 𝟓
𝒏 𝟖
Properties of Arithmetic mean
4. If X1, X2 , …………, Xn have mean 𝑋ത then the mean after multiplying each observation by a
constant ‘a’ is the mean multiplied by that constant.
σ 𝒏
ഥ ∗ 𝒊=𝟏 𝒂𝑿𝒊 ഥ
𝑿 = =𝒂 ×𝑿
𝒏
5. If a constant ‘a’ is added to each of the observation X1, X2 , …………, Xn having mean 𝑋ത
then mean increases by that constant.
σ 𝒏
𝒊=𝟏 (𝒂+𝑿𝒊 )
ഥ =
𝑿∗ ഥ+𝒂
=𝑿
𝒏
Weighted arithmetic mean
Computations:
For ‘n’ observations, 𝑥1 , 𝑥2 , … , 𝑥𝑛 of a data set with corresponding
weights 𝑤1 , 𝑤2 , … , 𝑤𝑛 then weighted arithmetic mean is defined as:
σ𝑛
𝑖=1 𝑊𝑖 𝑋𝑖 σ 𝑊𝑋
𝑋ത𝑤 = σ𝑛
= σ𝑊
𝑖=1 𝑊𝑖
Weighted A.M - Calculations
Subjects Marks (Xi) Weights Weights
(Wi) WiXi
Statistics 80 20 1600
Mathematics 75 10 750
Chemistry 50 40 2000
English 60 30 1800
Total 265 100 6150
σ 𝑋 265 σ 𝑊𝑋 6150
𝑋ത𝑤 = = = 66.25 𝑋ത𝑤 = = = 61.5
𝑛 4 σ𝑊 100
Example
• An examination was held to decide about the award of a scholarship in an
institution. The weights of various subjects were different. The marks obtained by 3
candidates ( A,B,C) out of 100 are given below. If the candidate getting the height
average score is to be awarded the scholarship, who should get it
Subject Weights (%) XA XB XC WXA WXB WXC
Statistics 40 70 80 91 2800 3200 3640
Mathematics 30 70 75 80 2100 2250 2400
Economics 20 54 57 40 1080 1140 800
English 10 70 40 45 700 400 450
TOTAL 100 264 252 256 6680 6990 7290
If there are no weights involved
σ 𝑋 264 σ 𝑊𝑋 6680
𝑋ത (𝑆𝑡𝑢𝑑𝑒𝑛𝑡 𝐴) = = = 66 𝑋ത𝑤 (𝑆𝑡𝑢𝑑𝑒𝑛𝑡 𝐴) = = = 66.8
𝑛 4 σ𝑊 100
𝑋ത (𝑆𝑡𝑢𝑑𝑒𝑛𝑡 𝐵) = 63 𝑋ത𝑤 (𝑆𝑡𝑢𝑑𝑒𝑛𝑡 𝐵) = 69.90
𝑋ത 𝑆𝑡𝑢𝑑𝑒𝑛𝑡 𝐶 = 64 𝑋ത𝑤 𝑆𝑡𝑢𝑑𝑒𝑛𝑡 𝐶 = 72.9
Combined Mean
σ𝒏𝒊=𝟏 𝑳𝒐𝒈 𝑿𝒊
𝑮. 𝑴 = 𝑨𝒏𝒕𝒊𝒍𝒐𝒈
𝒏
• The Harmonic Mean (H) of a set of n values 𝑥1 , 𝑥2 , … , 𝑥𝑛 is defined as the reciprocal of
the arithmetic mean of the reciprocals of the values.
𝒏
𝑯. 𝑴 =
𝟏
σ𝒏𝒊=𝟏
𝑿𝒊
Example
Find Geometric Mean and Harmonic Mean from the following
data?
X Log (X) 1/X σ𝒏𝒊=𝟏 𝑳𝒐𝒈 𝑿𝒊
𝑮. 𝑴 = 𝑨𝒏𝒕𝒊𝒍𝒐𝒈 = 𝟔. 𝟒𝟑
3 0.477 0.333 𝒏
5 0.699 0.200
6 0.778 0.167
0.778 0.167 𝒏
6 𝑯. 𝑴 = = 𝟓. 𝟖𝟕
𝟏
7 0.845 0.143 σ𝒏𝒊=𝟏
𝑿𝒊
10 1.000 0.100
12 1.079 0.083
49 5.6567 1.1929
Mode
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 0 1 2 3 4 5 6
No Mode
Mode = 9
Median
𝑛+1
𝑀𝑒𝑑𝑖𝑎𝑛 = 𝑆𝑖𝑧𝑒 𝑜𝑓 𝑡ℎ 𝑜𝑏𝑠𝑒𝑟𝑣𝑎𝑡𝑖𝑜𝑛
2
Quartiles
▪ Divide an array into four equal parts, each part having
25% of the distribution of the data values, denoted by Q j
▪ 25th of the observations are below the 1st quartile.
▪ 1st quartile is the 25th percentile; the 2nd quartile is the
50th percentile, also the median and the 3rd quartile is
the 75th percentile.
𝒏+𝟏
𝑸𝒋 = 𝑺𝒊𝒛𝒆 𝒐𝒇 𝒋 𝒕𝒉 𝒐𝒃𝒔𝒆𝒓𝒗𝒂𝒕𝒊𝒐𝒏
𝟒
Where j = 1, 2, 3
Deciles
▪ Divide an array into ten equal parts, each part having ten
percent of the distribution of the data values, denoted by Dj
▪ 10 percent of the total observations fall below D1 and the
rest 90% are above it.
▪ 5th Decile is equal to the Q2 and Median
𝒏+𝟏
𝑫𝒋 = 𝑺𝒊𝒛𝒆 𝒐𝒇 𝒋 𝒕𝒉 𝒐𝒃𝒔𝒆𝒓𝒗𝒂𝒕𝒊𝒐𝒏
𝟏𝟎
Where j = 1, 2, 3, …,9
Percentiles
▪ Divide an array (raw data arranged in increasing or
decreasing order of magnitude) into 100 equal parts.
▪ The jth percentile, denoted as Pj, is the data value in the data
set that separates the bottom j% of the data from the top
(100-j)%.
𝒏+𝟏
𝑷𝒋 = 𝑺𝒊𝒛𝒆 𝒐𝒇 𝒋 𝒕𝒉 𝒐𝒃𝒔𝒆𝒓𝒗𝒂𝒕𝒊𝒐𝒏
𝟏𝟎𝟎
Where j = 1, 2, 3, …,99
Example
▪ Suppose ALI was told that relative to the other scores on a NTS
test, his score was the 95th percentile i.e., his percentile score
is 95. How do we interpret it?
➔ This means that 95% of those who took the test had scores
less than or equal to Ali’s score, while 5% had scores higher than
Ali’s.
Exercise
• Find Median, Q1, Q2, Q3 of the following data D1, D5, P10 and P50
for the following data of marks obtained by 20 students? Also
show that Median = Q2 = D5 = P50 ? Also interpret the results?
53 74 82 42 39 28 20 18 68 58 54 93 70
30 61 55 36 37 29 94
Sr. No. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
18 20 28 29 30 36 37 39 42 53 54 55 58 61 68 70 74 82 93 94
Median & Quartiles
𝒏+𝟏
• 𝑴𝒆𝒅𝒊𝒂𝒏 = 𝑺𝒊𝒛𝒆 𝒐𝒇 𝒕𝒉 𝒐𝒃𝒔𝒆𝒓𝒗𝒂𝒕𝒊𝒐𝒏
𝟐
= Size of 10.5th Observation
= 10th Observation + 0.5 (11th Observation – 10th Observation)
= 53 + 0.5 (54 – 53)
= 53.5
𝒏+𝟏
• Q3= 𝑺𝒊𝒛𝒆 𝒐𝒇 𝟑 𝒕𝒉 𝒐𝒃𝒔𝒆𝒓𝒗𝒂𝒕𝒊𝒐𝒏
𝟒
= Size of 15.75th observation
= 15th Observation + 0.75 (16th Observation – 15th Observation)
= 68 + 0.75 (70 – 68)
=69.5
Boxplot / Box & Whisker plot
Minimum = 20
Q1 = 36.25
Median = 54.5
Q3 = 73
Maximum = 94