Unit 3
Unit 3
Chapter 3
Central Tendency
• Any set of Data follows a distribution – for example the age of
students in a class
• However, there always exists a Central Tendency.
• For example, the age in Class 12 of a class may vary between 18
and 20 but the central point may be 19.
• Mean, Median and Mode are the three measures of central
tendency.
• Mean is the arithmetic average of a data set. This is found by
adding the numbers in a data set and dividing by the number of
observations in the data set.
Central Tendency
• The Mode is the value that occurs the most often in a data set
• If we consider the numbers 13, 18, 13, 14, 13, 16, 14, 21, 13
,
the Mean=13+18+13+14+13+16+14+21+13=135/9=15
• Let us sort the numbers shown in the sheet of the previous slide
in ascending order:
• 13,13,13,13,14,14,16,18,21
1 Less than 2 4 4
2 2 to 4 7 11
3 4 to 6 10 21
4 6 to 8 12 33
5 8 to 10 14 47
6 10 to 12 6 53
7 12 to 14 15 68
8 14 to 16 13 81
9 16 to 18 1 82
Total 82
Each of the line items belong to a class. There are 9 classes and hence the mid-point
=4.5 (taken as 5)
Lower limit = l =8 ; h=2 ; n=82; cf=33; f= 14
Therefore, using the formula, the Median
=8+2*((41-33)/14)=8+16/14= 9.1(approx.)
Mode
• The number 13 occurs 4 times and since this is the maximum, the
node is 13
• Mode is the most frequently occurring number in a set.
• It is also used to depict the central tendency of the distribution
• At times the Mode is not located centrally, and hence may not
represent the central point
This is however not very frequent.
• In the previous example, it is close to both the Median and Mean
and thus represents
the Central Tendency to a large extent..
Weighted Mean
• The weighted mean is a type of mean that is calculated by
multiplying the weight (or probability) associated with a
particular event or outcome with its associated quantitative
outcome and then summing all the products together.
Where :
n = Number of scores in each set of data
X(bar) = Mean of the n scores in the first data set
Xi = ith raw score in the first set of scores
Y(bar) = Mean of the N scores in the second data set
Yi = ith raw score in the second set of scores
Covariance between Economic
growth and stock market indices
xi yi xi – x̄ yi – ȳ
2.1 8 -1 -3
2.5 12 -0.6 1
4.0 14 0.9 3
3.6 10 0.5 -1
Now summing each product of the 3rd and 4th rows of the table above and
dividing by n-1 which is equal to 3, we arrive at the covariance which is 1.5 and
hence it has a positive covariance i.e., they move together in the same
direction.
Coefficient of Correlation
The coefficient of correlation provides a measure of the relative strength of the
linear relationship between numerical variables.
The sample’s coefficient of correlation is represented by the symbol ‘r’ , which
range from -1 for perfectly negative correlation to +1 for the perfectly positive
correlation
It is calculated as r=
to measure the total variances of the products of the variables divided by the
product of individual variances.
In the previous example, the standard deviations of the 2 variables (Economic
growth and Stock Market indices) are calculated as 0.89 and 2.58 respectively
Hence, r= 1.5/(0.89*2.58) = 0.64 which indicates a strong correlation, if not very
strong.