Unit 7 PDF
Unit 7 PDF
Unit 7 PDF
Structure
7.1 Introduction
Objectives
7.2 Concept of Rank Correlation
7.3 Derivation of Rank Correlation Coefficient Formula
7.4 Tied or Repeated Ranks
7.5 Concurrent Deviation
7.6 Summary
7.7 Solutions / Answers
7.1 INTRODUCTION
In second unit of this block, we have discussed the correlation with its
properties and also the calculation of correlation coefficient. In correlation
coefficient or product moment correlation coefficient, it is assumed that both
characteristics are measurable. Sometimes characteristics are not measurable
but ranks may be given to individuals according to their qualities. In such
situations rank correlation is used to know the association between two
characteristics. In this unit, we will discuss the rank correlation and calculation
of rank correlation coefficient with its merits and demerits. We will also study
the method of concurrent deviation.
In Section 7.2, you will know the concept of rank correlation while Section 7.3
gives the derivation of Spearman’s rank correlation coefficient formula. Merits
and demerits of the rank correlation coefficient are discussed in Sub-section
7.3.1. There might be a situation when two items get same rank. This situation
is called tied or repeated rank which is described in Section 7.4. You will learn
the method of concurrent deviation in Section 7.5.
Objectives
After reading this unit, you would be able to
explain the concept of rank correlation;
derive the Spearman’s rank correlation coefficient formula;
describe the merits and demerits of rank correlation coefficient;
calculate the rank correlation coefficient in case of tied or repeated ranks;
and
describe the method of concurrent deviation.
x
i 1
i x 1 x 2 ... x n
2 1 n
x (x i x)2
n i1
2 1 n
x ( x i2 x 2 2x i x)
n i1
2 1 n 2 n 2 n
x ( x i x 2x x i )
n i1 i 1 i 1
2 1 n 2
x ( x i nx 2 2nx 2 )
n i1
2 1 n 2
x ( x i n x 2 )
n i1
2 1 n 2 1
x x i x 2 (x12 x 22 ... x 2n ) x 2 … (2)
n i1 n
Substituting the value of x in equation (2), we have
2
2 1 n 1
x (12 2 2 ... n 2 ) (Since X is taking values 1,2,…,n)
n 2
2
2 1 n ( n 1)( 2n 1) n 1
x
n 6 2
(From the formula of sum of squares of n natural numbers)
2
2 (n 1)(2n 1) (n 1)
x
6 4
2 2n 1 (n 1)
x (n 1)
6 4
2 2(2 n 1) 3(n 1)
x (n 1)
12
2 4n 2 3n 3
x (n 1)
12
2 n 1
x (n 1)
12
2 n 2 1
σx (from the formula (a - b)(a + b) = a 2 b 2 )
12
Since both variables X and Y are taking same values, they will have same
variance, thus
n2 1
σ 2y σ 2x
12
47
Correlation for Bivariate Data Let d i be the difference of the ranks of i th individual in two characteristics,
then
di xi yi
d i x i yi x y Since x y
d i ( x i x ) ( y i y)
d i2 ( x i x) ( y i y)
i 1 i 1
2
n n
d i2 (x i x ) 2 ( y i y) 2 2(x i x )( y i y)
i 1 i 1
n n n n
d i2 ( x i x ) 2 ( y i y) 2 2 (x i x )( y i y) … (3)
i 1 i 1 i 1 i 1
Cov(x, y)
We know that, r , which implies that Cov( x , y) r x y .
xy
1 n 2
d i 2x 2y 2r x y
n i1
Since, 2x 2y , then
1 n 2
d i 2x 2x 2r x x
n i1
1 n 2
d i 2 2x 2r 2x
n i1
1 n 2
d i 2 2x (1 r)
n i1
n
1 2
2n2x
d
i1
i (1 r)
n
2
d
i 1
i
r 1
2n 2x
48
n
Rank Correlation
6 d i2
i 1 2 n2 1
r 1 (Since x )
n (n 2 1) 12
We denote rank correlation coefficient by rs , and hence
n
6 d i2
i 1
rs 1 … (5)
n (n 2 1)
This formula was given by Spearman and hence it is known as Spearman’s
rank correlation coefficient formula.
Let us discuss some problems on rank correlation coefficient.
Example 1: Suppose we have ranks of 8 students of B.Sc. in Statistics and
Mathematics. On the basis of rank we would like to know that to what extent
the knowledge of the student in Statistics and Mathematics is related.
Rank in Statistics 1 2 3 4 5 6 7 8
Rank in Mathematics 2 4 1 5 3 8 7 6
Difference of
Rank in Rank in
Ranks
Statistics Mathematics d i2
R x R y di R x R y
1 2 −1 1
2 4 −2 4
3 1 2 4
4 5 −1 1
5 3 2 4
6 8 −2 4
7 7 0 0
8 6 2 4
2
d i 22
Solution: In this problem, we want to see which two subjects have same
trend i.e. which two subjects have the positive rank correlation coefficient.
Here we have to calculate three rank correlation coefficients
r12s Rank correlation coefficient between the ranks of Computer and Physics
r23s Rank correlation coefficient between the ranks of Physics and Statistics
6 d 223 6 32 8 3
r23s 1 2
1 1 0.6
n ( n 1) 5 24 5 5
2
6 d13 6 14 7 3
r13s 1 2
1 1 0.3
n ( n 1) 5 24 10 10
r12s is negative which indicates that Computer and Physics have opposite
trend. Similarly, negative rank correlation coefficient r23s shows the opposite
50
trend in Physics and Statistics. r13s 0.3 indicates that Computer and Statistics Rank Correlation
have same trend.
Sometimes we do not have rank but actual values of variables are available. If
we are interested in rank correlation coefficient, we find ranks from the given
values. Considering this case we are taking a problem and try to solve it.
Example 3: Calculate rank correlation coefficient from the following data:
x 78 89 97 69 59 79 68
y 125 137 156 112 107 136 124
Series 50 70 80 80 85 90
In the above example 80 was repeated twice. It may also happen that two or
more values are repeated twice or more than that.
For example, in the following series there is a repetition of 80 and 110. You
observe the values, assign ranks and check with following.
m(m 2 1)
When there is a repetition of ranks, a correction factor is added to
12
d 2 in the Spearman’s rank correlation coefficient formula, where m is the
number of times a rank is repeated. It is very important to know that this
correction factor is added for every repetition of rank in both characters.
52
In the first example correction factor is added once which is 2(4-1)/12 = 0.5, Rank Correlation
while in the second example correction factors are 2(4-1)/12 = 0.5 and
3 (9-1)/12 = 2 which are aided to d 2 .
m (m 2 1)
6 d 2 ...
12
rs 1 2
n (n 1)
Here rank 6 is repeated three times in rank of x and rank 2.5 is repeated twice
in rank of y, so the correction factor is
3(32 1) 2(22 1)
12 12
Hence rank correlation coefficient is
3(32 1) 2(2 2 1)
683.50
12 12
rs 1
8(64 1)
53
Correlation for Bivariate Data 3 8 2 3
683.50
12 12
rs 1
8 X 63
6(83.50 2.50)
rs 1
504
516
rs 1
504
rs 1 1.024 0.024
There is a negative association between expenditure on advertisement and
profit.
Now, let us solve the following exercises.
E2) Calculate rank correlation coefficient from the following data:
x 10 20 30 30 40 45 50
y 15 20 25 30 40 40 40
54 3. Now second value is taken as base and it is compared with the third value
of the series. If third value is less than second ‘-’ is assigned against the
third value. If the third value is greater than the second value ‘+’ is Rank Correlation
assigned. If second and third values or equal than ‘=’ sign is assigned.
4. This procedure is repeated upto the last value of the series.
5. Similarly, we obtain column D y for series y.
7.6 SUMMARY
In this unit, we have discussed:
1. The rank correlation which is used to see the association between two
qualitative characteristics;
2. Derivation of the Spearman’s rank correlation coefficient formula;
3. Calculation of rank correlation coefficient in different situations- (i) when
values of variables are given, (ii) when ranks of individuals in different
characteristics are given and (iii) when repeated ranks are given;
4. Properties of rank correlation coefficient; and
5. Concurrent deviation which provides the direction of correlation.
m (m 2 1)
6 d 2 ...
12
rs 1 2
n (n 1)
Here, rank 4.5 is repeated twice in rank of x and rank 2 is repeated
thrice in rank of y so the correction factor is
2(2 2 1) 3(32 1)
12 12
and therefore, rank correlation coefficient is
2(2 2 1) 3(32 1)
6 2.5
12 12
rs 1
7(49 1)
2 3 3 8
62.5
12 12
rs 1
7 48
6 ( 2. 5 2. 5)
rs 1
336
30 306
rs 1
336 336
rs 0.91
E3) We have some calculations in the following table:
57
Correlation for Bivariate Data x Rank of x y Rank of y d = Rx-Ry d2
(Rx) (Ry)
70 6.5 90 2 4.5 20.25
70 6.5 90 2 4.5 20.25
80 4 90 2 2 4
80 4 80 4 0 0
80 4 70 5 −1 1
90 2 60 6 −4 16
100 1 50 7 −6 36
2
d 97.5
2 3 5 1
r =
5 5
59