0% found this document useful (0 votes)
84 views19 pages

12 Correlation and Rank Correlation 05-02-2024

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
84 views19 pages

12 Correlation and Rank Correlation 05-02-2024

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 19

Correlation and Rank Correlation

by
Dr. Rajesh Moharana

Department of Mathematics
School of Advanced Sciences
Vellore Institute of Technology
Vellore, India

MAT 2001 (Statistics for Engineers) [email protected] 1 / 19


Correlation

Correlation is a statistical measure for finding out the degree of associa-


tion between two or more variables.

Definition
The relationship between two variables such that a change in one
variable results in the positive or negative change in the other is called
correlation. A greater change in one variable resulting in a
corresponding greater or smaller change in the other variable is also
known as correlation.

MAT 2001 (Statistics for Engineers) [email protected] 2 / 19


Conti...

Types of correlation: Correlation is classified into many types

Positive or negative: When two variables tend to move


together in the same direction then the correlation is called
positive correlation, otherwise negative correlation.
Simple and multiple: When we study only two variables say X
and Y , then the relationship is described as a simple correlation. If
we study more then two variables simultaneously, then the
correlation is called multiple correlations.
Partial and total: If the study of the variables excludes some
other variables, then it is called partial correlation. If all the facts
are taken into account, then the correlation is called total
correlation.

MAT 2001 (Statistics for Engineers) [email protected] 3 / 19


Conti...

Coefficient of correlation: The degree of relationship between two vari-


ables measured in terms of another parameter is called the coefficient of
correlation. It is denoted by r or ρ or rXY or ρXY .
Let X and Y denote two variables and r denote the coefficient of
correlation between X and Y . Depending on the value of r , we can
classify correlation as follows.
I If r = 1, both the variables X and Y increase or decrease in the same
proportion. In this case we say that there is perfect positive correlation.

I If r = −1, both the variables X and Y are inversely proportion to


each other. In this case we say that there is perfect negative correlation.

I If r = 0, we say that there is no relation between X and Y .

MAT 2001 (Statistics for Engineers) [email protected] 4 / 19


Conti...

I If 0 < r < 1, there is moderate (partial) positive correlation between


X and Y .

I If −1 < r < 0, there is moderate (partial) negative correlation


between X and Y .

Properties of coefficient of correlation:

1 The correlation is a measure of the relationship between two


variables.
2 The value of the coefficient of correlation lies between −1 and 1,
i.e., −1, ≤ r ≤ 1.
3 If r = 0, the variables are said to be independent.
4 If r = ±1, there is perfect correlation coefficient.

MAT 2001 (Statistics for Engineers) [email protected] 5 / 19


Conti...

Methods of finding the coefficient of correlation:

Coefficient of correlation may be computed by any one or more of the


following methods.
Scatter diagram
Direct method (Karl Pearson’s method)
Two way frequency table method
Concurrent deviation method

MAT 2001 (Statistics for Engineers) [email protected] 6 / 19


Karl Pearson’s Correlation Coefficient Formula

In this method the coefficient of correlation between two variables X and


Y is given by

Cov (X , Y )
r =
σX σY
E ((X − µX )(Y − µY ))
=
σx σy
E (XY ) − E (X )E (Y )
=
σx σy

where σX and σY are the standard deviations, and µX and µY are the
means of X and Y , respectively.

MAT 2001 (Statistics for Engineers) [email protected] 7 / 19


Conti...

If (x1 , y1 ), (x2 , y2 ), · · · , (xn , yn ) be n paired observations, then


P
(xi − x)(yi − y )
r =
nσx σy
( n1 xi yi ) − ( n12
P P P
xi yi )
= r  ,
1P 2 1P 2 1P 2 1P 2
n x i − ( n x i ) n yi − ( n yi )

where x = n1 1
P P
xi and y = n yi are the means of the variables x and
y , respectively.

MAT 2001 (Statistics for Engineers) [email protected] 8 / 19


Problems

Problem 1: The joint probability function of X and Y is given by


x +y
f (x, y ) = ; x = 1, 2, 3 and y = 1, 2.
21
Find the correlation coefficient between X and Y .

Problem 2: If the joint pdf of (X , Y ) is f (x, y ) = 1/4, 0 ≤ x, y ≤ 2,


find correlation coefficient.

Problem 3: Find coefficient of correlation for the following data and


comment.

Fertilisers used: 15 18 20 24 30 35 40 50
Productivity: 85 93 95 105 120 130 150 160

MAT 2001 (Statistics for Engineers) [email protected] 9 / 19


Conti...

Solution 3: Let X denote the fertiliser used and Y denote the produc-
tivity. Computation of coefficient correlation:
1X 1X
x= xi = 29 and y = yi = 119
8 8
P
(xi − x)(yi − y )
∴r = = 0.99
nσx σy
There is a high degree of correlation between fertiliser used and produc-
tivity.

MAT 2001 (Statistics for Engineers) [email protected] 10 / 19


Spearman’s Rank Correlation Coefficient

This method is based on rank and is useful in measuring characteris-


tics such as beauty, intelligence, character, etc. It is applicable only to
individual observations. It is defined as follows
6 d2
P
r =1− ,
n(n2 − 1)
P 2
where n = number of paired observations, d = sum of squares of
the difference of two ranks.

MAT 2001 (Statistics for Engineers) [email protected] 11 / 19


Conti...

Problem 4: A random sample of five college students is selected and


their grades in mathematics and statistics are found to be

Mathematics: 85 60 73 40 90
Statistics: 93 75 65 50 80

Calculate rank correlation coefficient.

MAT 2001 (Statistics for Engineers) [email protected] 12 / 19


Conti...

Solution 4: Calculation of rank correlation coefficient

Marks in Math ‘X’ Ranks of X Marks in Stat ‘Y’ Ranks of Y Rank difference ‘d’ d2
85 2 93 1 1 1
60 4 75 3 1 1
73 3 65 4 -1 1
40 5 50 5 0 0
90 1 80 2 -1 1

6 d2
P  6×5  4
r =1− = 1 − = = 0.8.
n(n2 − 1) 5(52 − 1) 5

MAT 2001 (Statistics for Engineers) [email protected] 13 / 19


Conti...

Rank correlation coefficient when the ranks are tied:

When two or more values are equal it is customary that values are given
the average of the ranks they would have received. In this case the
formula for computing rank correlation coefficient takes the form

6( d 2 + Tx + Ty )
P
r =1− ,
n(n2 − 1)
P ti3 −ti
where Tx = ; (i denotes the ith tie, ti denotes the number of
12
P tj3 −tj
ranks tied in the ith tie in the variable X ) and Ty = 12 ; (j denotes
the jth tie, tj denotes the number of ranks tied in the jth tie in the
variable Y ).

MAT 2001 (Statistics for Engineers) [email protected] 14 / 19


Conti...

Problem 5: Compute the rank correlation coefficient for the following


data
x 68 64 75 50 64 80 75 40 55 64
y 62 58 68 45 81 60 68 48 50 70

MAT 2001 (Statistics for Engineers) [email protected] 15 / 19


Conti...

In the first ranking there are two sets of ties, and in the second ranking,
there is one set of ties.
X Rank of X Y Rank of Y d d2
68 4 62 5 -1 1
64 6 58 7 -1 1
75 2.5 68 3.5 -1 1
50 9 45 10 -1 1
64 6 81 1 5 25
80 1 60 6 -5 25
75 2.5 68 3.5 -1 1
40 10 48 9 1 1
55 8 50 8 0 0
64 6 70 2 4 16

MAT 2001 (Statistics for Engineers) [email protected] 16 / 19


Conti...

Therefore,
1
Tx = {(23 − 2) + (33 − 3)} = 2.5,
12
1
{(23 − 2)} = 0.5,
Ty =
12
 6(72 + 2.5 + 0.5) 
r =1− = 0.545.
990

MAT 2001 (Statistics for Engineers) [email protected] 17 / 19


References

Vijay K. Rohatgi and A.K. Md. Ehsanes Saleh (2003)


An Introduction to Probability and Statistics
Wiley Series in Probability and Statistics
Kapoor, V.K. and Gupta, S.C., (1980)
Fundamentals of Mathematical Statistics
Sultan Chand & Sons
Ronald E. Walpole, Raymond H. Myers, Sharon L. Myers and Keying Ye
(2012)
Probability & Statistics for Engineers & Scientists
Pearson Education
T Veerarajan (2017)
Probability - Statistics and Random Processes
McGraw Hill Education
Rao G. S. (2011)
Probability and Statistics for Science and Engineering
Universities Press

MAT 2001 (Statistics for Engineers) [email protected] 18 / 19


Thank You

MAT 2001 (Statistics for Engineers) [email protected] 19 / 19

You might also like