CORRELATION
CORRELATION
Correlation is defined as the relationship between two or more variables. Two variables are said to be
correlated if the change in one variable results in a corresponding change in the other variable.
For example: when price of a commodity rises supply for that commodity also rises.
Positive correlation:
Two variables are moves in the same direction then the correlation is said to be positive correlation.
That is an increase in the value of one variable causes an increase in the value of other variable or a
decrease in the value of one variable causes a decrease in the value of the other variable.
Negative correlation:
Two variables moves in the opposite direction then the correlation is said to be negative correlation.
That is an increase in the value of one variable causes a decrease in the value of the other variable
or a decrease in the value of one variable causes an increase in the value of other variable.
Linear correlation:
When the amount of change in one variable leads to a constant ratio to the change in the other
variable, then the correlation is said to be linear
x 5 10 15 20
y 10 20 30 40
Ratio 1:2
x 5 10 15 20
y 12 23 8 25
Simple correlation
In the study of relationship between variables , if there are only two variables, the correlation is
said to be simple.
Partial correlation
In partial correlation we study the relationship between any two variables , and the third variable
remains constant, if there are three variables.
Multiple correlation
In multiple correlation we study the relationship between one variable on one side and the
remaining variables on the other side.
1. Scatter diagram
This is a graphical method of studying correlation between two variables. One of the variables is
shown on the X – axis and the other on the Y- axis. Each pair of value is plotted on the graph by
means of a dot mark. After all the items are plotted we get as many dots on the graph paper as the
number of points. If this points show some upward trend ( left bottom to right top) , then the
correlation is said to be positive. If this points show some downward trend (left top to right bottom)
then the correlation is said to be negative. If the plotted point do not show any trend then there is
no correlation.
Positive correlation
No correlation
Negative correlation
2. Coefficient of correlation
Correlation coefficient is denoted by ‘r’. the value of correlation coefficient lies between -1 and+1.
n xy ( x y )
n x 2 x n y 2 y
2 2
n dxdy ( dx dy )
r
n dx 2 dx n dy 2 y
2 2
cov( x, y)
Karl Pearson’s correlation co.efficient=
x y
,
m3 m
6[ D (
2
)]
In case of tied ranks, Spearman’s rank correlation coefficient = 1 12
n(n 2 1)
X 2 3 4 5 6 7 8
Y 4 5 6 12 9 5 4
Solution:
X Y XY X2 Y2
2 4 8 4 16
3 5 15 9 25
4 6 24 16 36
5 12 60 25 144
6 9 54 36 81
7 5 35 49 25
8 4 32 64 16
n XY X . Y
r=
n X 2 X n Y 2 Y
2 2
1596 1575
=
1421 1225 2401 2025
21
=
196 376
21
=
196 376
6 D 2
Spearman’s rank correlation coefficient = 1
n(n2 1)
Ex.1. The ranking of 10 individuals at the start and at the finish of a course of training are as follows.
Individuals : A B C D E F G H I J
Rank before : 1 6 3 9 5 2 7 10 8 4
Rank after : 6 8 3 2 7 10 5 9 4 1
1 6 5 25
6 8 2 4
3 3 0 0
9 2 7 49
5 7 2 4
2 10 8 64
7 5 2 4
10 9 1 1
8 4 4 16
4 1 3 9
Sum=176
6 D 2
Rank correlation coefficient = 1
n n 2 1
6 176
=1
10 100 1
=1 - 1.07
Ex2. Ten competitors in a beauty contest are ranked by three judges in the following order.
First judge :1 6 5 10 3 2 4 9 7 8
Second judge :3 5 8 4 7 10 2 1 6 9
Third judge :6 4 9 8 1 2 3 10 5 7
1 3 2 4
6 5 1 1
5 8 3 9
10 4 6 36
3 7 4 16
2 10 8 64
4 2 2 4
9 1 8 64
7 6 1 1
8 9 1 1
TOTAL 200
6 D 2
Rank correlation coefficient = 1
n n 2 1
6 200
=1
10(100 1)
=1-1.21
1 6 5 25
6 4 2 4
5 9 4 16
10 8 2 4
3 1 2 4
2 2 0 0
4 3 1 1
9 10 1 1
7 5 2 4
8 7 1 1
TOTAL 60
6 D 2
Rank correlation coefficient = 1
n n 2 1
6 60
=1
10(100 1)
=1-0.364
Rank correlation co.efficient , r = 0.636
3 6 3 9
5 4 1 1
8 9 1 1
4 8 4 16
7 1 6 36
10 2 8 64
2 3 1 1
1 10 9 81
6 5 1 1
9 7 2 4
TOTAL 240
6 D 2
Rank correlation coefficient = 1
n n 2 1
6 214
=1
10(100 1)
= 1-1.30
The rank correlation coefficient in the case of first and third judges is greater than other two pairs.
There for first and third judges have highest similarity of thought and have the nearest approach in
common taste in beauty.
REPEATED RANK
m3 m
6[ D 2 ( )]
In case of tied ranks, Spearman’s rank correlation coefficient = 1 12
n(n 2 1)
Obtain the rank correlation coefficient for the following data
X 68 64 75 50 64 80 75 40 55 64
Y 62 58 68 45 81 60 68 48 50 70
Solution:
X Rank of X Y Rank of Y D2
68 4 62 5 1
64 6 58 7 1
75 2.5 68 3.5 1
50 9 45 10 1
64 6 81 1 25
80 1 60 6 25
75 2.5 68 3.5 1
40 10 48 9 1
55 8 50 8 0
64 6 70 2 16
TOTAL 72
75 Occurs 2 times , m = 2 m3 m = 6
64 occurs 3 times , m= 3 m 3 m = 24
m3 m
6[ D 2 ( )]
Spearman’s rank correlation coefficient = 1 12
n(n 2 1)
36
6 72
12
= 1
10(100 1)
6 75
=1
990
2C N
Concurrent deviation is r= ( )
N
Ex. Calculate the co.efficient of concurrent deviation from the following data.
X: 20 25 30 15 28 32 35 17 29
Y: 30 18 25 10 30 25 15 30 27
X Y Direction of Direction of Dx DY
change (Dx) change of Y(Dy)
20 30 ................ ……… ………
25 18 + - -
30 25 + + +
15 10 - - +
28 30 + + +
32 25 + - -
35 15 + - -
17 30 - + -
29 27 + - -
2C N
Concurrent deviation is r= ( )
N
=-.5
Note:
4).correlation co .efficient does not change with reference to change of origin or change of scale.
Probable error
Probable error is used to measure the reliability and dependability of the value of correlation
coefficient . If probable error is added or subtracted from the value of correlation coefficient , we get 2
limits within which the value of correlation coefficient may expected to lie.
0.6745(1 r 2 )
P.E= Probable error =
n
(1 r 2 )
S.E =
n