0% found this document useful (0 votes)
18 views38 pages

Correlation

Correlation is a statistical method used to assess the relationship between two variables, indicating whether they move together (positive correlation) or in opposite directions (negative correlation). The correlation coefficient (r) quantifies this relationship, ranging from -1 (perfect negative correlation) to +1 (perfect positive correlation), with values indicating strength and direction of the association. Different methods, such as Pearson's and Spearman's correlation coefficients, are used based on the type of data and relationship being analyzed.

Uploaded by

Rashedul Hasan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
18 views38 pages

Correlation

Correlation is a statistical method used to assess the relationship between two variables, indicating whether they move together (positive correlation) or in opposite directions (negative correlation). The correlation coefficient (r) quantifies this relationship, ranging from -1 (perfect negative correlation) to +1 (perfect positive correlation), with values indicating strength and direction of the association. Different methods, such as Pearson's and Spearman's correlation coefficients, are used based on the type of data and relationship being analyzed.

Uploaded by

Rashedul Hasan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 38

Correlation

Correlation

Correlation is a statistical technique used to


determine the degree to which two variables
are related.
Example
For example, consider the variables family income and
family expenditure. It is well known that income and
expenditure increase or decrease together. Thus they
are related in the sense that change in any one variable
is accompanied by change in the other variable.
Again price and demand of a commodity are related
variables; when price increases demand will tend to
decreases and vice versa.
If the change in one variable is accompanied by a
change in the other, then the variables are said to be
correlated.
We can therefore say that family income and family
expenditure, price and demand are correlated.
Correlation
Variables are positively related if they move
in the same direction.
Variables are inversely related if they move in
opposite directions.
Positive relationship
Negative relationship

Reliability

Age of Car
No relation
Example:

A sample of 6 children was selected, data about their


age in years and weight in kilograms was recorded as
shown in the following table . It is required to find the
correlation between age and weight. (Scatterplot)

Weight Age Serial


(Kg) (Years) No
12 7 1
8 6 2
12 8 3
10 5 4
11 6 5
13 9 6
Correlation Coefficient
Statistic showing the degree of relation between two
variables.
Simple Correlation coefficient (r)
 It is also called Pearson's correlation or
product moment correlation coefficient.

 It measures the nature and strength between


two variables of the quantitative type.
Simple Correlation coefficient (r)

COV ( x, y )
r
V ( x)V ( y )
SPxy

SS x SS y
n

 ( x  x )( y
i 1
i i  y)
n n

 i
( x
i 1
 x ) 2
 i
( y
i 1
 y ) 2
How to compute the simple correlation
coefficient (r)

 xi yi  x y
i i

r n
 (  xi ) 2
  (  yi ) 
2
  xi 
2
 .  yi 
2

 n  n 
  
Correlation

The sign of r denotes the nature of


association

while the value of r denotes the


strength of association.
Correlation

 If the sign is +ve this means the relation is direct (an


increase in one variable is associated with an
increase in the other variable and a decrease in one
variable is associated with adecrease in the other
variable).

 While if the sign is -ve this means an inverse or


indirect relationship (which means an increase in
one variable is associated with a decrease in the
other).
Correlation
 The value of r ranges between ( -1) and ( +1)
 The value of r denotes the strength of the association as illustrated
by the following diagram.

stron intermediat weak weak intermediat stron


g e e g

-1 -0.75 -0.25 0 0.25 0.75 1

perfect perfect
correlation correlation
no
relation
Correlation

If r = Zero this means no association or


correlation between the two variables.

If 0 < r < 0.25 = weak correlation.

If 0.25 ≤ r < 0.75 = intermediate correlation.

If 0.75 ≤ r < 1 = strong correlation.

If r = l = perfect correlation.
Example:

A sample of 6 children was selected, data about their


age in years and weight in kilograms was recorded as
shown in the following table . It is required to find the
correlation between age and weight.

Weight Age Serial


(Kg) (Years) No
12 7 1
8 6 2
12 8 3
10 5 4
11 6 5
13 9 6
These 2 variables are of the quantitative type, one
variable (Age) is called the independent and
denoted as (X) variable and the other (weight)
is called the dependent and denoted as (Y)
variables to find the relation between age and
weight compute the simple correlation coefficient
using the following formula:

 xi yi  x yi i

r n

  xi 2   i  .  yi 2   i
 ( x ) 2
 ( y ) 2


 n  n 
  
Weight Age
Serial
Y2 X2 xy (Kg) (years)
.n
(y) (x)
144 49 84 12 7 1

64 36 48 8 6 2

144 64 96 12 8 3

100 25 50 10 5 4

121 36 66 11 6 5

169 81 117 13 9 6

=y2∑ =x2∑ xy=∑ =y ∑ =x ∑ Total


742 291 461 66 41
4166
461 
r 6
 ( 41 )  
2
( 66 ) 
2

 291  6   . 742 
6  

r = 0.759
strong direct correlation
EXAMPLE: Relationship between Anxiety and Test
Scores
Anxiety Test X2 Y2 XY
)X( score (Y)
10 2 100 4 20
8 3 64 9 24
2 9 4 81 18
1 7 1 49 7
5 6 25 36 30
6 5 36 25 30
X = 32∑ Y = 32∑ X2 = 230∑ Y2 = 204∑ XY=129∑
Calculating Correlation Coefficient
(6)(129)  (32)(32) 774  1024
r   .94
6(230)  32 6(204)  32 
2 2
(356)(200)

r = - 0.94

Indirect strong correlation


Correlation
Can you use any type of variable for Pearson's
correlation coefficient?
No, the two variables have to be measured on
either an interval or ratio scale. However, both
variables do not need to be measured on the same
scale (e.g., one variable can be ratio and one can
be interval). Further information about types of
variable can be found in our Types of Variable
guide. If you have ordinal data, you will want to
use Spearman's rank-order correlation or a
Kendall's Tau Correlation instead of the Pearson
product-moment correlation.
Correlation
Do the two variables have to be measured in the
same units?
No, the two variables can be measured in entirely
different units. For example, you could correlate a
person's age with their blood sugar levels. Here, the
units are completely different; age is measured in
years and blood sugar level measured in mmol/L (a
measure of concentration). Indeed, the calculations
for Pearson's correlation coefficient were designed
such that the units of measurement do not affect the
calculation. This allows the correlation coefficient to
be comparable and not influenced by the units of the
variables used.
Correlation
Can you establish cause-and-effect?
No, the Pearson correlation cannot determine
a cause-and-effect relationship. It can only
establish the strength of the association
between two variables. As stated earlier, it
does not even distinguish between
independent and dependent variables.
Spearman Rank Correlation Coefficient (rs)

It is a non-parametric measure of correlation.


This procedure makes use of the two sets of
ranks that may be assigned to the sample values
of x and Y.
Spearman Rank correlation coefficient could be
computed in the following cases:
Both variables are quantitative.
Both variables are qualitative ordinal.
One variable is quantitative and the other is
qualitative ordinal.
Procedure:

1. Rank the values of X from 1 to n where n is the


numbers of pairs of values of X and Y in the
sample.
2. Rank the values of Y from 1 to n.
3. Compute the value of di for each pair of
observation by subtracting the rank of Yi from the
rank of Xi
4. Square each di and compute ∑di2 which is the
sum of the squared values.
5. Apply the following formula

6 (di) 2
rs 1 
n(n 2  1)

The value of rs denotes the


magnitude and nature of association
giving the same interpretation as
simple r.
Example
Example Calculate ‘ r ’ from the following
data. Student No.: 1 2 3 4 5 6
7 8 9 10
Rank in Maths : 1 3 7 5 4 6 2
10 9 8
Rank in Stats: 3 1 4 5 6 9 7
8 10 2
Solution
Student Rank in Rank in R1 - R2 (R1 - R2 )2
No. Maths Stats
(R1) (R2)
d d2
1 1 3 -2 4
2 3 1 2 4
3 7 4 3 9
4 5 5 0 0
5 4 6 -2 4
6 6 9 -3 9
7 2 7 -5 25
8 10 8 2 4
9 9 10 -1 1
10 8 2 6 36
n = 10 ∑d = 0 ∑d2 = 96
Solution
Example
In a study of the relationship between level
education and income the following data was
obtained. Find the relationship between them and
comment.

Income level education sample


(Y) (X) numbers
25 Preparatory. A
10 Primary. B
8 University. C
10 secondary D
15 secondary E
50 illiterate F
60 University. G
Answer
:
di2 di Ran Rank
k X (Y) (X)
Y
4 2 3 5 25 Preparatory A

0.25 0.5 5.5 6 10 Primary. B


30.25 -5.5 7 1.5 8 University. C
4 -2 5.5 3.5 10 secondary D
0.25 -0.5 4 3.5 15 secondary E
25 5 2 7 50 illiterate F
0.25 0.5 1 1.5 60 university. G


di2=64
6 64
rs 1   0.1
7(48)

Comment:
There is an indirect weak correlation between level of
education and income.
Exercise 2 B.P Age B.P Age
(y) (x) (y) (x)
128 46 120 20
The following are the 136 53 128 43
age (in years) and 146 60 141 63
systolic blood
pressure of 20 124 20 126 26
apparently healthy 143 63 134 53
adults. 130 43 128 31
124 26 136 58
121 19 132 46
126 31 140 58
123 23 144 70
Find the correlation between age
and blood pressure using simple
and Spearman's correlation
coefficients, and comment.
x2 xy y x Serial
400 2400 120 20 1
1849 5504 128 43 2
3969 8883 141 63 3
676 3276 126 26 4
2809 7102 134 53 5
961 3968 128 31 6
3364 7888 136 58 7
2116 6072 132 46 8
3364 8120 140 58 9
4900 10080 144 70 10
x2 xy y x Serial
2116 5888 128 46 11
2809 7208 136 53 12
3600 8760 146 60 13
400 2480 124 20 14
3969 9009 143 63 15
1849 5590 130 43 16
676 3224 124 26 17
361 2299 121 19 18
961 3906 126 31 19
529 2829 123 23 20
41678 114486 2630 852 Total

You might also like