0% found this document useful (0 votes)
114 views25 pages

Correlation

- Correlation analysis measures the strength of the linear relationship between two or more variables. - There can be positive or negative correlation depending on whether the variables vary in the same or opposite directions. - A scatter plot is used to visualize the relationship between two variables and identify if the correlation is linear or non-linear. - The Pearson correlation coefficient, r, quantifies the strength and direction of the linear relationship between two variables, with values ranging from -1 to 1.

Uploaded by

Shimul Hossain
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
114 views25 pages

Correlation

- Correlation analysis measures the strength of the linear relationship between two or more variables. - There can be positive or negative correlation depending on whether the variables vary in the same or opposite directions. - A scatter plot is used to visualize the relationship between two variables and identify if the correlation is linear or non-linear. - The Pearson correlation coefficient, r, quantifies the strength and direction of the linear relationship between two variables, with values ranging from -1 to 1.

Uploaded by

Shimul Hossain
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 25

Correlation Analysis

Correlation
• Suppose we may be interested in finding the
relationship between volume of sales and year of
experience of sales person of a departmental store

• When variables are found to be related, we often want


to know how close the relationship is

• The primary objective of correlation analysis is to


measure the strength of linear relation between two or
more variables
February 15, 2024 Correlation 2
Types of Correlation
• Positive & Negative correlation
 If both the variables are varying in the same direction then it is said that
there is positive correlation between these variable
 If both the variables are varying in the opposite direction then it is said that
there is negative correlation between these variable
• Simple, Partial and Multiple Correlation
 When only two variables are studied it is a problem of simple correlation
 When more than two variables are studied it is a problem of either partial or multiple
correlation

• Linear& Non- linear ( Curvilinear) correlations


 If the amount of change in one variable tends to bear a constant ratio to the amount
of change in the other variable the correlation is said to be linear & if not then
curvilinear or non- linear correlation

February 15, 2024 Correlation 3


Scatter Plot
• Simplest device for studying the correlation in two
variables is a special type of dot chart called scatter
diagram
• Scatter diagrams are useful for displaying
information on two quantitative variables, which are
believed to be inter- related
• Scatter diagram is an essential and important first step
in studying the association between two variable

February 15, 2024 Correlation 4


Different types of correlation using scatter diagram

February 15, 2024 Correlation 5


Simple Correlation Coefficient
• A scatter plot is essential and important first step in
studying the association between two variables
• It is often useful to quantify the strength of the
association by calculation of a summary index
• One such commonly used measure is the Pearson’s
correlation coefficient, denoted by r
• Correlation coefficient is a quantitative measure of
the direction and strength of linear relationship
between two numerically measured variable

February 15, 2024 Correlation 6


Simple Correlation Coefficient
• The formula for r is
 X iY i   X Y
i i

r
 ( X  X )(Y  Y )
i i
 n
2
 ( X i  X )  (Y i Y )
2

  
 X i 
2

  Y i 
2

  X i    Y i 
2 2

n  n 
   
• Assumptions of r
1. Both variables are measured on an interval or ratio scales
2. Two variables are follow bivariate normal distribution
3. The relationship between the variables is linear
4. The sample is adequate size to assume normality

February 15, 2024 Correlation 7


Properties of r
• The correlation coefficient is a symmetric measure
• Will be positive or negative depending on whether
the sign of the numerator of the formula is positive or
negative
• The correlation coefficient lies between -1 and +1
• It is dimensionless quantity
• The coefficient of correlation is independent of origin
and scale of measurement

February 15, 2024 Correlation 8


Interpretation of r
Perfect Moderate Perfect
No Moderate
Negative Negative Positive
Correlation positive
Correlation Correlation Correlation
Correlation

Strong Weak Weak Strong


Negative Negative positive positive
Correlation Correlation Correlation Correlation

-1 -.50 0 +.50 +1
Negative Correlation Positive Correlation

February 15, 2024 Correlation 9


Calculation of Correlation Coefficient
• A department store has the following statistics of sales for a
period of last one year of salesman, who have varying years
of experience
Sales
1 2 3 4 5 6 7 8 9 10
person
Years of
1 3 4 4 6 8 10 10 11 13
experience
Annual
80 97 92 102 103 111 119 123 117 136
Sales

Find the coefficient of correlation between years of


experience & annual sales
February 15, 2024 Correlation 10
Scatter Diagram
Scatter Diagram
14

12

10

8
Sales

0
70 80 90 100 110 120 130 140

Experience
February 15, 2024 Correlation 11
Calculation of Correlation Coefficient
Necessary table for calculation
Sales person X Y XY X2 Y2
1 1 80 80 1 6400
2 3 97 291 9 9409
3 4 92 368 16 8464
4 4 102 408 16 10404
5 6 103 618 36 10609
6 8 111 888 64 12321
7 10 119 1190 100 14161
8 10 123 1230 100 15129
9 11 117 1287 121 13689
10 13 136 1768 139 18496
Totals 70 1080 8128 632 119082
February 15, 2024 Correlation 12
Calculation of Correlation Coefficient

X Y 
 X Y i i
i i
r n

  X i  2



  Y i  
2

  
2 2
X i

n   Yi  n 
   

70  1080
8128 
 10  0.96
 2
  2

 632  70  119082  1080 
 10   10 
 
• From the above calculation we can see that correlation coefficient
between sales and experience is .96 which means that there is
high degree of positive correlation between sales and experience

February 15, 2024 Correlation 13


Rank Correlation
• For computing Pearson correlation coefficient we
need our data at least an interval level
• Furthermore it was noted that that two variables had a
joint normal distribution
• In situation where the truth of these assumption is
doubtful, we may use other technique generally
known as rank correlation
• The measure based on this method is known as rank
correlation coefficient

February 15, 2024 Correlation 14


Rank Correlation
• Rank correlation method is recommended when
 The are values of the variables are available in
rank –ordered form
 The data are qualitative in nature and can be
ranked in some order
 The data were originally quantitative in nature
but because of smallness of the sample size or
for convenience in fitting the requirements of
analytical techniques, were converted into ranks
February 15, 2024 Correlation 15
Computing Rank Correlation
• Spearman rank correlation is just the ordinary sample correlation
coefficient r applied in ranked data
• The method calls for computing the sum of squared difference between
each pair of ranks
• If no tie in ranks exists, we can apply the following formula for computing
Where di is the difference
6 d i
2
between ranks of the ith pair
r s
 1 2
n(n  1) and n is the number of pairs
included
• A convenient and simple formula for computing r is as follows

 x y i
C
r 
n 1
i
s 1 Where, 2
2
n(n  1) n
12 C
February 15, 2024 Correlation
4 16
Computing Rank Correlation
• Suppose we wish to determine whether the
marks given by two independent examiner are
correlated . The table below shows the marks
Student 1 2 3 4 5 6 7 8 9 10
1st examiner
65 70 76 75 80 78 83 84 85 90
marks

2nd examiner
30 25 35 40 38 42 48 50 55 45
marks

• Calculate the spearman rank correlation


coefficient
February 15, 2024 Correlation 17
Computing Rank Correlation
1st 2nd
Student examiner
Marks
Rank
x  i
Examiner
Marks
Rank
y  di  xi  y i d
2
i
 xi  yi x y
2
i i
i
1 65 10 30 9 +1 1 90
2 70 9 25 10 -1 1 90
3 76 7 35 8 -1 1 56
4 75 8 40 6 +2 4 48
5 80 5 38 7 -2 4 35
6 78 6 42 5 +1 1 30
7 83 4 48 3 +1 1 12
8 84 3 50 2 +1 1 6
9 85 2 55 1 +1 1 2
10 90 1 45 4 -3 9 4
Total 24 373
February 15, 2024 Correlation 18
Computing Rank Correlation
• Calculation
6 d i
2
6( 24)
r s
 1 2
n(n  1)
 1
10(100  1)
 1  .15  .85

 x y C
373  302.5
n1  302.5
i
r s

1 2
i

1
 .85
C
n
2

n(n  1) 10(100  1)
12 12 4

• The value of the correlation coefficient indicates that


two examiner strongly agree in their opinion in ranking
the candidates on the basis of their performance
February 15, 2024 Correlation 19
Computing Rank Correlation
When tie occurs

• An examination of eight applicants for a clerical post


was taken by a firm. From the marks obtained by the
applicant in the Accountancy & Statistics papers.
Compute rank coefficient of correlation
Applicant A B C D E F G H
Marks in 15 20 28 12 40 60 20 80
Accountancy

Marks in 40 30 50 30 20 10 30 60
Statistics

February 15, 2024 Correlation 20


Computing Rank Correlation
Marks of Marks of Rank of x Rank of y diff 2

Accountancy Statistics Accountancy


i
Statistics i
d  (x  y )
d
i i

15 40 2 6 -4 16
20 30 3.5 4 -0.5 0.25
28 50 5 7 -2 4
12 30 1 4 -3 9
40 20 6 2 4 16
60 10 7 1 6 36
20 30 3.5 4 -0.5 0.25
80 60 8 8 0 0

February 15, 2024 Correlation 21


Computing Rank Correlation

6 d i 
2 1
12
m m
1
2
1
1  1
12
m m
2
2
2
 
1


r s
 1  2

n(n  1)
The item 20 is repeated 2 times in the x series and item 30 occurs 3 times in Y
series and hence m1= 2 and m2=3. Substituting these values in the above formula
we get

 1 2 1 2 
681.5  ( 2(2  1)  (3(3  1) 
12 12
r s
 1  2

8(8  1)
6(81.5  .5  2)
 1
504
0

February 15, 2024 Correlation 22


Properties of Spearman Correlation
Coefficient

• Like simple correlation, spearman correlation


coefficient also ranges from -1 to +1, with an
interpretation similar to that for the simple correlation
coefficient r
• Measure of monotonocity of a relationship
• Considered to be a measure of increasing and
decreasing relationship between two variables

February 15, 2024 Correlation 23


Advantages of Spearman’s Rank
Correlation over Pearson’s
• When the data possess a curvilinear relationship the
rank correlation coefficient is likely to be more
reliable than the conventional correlation coefficient
• No distributional assumption is made concerning the
distribution of the variable
• Can be calculated when no numerical measurement of
the variables are possible

February 15, 2024 Correlation 24


Correlation & Cause
• If there is a strong relationship between two variable , we are
tempted to assume that increase or decrease in one variable
cause a change in other variable
• For example, it can be shown that the consumption of peanuts
and the consumption of aspirin have a strong correlation
• However this does not indicate that an increase in the
consumption of peanuts caused the consumption of aspirin to
increase
• Relation such as these are called the “Spurious Correlation”
• What we can conclude when we find two variables with strong
correlation is that there is association between two variable,
not the change in one cause the change in other
February 15, 2024 Correlation 25

You might also like