0% found this document useful (0 votes)
115 views18 pages

Lesson 9. Correlation Coefficient

The document discusses correlation and the correlation coefficient. It defines correlation as a measure of the relationship between two variables. The correlation coefficient indicates both the direction and strength of the relationship. A positive correlation coefficient represents a direct relationship, while a negative value shows an inverse relationship. The strength of the correlation is also given by the coefficient, with values from 0 to 1 indicating increasing correlation. Examples are provided to illustrate direct, inverse, linear and non-linear relationships. Different types of correlation are defined, including simple, multiple, and partial correlation. Formulas for calculating the Pearson product-moment correlation coefficient and point-biserial correlation are also provided.

Uploaded by

Daniela Caguioa
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
115 views18 pages

Lesson 9. Correlation Coefficient

The document discusses correlation and the correlation coefficient. It defines correlation as a measure of the relationship between two variables. The correlation coefficient indicates both the direction and strength of the relationship. A positive correlation coefficient represents a direct relationship, while a negative value shows an inverse relationship. The strength of the correlation is also given by the coefficient, with values from 0 to 1 indicating increasing correlation. Examples are provided to illustrate direct, inverse, linear and non-linear relationships. Different types of correlation are defined, including simple, multiple, and partial correlation. Formulas for calculating the Pearson product-moment correlation coefficient and point-biserial correlation are also provided.

Uploaded by

Daniela Caguioa
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 18

Lesson 9.

Correlation Coefficient

Correlation is a measure of relationship between two variables of


interest, where some of the common things measured tend to be related.

The relationship between two variables can be determined through the


use of correlation coefficient.

Correlation coefficient indicates two things:

1. The direction of the relationship

2. The strength or degree of the relationship

Direction of the Relationship

 If the correlation coefficient is positive, the relationship is said to be


direct ;

- this suggest that a high value in one variable X corresponds to a


high value in the other variable Y

- students who perform high in Math tend tend to perform also in


English

 If the correlation coefficient is negative, the relationship is said to be


inverse. ;

- this suggest that a high value in one variable X corresponds to a


low value in other variable Y

- students who perform high in Math tend to perform low in English.

 If the correlation coefficient is zero, no relationship exist

- a high value in one variable does not necessarily corresponds to


a high or low value in the other variable

 Perfect positive (direct) relationship

- this happens when the computed correlation coefficient is exactly


+ 1.
A Graphical Summary

Strength or Degree of Correlation

An r of 0 denotes no relationship
An r from 0.01 to ± 0.20 denotes negligible correlation
An r from ± 0.21 to ± 0.40 denotes low or slight correlation
An r from ± 0.41 to ± 0.70 denotes marked or moderate correlation
An r from ± 0.71 to ± 0.90 denotes high relationship
An r from ± 0.91 to 0.99 denotes very high relationship
An r from ± 1.00 denotes perfect relationship

A. Pearson Product – Moment Correlation Coefficient

Pearson r correlation: 

Pearson r correlation is the most widely used correlation statistic to measure


the degree of the relationship between linearly related variables. For example,
in the stock market, if we want to measure how two stocks are related to each
other, Pearson r correlation is used to measure the degree of relationship
between the two. The point - biserial correlation is conducted with the
Pearson correlation formula except that one of the variables is dichotomous. 
The following formula is used to calculate the Pearson r correlation from raw
scores

Types of Correlation

1. Direct and Inverse


Direct correlation is obtained when the changes in two variables
are in the same direction; as one variable increases or (decreases)
, the other also decreases or increases.
Example: employment increases when activity increases; it
decreases when business decline.

Inverse correlation is obtained when the changes in the two


variables are in the opposite direction.
Example: business failures decrease in frequency when business
improves, but they increase when declines.

2. Linear and Non Linear.


Linear correlation is obtained when the amount of change in one
variable tends to bear a constant ratio to the amount of change in
the other variable.
Example: if income is doubled, wealth is also doubled

Non-Linear or curvilinear correlation when the amount of change


in one variable does not bear a constant ration to the amount of
change in the other.
Example: if the amount of rainfall is doubled, the amount of palay
harvest is not necessarily doubled.

3. Simple, Multiple, Partial


Simple correlation is obtained when only two attributes or
characteristics are considered.
Multiple correlations are obtained when more than two variables
are considered
Partial correlation is obtained when more than two variables are
recognized but only two are considered to be influencing each
other.
Example.

The College Entrance Examination scores of 7 students and their


corresponding grade point average (GPA) are shown below:

Entrance Exam GPA

x y xy X2 Y2

80 75 6000 6400 5625

72 82 5904 5184 6724

80 76 6080 6400 5776

75 74 5550 5625 5476

70 78 5460 4900 6084

77 72 5544 5929 5184

75 75 5625 5625 5625

∑x = 529 ∑y = 532 40163 40063 40494

n = 7

Solution:

7(40163) - (529)(532)
r = -----------------------------------------------------------
√ [ 7( 40063) - (529)2 ] [ 7( 40494) - ( 532)2

281,141 - 281,428
r = ------------------------------------------------------------
√ ( 280,441 - 279,841) - ( 283,458 - 283,024)

287
r = ------------------------
√ ( 600) ( 434)

287 287
r = ---------------- = --------------
√ 260,400 510.29

r = 0.56

The value of r = 0.56 means that there is negative moderate relationship


between College Exam and the GPA. This means that those who have high
performance in college entrance exam tend to have moderate GPA.

Exercise 9A
Solve the following :

1. A random sample of 15 married couples both of whom earned


income (in thousand pesos) showed the following;

Income of Husband (x) Income of Wife (y)

13,000 8,000

18,000 12,000

24,000 16,000

76,000 48,000

25,000 30,000

19,000 16,000

40,000 32,000

52,000 60,000

8,000 42,000

19,000 17,000

26,000 25,000

18,000 19,000

33,000 45,000

37,000 32,000

12,000 28,000

What is the correlation coefficient of the income of both husband and


wife ? What is the degree of correlation. Interpret.

B. Point Biserial Correlation, (rpb )


This is used when a variable which is interval in nature is correlated
with another variable which is classified as real, dichotomous.

 Relationship between performance rating (y) and sex (x)


categorized as male (1) and female (0)

 Relationship between production and presence a modern


equipment with (1) and without (0) .

Formula ;
_ _
x1 - x0 n1 n0
r pb = ------------ √ ------------
Sy n(n-1)

Where;
x1 = mean of those which are labeled 1

X0 = mean of those which are labeled as 0

n1 = number of samples labeled as 1 in y

n 0 = number of samples labeled as 0 in y

n = total number if samples = n1 + n0

Sy = standard deviation of all the y values

Example
1. A researcher would like to find out whether the use of modern
equipment (x) is related to increase in production of rice in cavans (y).

Use of Equip. Productions


Farmer
(0) not using & (1) using in cavans y

1 1 120

2 0 100

3 1 160

4 0 110

5 1 150

6 1 180

7 0 180

8 0 60

9 1 70

Solution:
n1 = 5 n0 = 4 n = 5 + 4 = 9

120 + 160 + 150 + 180 + 70


y1 = ---------------------------------------
5

y1 = 136

100 + 110 + 180 + 60


y0 = -----------------------------------
4

y0 -= 112.5

Sy = 44.75

136 - 112.5 5(4)


r pb = ------------------------------ √ --------
44.75 9(8)

23.5
= -------- ( 0.527)
41.87

r pb = 0.277

 Relationship between the continuous variable and the dichotomous


response (like true or false)

Formula :

∑ f ( ∑f1Y) - ∑ f1 ( ∑ f Y)
r pb = ------------------------------------------------------------------------
√ ∑ f ( ∑ f ) [ ∑ f ( ∑ f Y ) - (∑ f Y)2 ]
1 2
2

Where:
f = sum of frequencies

Y = continuous variable

f 1 = frequency of larger (or major ) group

f2 = frequency of smaller (minor) group

Example:
A study was conducted to determine the relation of sex to emotional
intelligence. A group of 90 students were selected at random and the
following results were obtained.

No. of Males No. of Females Total no. of Students


EQ Scores
obtaining scores obtaining scores obtaining scores

   f 2 f 1   

80 3 4 7

85 6 8 14

92 1 4 5

76 7 2 9

79 5 4 9

94 3 6 9

97 4 7 11

102 2 5 7

87 5 6 11

82 4 4 8

  40 50 90

Using the point biserial correlation, determine the relationship between


emotional intelligence ad sex.

Solution :
Y f1 f2 f = f1 + f2 Y2 fY fY2 f1Y

           

80 4 3 7 6400 560 44800 320

85 8 6 14 7225 1190 101150 680

92 4 1 5 8464 460 42320 368

76 2 7 9 5776 684 51984 152

79 4 5 9 6214 711 56169 316

94 6 3 9 8836 846 79524 564

97 7 4 11 9409 1067 103499 679

102 5 2 7 10404 714 72828 510

87 6 5 11 7569 957 83259 522

82 4 4 8 6724 656 53792 328

   

  50 40 90   7845 689325 4439

90(4439) - (50)(7845)
r pb = -------------------------------------------
√ (50)(40) [ 90(689325) - (7845)2

399,510 - 392,250
= ------------------------------------------------
√ 2,000 [( 62,039,250) - 61,544,025]

7260
= ------------------------
√ 2000( 495,225)

7260
= ---------------
31,471.42

= 0.23

The result gives a low positive point biserial correlation.

Exercise 9B

1. Using the following data, compute for the point biserial correlation.
Interpret the results as ti the degree of correlation.
Y f1 f2

20 5 3

25 6 2

30 15 9

15 7 2

10 12 4

40 7 3

2. In an achievement test administered to senior students, a study was made


to determine the relationship between test scores and type if school. The
following results were : Compute for point biserial correlation and interpret.

Test Scores fA fB

95 10 4

72 3 5

65 6 2

75 13 7

84 14 9

90 20 18

87 15 10

89 8 5

98 12 7

82 11 4

C. Spearman’s Rank Correlation Coefficient (rho)


Spearman rank correlation is a non-parametric test that is used to
measure the degree of association between two variables.  The Spearman
rank correlation test does not carry any assumptions about the distribution of
the data and is the appropriate correlation analysis when the variables are
measured on a scale that is at least ordinal.

The following formula is used to calculate the Spearman rank correlation:

Where:

p = Spearman rank correlation


di = the difference between the ranks of corresponding variables
n = number of observations

Example
Two judges ranked 10 contestants in a beauty pageant in order of their
preference. The results were as follows:

Contestants 1 2 3 4 5 6 7 8 9 10
Judge A 7 1 5 8 2 9 6 3 10 4
Judge B 9 3 7 6 1 10 4 2 8 5
Did the judges tend to agree in their choice? Use α = 0.05

Solution:

Difference Squares
Judge A Judge B
D D2

7 9 2 4
1 3 2 4

5 7 2 4

8 6 2 4

2 1 1 1

9 10 1 1

6 4 2 4

3 2 1 1

10 8 2 4

4 5 1 1

∑D = 28

n = no. Of contestants = 10

6 ∑D2
rho ( ρ) = 1 - -----------------------
n ( n 2 - 1)

6 ( 28)
= 1 - --------------------
10 ( 102 - 1)

168
= 1 - -----------------
990

= 1 - 0.17

rho ( ρ) = 0.83

This means that there is a high relationship between the two judges on
preference of candidates.

Exercise 9C

1. Ten applicants to the position of an analyst engineer of international


firm were ranked in their performance in the Engineering board examination.
They were also ranked in the actual job performance. The data gathered are
tabulated below where the highest is 1 and the lowest is 10.

Ranked Ranked
Applicant
Board Exam Job Performance

1 6 5

2 3 2

3 1 3

4 8 6

5 4 7

6 2 1

7 10 9

8 9 8

9 5 4

10 7 10

Find the correlation coefficient and interpret the result.

2. Given the following data. (in millions of pesos).


Do capital and profit tend to agree? Interpret the correlation coefficient.
Capital Profit

6.5 1

10.8 1.6

21.4 5.3

18.7 3.4

9.3 2

15.1 4.5

Check - up Test ! ! !

Choose which of the following correlation tests be used on the


given data.

A group of administrators attended a training along Total Quality Management


(TQM). Their administrative competence through a questionnaire, and data
about their profile were determined before and after attending the program.
The data are tabulated as follows:

Sex Admin Admin Admin

Admin 1 = male Age Competence Competence Experience

0 = female Before After Years

           

1 0 42 4 5 3

2 1 52 4 5 5

3 1 39 3 4 1

4 0 47 3 4 5

5 0 40 4 4 1

6 1 50 5 5 8

7 1 52 5 4 7

8 0 48 4 4 4

9 1 46 5 4 6

10 0 54 4 5 8

Determine the degrees of relation between the following variables


A. Sex and change in Admin Competence
B. Age and change in Admin Competence
C. Admin Experience and Admin Competence Before

References;

Nocon, Ferdinand T. et al. General Statistics For Filipinos , Manila: National Book
Store, .1st Edition, 2010.

Punsalan, Twila G. et al . Statistics A Simplified Approach , Manila: Rex Book Store,


2010.

Reynoso, Lino C. et al. Statistics and Probability.

Walpole, R. Introduction to Statistics . 3rd Edition. New York, Macmillian Publishing


Co., Inc.

You might also like