0% found this document useful (0 votes)
303 views5 pages

Correlation Coefficients: Find Pearson's Correlation Coefficient

This document provides instructions for calculating Pearson's correlation coefficient from a dataset. It describes 6 steps: 1) Make a chart with the original data and add columns for the product of corresponding x and y values (xy), x squared (x2), and y squared (y2). 2) Fill the xy column. 3) Fill the x2 column. 4) Fill the y2 column. 5) Sum all the columns. 6) Use the formula to calculate r, the correlation coefficient, which in this example is 0.5298, indicating a moderate positive correlation. It then discusses interpreting the strength of correlations based on the r value.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
303 views5 pages

Correlation Coefficients: Find Pearson's Correlation Coefficient

This document provides instructions for calculating Pearson's correlation coefficient from a dataset. It describes 6 steps: 1) Make a chart with the original data and add columns for the product of corresponding x and y values (xy), x squared (x2), and y squared (y2). 2) Fill the xy column. 3) Fill the x2 column. 4) Fill the y2 column. 5) Sum all the columns. 6) Use the formula to calculate r, the correlation coefficient, which in this example is 0.5298, indicating a moderate positive correlation. It then discusses interpreting the strength of correlations based on the r value.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 5

Correlation Coefficients: Find Pearson’s Correlation Coefficient

How to Find Pearson’s Correlation Coefficients


Correlation coefficients are used in statistics to measure how strong a relationship is between two
variables. There are several types of correlation coefficient: Pearson’s correlation or Pearson
correlation is a correlation coefficient commonly used in linear regression.
Sample question: Find the value of the correlation coefficient from the following table:

SUBJECT AGE X GLUCOSE LEVEL Y

1 43 99

2 21 65

3 25 79

4 42 75

5 57 87

6 59 81

Step 1: Make a chart. Use the given data, and add three more columns: xy, x2, and y2.

SUBJECT AGE X GLUCOSE LEVEL Y XY X2 Y2

1 43 99

2 21 65

3 25 79

4 42 75

5 57 87

6 59 81

Step 2::Multiply x and y together to fill the xy column. For example, row 1 would be
43 × 99 = 4,257.

SUBJECT AGE X GLUCOSE LEVEL Y XY X2 Y2

1 43 99 4257

2 21 65 1365

3 25 79 1975

4 42 75 3150

5 57 87 4959
6 59 81 4779

Step 3: Take the square of the numbers in the x column, and put the result in the
x2 column.

SUBJECT AGE X GLUCOSE LEVEL Y XY X2 Y2

1 43 99 4257 1849

2 21 65 1365 441

3 25 79 1975 625

4 42 75 3150 1764

5 57 87 4959 3249

6 59 81 4779 3481

Step 4: Take the square of the numbers in the y column, and put the result in the
y2 column.

SUBJECT AGE X GLUCOSE LEVEL Y XY X2 Y2

1 43 99 4257 1849 9801

2 21 65 1365 441 4225

3 25 79 1975 625 6241

4 42 75 3150 1764 5625

5 57 87 4959 3249 7569

6 59 81 4779 3481 6561

Step 5: Add up all of the numbers in the columns and put the result at the
bottom.2 column. The Greek letter sigma (Σ) is a short way of saying “sum of.”

SUBJECT AGE X GLUCOSE LEVEL Y XY X2 Y2

1 43 99 4257 1849 9801

2 21 65 1365 441 4225

3 25 79 1975 625 6241

4 42 75 3150 1764 5625

5 57 87 4959 3249 7569

6 59 81 4779 3481 6561


Σ 247 486 20485 11409 40022

Step 6:Use the following correlation coefficient formula.

The answer is: 2868 / 5413.27 = 0.529809

From our table:

 Σx = 247
 Σy = 486
 Σxy = 20,485
 Σx2 = 11,409
 Σy2 = 40,022
 n is the sample size, in our case = 6

The correlation coefficient =

 6(20,485) – (247 × 486) / [√[[6(11,409) – (2472)] × [6(40,022) – 4862]]]


=0.5298

The range of the correlation coefficient is from -1 to 1. Our result is 0.5298 or 52.98%, which means the variables
have a moderate positive correlation.

How to test correlation coefficients

If you can read a table–you can test for correlation coefficient.

Sample problem: test the significance of the correlation coefficient r = 0.565 using the critical
values for PPMC table. Test at α = 0.01 for a sample size of 9.
Step 1: Subtract two from the sample size to get df, degrees of freedom.
9–7=2
Step 2: Look the values up in the PPMC Table. With df = 7 and α = 0.01, the table value is
= 0.798
Step 3: Draw a graph, so you can more easily see the relationship.
r = 0.565 does not fall into the “reject” region (above 0.798), so there isn’t enough evidence to
state a strong linear relationship exists in the data.

What Does the Correlation Coefficient Mean?


Pearson’s Correlation Coefficient returns a value of between -1 and +1. A -1 means there is a strong negative
correlation and +1 means that there is a strong positive correlation. This can initially be a little hard to
wrap your head around (who likes to deal with negative numbers?). The Political Science Department at
Quinnipiac University posted this useful list of the meaning of Pearson’s Correlation coefficients. They
note that these are “crude estimates” for interpreting strengths of correlations using Pearson’s
Correlation:

r value =

+.70 or higher Very strong positive relationship

+.40 to +.69 Strong positive relationship

+.30 to +.39 Moderate positive relationship

+.20 to +.29 weak positive relationship

+.01 to +.19 No or negligible relationship

0 No relationship

-.01 to -.19 No or negligible relationship

-.20 to -.29 weak negative relationship

-.30 to -.39 Moderate negative relationship

-.40 to -.69 Strong negative relationship

-.70 or higher Very strong negative relationship

It may be helpful to see graphically what these correlations look like:


Graphs showing a correlation of -1 (a negative correlation), 0 and +1 (a positive correlation)

The images show that a strong negative correlation means that the graph has a downward slope
from left to right: as the x-values increase, the y-values get smaller. A strong positive correlation
means that the graph has an upward slope from left to right: as the x-values increase, the y-values
get larger.

Where did the Correlation Coefficient Come From?

A correlation coefficient gives you an idea of how well data fits a line or curve. Pearson wasn’t the
original inventor of the term correlation but his use of it became one of the most popular ways to measure
correlation.

Francis Galton (who was also involved with the development of the interquartile range) was the first
person to measure correlation, originally termed “co-relation,” which actually makes sense considering
you’re studying the relationship between a couple of different variables. In Co-Relations and Their
Measurement, he said “The statures of kinsmen are co-related variables; thus, the stature of the father is
correlated to that of the adult son and so on; but the index of co-relation … is different in the different
cases.” It’s worth noting though that Galton mentioned in his paper that he had borrowed the term from
biology, where “Co-relation and correlation of structure” was being used but until the time of his paper it
hadn’t been properly defined.

In 1892, British statistician Francis Ysidro Edgeworth published a paper called “Correlated Averages,”
Philosophical Magazine, 5th Series, 34, 190-204 where he used the term “Coefficient of Correlation.” It
wasn’t until 1896 that British mathematician Karl Pearson used “Coefficient of Correlation” in two
papers: Contributions to the Mathematical Theory of Evolution and Mathematical Contributions to the
Theory of Evolution. III. Regression, Heredity and Panmixia. It was the second paper that introduced the
Pearson product-moment correlation formula for estimating correlation.

You might also like