0% found this document useful (0 votes)
14 views8 pages

06 Correlation

The document discusses correlation as a measure of the association between two variables, detailing both Pearson and Spearman correlation coefficients. It highlights the differences between linear and monotonic relationships and introduces other correlation measures like Point-Biserial and Intraclass. The document emphasizes that correlation does not imply causation.

Uploaded by

kenny00215
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views8 pages

06 Correlation

The document discusses correlation as a measure of the association between two variables, detailing both Pearson and Spearman correlation coefficients. It highlights the differences between linear and monotonic relationships and introduces other correlation measures like Point-Biserial and Intraclass. The document emphasizes that correlation does not imply causation.

Uploaded by

kenny00215
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

Biostatistics I: Descriptive Statistics

Correlation

Eleni-Rosalina Andrinopoulou
Department of Biostatistics, Erasmus Medical Center

[email protected]

7@erandrinopoulou
In this Section

▶ Correlation coefficients
▶ Examples

1
Correlation

Correlation is a measure that describes the strength of the association


between two variables. Let’s assume that we have two continuous
variables, we can get the following relationships:

Positive correlation Negative correlation No correlation

2
2

1
1

1
Variable 2

Variable 2
Variable 2
0

0
0
−1

−1
−1
−2

−2
−2

−2 −1 0 1 2 −2 −1 0 1 2 −2 −1 0 1 2

Variable 1 Variable 1 Variable 1

2
Pearson Correlation

▶ magnitude of association
▶ linear association
▶ direction of the relationship

A relationship is linear when a change in one variable is associated with a


proportional change in the other variable
cov(X,Y)
Pearson Correlation: corr(X, Y) = sd(X)sd(Y) ,
where cov(X, Y) is the covariance and sd(X), sd(Y) are the standard
deviations

3
Spearman Correlation
▶ direction of the relationship
▶ monotonic relationship

In a monotonic relationship, the variables tend to change together, but


not always at a constant rate (as in the linear case)
The Spearman correlation coefficient is based on the ranked values:
cov(RX ,RY )
corrR (X, Y) = sd(R X )sd(RY )

What is rank?
Ranks are integers indicating the rank of some values. E.g. the rank of 3,
10, 16, 6, 2 is 2, 4, 5, 3, 1:

rank(c(3, 10, 16, 6, 2))

[1] 2 4 5 3 1
4
Difference between Pearson and Spearman
Weak positive correlation: Strong positive correlation:
Pearson = 0.5 Spearman = 0.51 Pearson = 0.9 Spearman = 0.8
2

2
1

1
Variable 2

Variable 2
0

0
−1

−1
−2

−2
−2 −1 0 1 2 −2 −1 0 1 2

Variable 1 Variable 1

5
Difference between Pearson and Spearman
What if there is a correlation but
Linear VS monotonic relationship:
this is not linear?
Pearson = −0.85 Spearman = −0.99
Pearson = 0.06 Spearman = 0.04
50

50
45
40

40
Variable 2

30

Variable 2

35
20

30
10

25
0

20
5 10 15 20
0 5 10 15 20 25 30
Variable 1
Variable 1
6
Other Correlation Measures

▶ Point-Biserial: It evaluates the association between a continuous


variable with a categorical (dichotomous) variable
▶ Intraclass: It evaluates the association between two continuous
variables that are structured in groups

Note
▶ Correlation must not be confused with causality
▶ If two variables are correlated, it does not imply that one variable
causes the changes in another variable

You might also like