0% found this document useful (0 votes)

49 views42 pages

DSUR I Chapter 06 (Correlation)

The document discusses measuring relationships between variables using correlation. It defines correlation as a way to measure how two variables change together, and introduces several correlation coefficients including Pearson's r, Spearman's rho, and Kendall's tau. Examples are provided calculating these coefficients in R and interpreting the results. Bootstrapping methods are also described as a way to estimate confidence intervals for correlation values.

Uploaded by

paynebrennan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

49 views42 pages

DSUR I Chapter 06 (Correlation)

Uploaded by

paynebrennan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 42

Correlation

Prof. Andy Field

Aims
• Measuring relationships
– Scatterplots
– Covariance
– Pearson’s correlation coefficient
• Nonparametric measures
– Spearman’s rho
– Kendall’s tau
• Interpreting correlations
– Causality
• Partial correlations
What is a Correlation?
• It is a way of measuring the extent to which
two variables are related.
• It measures the pattern of responses across
variables.
160

140
Appreciation of Dimmu Borgir

120

100

-20
10 20 30 40 50 60 70 80 90

Age
Slide 4
90

Appreciation of Dimmu Borgir 80

10
10 20 30 40 50 60 70 80 90

Age

Slide 5
100

80
Appreciation of Dimmu Borgir

-20
10 20 30 40 50 60 70 80 90

Age

Slide 6
Measuring Relationships
• We need to see whether as one variable
increases, the other increases, decreases or
stays the same.
• This can be done by calculating the
covariance.
– We look at how much each score deviates from
the mean.
– If both variables deviate from the mean by the
same amount, they are likely to be related.
Revision of Variance
• The variance tells us by how much scores
deviate from the mean for a single variable.
• It is closely linked to the sum of squares.
• Covariance is similar – it tells is by how
much scores on two variables differ from
their respective means.
Variance
• The variance tells us by how much scores
deviate from the mean for a single variable.
• It is closely linked to the sum of squares.
  xi  x 
2

variance  N 1
  xi  x  xi  x 
 N 1
Covariance
• Calculate the error between the mean and
each subject’s score for the first variable (x).
• Calculate the error between the mean and
their score for the second variable (y).
• Multiply these error values.
• Add these values and you get the cross
product deviations.
• The covariance is the average cross-product
deviations:
  xi  x  yi  y 
cov( x, y)  N 1
( xi  x )( y i  y )
cov( x , y ) 
N 1
( 0.4)( 3)  ( 1.4)( 2 )  ( 1.4)( 1)  (0.6)( 2 )  ( 2.6)( 4)

4
1.2  2.8  1.4  1.2  10.4

4
 17
4
 4.25
Problems with Covariance
• It depends upon the units of measurement.
– E.g. the covariance of two variables measured in
miles might be 4.25, but if the same scores are
converted to kilometres, the covariance is 11.
• One solution: standardize it!
– Divide by the standard deviations of both
variables.
• The standardized version of covariance is
known as the correlation coefficient.
– It is relatively unaffected by units of measurement.
The Correlation Coefficient

r
cov xy
sx s y

  xi  x  yi  y 
  N 1 sx s y
The Correlation Coefficient

r
cov xy
sx s y

4.25

1.67  2.92
 .87
Correlation: Example
• Anxiety and exam performance
• Participants:
– 103 students
• Measures
– Time spent revising (hours)
– Exam performance (%)
– Exam Anxiety (the EAQ, score out of 100)
– Gender
Doing a Correlation with R
Commander
General Procedure for Correlations
Using R
• To compute basic correlation coefficients
there are three main functions that can be
used:
cor(), cor.test() and rcorr().
Correlations using R
• Pearson correlations:
– cor(examData, use = "complete.obs", method
= "pearson")
– rcorr(examData, type = "pearson")
– cor.test(examData$Exam, examData$Anxiety,
method = "pearson")
• If we predicted a negative correlation:
– cor.test(examData$Exam, examData$Anxiety,
alternative = "less"), method = "pearson")
Pearson Correlation Output

Exam Anxiety Revise

Exam 1.0000000 -0.4409934 0.3967207
Anxiety -0.4409934 1.0000000 -0.7092493
Revise 0.3967207 -0.7092493 1.0000000
Reporting the Results
• Exam performance was significantly
correlated with exam anxiety, r = .44, and
time spent revising, r = .40; the time spent
revising was also correlated with exam
anxiety, r = .71 (all ps < .001).
Things to Know about the
Correlation
• It varies between -1 and +1
– 0 = no relationship
• It is an effect size
– ±.1 = small effect
– ±.3 = medium effect
– ±.5 = large effect
• Coefficient of determination, r2
– By squaring the value of r you get the proportion
of variance in one variable shared by the other.
Correlation and Causality
• The third-variable problem:
– In any correlation, causality between two
variables cannot be assumed because there
may be other measured or unmeasured
variables affecting the results.
• Direction of causality:
– Correlation coefficients say nothing about
which variable causes the other to change.
Non-parametric Correlation
• Spearman’s rho
– Pearson’s correlation on the ranked data
• Kendall’s tau
– Better than Spearman’s for small samples
• World’s Biggest Liar competition
– 68 contestants
– Measures
• Where they were placed in the competition (first,
second, third, etc.)
• Creativity questionnaire (maximum score 60)
Spearman’s Rho
cor(liarData$Position, liarData$Creativity, method =
"spearman")
• The output of this command will be:
[1] -0.3732184
• To get the significance value use rcorr() (NB:
first convert the dataframe to a matrix):
liarMatrix<-as.matrix(liarData[, c("Position",
"Creativity")])
rcorr(liarMatrix)
• Or:
cor.test(liarData$Position, liarData$Creativity,
alternative = "less", method = "spearman")
Spearman's Rho
Output
Spearman's rank correlation rho
data: liarData$Position and liarData$Creativity
S = 71948.4, p-value = 0.0008602
alternative hypothesis: true rho is less than 0
sample estimates:
rho
-0.3732184
Kendall’s Tau (Non-parametric)
• To carry out Kendall’s correlation on the
World’s Biggest Liar data simply follow the
same steps as for Pearson and Spearman
correlations but use method = “kendall”:
cor(liarData$Position, liarData$Creativity,
method = "kendall")
cor.test(liarData$Position, liarData$Creativity,
alternative = "less", method = "kendall")
Kendall’s Tau (Non-parametric)

• The output is much the same as for

Spearman’s correlation.
Kendall's rank correlation tau
data: liarData$Position and liarData$Creativity
z = -3.2252, p-value = 0.0006294
alternative hypothesis: true tau is less than 0
sample estimates:
tau
-0.3002413
Bootstrapping Correlations
• If we stick with our World’s Biggest Liar data and
want to bootstrap Kendall’s tau, then our function
will be:
bootTau<-function(liarData,i) cor(liarData$Position[i],
liarData$Creativity[i], use = "complete.obs", method =
"kendall")

• To bootstrap a Pearson or Spearman correlation you

do it in exactly the same way except that you
specify method = “pearson” or method =
“spearman” when you define the function.
Bootstrapping Correlations Output
• To create the bootstrap object, we execute:
library(boot)
boot_kendall<-boot(liarData, bootTau, 2000)
boot_kendall
• To get the 95% confidence interval for the
boot_kendall object:
boot.ci(boot_kendall)
Bootstrapping Correlations
• To bootstrap a Pearson or Spearman
correlation you do it in exactly the same
way except that you specify method =
“pearson” or method = “spearman” when
you define the function.
Bootstrapping Correlations Output
• The output below shows the contents of boot_kendall:

ORDINARY NONPARAMETRIC BOOTSTRAP

Call:
boot(data = liarData, statistic = bootTau, R = 2000)

Bootstrap Statistics :
original bias std. error
t1* -0.3002413 0.001058191 0.097663
Bootstrapping Correlations Output
• The output below shows the contents of the boot.ci() function:

BOOTSTRAP CONFIDENCE INTERVAL CALCULATIONS

Based on 2000 bootstrap replicates

CALL :
boot.ci(boot.out = boot_kendall)

Intervals :
Level Normal Basic
95% (-0.4927, -0.1099 ) (-0.4956, -0.1126 )

Level Percentile BCa

95% (-0.4879, -0.1049 ) (-0.4777, -0.0941 )
Partial and Semi-partial Correlations

• Partial correlation:
– Measures the relationship between two
variables, controlling for the effect that a third
variable has on them both.
• Semi-partial correlation:
– Measures the relationship between two
variables controlling for the effect that a third
variable has on only one of the others.

Slide 37
Exam
Performance

1
Variance Accounted for by Exam Anxiety
Exam Anxiety (19.4%)

Exam
Performance

2 Revision Time

Variance Accounted for by

Revision Time (15.7%)

Unique variance accounted

for by Revision Time

Variance accounted for by Exam

both Exam Anxiety and Performance
Revision Time

3 Revision Time

Unique variance accounted Exam Anxiety

for by Exam Anxiety
Revision Revision

Exam Anxiety Exam Anxiety

Partial Correlation Semi-Partial Correlation

13-Oct-19
Doing Partial Correlation using R
• The general form of pcor() is:
pcor(c("var1", "var2", "control1", "control2"
etc.), var(dataframe))
• We can then see the partial correlation and
the value of R2 in the console by executing:
pc
pc^2
Doing Partial Correlation using R
• The general form of pcor.test() is:
pcor(pcor object, number of control variables,
sample size)
• Basically, you enter an object that you have
created with pcor() (or you can put the
pcor() command directly into the function):
pcor.test(pc, 1, 103)
Partial Correlation Output
> pc
[1] -0.2466658

> pc^2
[1] 0.06084403
> t(pc, 1, 103)
$tval
[1] -2.545307

$df
[1] 100

$pvalue
[1] 0.01244581

Statistical Inference - II
No ratings yet
Statistical Inference - II
171 pages
Topic4 Linear Models
No ratings yet
Topic4 Linear Models
72 pages
Lecture - Correlation and Regression GEG 222
100% (1)
Lecture - Correlation and Regression GEG 222
67 pages
FRM-I - Book Quant Analysis
100% (4)
FRM-I - Book Quant Analysis
457 pages
Regression Analysis 2022
No ratings yet
Regression Analysis 2022
92 pages
CORRELATION AND COVARIANCE in R
100% (1)
CORRELATION AND COVARIANCE in R
24 pages
L3 Correlation
No ratings yet
L3 Correlation
101 pages
PRP1001 JXH1003 Week 7 2024 No Notes
No ratings yet
PRP1001 JXH1003 Week 7 2024 No Notes
49 pages
Two Variables Chap3
No ratings yet
Two Variables Chap3
47 pages
This is The Statistics Handbook your Professor Doesn't Want you to See. So Easy, it's Practically Cheating...
From Everand
This is The Statistics Handbook your Professor Doesn't Want you to See. So Easy, it's Practically Cheating...
S. Deviant
4.5/5 (6)
Week9 Correlations
No ratings yet
Week9 Correlations
42 pages
06 Correlation and Regression
No ratings yet
06 Correlation and Regression
63 pages
Lesson 10 Relationship Between Variables
No ratings yet
Lesson 10 Relationship Between Variables
85 pages
Chapter - 5 - Correlation and Regression
No ratings yet
Chapter - 5 - Correlation and Regression
70 pages
Correlation Covariance & Model Selection
No ratings yet
Correlation Covariance & Model Selection
29 pages
Review: I Am Examining Differences in The Mean Between Groups
100% (2)
Review: I Am Examining Differences in The Mean Between Groups
44 pages
Correlation Rank - Correlation Curve - Fitting For Student
No ratings yet
Correlation Rank - Correlation Curve - Fitting For Student
26 pages
Lecture 10 Correlation
No ratings yet
Lecture 10 Correlation
32 pages
Correlation
No ratings yet
Correlation
59 pages
CORRELATION 5 18092024 110549am
No ratings yet
CORRELATION 5 18092024 110549am
38 pages
Correlation
No ratings yet
Correlation
20 pages
Correlations
No ratings yet
Correlations
30 pages
Introduction of Correlation
No ratings yet
Introduction of Correlation
39 pages
May 8 2023
No ratings yet
May 8 2023
39 pages
Lecture 11 Correlation Edited
No ratings yet
Lecture 11 Correlation Edited
32 pages
Correlation Coefficient
No ratings yet
Correlation Coefficient
44 pages
8 Correlation
No ratings yet
8 Correlation
22 pages
Correlation Constant
No ratings yet
Correlation Constant
23 pages
Correlation & Regression
100% (1)
Correlation & Regression
23 pages
Applied Statistics
100% (1)
Applied Statistics
64 pages
Cce 68 D 4 CC 4
No ratings yet
Cce 68 D 4 CC 4
28 pages
Corelation - CL
No ratings yet
Corelation - CL
12 pages
FODS Unit-3
No ratings yet
FODS Unit-3
25 pages
P3 Acowtancy Notes - 186p
No ratings yet
P3 Acowtancy Notes - 186p
186 pages
Data Science With Python Relationship
No ratings yet
Data Science With Python Relationship
30 pages
Correlation & Regression-I
No ratings yet
Correlation & Regression-I
43 pages
Microsoft PowerPoint Session 4 PDF
No ratings yet
Microsoft PowerPoint Session 4 PDF
86 pages
Correlation Coefficient Definition
100% (1)
Correlation Coefficient Definition
8 pages
Correlation Analysis - Final
No ratings yet
Correlation Analysis - Final
40 pages
Introduction To Correlationand Regression Analysis BY Farzad Javidanrad PDF
No ratings yet
Introduction To Correlationand Regression Analysis BY Farzad Javidanrad PDF
52 pages
FALLSEM2024-25 MMAT501P LO VL2024250107799 2024-09-18 Reference-Material-I
No ratings yet
FALLSEM2024-25 MMAT501P LO VL2024250107799 2024-09-18 Reference-Material-I
13 pages
Unit 2
No ratings yet
Unit 2
17 pages
Correction
No ratings yet
Correction
10 pages
Correlation Coefficient
No ratings yet
Correlation Coefficient
8 pages
Lecture 7 Correlation
No ratings yet
Lecture 7 Correlation
18 pages
Correlation Analysis: Concept of Univariate, Bivariate Data
No ratings yet
Correlation Analysis: Concept of Univariate, Bivariate Data
48 pages
Statistics: Correlation: 2.1 Interpreting A Scatterplot
No ratings yet
Statistics: Correlation: 2.1 Interpreting A Scatterplot
8 pages
Correlation 26-2-24
No ratings yet
Correlation 26-2-24
16 pages
Correlation
No ratings yet
Correlation
14 pages
LESSON 3 - Correlation Analysis
No ratings yet
LESSON 3 - Correlation Analysis
32 pages
PSY Chapter 7
No ratings yet
PSY Chapter 7
8 pages
Correlation Notes
No ratings yet
Correlation Notes
15 pages
Cbse - Department of Skill Education: Artificial Intelligence (Subject Code 843)
100% (1)
Cbse - Department of Skill Education: Artificial Intelligence (Subject Code 843)
7 pages
ECN 652 Handout 9 Student
No ratings yet
ECN 652 Handout 9 Student
46 pages
Oe Statistics Notes
No ratings yet
Oe Statistics Notes
32 pages
Correlation New
No ratings yet
Correlation New
37 pages
Final Exam Guidelines
No ratings yet
Final Exam Guidelines
7 pages
Chapter 14 Summary
No ratings yet
Chapter 14 Summary
2 pages
Psych Assess Chap 4
No ratings yet
Psych Assess Chap 4
5 pages
Chapter 8 - PSYC 284
No ratings yet
Chapter 8 - PSYC 284
7 pages
Correleation and PMCC
No ratings yet
Correleation and PMCC
12 pages
Solution Manual Business Analytics 3rd Edition by Camm & Cochran
No ratings yet
Solution Manual Business Analytics 3rd Edition by Camm & Cochran
56 pages
Lect 4
No ratings yet
Lect 4
22 pages
Lecture #1
No ratings yet
Lecture #1
22 pages
Lecture 7 Correlation
No ratings yet
Lecture 7 Correlation
5 pages
Book Mixed Model Henderson
No ratings yet
Book Mixed Model Henderson
384 pages
Buku SEM Prof Bahaman
No ratings yet
Buku SEM Prof Bahaman
101 pages
Correlation and Dependence: Navigation Search
No ratings yet
Correlation and Dependence: Navigation Search
7 pages
Slides
No ratings yet
Slides
418 pages
Econometrics 1 Cumulative Final Study Guide
No ratings yet
Econometrics 1 Cumulative Final Study Guide
35 pages
Ipm Final Model Paper Spring 15
No ratings yet
Ipm Final Model Paper Spring 15
12 pages
Multiple Regression
No ratings yet
Multiple Regression
49 pages
Matrix Cookbook
No ratings yet
Matrix Cookbook
53 pages
Beyond Correlation
100% (1)
Beyond Correlation
33 pages
Chapter 2
No ratings yet
Chapter 2
23 pages
D2 Analysis or Cluster
No ratings yet
D2 Analysis or Cluster
15 pages
Correlations and Copulas
No ratings yet
Correlations and Copulas
53 pages
Solutions2 PDF
No ratings yet
Solutions2 PDF
35 pages
From The Help Desk: Demand System Estimation
No ratings yet
From The Help Desk: Demand System Estimation
8 pages
Corral VerdugoFigueredo1999
No ratings yet
Corral VerdugoFigueredo1999
17 pages
SAPM - Sem1 - MidSem
No ratings yet
SAPM - Sem1 - MidSem
9 pages
Variance PDF
No ratings yet
Variance PDF
14 pages
Sillano and Ortuzar (2005) (WTP - Random)
No ratings yet
Sillano and Ortuzar (2005) (WTP - Random)
26 pages
Lecture 8
No ratings yet
Lecture 8
8 pages
Mathematical Expectation
No ratings yet
Mathematical Expectation
34 pages
1 Probability Unit 3
No ratings yet
1 Probability Unit 3
22 pages
Mathematical Expectation
No ratings yet
Mathematical Expectation
34 pages
Embedded Kalman Filter For Inertial Measurement Unit (Imu) On The Atmega8535
No ratings yet
Embedded Kalman Filter For Inertial Measurement Unit (Imu) On The Atmega8535
6 pages
Propagation of Statistical Errors
No ratings yet
Propagation of Statistical Errors
6 pages
Imm
No ratings yet
Imm
3 pages

DSUR I Chapter 06 (Correlation)

Uploaded by

DSUR I Chapter 06 (Correlation)

Uploaded by

Correlation

Prof. Andy Field

Appreciation of Dimmu Borgir 80

Exam Anxiety Revise

• The output is much the same as for

• To bootstrap a Pearson or Spearman correlation you

ORDINARY NONPARAMETRIC BOOTSTRAP

BOOTSTRAP CONFIDENCE INTERVAL CALCULATIONS

Level Percentile BCa

Variance Accounted for by

Unique variance accounted

Variance accounted for by Exam

Unique variance accounted Exam Anxiety

Exam Anxiety Exam Anxiety

Partial Correlation Semi-Partial Correlation

You might also like