0% found this document useful (0 votes)
77 views33 pages

Correlation: Khairil Anuar Md. Isa Bbiomedicalsc. (Hons), Ukm Msc. (Medical Stat), Usm

This document discusses correlation, which measures the strength and direction of a linear relationship between two random variables. Correlation is measured by the coefficient of correlation r, which ranges from -1 to 1. A correlation close to 0 indicates a weak linear relationship, while values closer to 1 or -1 indicate a strong positive or negative linear relationship. The document outlines how to interpret correlation coefficients and cautions about assumptions and limitations, such as correlation not implying causation. Steps for computing correlation are provided, including generating scatter plots and running correlation tests in software like SPSS.

Uploaded by

Misx Muna
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
77 views33 pages

Correlation: Khairil Anuar Md. Isa Bbiomedicalsc. (Hons), Ukm Msc. (Medical Stat), Usm

This document discusses correlation, which measures the strength and direction of a linear relationship between two random variables. Correlation is measured by the coefficient of correlation r, which ranges from -1 to 1. A correlation close to 0 indicates a weak linear relationship, while values closer to 1 or -1 indicate a strong positive or negative linear relationship. The document outlines how to interpret correlation coefficients and cautions about assumptions and limitations, such as correlation not implying causation. Steps for computing correlation are provided, including generating scatter plots and running correlation tests in software like SPSS.

Uploaded by

Misx Muna
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 33

CORRELATION

Khairil Anuar Md. Isa BBiomedicalSc.(hons), UKM MSc. (Medical Stat), USM

WHAT IS CORRELATION?
Measure the strength and direction of a linear relationship between a pair of random variable. Measured by the coefficient of correlation, r (rho) : (Population) r has a value between -1 to 1. The sample estimate for r is denoted by r : (Sample Coefficient of Correlation)

WHAT IS CORRELATION? CONT

The type of coefficient suitable for a pair of variables is determined by measurement scales (ratio/interval or ordinal) and the distributions of the variables (normal or not normal). It does not indicate the cause and effect relationship E.g.: High Correlation does not necessarily mean one variable caused the other.

CORRELATION ANSWER:

What is the direction of the linear relationship (negative or positive)? How strong is the linear relationship?

! Regression will add the answer of prediction (more detail on the strength of linear relationship)

EXAMPLES OF RESEARCH QUESTIONS


What factors are related to road accident fatalities? Is the satisfaction level of student towards their lecture related to grade expectation? What factors affect customers satisfaction? Does the shelf life of bread related to the humidity of the storage area.

CORRELATION VERSUS REGRESSION

Correlation

Emphasis on the degree of linear relationship How strong is the relationship Does NOT matter which is X and which is Y

Regression Emphasis on prediction Directional ~ one is the predictor to predict the other one

MUST identify which is the predictor variable (X) and which is to predict (Y)

RELATIONSHIP BETWEEN 2 QUANTITATIVE VARIABLES

Measured by the Pearsons coefficient of correlation (Parametric) Both variables must be quantitative and normally distributed. The nature, direction and the strength of the relationship can be determined from scatter plot.

300

200

100

Positive correlation
BP
0 0 20 40 60 80 100 120 140

Weight

If weight increase, Blood pressure will increase

320

Negative correlation
300

280

260

240

220 50 60 70 80 90 100 110

X5

If age increase, PEFR will decrease

140

120

100

80

60

40

Weight

20

0 26 28 30 32 34 36

Almost no correlation
Age

When age increase, weight does not change

The longer the ellipse the more the r r = -0.94

r = -0.54

The more scattered the distribution the lower is correlation

Shape of circle

r = 0.42

r = 0.17

GUIDE TO INTERPRET THE COEFF. (R)


r = 0 0.25 : poor / no correlation r = 0.26 0.50 : fair correlation r = 0.51 0.75 : good correlation r = 0.76 1 : excellent / perfect correlation For + or values ! Positive correlation means two variables are moving in the

same direction and vise versa for negative correlation

ASSUMPTIONS FOR CORRELATION TEST


At least one of the variables are normally distributed There is a linear relationship between variables Random sample

Ho= There is no correlation between X & Y HA= There is correlation between X & Y

SUMMARY..
To see whether there is any relationship between two numerical variables in term of strength and direction.

Scatter plot

Linear association? Direction (+/-) association?

Correlation test = r

PEARSONS CORRELATION COEFFICIENT


A measure of degree of straight line relationship between two numerical variables At least one variable must have a normal distribution Population correlation coefficient (r: rho) Sample correlation coefficient (r)

FORMULA..

TESTING THE SIGNIFICANCE OF THE ASSOCIATION

CAUTIONS ABOUT INTERPRETING CORRELATION COEFFICIENTS

Appropriate data type


Data

MUST be drawn from random sample One variable should not be a component of the other Data is not the combined data from two identifiable groups

..CAUTIONS..

Effect of outliers
Look

at the scatter plot! Outliers effect the means so can effect the coefficient May be muted if sample size are large Check whether true outlier or not then handle the outlier

Correlation is not an agreement Not to compare change and initial value

..CAUTION..LIMITATIONS.
Very weak relationship (r<0.25) still give significant result with a adequate large sample size High correlation does not imply cause and effect relationship

No

component of temporal relationship for causality

SPEARMAN RANK CORRELATION


Distribution free For non-normal distribution data ~ nonparametric procedure Replace the observations by their ranks in the calculation

d is the different between rank x with rank y

COMPUTING CORRELATION COEFFICIENTS


Scatter plot Correlation matrix Coefficient Significance test

STEPS..

Check normality of the two variables

Pearsons or Spearman rank?


Identify outliers ~ remove? Visual assessment LINEAR or not & direction P-value Correlation coefficient

Do scatter plot

Correlation test

Steps using SPSS..

2 Click the two variables

INTERPRETATION

There is a significant (p<0.001), negative poor correlation (r=-0.244) between the age and PEFR.

Thank you..

You might also like