0% found this document useful (0 votes)
16 views25 pages

16.. Correlation Analysis - Michael

The document provides an overview of biostatistics focusing on linear regression and correlation, explaining the concepts of correlation, causation, and types of correlation including positive, negative, simple, and multiple correlation. It introduces Pearson's Linear Correlation Coefficient as a measure of the strength of linear association between two quantitative variables and discusses its characteristics and calculation. Additionally, it covers the coefficient of determination and assumptions of Pearson’s correlation coefficient, along with its advantages.

Uploaded by

DrElias Davis
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views25 pages

16.. Correlation Analysis - Michael

The document provides an overview of biostatistics focusing on linear regression and correlation, explaining the concepts of correlation, causation, and types of correlation including positive, negative, simple, and multiple correlation. It introduces Pearson's Linear Correlation Coefficient as a measure of the strength of linear association between two quantitative variables and discusses its characteristics and calculation. Additionally, it covers the coefficient of determination and assumptions of Pearson’s correlation coefficient, along with its advantages.

Uploaded by

DrElias Davis
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 25

BIOSTATISTICS

LINEAR REGRESSION &


CORRELATION .

PRESENTER ; MICHAEL GILYA


FACILITATOR; PROF. SEMALI
Correlation
• Correlation; is a statistical tool that helps to measure
and analyze the degree of relationship between two
variables.

• Correlation analysis deals with the association


between two or more variables.

2
Correlation & Causation
Correlation denotes the interdependency among
the variables for correlating two phenomenon.

It is essential that the two phenomenon should


have cause-effect relationship,& if such relationship
does not exist then the two phenomenon can not
be correlated.

3
Cont..
• If two variables vary in such a way that movement in
one are accompanied by movement in other, these
variables are called cause and effect relationship.

• Causation always implies correlation but correlation


does not necessarily implies causation.

4
Types of correlation

Type1
 A) POSITIVE CORRELATION

• The correlation is said to be positive correlation if


the values of two variables are changing with same
direction. E.g. Height & weight, water consumption
& temperature, study time & grades.
• Variable Y increases as x Increases and Vice versa is
true
5
Cont..
B) NEGATIVE CORRELATION

• The correlation is said to be negative correlation


when the values of variables change with opposite
direction .e.g. Price & quantity demanded, alcohol
consumption & driving ability.

 Variable Y increases as variable x decreases and


Vice versa is true
6
» TYPE 2;
 A) Simple correlation: only two variables are
studied.
 B) Multiple Correlation: three or more than
three variables are studied. Further
categorized into;
 Partial correlation: analysis recognizes
more than two variables but considers only
two variables keeping the other constant.
 Total correlation: is based on all the
relevant variables, which is normally not
feasible.
7
TYPE 3;
 A) Linear correlation: Correlation is said to be linear when the
amount of change in one variable tends to bear a constant
ratio to the amount of change in the other. The graph of the
variables having a linear relationship will form a straight line.
B) Non Linear correlation: The correlation would be non linear
if the amount of change in one variable does not bear a
constant ratio to the amount of change in the other variable.

8
Pearson’s Linear Correlation Coefficient:

• Linear regression provides us with a straight line which


summarizes the relationship between two variables.

• However, it does not tell us how closely the data lie on a


straight line.

• The closeness with which the points lie along the


straight line is measured by the (Pearson's) correlation
coefficient 9
Pearson’s Linear Correlation Coefficient

• It measures the strength of the linear association between two


quantitative variables

• Can be obtained by using the formula:

 ( x  x )( y  y )
r= 2 2
 ( x  x)  ( y  y )
Where:
i. r = Correlation coefficient
ii. Xi = Values of the x –variable in a sample
iii. x̄ = mean of the values of the x-variable
iv. yi = Values of the y – variable in a sample
10
v. ȳ = mean of the values of the y - variable
Characteristics of r
• It is a bivariate correlation coefficient summarizing the magnitude
and direction of the relationship between two variables.

• r> 0 means the variables are positively correlated, meaning that as x


increases, y tends to increase and the vice versa.

• r< 0 means variables are negatively correlated, as x increases, y tends


to decrease, while as x decrease , y tends to increase.

11
Cont..
• Ranges between -1 and +1
r=0 = No linear relationship ( uncorrelated)
r=1 = Perfect positive relationship
r= -1=Perfect negative relationship

12
1. Calculate the Correlation coefficient of age (Years) against the
body mass index of patients attended the clinic at a certain
department.
N AGE (YEARS) BMI (Kg/m2)
1 73 28
2 22 22
3 74 27
4 34 29
5 50 29
6 42 27
7 64 28
8 53 29
9 43 24
10 21 19
11 12 17
• Recall for, r

 ( x  x )( y  y )
2 2
 ( x  x)  ( y  y )
X Y X - X̄ Y - Ȳ (X - X̄ )(Y –Ȳ ) (X- X̄ )2 (Y-Ȳ )2
73 28 73 – 44 = 29 28 – 25 = 3 87 841 9
22 22 22 - 44 = -22 22 – 25 = -3 66 484 9
74 27 74 – 44 = 30 27 – 25 = 2 60 900 4
34 29 34 – 44 = -10 29 – 25 = 4 -40 100 16
50 29 50 – 44 = 14 29 – 25 = 4 56 196 16
42 27 42 – 44 = -2 27 – 25 = 2 -4 4 4
64 28 64 – 44 = 20 28 – 25 = 3 60 400 9
53 29 53 – 44 = 9 29 – 25 = 4 36 81 16
43 24 43 – 44 = -1 24 – 25 = -1 1 1 1
21 19 21 – 44 = -23 19 – 25 = -6 138 529 36
12 17 12 – 44 = -32 17 – 25 = -8 256 1024 64
X̄ = 44 Ȳ = 25 Σ = 12 Σ=4 Σ = 716 Σ = 4564 Σ=184
r = 0.781

What does it mean?


How to use excel to find r ?
N AGE (YEARS) BMI (Kg/m2)
1 73 28
2 22 22
3 74 27
4 34 29
5 50 29
6 42 27
7 64 28
8 53 29
9 43 24
10 21 19
11 12 17
Rule of thumb for r
Correlation Strong Weak
Positive 0.7 to 1.0 0.3 to <0.7

Negative -1.0 to -0.7 -0.7< to -0.3

Little or No Correlation: -0.3 to 0.3

18
Degree of Correlations
a c
b

Strong +ve r Weak +ve r Strong -ve r


d e f

Weak - r Moderate –ve r No r

19
Coefficient of Determination
• Coefficient of Determination r2 is the square of the
Pearson’s correlation coefficient r.

• Represents the prediction percentage

20
• It denotes the percentage at which x can predict y.

• For example, r2 of 0.612 (= 61.2%) on a study of Age(x) and SBP (y)


will mean that:
• Age can predict SBP by only 61.2%, the remaining %
can only be predicted by other factors.

21
• In other words it will mean that only 61.2% of changes
in SBP (y) are due to changes in Age (x)

• Moreover we can say that, when you use Age( X) you


can only predict SBP (y) by only 61.2%.

22
Assumptions of Pearson’s Correlation
Coefficient

• There is linear relationship between two variables, i.e. when the two
variables are plotted on a scatter diagram, a straight line will be
formed by the points.

• Cause and effect relation exists between different forces operating on


the item of the two variable series.

23
Advantages of Pearson’s
Coefficient
• It summarizes in one value, the degree of correlation
& direction of correlation also.

24
THANK YOU

FOR YOUR ATTENTION!

25

You might also like