0% found this document useful (0 votes)
11 views19 pages

6.6 Correlation & Linear Regression

This document covers the concepts of correlation and linear regression in biostatistics, focusing on Pearson's correlation coefficient and its significance testing. It explains the use of scatter diagrams to visualize relationships between variables and the formulation of simple linear regression models to predict outcomes. Additionally, it discusses the importance of statistical significance and the determination coefficient (R²) in evaluating the predictive power of regression models.

Uploaded by

sergekouassi065
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views19 pages

6.6 Correlation & Linear Regression

This document covers the concepts of correlation and linear regression in biostatistics, focusing on Pearson's correlation coefficient and its significance testing. It explains the use of scatter diagrams to visualize relationships between variables and the formulation of simple linear regression models to predict outcomes. Additionally, it discusses the importance of statistical significance and the determination coefficient (R²) in evaluating the predictive power of regression models.

Uploaded by

sergekouassi065
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 19

Biostatistics & Research Methodology

2021/22

Block 6 – Lecture 6

Correlation &
Linear Regression
Aktham Osama Abdulazeez, MBChB
Scatter Diagrams & Correlation
Part 01

CORRELATION
Pearson’s Correlation Coefficient (r)
o It is a measure of the strength of linear relationship between
two interval/ratio scale (continuous) variables under the
assumption of normal distribution.
o The value of (r) lies between ‐1 and +1:
‐1: perfect inverse linear correlation
+1: Perfect direct linear correlation
0: No correlation
o The value of (r) indicates the strength of the relationship:
<0.2 : very weak
0.2‐ <0.4 : weak
0.4‐ <0.7 : moderate
0.7‐ <0.9 : strong
≥0.9 : very strong
Testing Significance of (r)
o The (r) value represents a sample value of the correlation
coefficient
o The (ρ)(rho) value represent the population value of the
correlation coefficient.
o The statistical hypotheses:
Ho : ρ = 0 (there is NO linear relationship between X and Y in the population).
HA : ρ ≠ 0
o A t-value can then be calculated for (r) and tested on a t
distribution with a df = n-2

A larger (r) and a bigger sample size is


associated with a higher calculated t‐
value and thus higher probability of
being statistically significant.
Scatter Diagrams & Correlation
Scatter Diagrams & Correlation
Scatter Diagrams & Correlation
o Scatter Diagram is a graphic device used to visually
summarize the relationship between two interval/ratio
variables.
o The X‐axis represents the independent variable.
o The Y –axis represents the dependent variable
o In correlation model it is not important to know which is the
dependent and independent variable, while in regression
model this distinction is crucial.
o The closer the dots that represent pairs of observations for
study subjects are to the regression (best-fit) line, the
stronger is the linear correlation.
Example: Correlation
o Systolic Blood Pressure
Readings (mmHg) by two
methods in 25 Patients with
Essential Hypertension.

o Assess for the presence of a


relationship between the two
methods of BP measurement
using SPSS.
Part 02
Simple Linear

REGRESSION
Simple Linear Regression Model

o The independent variable (x) is pre-selected and called non-


random or mathematical variable.

o For each value of x there is a set of normally distributed


values of Y.
Simple Linear Regression Model
Simple Linear Regression Model

o The least square line equation (summarizes relationship


between x & Y):

o Where:
a= intercept : the point where the line crosses the vertical axis (i.e.:
amount of Y when X= 0)
b= slope : amount by which Y changes for each change in x
x= independent variable
Y= dependent variable
Simple Linear Regression Model

• Helpful in:
• Ascertaining the probable form of the relationship between variables.
• Predict or estimate the value of one variable corresponding to a given
value of another variable.
• Another way to quantify the strength of association between 2
interval/ratio variables under the assumption of normal distribution
(The higher the value of b “regression coefficient” the stronger is the
effect of x on the value of Y)(Dose-response relationship).
Simple Linear Regression Model
o b = regression coefficient of the sample
o β = regression coefficient of the population

o H0 : β = 0
o HA : β ≠ 0

o The calculated regression coefficient (beta or slope) is tested


for statistical significance by t‐test against the null hypothesis
of beta=0 at the population level.
o The overall regression equation is tested for statistical
significance by ANOVA.
o The model should be statistically significant before we are
able to generalize the results to reference population.
Simple Linear Regression – Power of Prediction

o The overall prediction power of the model is measured by R2


(determination coefficient) which is equal to the square value
of r (linear correlation coefficient).

o It measures the proportion of observed variation in the


response variable explained by the regression model.
Example: Regression
To evaluate the performance of a new test, an experiment was
done on 11 patients with paired measurements of scores
obtained on the new test and a standardized test. The results
are shown in the table.

Use the new test


scores as predictors in
establishing a linear
regression model
between the 2 tests
using SPSS.
Thank You
Questions? Ask in the group or note it down for our next
live session!
Hope to see you next time ☺

You might also like