0% found this document useful (0 votes)
1 views24 pages

Porchelvan Correlation SMC

The document covers correlation and regression analysis, explaining the concepts, calculations, and significance of these statistical methods. It provides examples, including the relationship between blood loss during surgery and mean systolic blood pressure, as well as gestational age and abdominal circumference. The document also includes calculations for correlation coefficients, regression lines, and confidence intervals to assess statistical significance.

Uploaded by

dhivya
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
1 views24 pages

Porchelvan Correlation SMC

The document covers correlation and regression analysis, explaining the concepts, calculations, and significance of these statistical methods. It provides examples, including the relationship between blood loss during surgery and mean systolic blood pressure, as well as gestational age and abdominal circumference. The document also includes calculations for correlation coefficients, regression lines, and confidence intervals to assess statistical significance.

Uploaded by

dhivya
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 24

CORRELATION & REGRESSION

Dr.S.Porchelvan
Prof. of Biostatistics

06/04/25 Dept. of Biostatistics 1


Saveetha Medical College
CORRELATION
• A measure of association between two variables observed from the
subjects.
• Scatter diagram
• Pearson’s Coefficient of Correlation – Strength of association
• The value of Correlation Coefficient can vary between (  1.0 ) to ( +
1.0 )

[1/ N *  X Y]  X Y
r = -------------------------------------
Sx Sy

r - value Degree of Association

0.0 No association
 0.01 to  0.2 Negligible
 0.2 to  0.4 Weak
 0.4 to  0.7 Moderate
 0.7 to  1.0 Strong
 1.0 Perfect
06/04/25 Dept. of Biostatistics 2
Saveetha Medical College
Example :
The amount of blood loss (ml) and mean systolic Blood pressure
(mmHg ) during surgery were recorded for 15 patients. Prepare a
scatter diagram and compute “r”.

Mean SBP ( mmHg )


93,88,123,103,108,103,88,88,78,108,88,103,88,138,108

Blood Loss (ml)


112,98,150,115,129,148,96,93,85,116,96,112,93,156,112

In order to know how the values are dispersed, we shall first


plot the Scatter or Correlation diagram with mean SBP in the
X axis and blood loss in the Y – axis.

06/04/25 Dept. of Biostatistics 3


Saveetha Medical College
Scatter Plot for blood loss (ml) during
surgery and mean SBP (mmHg)

160
Blood loss (ml)

120

80

40
50 70 90 110 130 150 170

Mean SBP (mmHg)

[(1/N) Σ X Y] – X Y
r = -----------------------------
SX SY
06/04/25 Dept. of Biostatistics 4
Saveetha Medical College
No X*Y (X – X )2 (Y – Y)2
Mean SBP ( X ) Blood Loss (Y)
1 93 112 10416 53.73 2234.45
2 88 98 8624 152.03 3218.29
3 123 150 18450 513.93 15692.57

4 103 115 11845 7.13 8148.67


5 108 129 13932 58.83 3105.83
6 103 148 15244 7.13 5887.49

7 88 96 8448 152.03 333.79


8 88 93 8184 152.03 11391.29
9 78 85 6630 498.63 1349.09

10 108 116 12528 58.83 3754.01


11 88 96 8448 152.03 472.19
12 103 112 11536 7.13 5887.49

13 88 93 8184 152.03 2675.99


14 138 156 21528 1419.03 8330.21
15 108 112 12096 58.83 2427.53

Total 1505 1711 176093 3443.35 74908.89

06/04/25 Dept. of Biostatistics 5


Saveetha Medical College
 Xi 1505
X = ----- = -------- = 100.33
N 15
 Yi 1711
Y = ------- = -------- = 114.07
N 15

SX =   (Xi – X )2 /n-1 = 15.68

SY =   (Yi –Y )2 /n-1 = 22.49

06/04/25 Dept. of Biostatistics 6


Saveetha Medical College
( 1/15)(176093) - (100.33)(114.07)
r = ----------------------------------------------
(15.68) (22.49)

r = + 0.84

The value of r shows a Strong positive association between mean SBP


during Surgery and blood loss with a magnitude of 0.84

Calculation of 95% CI

The 95% CI for r is r  t SE ( r )


0.84  t0.05  (1 – r2) /(n-2)
0.84  2.16 (0.0226)
0.7912 to 0.8888
The correlation between systolic blood pressure during
surgery and blood loss (0.84) is statistically significant
( has not occurred by chance / Sampling error).
The 95% CI is also narrow.
06/04/25 Dept. of Biostatistics 7
Saveetha Medical College
( I I ) Student’s t - test

r 0.84
t = ----------------- = -------------------------- = 37.09
 (1 – r2) / n-2  (1 –0.84 2) /(15-2)

From the table value it is found that t0.01 (13) = 4.221

The calculated value 37.09 is larger than the table


value. The positive association between mean
systolic blood pressure during surgery
and blood loss is statistically significant ( has not
occurred by chance / Sampling error )

06/04/25 Dept. of Biostatistics 8


Saveetha Medical College
06/04/25 Dept. of Biostatistics 9
Saveetha Medical College
06/04/25 Dept. of Biostatistics 10
Saveetha Medical College
06/04/25 Dept. of Biostatistics 11
Saveetha Medical College
06/04/25 Dept. of Biostatistics 12
Saveetha Medical College
06/04/25 Dept. of Biostatistics 13
Saveetha Medical College
06/04/25 Dept. of Biostatistics 14
Saveetha Medical College
REGRESSION
 A variable depends on one or more variables.
ex: diastolic blood pressure depends on age

 Regression Coefficients are used to measure


association

 It measures the mean changes to be expected in


the dependent variable (Y) for a unit change in
the value of the independent variable (X)
06/04/25 Dept. of Biostatistics 15
Saveetha Medical College
The regression line of Y on X is given by
SY
Y- Y = r ----- ( X – X )
SX
where this equation is regressed from the X axis and passes
through X and Y. We arrive at an equation of a straight line of the
form
Y =  +  X ,  is constant
And
Σ (Xi – X) (Yi – Y)
 = ---------------------- is the regression coefficient
Σ (Yi – Y)2

06/04/25 Dept. of Biostatistics 16


Saveetha Medical College
Example :
The gestational age (weeks) and the abdominal circumferences (cm)
were recorded for 54 antenatal mothers . Prepare a scatter diagram ,
compute 'r' , fit a regression line and test for the regression coefficient.

Abdominal
Circumferenc
S. e
No (cm)
Y
No GA WKS X AC CMS Y GAWKS AC CMS GA WKS AC CMS GA WKS ACCMS
1 12.22 56.00 18.28 72.00 25.89 96.00 32.78 112.00
2 12.28 59.00 19.21 80.00 26.78 99.00 33.44 111.00
3 12.42 54.00 19.56 81.00 26.45 93.00 33.58 123.00
4 13.52 60.00 20.29 80.00 27.58 95.00 34.85 104.00
5 13.45 58.00 20.45 81.00 27.55 99.00 35.71 99.00
6 14.71 62.00 21.36 82.00 28.66 92.00 35.57 122.00
7 14.58 62.00 21.56 81.00 28.65 99.00 36.42 120.00
8 15.85 64.00 22.34 84.00 29.32 86.00 36.14 128.00
9 15.36 65.00 22.85 86.00 29.58 89.00 37.56 131.00
10 16.27 65.00 23.65 88.00 30.78 102.00 37.98 135.00
11 16.65 66.00 23.29 88.00 30.88 98.00 38.29 138.00
12 17.16 68.00 24.54 88.00 31.45 106.00 38.44 142.00
13 17.24 70.00 24.36 89.00 31.54 110.00
14 18.81 67.00 25.89 86.00 32.85 107.00

Gestational age in week = Number of days of gestation period / 7

06/04/25 Dept. of Biostatistics 17


Saveetha Medical College
For example ,

X = 25.12 wks S x = 8.01   X X ) 2


= 3464.6

Y = 90.33 cm S y = 22.81 Y Y) 2


= 28095.98

Fig . Abdominal Circumference (cm) and gestational age (weeks)


160

140
Abdominal circumference in cms

120

100

80

60

40 r= 0.961
20
0 10 20 30 40

Gestational age in weeks

06/04/25 Dept. of Biostatistics 18


Saveetha Medical College
We shall frame a Regression line of Y on X as follow
Sy
Y  Y = r ------ ( X  X )
Sx
22.81
Y  90.33 = ( 0.961) ------------ ( X  25.12 )
8.01
Y  90.33 = 2.7366 ( X  25.12 )
Y  90.33 = 2.7366 X  68.7433
Y = 2.7366 X  68.7433 + 90.33
Y = 2.74 X + 21.5858
Y = ( 21.59 ) + 2.74 X
This is of the form of a straight line Y =    X
Where  = ( 21.59 ) and  = 2.74

And SE (  ) =  ( 1/n – 2 )  {   Y  Y) 2 /   X  X ) 2
   2

SE (  ) =  ( 1/54 – 2 ) {28095.98 / 3464.64   (2.74) 2

SE (  ) =  (1/52){(8.1093) - (7.5076)}

SE (  ) =  (1/52){0.6017}

SE (  ) =  0.0115 = 0.1075
06/04/25 Dept. of Biostatistics 19
Saveetha Medical College
Fig . Abdominal Circumference (cm) and gestational age

(weeks) with regression line


140.00 


 

120.00
 

 


100.00     
 
 
   
  


  
80.00


 
  


60.00  

15.00 20.00 25.00 30.00 35.00


gestational age in weeks

06/04/25 Dept. of Biostatistics 20


Saveetha Medical College
Calculation of 95% CI
The 95% CI for  is   t  SE (  )
2.74  t 0.05 (0.1075)
2.74  2.00 (0.1075)

2.74  0. 2151
( 2.74  0.2151 ) to ( 2.74 + 0.2151)
i.e. 95% CI for  is 2.5249 to 2.9551

 2.74
t = ----------- = ----------- = 25.48
SE (  ) 0.1075

From the table, t 0.05 (52) = 2.00 , t 0.01 (52) = 2.65


 is significant at p < 0.00001
Abdominal Circumference = ( 21.59) + (2.74) *Gestational Age
For GA = 14 wks, AC = ( 21.59) + (2.74) (14) = 59.95 cm
For GA = 15 wks, AC = ( 21.59) + (2.74) (15) = 62.69 cm
For GA = 16 wks, AC = ( 21.59) + (2.74) (16) = 65.43 cm
Inference
The 95% Confidence Interval does not include Zero (0). So it is found
to be statistically significant (has not occurred by chance/ Sampling error)
i.e. for every increase in week of gestational age the abdominal
circumference increases by 2.74 cm
06/04/25 Dept. of Biostatistics 21
Saveetha Medical College
Models for Regression
• Multiple Regression
• Logistic Regression
What lifestyle characteristics are risk
factors for coronary heart disease (CHD)?
Given a sample of patients measured on
smoking status, diet, exercise, alcohol
use, and CHD status, you could build a
model using the four lifestyle variables
06/04/25 Dept. of Biostatistics 22
Saveetha Medical College
Conti…
to predict the presence or absence of CHD
in a sample of patients. The model can
then be used to derive estimates of the odds
ratios for each factor to tell you, for example,
how much more likely smokers are to
develop CHD than nonsmokers

06/04/25 Dept. of Biostatistics 23


Saveetha Medical College
Dr.S.Porchelvan
[email protected]

06/04/25 Dept. of Biostatistics 24


Saveetha Medical College

You might also like