Logistic Regression-1
Logistic Regression-1
1 if “success”
Dependent Variable Y = (event)
0 if “failure”
(no event)
e a bx
Pr Y 1 | X
1 e a bx
e a bx
1
1 P Pr Y 0 | X 1
1 e a bx
1 e a bx
P
e a bx ODDS
1 P
P
ln a bx LOG ODDS LOGIT
1 P
ln (P/1-P)
P
0 a X
b = “Slope” a = “Location”
ODDS RATIOS: BINARY X
1 = Exposed
E.g. X =
0 = Unexposed
FOR EXPOSED:
P1
ln a b x1 a b
1 - P1
FOR UNEXPOSED:
P0
ln a bx0 a
1 - P0
Therefore,
P1 P0
ln - ln (a b) - a b
1 - P1 1 - P0
P1
1 - P1
ln b
P0
1 - P0
Therefore,
b = ln(Odds Ratio)
or
Odds Ratio = eb
CONTINUOUS X:
e a b1 x1 b2 x 2 ......... bk x k
Pr Y 1 | X 1 , X 2 ,..... X k
1 e a b1 x1 b2 x 2 ......... bk x k
P
ln a b1 x1 b 2 x2 .......b k xk
1 P
1.0
Coronary Heart Disease (CHD)
.8
.6
.4
.2
0.0
-.2
10 20 30 40 50 60 70
AGE
EFFECT OF DATA GROUPING
e constant age
PCHD yes | Age
1 e constant age
Likelihood Function:
Probability of the observed data is expressed as a function
of unknown parameters. That is,
P(y=1 | X=Age) = (x) = (Age)
P(y=0 | X=Age) = 1 - (x) = 1 - (Age)
e constant age 40
10 Age 40
1 e constant age 40
1 e 0 1 age
Design variable
D1 D2
White 0 0
Black 1 0
Other 0 1
Testing for Significance
Significance of the Model:
Assessing the significance of the model means that the test
for overall significance of the 4 variables in the model.
However, one or more variables individually may not
be significant.
Log-likelihood = -222.583
Wald statistic = /SE()
We conclude that the variables LWT and possibly Race are
significant at P < 0.05.
USE OF LR COEFFICIENTS FOR
GENERAL COMPARISIONS OF RISK
1 a b
OR e
Example:
Risk of birth defect to mothers, Ages 35 +
1 = 0.182 (Age in years)
= e0.182 = 1.2 (change of 1 Year)
For change of 5 years (E.g. 40 vs 45)