Logistic Regression: Psy 524 Ainsworth
Logistic Regression: Psy 524 Ainsworth
Questions
Can the categories be correctly predicted given a set of predictors?
Usually once this is established the predictors are manipulated to see if the equation can be simplified. Can the solution generalize to predicting new cases? Comparison of equation with predictors plus intercept to a model with just the intercept
Questions
What is the relative importance of each predictor?
How does each variable affect the outcome? Does a predictor make the solution better or worse or have no effect?
Questions
Are there interactions among predictors?
Does adding interactions among predictors (continuous or categorical) improve the model? Continuous predictors should be centered before interactions made in order to avoid multicollinearity.
Can parameters be accurately predicted? How good is the model at classifying cases for which the outcome is known ?
Questions
What is the prediction equation in the presence of covariates? Can prediction models be tested for relative fit to the data?
So called goodness of fit statistics
What is the strength of association between the outcome variable and a set of predictors?
Often in model comparison you want non-significant differences so strength of association is reported for even non-significant effects.
Assumptions
The only real limitation on logistic regression is that the outcome must be discrete.
Assumptions
If the distributional assumptions are met than discriminant function analysis may be more powerful, although it has been shown to overestimate the association using discrete predictors. If the outcome is continuous then multiple regression is more powerful given that the assumptions are met
Assumptions
Ratio of cases to variables using discrete variables requires that there are enough responses in every given category
If there are too many cells with no responses parameter estimates and standard errors will likely blow up Also can make groups perfectly separable (e.g. multicollinear) which will make maximum likelihood estimation impossible.
Assumptions
Linearity in the logit the regression equation should have a linear relationship with the logit form of the DV. There is no assumption about the predictors being linearly related to each other.
Assumptions
Absence of multicollinearity No outliers Independence of errors assumes a between subjects design. There are other forms if the design is within subjects.
Background
Odds like probability. Odds are usually written as 5 to 1 odds which is equivalent to 1 out of five or .20 probability or 20% chance, etc.
The problem with probabilities is that they are non-linear Going from .10 to .20 doubles the probability, but going from .80 to .90 barely increases the probability.
Background
Odds ratio the ratio of the odds over 1 the odds. The probability of winning over the probability of losing. 5 to 1 odds equates to an odds ratio of .20/.80 = .25.
Background
Logit this is the natural log of an odds ratio; often called a log odds even though it really is a log odds ratio. The logit scale is linear and functions much like a z-score scale.
Background
LOGITS ARE CONTINOUS, LIKE Z SCORES p = 0.50, then logit = 0 p = 0.70, then logit = 0.84 p = 0.30, then logit = -0.84
MEAN(Y) = P, observed proportion of successes VAR(Y) = PQ, maximized when P = .50, variance depends on mean (P) XJ = ANY TYPE OF PREDICTOR Continuous, Dichotomous, Polytomous
Y | X = B0 + BX1 + 1
and it is assumed that errors are normally distributed, with mean=0 and constant variance (i.e., homogeneity of variance)
E(Y | X) = B0 +BX1 1
an expected value is a mean, so
The predicted value equals the proportion of observations for which Y|X = 1; P is the probability of Y = 1(A SUCCESS) given X, and Q = 1- P (A FAILURE) given X.
=) = P | X (Y Y=1
Where Y-hat is the estimated probability that the ith case is in a category and u is the regular linear regression equation:
u = A + B1 X 1 + B2 X 2 + L + BK X K
e i = b0+b1X1 1+e
b0 +b X1 1
Logistic Function
Constant regression constant different slopes
v2: b0 = -4.00 b1 = 0.05 (middle) v3: b0 = -4.00 b1 = 0.15 (top) v4: b0 = -4.00 b1 = 0.025 (bottom)
1.0
.8
.6
.4 V4 V1 .2 V3 V1 V2 0.0 30 40 50 60 70 80 90 100 V1
Logistic Function
Constant slopes with different regression constants
v2: b0 = -3.00 b1 = 0.05 (top) v3: b0 = -4.00 b1 = 0.05 (middle) v4: b0 = -5.00 b1 = 0.05 (bottom)
1.0 .9 .8 .7 .6 .5 .4 V4 .3 .2 .1 0.0 30 40 50 60 70 80 90 100 V1 V3 V1 V2 V1
The Logit
By algebraic manipulation, the logistic regression equation can be written in terms of an odds ratio for success:
The Logit
Odds ratios range from 0 to positive infinity Odds ratio: P/Q is an odds ratio; less than 1 = less than .50 probability, greater than 1 means greater than .50 probability
The Logit
Finally, taking the natural log of both sides, we can write the equation in terms of logits (log-odds):
The Logit
The Logit
Log-odds are a linear function of the predictors The regression coefficients go back to their old interpretation (kind of)
The expected value of the logit (logodds) when X = 0 Called a logit difference; The amount the logit (log-odds) changes, with a one unit change in X; the amount the logit changes in going from X to X + 1
Conversion
EXP(logit) or = odds ratio Probability = odd ratio / (1 + odd ratio)