0% found this document useful (0 votes)
132 views23 pages

Logistic Regression: Interaction Terms

1) β3 represents the interaction between two predictors in a logistic regression model. It quantifies how the effect of one predictor depends on the value of the other. 2) β3 can be interpreted as either the difference in the effect of one predictor between two levels of the other predictor, or vice versa. 3) Lookup tables can be used to determine the interpretation of each coefficient based on the values of the predictors in different observation groups.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
132 views23 pages

Logistic Regression: Interaction Terms

1) β3 represents the interaction between two predictors in a logistic regression model. It quantifies how the effect of one predictor depends on the value of the other. 2) β3 can be interpreted as either the difference in the effect of one predictor between two levels of the other predictor, or vice versa. 3) Lookup tables can be used to determine the interpretation of each coefficient based on the values of the predictors in different observation groups.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 23

Logistic Regression: Interaction Terms

1
Interactions in Logistic Regression

I For linear regression, with predictors X1 and X2 we saw


that an interaction model is a model where the
interpretation of the effect of X1 depends on the value of
X2 and vice versa.

I Exactly the same is true for logistic regression.

I The simplest interaction models includes a predictor


variable formed by multiplying two ordinary predictors:

logit(P(Y = 1)) = β0 + β1 × X1 + β2 × X2 + β3 × X1 × X2

I Interaction term

2
Interactions in Logistic Regression

We will look at the interpretation of interactions in 3 cases:


1 Interaction between two dummy variables.
2 Interaction between a dummy and a continuous variable.
3 Interaction between two continuous variables.

3
Interaction Between 2 Dummy Variables

I Consider a logistic model for the risk of suffering a heart


attack over a year in terms gender and smoking status:

logit P(Y = 1) = β0 + β1 sex + β2 smoke + β3 (sex × smoke)

I sex indicates gender (male=1, female=0)

I smoke indicates smoking status (smokes=1, does not=0).

4
Interpreting the Intercept

logit P(Y = 1) = β0 + β1 sex + β2 smoke + β3 (sex × smoke)

I In order to interpret β0 we need to find a situation in


which the final three terms in the equation vanish.

I This happens when an observation corresponds to a


female non-smoker, for then sex=0 and smoke=0.

logit P(Y = 1) = β0 + β1 × 0 + β2 × 0 + β3 (0 × 0)
= β0

I Consequently, β0 is the log odds in favour of a female


non-smoker suffering from a heart attack.

5
Interpretations of Other Quantities Involving β0

We can also give interpretations on the odds scale and on the


probability scale:
I exp(β0 ) is the odds in favour of a female non-smoker
suffering from a heart attack.
exp(β0 )
I
1+exp(β0 )is the probability of a female non-smoker
suffering from a heart attack.

6
Interpreting β1 and β2

logit P(Y = 1) = β0 + β1 sex + β2 smoke + β3 (sex × smoke)

I We would know how to interpret β1 if the interaction


term was not there.

I Since in that case would just have an ordinary


multivariate logistic model.

I This happens when an observation corresponds to a


non-smoker, for then smoke=0.

logit P(Y = 1) = β0 + β1 × sex + β2 × 0 + β3 (sex × 0)


= β0 + β1 × sex

7
Interpreting β1 and β2

I Amongst non-smokers

logit P(Y = 1) = β0 + β1 × sex

I We know how to interpret β1 in this case as its a


univariate logistic model.

I β1 is the log-odds ratio comparing males and females


amongst non-smokers.

I exp(β1 ) is the odds ratio comparing males and females


amongst non-smokers.

8
Interpreting β1 and β2

logit P(Y = 1) = β0 + β1 sex + β2 smoke + β3 (sex × smoke)

I To interpret β2 we need to get rid of the interaction term


without getting rid of the β2 smoke term.

I Same argument as before but now set sex=0 (female):

logit P(Y = 1) = β0 + β1 × 0 + β2 × smoke + β3 (0 × smoke)


= β0 + β2 × smoke

I β2 is the log-odds ratio comparing smokers with


non-smokers amongst females.

9
Interpreting β3

logit P(Y = 1) = β0 + β1 sex + β2 smoke + β3 (sex × smoke)

I To interpret β3 rewrite the regression equation:

logit P(Y = 1) = β0 + [β1 + β3 smoke]sex + β2 smoke


I This looks like a multivariate regression model with sex
and smoke as predictors where:
I β1 + β3 smoke is the log-odds ratio for males vs. females;
I β2 is the log odds ratio for smokers vs. non-smokers.

I β3 is the difference between the log-odds ratio comparing


males vs females in smokers and the log-odds ratio
comparing males vs. females in non-smokers.

10
Interpreting β3

logit P(Y = 1) = β0 + β1 sex + β2 smoke + β3 (sex × smoke)

I We could just as well have rewritten the equation this way:

logit P(Y = 1) = β0 + β1 sex + [β2 + β3 sex]smoke

I β3 is the difference between the log-odds ratio comparing


smokers vs non-smokers in males and the log-odds ratio
comparing smokers vs. non-smokers in females.
I So we have two ways of thinking about β3 :
1 either as modification of the effect of smoke by sex
2 or the modification of the effect of sex by smoke.

11
Quick Lookup Table

We can draw up a table for the 4 types of observation:

sex smoke logit(P(Y = 1))


1 Male Yes β0 + β1 + β2 + β3
2 Male No β0 + β1
3 Female Yes β0 + β2
4 Female No β0
I This allows us to find the function of the parameters
corresponding to a log-odds ratio and vice versa.
I e.g. 3 - 4 shows us that the log-odds ratio for smokers
vs. non-smokers amongst females is β2
I e.g. 1 - 2 shows us that the log-odds ratio for smokers
12 vs. non-smokers amongst males is β + β
Interaction Between a Dummy Variable and a
Continuous Variable

I Consider a logistic model where the main predictors are


sex (a dummy coded as before) and age (in years)

logit P(Y = 1) = β0 + β1 sex + β2 age + β3 (sex × age)

I β0 is the log-odds in favour of a female age 0 suffering


from a heart attack.

13
Interaction Between a Dummy Variable and a
Continuous Variable

I Consider a logistic model where the main predictors are


sex (a dummy coded as before) and age (in years)

logit P(Y = 1) = β0 + β1 sex + β2 age + β3 (sex × age)

I β1 is the log-odds ratio for males vs. females amongst


people of age 0.

14
Interaction Between a Dummy Variable and a
Continuous Variable

I Consider a logistic model where the main predictors are


sex (a dummy coded as before) and age (in years)

logit P(Y = 1) = β0 + β1 sex + β2 age + β3 (sex × age)

I β2 is the log-odds ratio corresponding to an increase in


age by 1 year amongst females.

15
Interaction Between a Dummy Variable and a
Continuous Variable

I Consider a logistic model where the main predictors are


sex (a dummy coded as before) and age (in years)

logit P(Y = 1) = β0 + β1 sex + β2 age + β3 (sex × age)

I β3 is the difference between the log-odds ratio


corresponding to a change in age by 1 year amongst males
and the the log-odds ratio corresponding to an increase in
age by 1 year amongst females.

I β3 is also difference between the log-odds ratios for males


vs. females in two age homogenous groups which differ
by 1 year.

16
Quick Lookup Table

Again we can draw up a table, this time considering groups of


individuals aged z and z + 1

sex age logit(P(Y = 1))


1 Male z+1 β0 + β1 + β2 (z + 1) + β3 (z + 1)
2 Male z β0 + β1 + β2 z + β3 z
3 Female z+1 β0 + β2 (z + 1)
4 Female z β0 + β2 z

I e.g. 3 - 4 shows us that the log-odds ratio


corresponding to an increase in age by 1 year amongst
females is β2
I e.g. 2 - 4 shows us that the log-odds ratio for males vs.
females amongst people aged z is β1 + β3 z
17
Interaction Between 2 Continuous Variables

I Consider a logistic model where the main predictors are


BP (blood pressure in mmHg) and age (in years)

logit P(Y = 1) = β0 + β1 BP + β2 age + β3 (BP × age)

I β0 is the log-odds in favour of a person with a BP of


0mmHg and age 0 suffering from a heart attack.

I Ridiculous interpretation (model can’t apply when age or


BP are close to 0, but we hope it is good for the ranges we
are interested in.)

18
Interaction Between 2 Continuous Variables

I Consider a logistic model where the main predictors are


BP (blood pressure in mmHg) and age (in years)

logit P(Y = 1) = β0 + β1 BP + β2 age + β3 (BP × age)

I β1 is the log-odds ratio corresponding to an increase in BP


by 1mmHg amongst people aged 0.

19
Interaction Between 2 Continuous Variables

I Consider a logistic model where the main predictors are


BP (blood pressure in mmHg) and age (in years)

logit P(Y = 1) = β0 + β1 BP + β2 age + β3 (BP × age)

I β2 is the log-odds ratio corresponding to an increase in


age by 1 year amongst people with a BP of 0mmHg.

20
Interaction Between 2 Continuous Variables

I Consider a logistic model where the main predictors are


BP (blood pressure in mmHg) and age (in years)

logit P(Y = 1) = β0 + β1 BP + β2 age + β3 (BP × age)

I β3 is the difference between the log-odds ratios


corresponding to an increase in age of 1 year for two BP
homogenous groups which differ by 1 mmHg.

I β3 is also difference between the difference between the


log-odds ratios corresponding to an increase in BP of 1
mmHg for two age homogenous groups which differ by 1
year.

21
Quick Lookup Table

Again we can draw up a table, this time considering


individuals with BP w and w + 1 and aged z and z + 1

BP age logit(P(Y = 1))


1 w+1 z+1 β0 + β1 (w + 1) + β2 (z + 1) + β3 (w + 1)(z + 1)
2 w+1 z β0 + β1 (w + 1) + β2 z + β3 (w + 1)z
3 w z+1 β0 + β1 w + β2 (z + 1) + β3 w(z + 1)
4 w z β0 + β1 w + β2 z + β3 wz

I e.g. 3 - 4 shows us that the log-odds ratio


corresponding to an increase in age by 1 year amongst
those of BP w is β2 + β3 w.
I e.g. 2 - 4 shows us that the log-odds ratio
22
Final Comment on Interpretation

I Remember whenever you give an interpretation of a


quantity γ in terms of a log-odds ratio there is always an
equivalent interpretation of exp(γ) as an odds-ratio.
I Whenever you give an interpretation of a quantity γ as
the log-odds in favour of an event you can always give
two equivalent interpretations
1 of exp(γ) as the odds in favour of the event,
exp(γ)
2 of 1+exp(γ) as the probability of the event.

23

You might also like