Glogit
Glogit
Paper 446-2013
ABSTRACT
Logistic regression is most often used for modeling simple binary response data. Two modifications extend it
to ordinal responses that have more than two levels: using multiple response functions to model the ordered
behavior, and considering whether covariates have common slopes across response functions.
This paper describes how you can use the LOGISTIC procedure to model ordinal responses. Before
SAS/STAT® 12.1, you could use cumulative logit response functions with proportional odds. In SAS/STAT
12.1, you can fit partial proportional odds models to ordinal responses. This paper also discusses methods
of determining which covariates have proportional odds. The reader is assumed to be familiar with using
PROC LOGISTIC for binary logistic regression.
INTRODUCTION
Logistic regression modeling has applications in many areas, including clinical studies, epidemiology, data
mining, social sciences, marketing, and engineering. It has proved to be reliable for both prospective
analyses (such as designed experiments or clinical trials) and retrospective analyses (such as found data or
case-control studies).
If your response variable can take only two values (the “event” and the “non-event”), then the conditions for
linear regression are not met; in particular, your errors are binary and not normally distributed. Binary logistic
regression was developed to handle this case. Instead of modeling the response itself, you use logistic
regression to model the probabilities of events.
Suppose your response variable Yi has events coded as 0 and non-events coded as 1 for each observation
i D 1; : : : ; n, and your p covariates are xi D .xi1 ; xi 2 ; : : : ; xip /. Write the probability of an event as
i D Pr.Yi D 0/. Then the logistic model is written as
i
logit.i / D log D ˛ C xi ˇ D ˛ C xi1 ˇ1 C xi 2 ˇ2 C C xip ˇp
1 i
where the expected behavior of the responses is modeled by a linear combination of the covariates, the
linear predictor ˛ C Xˇ. The quotient 1 ii is called the odds, and the log of the odds is called the logit or
the log odds. This model is linear in the ˛ and ˇ, and the different combinations of the X covariates define
subpopulations within which the response is independently binomially distributed. Under simple random
sampling within the subpopulations, the data have a product binomial distribution from which you can derive
a likelihood and all other results. When using continuous covariates, you might have only one observation
in each combination, but this affects only certain statistical results. A general recommendation is that you
should have at least 10 events and 10 non-events for each parameter (Peduzzi et al. 1996).
Although binary logistic regression is most common, logistic regression is extensible to more than two
response levels. If your response variable takes values that have no inherent ordering (such as voting
Democratic, Green, Independent, Republican), then your response is nominal. If your response takes values
that have an intrinsic order (good, better, best), then your response is ordinal.
This paper first reviews how binary logistic regression extends to polytomous logistic regression—in particular,
to a special ordinal response model, the proportional odds model combined with a cumulative logit link.
Then the proportional odds model is relaxed by using the new UNEQUALSLOPES option in the LOGISTIC
procedure to fit the partial proportional odds model. Methods for determining which model applies to
your data are also described. The paper ends with suggestions for performing model selection while
simultaneously assessing the proportional odds of the individual parameters.
1
SAS Global Forum 2013 Statistics and Data Analysis
Link Functions
Generalized Logit Link
Define a link function so that each response function contrasts a lower level with the last level:
i1 i2 i3
log log log
i 4 i4 i4
This is the generalized logit link, and it ignores the order of the responses, beyond identifying the last
one as the reference response. You use this link when you have nominal data.
Cumulative Logit Link
The most popular ordinal link function uses every probability in every function by contrasting the lower
levels of Y with the higher levels of Y . Let the cumulative probability be denoted as ij D Pr.Yi j /:
i1 i1 C i 2 i1 C i 2 C i 3
logit.i1 / D log logit.i 2 / D log logit.i 3 / D log
i2 C i3 C i4 i 3 C i 4 i 4
This is the cumulative logit link. As you move from the first logit function to the second and from the
second logit function to the third, the numerator increases and the denominator decreases, so the
cumulative logits are increasing.
Linear Predictors
General Model
To model the variation of the probabilities, you set each of your J – 1 = 3 response functions equal to a
different linear predictor, ˛j C X0 ˇj ; j D 1; 2; 3. Choosing this model has several consequences:
1. There are .J 1/.1 C p/ D 3.1 C p/ parameters, which can be too many for a successful fit and
can violate the ideal of model parsimony.
2. If you take the difference of the jth response function between any two subpopulations, h and i, you
get
where g is the link function and g.ij / is the jth response function for the ith observation. The
subscript on ˇj means that this difference depends on the specific response function, so your
conclusions are drawn for each response function separately. If you use the generalized logit link
function, this difference is the log odds ratio.
3. There are no constraints on the ˛j and ˇj , so there is no guarantee that the response functions
are ordered.
Any of the links that are used with this model is an instance of a general (generalized) model (Agresti
2010). In PROC LOGISTIC, specifying the LINK=GLOGIT option in the MODEL statement fits this model
to the generalized logit.
Proportional Odds Model
The general model has unequal slopes for the predictors, and you need enough data to estimate a
different coefficient for each predictor in each response function. To simplify this model, you can force an
2
SAS Global Forum 2013 Statistics and Data Analysis
ordering on the linear predictors by using the same slope parameters for each response function and by
constraining the intercepts to increase (˛1 < ˛2 < ˛3 ) or decrease:
g.j / D ˛j C X0 ˇ j D 1; 2; 3
This model has J 1 C p parameters and thus requires less data for an adequate fit than the general
model requires. It also provides a more straightforward interpretation. Compute the difference of the jth
response function between two subpopulations h and i to see the impact of this model:
This difference is proportional to the distance between the explanatory variables, and the difference is
the same no matter which response function you consider. This is the equal slopes assumption, which
is also called the parallel lines assumption. You can apply the parallel lines assumption to any of the
link functions, but it is most commonly used with the cumulative logit link. When you use the cumulative
logit link, the assumption is the proportional odds assumption, the model is the proportional odds model,
and the difference of cumulative logits (g) is the log cumulative odds ratio. By default, PROC LOGISTIC
fits the proportional odds model combined with the cumulative logit link when you have more than two
response levels.
Partial Proportional Odds Model
The third linear predictor is a hybrid of the general model and the proportional odds model. Suppose one
set of effects X has p1 parameters that satisfy the parallel lines assumption (that is, they have equal
slopes), but the remaining set Z has p2 parameters that do not and instead require the general model
(that is, they have unequal slopes). This model is written as
g.j / D ˛j C X0 ˇ C Z0 j j D 1; 2; 3
It has .J 1/ C p1 C .J 1/p2 parameters and, when used with the cumulative logit link, is called the
partial proportional odds model (Peterson and Harrell 1990). Interpretation of the proportional odds
parameters is independent of the response function; interpretation of the general parameters depends
on the response function. When you fit the partial proportional odds model, you must be especially
careful to ensure that the cumulative logits remain ordered for the data being modeled.
The next sections fit proportional odds models and partial proportional odds models to several data sets.
data Asbestos;
input Task Ventilation Exposure Freq @@;
datalines;
0 0 1 29 0 0 2 1 0 0 3 1 0 1 1 3 0 1 2 1 0 1 3 2
1 0 1 10 1 0 2 1 1 0 3 7 1 1 1 3 1 1 2 3 1 1 3 22
;
The Task variable is 0 if the worker removed asbestos tile and 1 if the worker removed asbestos insulation.
The Ventilation variable is 0 if a negative pressure ventilation system was used and 1 if a general system
was in place. The response variable Exposure measures the extent of the worker’s exposure to asbestos; it
takes the values 1 for low exposure, 2 for exposure near the legal limit, or 3 for exposure above the legal
limit. The Freq variable counts the number of workers in each grouping.
The following code fits a proportional odds model to the Asbestos data set; the classification variables are
entered as main effects. The CLASS statement is optional for these data, but you specify the PARAM=REF
and REF=FIRST options in the CLASS statement to preserve the 0-1 coding of the two binary covariates
(PARAM=REF is the incremental effects parameterization). The EFFECTPLOT statement displays a plot of
the fit.
3
SAS Global Forum 2013 Statistics and Data Analysis
This model has only four combinations of the explanatory variables, as shown in Table 1.
Each subpopulation has two cumulative logits in the model. If you write the cumulative probabilities for the
subpopulations in each logit function as ij D Pr.Exposure j jsubpopulation D i /, the model is
2 3 2 3 2 3
logit.a1 / ˛1 1 0 0 0
6 logit.b1 / 7 6 ˛1
6 7 6 Cˇ1 7 6 1 0 1 0 7
7 6 7
6 logit.c1 / 7 6 ˛1 Cˇ 7 6 1 0 0 1 72 3
6 7 6 2 7 6 7 ˛1
6 logit.d1 / 7 6 ˛1 Cˇ1 Cˇ2 7 7 6 1 0 1 1 76
6 7
˛2 7
7D6 7D6
6 7 6
76 7
ˇ
6
6 logit.a2 / 7 6 ˛ 2 7 6 0 1 0 0 7 4 1 5
7 6 0 1 1 0 7 ˇ2
6 7 6 7 6 7
6 logit.b2 / 7 6
6 7 6 ˛2 Cˇ1 7 6 7
4 logit.c2 / 5 4 ˛2 Cˇ2 5 4 0 1 0 1 5
logit.d 2 / ˛2 Cˇ1 Cˇ2 0 1 1 1
The first row in the preceding equations shows you that the intercept for the first response function, ˛1 , is the
log odds of Exposure = 1 versus 2 or 3 for Task = 0 and Ventilation = 0. The fifth row shows you that the
intercept ˛2 is the log odds of Exposure = 1 or 2 versus 3 for Task = 0 and Ventilation = 0. The second (or
sixth) row, compared to the first (or fifth), shows you that ˇ1 is the increment in both response functions (or
both types of log cumulative odds) due to Task = 1. The third (or seventh) row, compared to the first (or fifth),
shows you that ˇ2 is the increment in both types of log cumulative odds due to Ventilation = 1.
Exponentiating the cumulative logits produces the cumulative odds, and you can solve for the cumulative
probabilities:
j
logit.j / D ˛j C X0 ˇ ! D exp.˛j C X0 ˇ/ ! j D exp.X0 ˇ/=.1 C exp.X0 ˇ//
.1 j /
Table 2 displays these equations for each of the subpopulations and cumulative logits.
4
SAS Global Forum 2013 Statistics and Data Analysis
Figure 1 displays the response profiles. The note means that the first logit function compares Exposure = 1
to Exposure = 2 or 3, and the second logit function compares Exposure = 1 or 2 to Exposure = 3; that
is, they compare lower exposure levels to higher exposure levels. You can compare higher values to lower
values by specifying the DESCENDING response option.
Response Profile
Ordered Total
Value Exposure Frequency
1 1 45
2 2 6
3 3 32
Figure 2 displays a score test for the proportional odds assumption; the test does not reject the null hypothesis
that the proportional odds assumption holds. The degrees of freedom are the difference between the number
of parameters in a general model and the number of parameters in a proportional odds model; in this case,
6 – 4 = 2. This score test actually tends to reject the null hypothesis more often than it should; Stokes,
Davis, and Koch (2012) say that this statistic needs approximately five observations (or frequencies) for each
outcome at each level of each main effect, because small samples might make the statistic artificially large.
This score test is a good confirmatory test if it does not reject the null; however, if it rejects the null, then you
need other means to justify the proportional odds assumption.
1.6130 2 0.4464
The AGGREGATE SCALE=NONE options produce the goodness-of-fit statistics that are displayed in Figure 3.
These tests compare the actual number of observations in each subpopulation to the number that is predicted
by the model, and both tests suggest an adequate fit. The degrees of freedom are the number of response
functions times the number of profiles minus the number of parameters, 4 2 – 4 = 4. The note in Figure 3,
“Number of unique profiles: 4,” refers to the four subpopulations that are identified earlier. Stokes, Davis, and
Koch (2012) suggest that you need at least five observations per subpopulation and response in order for
5
SAS Global Forum 2013 Statistics and Data Analysis
the test to be valid. If you have a continuous variable in your model, then you should not use these tests; you
should instead compare your model to a larger model and show that your model is just as good.
The fit statistics that are shown in Figure 4 are often used to compare nested models. The difference of the
–2 Log L statistics forms the likelihood ratio statistic that is shown in Figure 5.
Intercept
Intercept and
Criterion Only Covariates
The three global tests that are displayed in Figure 5 evaluate the significance of all the predictors combined.
They tell you only whether the model has some significance; they don’t say anything about the effect of
individual predictors. There are 4 parameters – 2 intercepts = 2 degrees of freedom for these tests. All
tests in Figure 5 reject the null hypothesis that the covariates are unrelated to Exposure; that is, the model
explains a significant amount of variation in the data.
The tests that are displayed in Figure 6 are Wald tests that the parameters in a given effect are jointly 0.
Because the classification covariates are binary and are coded in the design matrix as a single column, the
tests all have one degree of freedom.
6
SAS Global Forum 2013 Statistics and Data Analysis
Figure 6 Tests That the Parameters for a CLASS Effect Are All Zero Are Rejected
Wald
Effect DF Chi-Square Pr > ChiSq
The parameter estimates are displayed in Figure 7. There are two intercepts for the two logit functions
modeled, but only one slope is reported. The parameter labeled “Intercept 1” is the ˛1 parameter for
Exposure = 1 versus Exposure = 2 or 3, and “Intercept 2” is the ˛2 parameter for Exposure = 1 or 2 versus
Exposure = 3.
Standard Wald
Parameter DF Estimate Error Chi-Square Pr > ChiSq
Both cumulative logits (or log odds) decrease as the explanatory effects increase. Table 3 shows the fitted
cumulative logits for the subpopulations. Recall the interpretation of the parameter estimates in terms of
logits; this table shows you that ˛1 =2.4751 is the log odds of Exposure = 1 versus 2 or 3 for a worker with
Task = 0 and Ventilation = 0, ˛2 is the log odds of Exposure = 1 or 2 versus 3 for a worker with Task = 0
and Ventilation = 0, ˇ1 is the increment in both log odds for Task = 1, and ˇ2 is the increment in both log
odds for Ventilation = 1.
7
SAS Global Forum 2013 Statistics and Data Analysis
Note that the interpretation of the parameter estimates in Figure 7 depends on the PARAM= option, but the
odds ratios in Figure 8 are independent of the PARAM= option.
The interpretation of these odds ratios is that a person who performs Task = 1 has 0.102 times the odds
of a low exposure versus a higher exposure than a person who performs Task = 0 both for Exposure = 1
versus 2 or 3 and for Exposure = 1 or 2 versus 3. That is, a person who removes insulation is more likely to
have a higher exposure to asbestos than a person who removes tile. Similarly, a person who works in an
environment that has general ventilation is more likely to have a higher exposure to asbestos than a person
who works in an environment that has negative pressure ventilation. This summary holds for both response
functions, because of the proportional odds assumption.
Finally, you can generate plots of the fitted model. The EFFECTPLOT statement produces the plot in
Figure 9, where you can easily see that the predicted probability of a low exposure is the highest for workers
who remove tile in negative pressure ventilation, and the probability of exposure above the legal limit is
highest for workers who remove insulation in general ventilation.
data Coal;
input Severity $ @@;
do i=1 to 8; input Exposure freq @@; lnExposure=log(Exposure); output; end;
datalines;
Normal 5.8 98 15 51 21.5 34 27.5 35 33.5 32 39.5 23 46 12 51.5 4
Moderate 5.8 0 15 2 21.5 6 27.5 5 33.5 10 39.5 7 46 6 51.5 2
Severe 5.8 0 15 1 21.5 3 27.5 8 33.5 9 39.5 8 46 10 51.5 5
;
8
SAS Global Forum 2013 Statistics and Data Analysis
The following program fits a proportional odds model to these data and displays plots of the effect of
the lnExposure variable. By default, PROC LOGISTIC orders the response levels alphanumerically; the
ORDER=DATA and DESCENDING options are specified to override the default by fitting cumulative logits of
higher versus lower severity.
You should not use the standard goodness-of-fit tests, because the lnExposure variable is continuous. A
good alternative is to compare your main-effects model to a larger one, and test whether your smaller model
is sufficient. A reasonable extension of this model is to square the continuous effect. To get this test in PROC
LOGISTIC, you specify a forward selection and include your model. If the fit is good, no other effects are
added to your model, and the score test is reported in the “Residual Chi-Square Test” table.
The three EFFECTPLOT statements display the relationship of lnExposure to Severity on three different
scales: cumulative probabilities over a range of exposure levels extended by 50%, logit of cumulative
probabilities (because the link function here is the cumulative logit link), and individual probabilities.
Standard Wald
Parameter DF Estimate Error Chi-Square Pr > ChiSq
9
SAS Global Forum 2013 Statistics and Data Analysis
The interpretation is that incrementing log(exposure) by 1 increases a miner’s odds of having more severe
pneumoconiosis by a factor of 13.
The EFFECTPLOT statements create several different views of the fitted model. First, Figure 12(A) shows
the three estimated cumulative probability functions. The plot is extended beyond the data to display the
S-shaped curve of the logistic function. Because the lnExposure values are fit toward the left half of the
logit functions, the predicted probabilities of higher severity are generally quite low. Second, Figure 12(B)
shows the fit on the logit scale, which has the parallel lines that you expect when making the proportional
odds assumption. Third, Figure 12(C) shows the model-predicted probabilities of a miner having each level
of Severity across the range of the lnExposure values. A miner who has little exposure has the highest
probability of being normal. As exposure increases, the higher severity levels become more likely and the
probability of a normal level decreases. When exposure exceeds 33 years, the probability of a moderate
severity levels off. Presumably miners are moving from normal to moderate severity and from moderate to
severe, so after this amount of exposure they mostly move into the severe range.
Figure 12 Displays of the Fitted Model
data Backache;
input Severity Month Age @@;
lnAge=log(Age);
if Severity=0 then Severity=1;
Trimester=(Month>6)+(Month>3)+1;
datalines;
1 0 23 3 0 36 1 0 21 1 0 30 1 0 42 1 0 34 3 3 26 1 7 18 3 6 39 1 0 25
;
The variables that are extracted from this data set are Severity, which takes the values 1 = none or very
little pain, 2 = troublesome pain, and 3 = severe pain; Trimester, which identifies in which trimester the pain
starts; and lnAge, which is the log of the mother’s age in years.
The following program fits a proportional odds model to the data. The third trimester is the reference level for
the Trimester parameterization. Specifying the DESCENDING option means that the probability of higher
severity is modeled.
10
SAS Global Forum 2013 Statistics and Data Analysis
22.2794 3 <.0001
As mentioned earlier, rejection tends to occur more often than it should. Now what should you do?
You can first try to determine whether the proportional odds assumption is actually valid. There are other
ways to assess the proportional odds structure besides relying on this test. Two graphical methods are
discussed in the next section.
• To confirm the ordinality of the response for a predictor, the means should be strictly increasing or
decreasing with respect to the Y variable.
• To assess the proportional odds assumption for a predictor, the model-expected value curve should
closely follow the mean curve.
The macro requires CLASS variables to be numeric; otherwise, you can use the coding from the design
matrix data set that is provided by the OUTDESIGN= option in the PROC LOGISTIC statement.
11
SAS Global Forum 2013 Statistics and Data Analysis
The following code applies these macros to the Backache data set and produces Figure 15:
%EmpiricalLogitPlot(class=Trimester,cont=lnAge,data=Backache,y=Severity);
%EXPlot(Trimester lnAge,data=Backache,y=Severity);
Figure 15 Assessment Plots for Backache Data
The plots for lnAge show no profound departure from proportionality (the mean curve in (B) is increasing),
but the plots for Trimester certainly do show a departure. But how do you generate a model in which
the proportional odds structure holds for one covariate but not for another? The answer is the partial
proportional odds model, a hybrid method in which the lnAge variable has a proportional odds structure and
the Treatment variable has a general structure. The partial proportional odds model for the Backache data
is
where the response functions share the lnAge parameter but each function has its own version of the
intercept, the Trimester = 1 parameter, and the Trimester = 2 parameter. To interpret these parameters and
their odds ratios, the lnAge parameter is treated as before—your interpretation applies to both response
functions. However, when you discuss the Trimester = 1 parameter, you have to distinguish between the
12
SAS Global Forum 2013 Statistics and Data Analysis
Severity = 3 versus 2 or 1 logit function and the Severity = 3 or 2 versus 1 logit function. The partial
proportional odds model is fit in the next section.
Wald
Effect DF Chi-Square Pr > ChiSq
As in the analysis of the Coal data set, you should not use the standard goodness-of-fit tests because the
lnAge variable is continuous. Instead compare your model to a reasonable extension: in this case, take
the interaction and square the continuous effect. You can produce the Rao score test in Figure 17 from a
forward selection step as shown in the following code:
13
SAS Global Forum 2013 Statistics and Data Analysis
8.8720 5 0.1143
The score test in Figure 17 suggests the main-effects model fits the data adequately.
Alternatively, you can use a likelihood ratio test to compare the two models, just as the likelihood ratio global
test compares your model to an intercept-only model. The following %LRTest macro call uses a likelihood
ratio test to compare these two models and does not reject the main-effects model (p = 0.357):
%LRTest(data=Backache,class=Trimester / param=ref,
model1=Severity(descending) = lnAge Trimester / unequalslopes=Trimester,
model2=Severity(descending) = lnAge Trimester lnAge*lnAge lnAge*Trimester
/ unequalslopes=Trimester);
Figure 18 shows that the Trimester effects for the first response function are not significantly nonzero.
Standard Wald
Parameter Severity DF Estimate Error Chi-Square Pr > ChiSq
Even though the Trimester effects for the first logit function are not significant, PROC LOGISTIC does not
set the estimates to 0. Peterson and Harrell (1990) introduce a constrained partial proportional odds model,
which enables you to multiply the Trimester parameters by a different specified constant for each response
function—multiplying the parameter in the first response function by 0 and in the second response function
by 1 removes the effect from the first response function. You can look for the constrained model in a future
release of PROC LOGISTIC.
14
SAS Global Forum 2013 Statistics and Data Analysis
You can interpret the intercepts and the lnAge parameter as before. The first intercept (–7.2266) is the log
odds of Severity = 3 versus 2 or 1 for lnAge = 0 and Trimester = 3. The second intercept (–5.5761) is the
log odds of Severity = 3 or 2 versus 1 for lnAge = 0 and Trimester = 3. The lnAge parameter (1.767) is the
increment in log odds of a higher Severity for a one-unit increase in lnAge.
The interpretation of the Trimester parameters is complicated because they depend on the response
function. In the first response function, the Trimester = 1 parameter (–0.3949) is the increment in log odds
for Severity = 3 versus 2 or 1 for pain onset in Trimester = 1, and the Trimester = 2 parameter (–0.3262) is
the increment in log odds for Severity = 3 versus 2 or 1 for pain onset in Trimester = 2. Neither of these
is significant. In the second response function, the Trimester = 1 parameter (–1.1475) is the increment in
log odds for Severity = 3 or 2 versus 1 for pain onset in Trimester = 1, and the Trimester = 2 parameter
(1.0891) is the increment in log odds for Severity = 3 or 2 versus 1 for pain onset in Trimester = 2.
Similarly, the Trimester odds ratios that are displayed in Figure 19 also depend on the response functions.
For the response function that compares Severity = 3 versus 2 or 1, every confidence interval contains 1, so
mothers who experience pain onset in any trimester are equally likely to have higher-severity pain; that is,
the Trimester effect does not describe any variation in the first response function. However, in the second
response function, which compares Severity = 3 or 2 versus 1, mothers who experience pain onset in the
first trimester are less likely to have higher-severity pain than mothers who experience pain onset in the later
trimesters, and mothers with pain onset in the second trimester are more likely to have higher-severity pain
than mothers with pain onset in the third trimester. A mother with a one-unit increase in lnAge has 5.853
times the odds of having (is more likely to have) a higher severity, for both response functions.
Up Start small with a proportional odds model and compare it to unequal slopes on each effect in turn
(Peterson and Harrell 1990).
Down Start big with a general model and compare it to equal slopes on each effect in turn (Koch, Amara,
and Singer 1985, p. 372).
OneUp Study proportionality for a series of one-effect models, for which Up and Down are equivalent
(based on Harrell 2001, p. 332).
None of these types of test are perfect: Up and OneUp are apt to signal nonproportional odds too often;
Down, which starts big, can require a lot of data and might be less able to detect proportional odds.
You can implement these tests in three ways:
15
SAS Global Forum 2013 Statistics and Data Analysis
• Wald tests can be computed directly by using the TEST statement in PROC LOGISTIC, but they are
usually not as good as either likelihood ratio tests or score tests.
• Likelihood ratio tests require only a little SAS programming to extract the values of the fitted log likelihood
from the appropriate PROC LOGISTIC table and compare the difference of their log likelihoods to a
chi-square distribution. The degrees of freedom are equal to the difference of the degrees of freedom
of the two models. The %LRTest macro facilitates this procedure.
• Score tests (Peterson and Harrell 1990) are traditionally computed in PROC LOGISTIC by using the
model selection machinery to produce them from the residual chi-square test for adding the effect in
question last.
You can run the likelihood ratio test version of the Up, Down, and OneUp tests for every variable in the model
by calling the %LRTestCycle macro as in the following code. The macro uses the UNEQUALSLOPES option
to generate the models in the Up, Down, and OneUp tests that involve nonproportional odds.
Table 4 reports that the Up tests reject proportional odds for both variables, but the OneUp and Down tests
disagree for lnAge. To break the tie, compare the two competing models. Both models have Trimester with
unequal slopes, one model has lnAge with proportional odds, and the other model has lnAge with unequal
slopes. This is just the Down test that you already ran, so you can conclude that the partial proportional
odds model fit in the previous section is appropriate.
16
SAS Global Forum 2013 Statistics and Data Analysis
of the selection algorithm. The following sections discuss these two approaches by using the Docvisit data
set (Cameron and Trivedi 1998, p. 68) from Example 54.17 of the PROC LOGISTIC documentation in the
SAS/STAT 12.1 User’s Guide.
data Docvisit;
input Sex Age Agesq Income Levyplus Freepoor Freerepa
Illness Actdays Hscore Chcond1 Chcond2 Dvisits;
if ( Dvisits > 2) then Dvisits = 2;
Dvisitsp1= Dvisits+1;
datalines;
1 0.19 0.0361 0.55 1 0 0 1 4 1 0 0 1
The empirical cumulative logits in Figure 20(A) suggest that Actdays and Income should have unequal
slopes, whereas Age, Agesq, and Hscore might have unequal slopes. The E.XjY O / plots in Figure 20(B)
show that the non-ordinality of the Levyplus and Chcond1 mean curves gives them unequal slopes. The
slight non-ordinality of the Age, Agesq, Freerepa, and Sex mean curves and their slight divergence from
the expected curves suggest they might have unequal slopes. Finally, choose effects that either plot is sure
17
SAS Global Forum 2013 Statistics and Data Analysis
of or that both sets of plots agree on: let Actdays, Income, Levyplus, Chcond1, Age, and Agesq have
unequal slopes. Table 5 contains these conclusions in the Graphical row.
Second, the results from the following Down likelihood ratio tests are displayed in Table 5, and tests with
p-values less than 0.1 are chosen to have unequal slopes.
Method Sex Age Agesq Income Levyplus Freepoor Freerepa Illness Actdays Hscore Chcond1 Chcond2
Graphical equal neq neq neq neq equal equal equal neq equal neq equal
LRTDown 0.298 0.113 0.053 0.003 0.899 0.598 0.158 0.354 0.011 0.827 0.405 0.804
Iterative equal neq neq neq equal equal neq equal neq equal equal equal
Besides Agesq, Income, and Actdays, Table 5 shows that the methods have little agreement on which vari-
ables should have nonproportional odds. However, you might not have enough data to make a determination
for nonsignificant effects, so the disagreement might ultimately be unimportant.
Perform a stepwise selection for the three competing models. The following program runs the model that is
given in the Graphical row of Table 5, and the other models differ only in the UNEQUALSLOPES specification.
The results are summarized in the section “Selection Results.”
18
SAS Global Forum 2013 Statistics and Data Analysis
Selection Results
The selection methods in the preceding sections all choose Sex, Freepoor, Illness, and Hscore as
proportional odds effects and Agesq and Actdays as general effects. Of these methods, copying the
variables and using the selection methods is the easiest to implement using PROC LOGISTIC. Finish the
Docvisit analysis by fitting the selected model as follows:
CONCLUSION
This paper shows you how to use the LOGISTIC procedure to fit proportional odds models and how to
interpret the results. The UNEQUALSLOPES option, available in SAS/STAT 12.1, is introduced to fit partial
proportional odds models and general models to cumulative logits. Two graphical methods for assessing
19
SAS Global Forum 2013 Statistics and Data Analysis
proportionality are discussed. The UNEQUALSLOPES option also enables you to determine which variables
satisfy the proportional odds assumptions and which do not.
Expanding the scope of the LOGISTIC procedure is an active area of SAS/STAT development. You can look
for the following features in a future release of PROC LOGISTIC: enabling effects to have both general and
proportional odds structures, fitting the unconstrained partial proportional odds model, fitting generalized
logit links with proportional odds and partial proportional odds models, and providing alternative ordinal links
such as the adjacent-category logit.
REFERENCES
Agresti, A. (2010), Analysis of Ordinal Categorical Data, 2nd Edition, New York: John Wiley & Sons.
Cameron, A. C. and Trivedi, P. K. (1998), Regression Analysis of Count Data, Cambridge: Cambridge
University Press.
Chatfield, C. (1995), Problem Solving: A Statistician’s Guide, 2nd Edition, Boca Raton, FL: Chapman &
Hall/CRC.
Harrell, F. E. (2001), Regression Modeling Strategies, New York: Springer-Verlag.
Koch, G. G., Amara, I. A., and Singer, J. M. (1985), “A Two-Stage Procedure for the Analysis of Ordinal
Categorical Data,” in P. K. Sen, ed., Biostatistics: Statistics in Biomedical, Public Health, and Environmental
Sciences, Amsterdam: Elsevier Science.
McCullagh, P. and Nelder, J. A. (1989), Generalized Linear Models, 2nd Edition, London: Chapman & Hall.
Peduzzi, P., Concato, J., Kemper, E., Holford, T. R., and Feinstein, A. R. (1996), “A Simulation Study of the
Number of Events per Variable in Logistic Regression Analysis,” Journal of Clinical Epidemiology, 49,
1373–1379.
Peterson, B. L. and Harrell, F. E. (1990), “Partial Proportional Odds Models for Ordinal Response Variables,”
Journal of the Royal Statistical Society, Series B, 39, 205–217.
SAS Institute Inc. (2012), “SAS Note 37944: Plots to Assess the Proportional Odds Assumption in an Ordinal
Logistic Model,” http://support.sas.com/kb/37944.
Simonoff, J. S. (2003), Analyzing Categorical Data, New York: Springer-Verlag.
Stokes, M. E., Davis, C. S., and Koch, G. G. (2012), Categorical Data Analysis Using SAS, 3rd Edition, Cary,
NC: SAS Institute Inc.
ACKNOWLEDGMENTS
The author is grateful to Maura Stokes, Randy Tobias, David Schlotzhauer, Ed Huddleston, Anne Baxter,
and Tim Arnold of the Advanced Analytics Division at SAS Institute Inc. for their valuable assistance in the
preparation of this paper.
CONTACT INFORMATION
Your comments and questions are valued and encouraged. Contact the author:
Bob Derr
SAS Institute Inc.
SAS Campus Drive
Cary, NC 27513
[email protected]
SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of
SAS Institute Inc. in the USA and other countries. ® indicates USA registration.
Other brand and product names are trademarks of their respective companies.
20