0% found this document useful (0 votes)

107 views75 pages

Agresti Ordinal Tutorial

Es una presentación o tutorial realizado por el Distinguido Dr. Agresti, este trata de variables ordinales y su manejo

Uploaded by

Ken Matsuda

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

107 views75 pages

Agresti Ordinal Tutorial

Es una presentación o tutorial realizado por el Distinguido Dr. Agresti, este trata de variables ordinales y su manejo

Uploaded by

Ken Matsuda

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 75

Ordinal data tutorial

Modeling Ordinal Categorical Data

Alan Agresti

Prof. Emeritus, Dept. of Statistics, University of Florida

Visiting Prof., Statistics Dept., Harvard University

Presented for Harvard University Statistics Department

October 23, 2010
These notes: www.stat.ufl.edu/aa/ordinal/ord.html

Ordinal categorical responses

Patient recovery, quality of life (excellent, good, fair, poor)

Pain (none, little, considerable, severe)
Diagnostic evaluation (definitely normal, probably normal,
equivocal, probably abnormal, definitely abnormal)

Political philosophy (very liberal, slightly liberal, moderate,

slightly conservative, very conservative)

Government spending (too low, about right, too high)

Categorization of an inherently continuous variable, such as
body mass index, BMI = weight(kg)/[height(m)]2 ,
measured as (< 18.5, 18.5-25, 25-30, > 30)
for (underweight, normal weight, overweight, obese)
For ordinal response variable y with c categories, our focus is on
modeling how

P (y = j), j = 1, 2, . . . , c,
depends on explanatory variables x (categorical and/or
quantitative).
The models treat observations on y at fixed x as multinomial.

Outline
Section 1: Logistic Regression Models Using Cumulative Logits
(Proportional odds and extensions)
Section 2: Other Ordinal Response Models
(adjacent-categories and continuation-ratio logits, stereotype
model, cumulative probit, log-log links, count data responses)

Section 3 on software summary and Section 4 summarizing

research work on ordinal modeling included for your reference but
not covered in these lectures

This is a shortened version of a 1-day short course for JSM 2010,

based on Analysis of Ordinal Categorical Data (2nd ed., Wiley,
2010), referred to in notes by OrdCDA.

Focus of tutorial
Survey of approaches to modeling ordinal categorical responses
Emphasis on concepts, examples of use, complicating issues,
rather than theory, derivations, or technical details
Examples of how to conduct methods using SAS, but output
provided to enhance interpretation of methods, not to teach
SAS. For R (and S-Plus) and Stata, we list functions and give
references for details in Section 3; e.g., detailed tutorial by
Laura Thompson shows how to use R for nearly all models in
this tutorial (link at www.stat.ufl.edu/aa/cda/software.html).

Joe Lang (Univ. of Iowa) R function mph.fit fits some

nonstandard models we consider (link at
www.stat.ufl.edu/aa/ordinal/ord.html).

We assume familiarity with basic categorical data methods

(e.g., logistic regression, likelihood-based inference).

But first, .... , why not just assign scores to the ordered categories
and use ordinary regression methods?

With categorical data, there is nonconstant variance, so

ordinary least squares (OLS) is not optimal.

In iterative fitting process for ML or WLS assuming multinomial

data, at some settings of explanatory variables, estimated

mean may fall below lowest score or above highest score and
fitting fails.
For binary response, this approach simplifies to linear
probability model, P (y

= 1) = + 0 x, (i.e., response

scores 1, 0), which rarely works with multiple explanatory

variables.

With categorical data, we may want estimates of conditional

probabilities rather than conditional means.

Regardless of fitting method or distributional assumption,

ceiling effects and floor effects can cause bias in results.

Example: Floor effect (Sec. 1.3 of OrdCDA)

For underlying continuous variable y , suppose

y = 20.0 + 0.6x 40z +

with x

uniform(0, 100), P (z = 0) = P (z = 1) = 0.50,

N (0, 102 ).

For a random sample of size n

= 100, suppose

y = 1 if y 20, y = 2 if 20 < y 40, y = 3 if 40 < y 60,

y = 4 if 60 < y 80, y = 5 if y > 80.

When x

< 50 with z = 1, there is a very high probability that

observations fall in the lowest category of y .
As a consequence, least squares line when z
the slope of least squares line when z
statistically and practically significant).

= 1 has only half

= 0 (and interaction is

o
o z=0
1 z=1

o
o

o
o
o o
o
oo
o
o
oo
o
o
o o
1
o
oo o
oo
o
o
o
o1
o
o o
o
o
o
oo
o
o
1
o
o
o
o
1
1
1 1
11 11 1
1
11 1
1
o oo
1
o
1
1
1
1
1
1
1
1
1 1
1
1
11
1
1
1
1
1
1
1
111 1 1
1
1
1
1
11
1

o ooo o o oo o o o oooo 1oo

o oo oo

1 1o

o oo

oooo ooo

o oo o

o111111 11 1o1111 11 1 1

1 111 1o1111 1111

1 11 1 1 1 1 11

60
x

100

60
x

100

1 Logistic Regression Models Using

Cumulative Logits
Ordinal Associations in Contingency Tables
(Section 2.2 of OrdCDA)

nij = count in row i, column j of r c table cross

classifying row variable x and column variable y

Notation:

pij = nij /n, where n = total sample size (joint)

When y response and x explanatory, conditional

pj|i = nij /ni+ , where ni+ = total count in row i.

P
Then,
j pj|i = 1 for each i.
Sample conditional cumulative proportions,

Fj|i = p1|i + + pj|i , j = 1, 2, . . . , c,

recognize ordering of categories of y .
Denote population conditional probabilities in row i by

j|i = P (y = j | x = i), j = 1, 2, . . . , c,
or (1 , 2 , . . . , c ) when suppress explanatory variables

Ordinal odds ratios: (text Figure 2.2, p. 20, OrdCDA)

For 22 table, sample odds ratio is

p1|1 /p2|1
p11 p22
n11 n22

=
=
=
p1|2 /p2|2
p12 p21
n12 n21
For r

c tables, (r 1)(c 1) ordinal odds ratios include:

Local odds ratios

nij ni+1,j+1
L

ij =
ni,j+1 ni+1,j
Global odds ratios
P

P
P
n
)(
ai
bj ab
a>i
b>j nab )
G

P
P
P
ij = P
( ai b>j nab )( a>i bj nab )
(

Cumulative odds ratios

P
n
)(
Fj|i /(1 Fj|i )
bj ib
b>j ni+1,b )
C

P
ij = P
=
( b>j nib )( bj ni+1,b )
Fj|i+1 /(1 Fj|i+1 )
(

Corresponding population ordinal odds ratios:

Local odds ratios

L
ij
=

P (x = i, y = j)P (x = i + 1, y = j + 1)
P (x = i, y = j + 1)P (x = i + 1, y = j)

Global odds ratios

G
ij
=

P (x i, y j)P (x > i, y > j)

P (x i, y > j)P (x > i, y j)

Cumulative odds ratios

C
ij
=

P (y j | x = i)/P (y > j | x = i)
P (y j | x = i + 1)/P (y > j | x = i + 1)

Another ordinal odds ratio, used for sequential processes

such as survival, is the continuation odds ratio,

CO
ij

P (y = j | x = i)/P (y > j | x = i)
=
P (y = j | x = i + 1)/P (y > j | x = i + 1)

For a given ordinal odds ratio, association is called

positive when all log odds ratios are positive,
negative when all log odds ratios are negative.
L
C
If all log ij
> 0, then all log ij
>0
G
C
> 0.
> 0, then all log ij
If all log ij

Ordinal odds ratios are natural parameters for ordinal logit

models (e.g., effects in the cumulative logit model presented

next are summarized by cumulative odds ratios).

Alternative ways to summarize r c tables include summary

measures of association such as

(1) extensions of Kendalls tau that summarize relative

numbers of concordant (C ) and discordant (D ) pairs:
gamma =

= (C D)/(C + D)

(Sec. 7.1 of OrdCDA)

stochastic superiority measure for 2c tables

P (y1 > y2 ) + 12 P (y1 = y2 )

(Sec. 2.1 of OrdCDA)

(2) correlation measures for fixed or rank scores (Sec. 7.2)

Cumulative Logit Model with Proportional Odds

(Sec. 3.23.5 of OrdCDA)

y an ordinal response (c categories), x an explanatory variable

Model

P (y j), j = 1, 2, , c 1, using logits

logit[P (y

j)] = log[P (y j)/P (y > j)]

= j + x, j = 1, . . . , c 1

This is called a cumulative logit model

As in ordinary logistic regression, effects described by odds ratios

(comparing odds of being below vs. above any point on the scale,
so cumulative odds ratios are natural)
For fixed j , looks like ordinary logistic regression for binary
response (below j , above j )

(Figure 3.1 from OrdCDA, p. 47)

Model satisfies
log

P (y j | x1 )/P (y > j | x1 )
= (x1 x2 )
P (y j | x2 )/P (y > j | x2 )

for all j (Proportional odds property)

Model assumes effect identical for every cutpoint,

j = 1, , c 1
= cumulative log odds ratio for 1-unit increase in predictor
For r c table with scores (1, 2, . . . , r) for rows, e is
assumed uniform value for cumulative odds ratio.

Model extends to multiple explanatory variables,

logit[P (y

j)] = j + 1 x1 + + k xk

that can be qualitative or quantitative

(use indicator variables for qualitative explanatory vars)

For subject i, estimated conditional distribution function is

exp(
j + xi )
P (yi j) =
0

1 + exp(
j + xi )

Estimated probability of outcome j is

P (yi = j) = P (yi j) P (yi j 1)

Logistic regression is special case c = 2
Uses ordinality of y without assigning category scores
Can motivate proportional odds structure with regression
model for underlying continuous latent variable

(Anderson and Philips 1981, related probit model Aitchison

and Silvey 1957, McKelvey and Zavoina 1975)

observed ordinal response

underlying continuous latent variable,

) with = (x) = 0 x
thresholds (cutpoints) = 0 < 1 < . . . < c = such
cdf G(y

that

y = j if j1 < y j
Ex. earlier in notes, p. 6. Then (Figure 3.4, p. 54 of OrdCDA)

P (y j | x) = P (y j | x) = G(j 0 x)
Model G1 [P (y j | x)] = j 0 x
Get cumulative logit model when G = logistic cdf

(G1 = logit).

So, cumulative logit model fits well when regression model holds for
underlying logistic response.
Note: Model often expressed as
logit[P (y
Then, j

j)] = j 0 x.

> 0 has usual interpretation of positive effect

(Software may use either. Same fit, estimates except for sign)

Other properties of cumulative logit models

Can use similar model with alternative cumulative link

link[P (yi

j)] = j 0 xi

of cumulative prob.s (McCullagh 1980); e.g., cumulative probit

model (link = inverse of standard normal cdf) applies naturally
when underlying regression model has normal y .

Effects invariant to choice and number of response

categories (If model holds for given response categories, holds

with same when response scale collapsed in any way).

For subject i, let (yi1 , . . . , yic ) be binary indicators of the

response, where yij = 1 when response in category j . For
independent multinomial observations at values xi of the
explanatory variables for subject i, the likelihood function is
n Y
c
Y

i=1
c
n Y
Y

i=1

j=1

P (Yi j | xi ) P (Yi j 1 | xi )

j=1

n Y
c
Y

i=1

P (Yi = j | xi )

yij ff

yij ff
0

exp(j + xi )
exp(j1 + xi )

1 + exp(j + 0 xi )
1 + exp(j1 + 0 xi )

yij ff

Model fitting requires iterative methods. Log likelihood is

concave (Pratt 1981). To get standard errors,

Newton-Raphson inverts observed information matrix

2 L()/a b (e.g., SAS PROC GENMOD)

Fisher scoring inverts expected information matrix

E( 2 L()/a b ) (e.g., SAS PROC LOGISTIC).

McCullagh (1980) provided Fisher scoring algorithm for

cumulative link models and described more general model also

having dispersion effects.

Inference uses standard methods for testing H0 : j = 0

(likelihood-ratio, Wald, score tests) and inverting tests of H 0 :
j = j0 to get confidence intervals for j .
(Wald z

j j0
SE , or

z 2 2 poorest method for small n)

Software for ML fitting includes PROC LOGISTIC and

GENMOD in SAS, the polr function (proportional odds logistic

regression) in MASS library distributed with R (or S-Plus), the

oglm program in Stata, and the plum program in SPSS.

Checking goodness of fit

With nonsparse contingency table data, can check goodness of

fit using Pearson X 2 , deviance G2 comparing observed cell
counts to expected frequency estimates.

At setting xi of predictors with ni =

j=1

nij multinomial

observations, expected frequency estimates equal

ij = ni P (y = j), j = 1, . . . , c.
Pearson test statistic is

2
X (nij

)
ij
.
X2 =

ij
i,j

Deviance (likelihood-ratio test statistic for testing that model

holds against unrestricted alternative) is

G2 = 2

X
i,j

nij log

nij
.

df = No. multinomial parameters no. model parameters

With sparse data, continuous predictors, can use such
measures to compare nested models.

Example: Detecting trend in dose response

Effect of intravenous medication doses on patients with
subarachnoid hemorrhage trauma (p. 207, OrdCDA)
Glasgow Outcome Scale (y)
Treatment

Veget.

Major

Minor

Good

Death

State

Disab.

Recov.

Placebo

Low dose

Med dose

High dose

Group (x)

Model with linear effect of dose (scores x = 1, 2, 3, 4) on

cumulative logits for outcome,

logit[P (y

has ML estimate

j)] = j + x

= 0.176 (SE = 0.056)

Likelihood-ratio test of H0

= 0 has test stat. = 9.6 (df = 1, P =

0.002), based on twice difference in maximized log likelihoods

compared to simpler model with

= 0.

SAS for modeling dose-response data

data trauma;
input dose outcome count @@;
datalines;
1 1 59 1 2 25 1 3 46 1 4 48 1 5 32
2 1 48 2 2 21 2 3 44 2 4 47 2 5 30
3 1 44 3 2 14 3 3 54 3 4 64 3 5 31
4 1 43 4 2 4 4 3 49 4 4 58 4 5 41
;
proc logistic; freq count; * proportional odds cumulative logit model;
model outcome = dose / aggregate scale=none;
run;
---------------------------------------------------------------------Deviance and Pearson Goodness-of-Fit Statistics
Criterion
Value
DF
Value/DF
Pr > ChiSq
Deviance
18.1825
11
1.6530
0.0774
Pearson
15.8472
11
1.4407
0.1469
Testing Global Null Hypothesis: BETA=0
Test
Chi-Square
DF
Pr > ChiSq
Likelihood Ratio
9.6124
1
0.0019
Score
9.4288
1
0.0021
Wald
9.7079
1
0.0018

Parameter
Intercept
Intercept
Intercept
Intercept
dose

1
2
3
4

Analysis of Maximum Likelihood Estimates

Standard
Wald
DF
Estimate
Error
Chi-Square
1
-0.7192
0.1588
20.5080
1
-0.3186
0.1564
4.1490
1
0.6916
0.1579
19.1795
1
2.0570
0.1737
140.2518
1
-0.1755
0.0563
9.7079

Pr > ChiSq
<.0001
0.0417
<.0001
<.0001
0.0018

Goodness of fit statistics:

= 11),

X 2 = 15.8 and G2 = 18.2 (df = 16 5

P -values = 0.15 and 0.18.

Odds ratio interpretation: For dose i + 1, estimated odds of

outcome

y (instead of < y ) equal exp(0.176) = 1.19 times

estimated odds for dose i, with 95% confidence interval
e0.1761.96(0.056) = (1.07, 1.33).
Odds ratio for dose levels (rows) 1 and 4 equals
e(41)0.176 = 1.69
Any equally-spaced scores (e.g. 0, 10, 20, 30) for dose provide
, SE ).
same fitted values and same test statistics (different
Unequally-spaced scores more natural in many cases (e.g.,

doses may be 0, 125, 250, 500). Sensitivity analysis usually

shows substantive results dont depend much on that choice,

unless data highly unbalanced (e.g., Graubard and Korn 1987).

Alternative analysis treats dose as factor, using indicator

variables. Deviance reduces only 0.12, df = 2. With 1 = 0:
2 = 0.12, 3 = 0.32, 4 = 0.52 (SE = 0.18 each)
Testing H0 :

1 = 2 = 3 = 4 gives LR stat. = 9.8 (df =

3, P = 0.02).
Using ordinality often increases power (focused on df = 1).

For simplicity of showing data, our examples use contingency table

data, but in general the data file may have both categorical and
quantitative explanatory variables.
Example: SAS modeling of mental health

y = mental impairment
(1=well, 2=mild impairment, 3=moderate impairment, 4=impaired)

x1 = number of life events

x2 = socioeconomic status (1 = high, 0 = low)
Subj
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20

Mental SES
Well
1
Well
1
Well
1
Well
1
Well
0
Well
1
Well
0
Well
1
Well
1
Well
1
Well
0
Well
0
Mild
1
Mild
0
Mild
1
Mild
0
Mild
1
Mild
1
Mild
0
Mild
1

Life
1
9
4
3
2
0
1
3
3
7
1
2
5
6
3
1
8
2
5
5

Subj
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40

Mental
Mild
Mild
Mild
Mild
Moderate
Moderate
Moderate
Moderate
Moderate
Moderate
Moderate
Impaired
Impaired
Impaired
Impaired
Impaired
Impaired
Impaired
Impaired
Impaired

SES
1
0
1
1
0
1
0
0
1
0
0
1
1
1
0
0
0
1
0
0

Life
9
3
3
1
0
4
3
9
6
4
3
8
2
7
5
4
4
8
8
9

data impair;
input mental ses life;
datalines;
1 1 1
1 1 9
1 1 4
...
4 0 8
4 0 9
;
proc logistic;
model mental = life ses ;
run;
proc genmod;
model mental = life ses / dist=multinomial link=clogit lrci type3;
run;
OUTPUT FROM PROC GENMOD
Analysis Of Parameter Estimates

Parameter
Intercept1
Intercept2
Intercept3
life
ses

DF
1
1
1
1
1

Estimate
-0.2819
1.2128
2.2094
-0.3189
1.1112

Standard
Error
0.6423
0.6607
0.7210
0.1210
0.6109

Likelihood Ratio
95% Confidence
Limits
-1.5615
0.9839
-0.0507
2.5656
0.8590
3.7123
-0.5718
-0.0920
-0.0641
2.3471

ChiSquare
0.19
3.37
9.39
6.95
3.31

LR Statistics For Type 3 Analysis

Source
DF
Chi-Square
Pr > ChiSq
life
1
7.78
0.0053
ses
1
3.43
0.0641

e.g., 95% likelihood-ratio confidence interval for the cumulative odds ratio for SES is (e0.064 ,

e2.347 ) = (0.94, 10.45);

the odds of mental impairment below any particular point could be as much as 10.45 times as high for those with high SES compared
to those with low SES, for a given level of life events

Alternative ways of summarizing effects

Some researchers find odds ratios difficult to interpret.

Can compare probabilities or cumulative probs for y directly,
(y = 1) or P (y = c) at maximum and
such as comparing P
minimum values of a predictor (at means of other predictors).

(y
ex.: At mean life events of 4.3, P
(y
and P

= 1) = 0.37 at high SES

= 1) = 0.16 at low SES.

(y
For high SES, P

= 1) = 0.70 at x1 = min = 0 and

P (y = 1) = 0.12 at x1 = max = 9.
(y
For low SES, P

= 1) = 0.43 at x1 = min and

P (y = 1) = 0.04 at x1 = max.

Summary measures of predictive power include

(1) concordance index (prob. that observations with different

outcomes are concordant with predictions)

(2) correlation between y and estimated mean of conditional
dist. of y from model fit, based on scores {vj } for y
(mimics multiple correlation, Sec. 3.4.6 of OrdCDA).

Checking fit and selecting a model

Lack of fit may result from omitted predictors (e.g., interaction

between predictors), violation of proportional odds assumption,

wrong link function, dispersion as well as location effects.

Some software (e.g., PROC LOGISTIC) provides score test of

proportional odds assumption, by comparing model to more

general non-proportional odds model with effects { j }. This

test applicable also when X 2 , G2 dont apply, but is liberal

(i.e., P(Type I error) too high) and more general model can
have cumulative probs out-of-order.

Can check particular aspects of fit using (1) likelihood-ratio test

to compare to more complex models (test statistic = change in
deviance), (2) standardized cell residuals such as

rij =

(
(nij
ij )
or
SE

k=1

nik ) (
SE

ik )
k=1

Even if proportional odds model has lack of fit, it may usefully

summarize first-order effects and have good power for testing

H0 : no effect, because of its parsimony (e.g., p. 30 example).

Other criteria besides significance tests can help select a good

model, such as by minimizing
AIC

= 2(log likelihood number of parameters in model)

which penalizes a model for having many parameters. This

attempts to find model for which fit is closest to reality, and
overfitting (too many parameters) can hurt this.

Advantages of utilizing ordinality of response include:

No interval-scale assumption about distances between
response categories (i.e., no scores needed for y )
Greater variety of models, including ones that are more
parsimonious than models that ignore ordering (such as
baseline-category logit models)
Greater statistical power for testing effects (compared to
treating categories as nominal), because of focusing effect on
smaller df .

Improved power using ordinality

Consider r

c table with ordered rows, columns.

H0 : independence between x and y

P (nij ij )2

ij = (ni+ n+j )/n

ignores orderings, df = (r 1)(c 1).
Pearson

X =

with

How does power compare to testing H0 :

= 0 against Ha :
6= 0 in cumulative logit model, logit[P (y j)] = j + xi ,
with scores {xi = i} for rows (i.e., using orderings, df = 1)?

(Could use LR test, score test, or Wald test)

Powers when underlying bivariate normal, correlation 0.20, uniform

row and column probs, n

= 100, significance level 0.05:

Pearson X 2

Ordinal test

0.23

0.31

0.16

0.42

0.12

0.48

0.09

0.51

1010

Note: Inference for single parameter also less susceptible to

problems (e.g., infinite estimates) due to sparseness of data

Sample size for comparing two groups

(Sec. 3.7.2 of OrdCDA)
Want power 1 in -level test for effect of size 0 (in

proportional odds model). Let {j } = marginal probabilities for y .

For two-sided test with equal group sample sizes, need
approximately (Whitehead 1993)

n = 12(z/2 + z )

/[02 (1

j3 )],

Setting {j

= 1/c} provides lower bound for n. Then, sample

size n(c) needed for c categories satisfies
n(c)/n(2) = 0.75/[1 1/c2 ].
Relative to continuous response (c
efficiency (1 1/c2 ).

= ), using c categories has

Substantial loss of information from collapsing to binary response,

but little gain with c more than about 5. In medical research,
continuous variables often converted to binary, which introduces
measurement error and loss of power.

Cumulative logit models without proportional odds

(Sec. 3.6 of OrdCDA)
Generalized model permits effects of explanatory variables to differ
for different cumulative logits,
logit[P (yi

j)] = j + j xi , j = 1, . . . , c 1.

Each predictor has c 1 parameters, allowing different effects for

logit[P (yi

1), logit[P (yi 2)], . . . , logit[P (yi c 1)].

Even if this model fits better, for reasons of parsimony a simple

model with proportional odds structure is sometimes preferable.

Effects of explanatory variables easier to summarize and

interpret.

With large n, small P -value in test of proportional odds may

reflect statistical significance, not practical significance.

Effect estimators using simple model are biased but may have
smaller MSE than estimators from more complex model, and

tests may have greater power, especially when more complex

model has many more parameters.

Is variability in effects great enough to make it worthwhile to

use more complex model?

Example: Religious fundamentalism by region (2006 GSS data)

y = Religious Beliefs
x = Region

Fundamentalist

Moderate

Liberal

Northeast

92 (14%)

352 (52%)

234 (34%)

Midwest

274 (27%)

399 (40%)

326 (33%)

South

739 (44%)

536 (32%)

412 (24%)

West/Mountain

192 (20%)

423 (44%)

355 (37%)

Create indicator variables {ri } for region and consider model

logit[P (y

j)] = j + 1 r1 + 2 r2 + 3 r3

Score test of proportional odds assumption compares with model

having separate {i } for each logit, that is, 3 extra parameters:
----------------------------------------------------------------------Score Test for the Proportional Odds Assumption
Chi-Square
DF
Pr > ChiSq
93.0162
3
<.0001
-----------------------------------------------------------------------

SAS for GSS religion and region data

data religion;
input region fund count;
datalines;
1 1 92
1 2 352
1 3 234
2 1 274
2 2 399
2 3 326
3 1 739
3 2 536
3 3 412
4 1 192
4 2 423
4 3 355
;
proc genmod; weight count; class region;
model fund = region / dist=multinomial link=clogit lrci type3 ;
run;
proc logistic; weight count; class region / param=ref;
model fund = region / aggregate scale=none;
run;
-----------------------------------------------------------------GENMOD output:
Analysis Of Parameter Estimates

Parameter
Intercept1
Intercept2
region
region
region
region

1
2
3
4

DF
1
1
1
1
1
0

Estimate
-1.2618
0.4729
-0.0698
0.2688
0.8897
0.0000

Standard
Error
0.0632
0.0603
0.0901
0.0830
0.0758
0.0000

Likelihood Ratio
95% Confidence
Limits
-1.3863
-1.1383
0.3548
0.5910
-0.2466
0.1068
0.1061
0.4316
0.7414
1.0385
0.0000
0.0000

ChiSquare
398.10
61.56
0.60
10.48
137.89
.

Model assuming proportional odds has (with 4

= 0)

(1 , 2 , 3 ) = (0.07, 0.27, 0.89)

For more general model,

(1 , 2 , 3 ) = (0.45, 0.43, 1.15) for logit[P (Y 1)]

(1 , 2 , 3 ) = (0.09, 0.18, 0.58) for logit[P (Y 2)].

1 reflects lack of stochastic ordering of first and

Change in sign of
fourth regions
Compared to resident of West, a Northeast resident is less likely to

1
be fundamentalist (see

= 0.45 < 0 for logit[P (Y 1)])

but slightly more likely to be fundamentalist or moderate and slightly

1
less likely to be liberal (see

= 0.09 > 0 for logit[P (Y 2)]).

Peterson and Harrell (1990) proposed partial proportional odds

model falling between proportional odds model and more general
model (Sec. 3.6.4 of OrdCDA),
logit[P (yi

j)] = j + xi + j ui , j = 1, . . . , c 1.

An alternative possible model adds dispersion effects

(McCullagh 1980, Sec. 5.4 of OrdCDA)

j 0 x
.
logit[P (y j)] =
exp( 0 x)

Example: Smoking Status and Degree of Heart Disease

Smoking

Degree of Heart Disease

status

Smoker

350 (23%)

307 (20%)

345 (22%)

481 (31%)

67 (4%)

Non-smoker

334 (45%)

99 (13%)

117 (16%)

159 (22%)

30 (4%)

y ordinal: 1 = No disease, ..., 5 = Very severe disease

x binary: 1 = smoker, 0 = non-smoker
Sample cumulative log odds ratios:

1.04, 0.65, 0.46, 0.07.

Consider model letting effect of x depend on j ,
logit[P (Y

j)] = j + 1 x + (j 1)2 x.

Cumulative log odds ratios are

C = ,
log 11
1

C = + ,
log 12
1
2

Model fits well (G2

C = + 2 ,
log 13
1
2

C = + 3 .
log 14
1
2

= 3.43, df = 2, P -value = 0.18)

Obtained using Joe Langs mph.fit function in R

(analysis 2 at www.stat.ufl.edu/aa/ordinal/mph.html).
ML estimates

1 = 1.017 (SE = 0.094), 2 = 0.298 (SE = 0.047)

give estimated cumulative log odds ratios
C = 1.02,
log
11

C = 0.72,
log
12

C = 0.42,
log
13

C = 0.12.
log
14

Some Models that Langs mph.fit R Function Can Fit by ML:

mph stands for multinomial Poisson homogeneous models,

which have general form

L() = X
for probabilities or expected frequencies in a contingency
table, where L is a general link function (Lang 2005).

Important special case is generalized loglinear model

C log(A) = X
for matrices C and A and vector of parameters .

This includes ordinal logit models, such as cumulative logit;

e.g., A forms cumulative probs and their complements at each
setting of explanatory vars (each row has 0s and 1s), and C
forms contrasts of log probs to generate logits (each row
contains 1, 1, and otherwise 0s).

Includes models for ordinal odds ratios, such as model where

all global log odds ratios take common value .
(OrdCDA, Sec. 6.6)

Another special case has form A = X, which includes

multinomial mean response model that mimics ordinary

regression (scores in each row of A). (OrdCDA, Sec. 5.6)

2 Other Ordinal Response Models

a. Models using adjacent-category logits (ACL)
(Sec. 4.1 of OrdCDA)

log[P (yi = j)/P (yi = j + 1)] = j + 0 xi

Odds ratio uses adjacent categories, whereas in cumulative
logit model it uses entire response scale (so, interpretations

use local odds ratios instead of cumulative odds ratios)

Model also has proportional odds structure, for these logits

(effect same for each cutpoint j )
Corresponding model for category probabilities is
0

P (yi = j) =

exp(j + xi )
, j = 1, . . . , c1
Pc1
0
1 + k=1 exp(k + xi )

Anderson (1984) noted that if

(x | y = j) N (j , )
then

with

0
P (y = j | x)
log
= j + j x
P (y = j + 1 | x)

j = 1 (j j+1 )
Equally-spaced means imply ACL model holds with same
effects for each logit.

ACL and cumulative logit models with proportional odds

structure fit well in similar situations and provide similar

substantive results (both imply stochastic orderings of

conditional distributions of y at different predictor values)

Which to use? Cumulative logit extends inference to underlying

continuum and is invariant with respect to choice of response

categories. ACL gives effects in terms of fixed categories,

which is preferable when want to provide interpretations for
given categories rather than underlying continuum. ACL effects
are estimable with retrospective studies (e.g., case-control).

Connection with baseline-category logit models

Baseline-category logits (BCL) with baseline c are

log

2
c1
1
, log
, . . . , log
.
c
c
c

Since

log

j
c

= log

j
+log
j+1

j+1
+ +log
j+2

ACL model

log

j (x)
= j + 0 x
j+1 (x)

can be fitted with software for BCL model

j (x)
log
c (x)

c1
X
k=j

k + 0 (c j)x

= j + 0 uj
with adjusted predictor uj

= (c j)x.

c1
,
c

Example: Stem Cell Research and Religious Fundamentalism

Stem Cell Research
Religious

Definitely

Probably

Gender

Beliefs

Fund

Not Fund

Female

Fundamentalist

34 (22%)

67 (43%)

30 (19%)

25 (16%)

Moderate

41 (25%)

83 (52%)

23 (14%)

14 (9%)

Liberal

58 (39%)

63 (43%)

15 (10%)

12 (8%)

Fundamentalist

21 (19%)

52 (46%)

24 (21%)

15 (13%)

Moderate

30 (27%)

52 (47%)

18 (16%)

11 (10%)

Liberal

64 (45%)

50 (36%)

16 (11%)

11 (8%)

Male

Definitely

For gender g (1 = females, 0 = males) and religious beliefs treated

quantitatively with x = (1, 2, 3), ACL model

log(j /j+1 ) = j + 1 x + 2 g
is equivalent to BCL model

log(j /4 ) = j + 1 (4 j)x + 2 (4 j)g

We set first predictor equal to 3x in equation for log(1 /4 ), 2x

in equation for log(2 /4 ), and x in equation for log(3 /4 );
e.g., for those liberal on religion (x

= 3), values in model matrix for

religion predictor are 9, 6, 3 for each gender.

Values in model matrix for gender are 3, 2, 1 for females and 0, 0, 0
for males.
data stemcell;
input religion scresrch gender count;
datalines;
1 1 0 21
1 1 1 34
1 2 0 52
1 2 1 67
...
3 4 1 12
;
proc catmod order=data; weight count; population religion gender;
model scresrch = (1 0 0 3 0, 0 1 0 2 0, 0 0 1 1 0,
1 0 0 3 3, 0 1 0 2 2, 0 0 1 1 1,
1 0 0 6 0, 0 1 0 4 0, 0 0 1 2 0,
1 0 0 6 3, 0 1 0 4 2, 0 0 1 2 1,
1 0 0 9 0, 0 1 0 6 0, 0 0 1 3 0,
1 0 0 9 3, 0 1 0 6 2, 0 0 1 3 1) / ML NOGLS;
---------------------------------------------------------------------Maximum Likelihood Analysis of Variance
Source
DF
Chi-Square
Pr > ChiSq
-------------------------------------------------Model|Mean
4
135.66
<.0001
Likelihood Ratio
13
12.00
0.5279
Effect
Parameter
Estimate Std. Error Chi-Square
Pr > ChiSq
----------------------------------------------------------------------Model
1
-0.5001
0.3305
2.29
0.1302
2
0.4508
0.2243
4.04
0.0444
3
-0.1066
0.1647
0.42
0.5178
(RELIGION)
4
0.2668
0.0479
31.07
<.0001
(GENDER)
5
-0.0141
0.0767
0.03
0.8539
-------------------------------------------------------------------------

For moderates, estimated odds of (definitely fund) vs. (probably

fund) are exp(0.2668) = 1.31 times estimated odds for
fundamentalists, whereas estimated odds of (definitely fund)
vs. (definitely not fund) are exp[3(0.2668)]

= 2.23 times the

estimated odds for fundamentalists, for each gender.

Ordinal models with trend in location display strongest

association with most extreme categories. e.g., for liberals,

estimated odds of (definitely fund) vs. (definitely not) are

exp[2(3)(0.2668)] = 4.96 times estimated odds for

fundamentalists, for each gender.

Model describes 18 multinomial probabilities (3 for each

religiongender combination) using 5 parameters. Deviance
G2 = 12.00, df = 18 5 = 13 (P -value = 0.53).
Similar substantive results with cumulative logit model.
1
Religious beliefs effect larger (

= 0.488, SE = 0.080),

since refers to entire response scale. However, statistical

1 /SE)
significance similar, with (

> 5 for each model.

Connection with ordinal loglinear models

(Sec. 6.26.3 OrdCDA)
For contingency tables, ACL models are equivalent to Poisson
loglinear models, called association models, that use
equally-spaced scores for y (Goodman 1979, 1985). e.g., for r c

table with ordered rows and columns, ACL model for row scores u i ,

log[P (y = j + 1)/P (y = j)] = j + ui

is equivalent to linear-by-linear association model for expected
frequencies {ij

= E(nij )},

log ij = + xi + yj + ui vj ,
with {vj

= j}. (Find log[i,j+1 /ij ] and simplify)

Effect describes local log odds ratios

(uniform association for {ui = i}, {vj = j})
Related literature for correspondence analysis models and
equivalent canonical correlation models, which use an

association term to model [ij

rather than [log ij

ij (indep)]

log ij (indep)]

(Goodman 1986)

Association models can use alternative measures

(e.g., global odds ratios, Dale 1986, Sec. 6.6 of OrdCDA)

b. Models using continuation-ratio logits

(Sec. 4.2 of OrdCDA)

log[P (yi = j)/P (yi j + 1)], j = 1, ..., c 1, or

log[P (yi = j + 1)/P (yi j)], j = 1, ..., c 1
Let j

= P (y = j | y j) =

j
j ++c

Then

log

j
j+1 + + c

= log[j /(1 j )],

Of interest when a sequential mechanism determines the

response outcome (Tutz 1991) or for grouped survival data

(Prentice and Gloeckler 1978)

Simple model with proportional odds structure is

logit[j (x)]

= j + x, j = 1, . . . , c 1,
0

More general model logit[j (x)] = j + j x

has fit equivalent to fit of c 1 separate binary logit models,

because multinomial factors into binomials,

p(yi1 , . . . , yic ) = p(yi1 )p(yi2 | yi1 ) p(yic | yi1 , . . . , yi,c1 ) =

bin[1, yi1 ; 1 (xi )] bin[1 yi1 yi,c2 , yi,c1 ; c1 (xi )].

Example: Tonsil Size and Streptococcus

Tonsil Size
Carrier

Not enlarged

Enlarged

Greatly Enlarged

Yes

19 (26%)

29 (40%)

24 (33%)

497 (37%)

560 (42%)

269 (20%)

Let x = whether carrier of Streptococcus pyogenes (1 = yes, 0 = no)

Continuation-ratio logit model fits well (deviance 0.01, df

log

= 1):

1
2
= 1 + x, log
= 2 + x
2 + 3
3

has

= 0.528 (SE = 0.196). Model estimates an assumed

common value exp(0.528) = 0.59 for cumulative odds ratio
from first part of model and for local odds ratio from second part.

e.g., given that tonsils were enlarged, for carriers, estimated odds
of response enlarged rather than greatly enlarged were 0.59 times
estimated odds for non-carriers.
By contrast, cumulative logit model estimates

exp(0.6025) = 0.55 for each cumulative odds ratio, and ACL

model estimates exp(0.429) = 0.65 for each local odds ratio.
(Both these models also fit well: Deviances 0.30, 0.24, df = 1.)

------------------------------------------------------------------------------data tonsils; * look at data as indep. binomials;

input stratum carrier success failure;
n = success + failure;
datalines;
1 1 19 53
1 0 497 829
2 1 29 24
2 0 560 269
;
proc genmod data=tonsils; class stratum;
model success/n = stratum carrier / dist=binomial link=logit lrci type3;
run;
-----------------------------------------------------------------------

Analysis Of Parameter Estimates

Parameter
Intercept
stratum
stratum
carrier

1
2

DF
1
1
0
1

Estimate
0.7322
-1.2432
0.0000
-0.5285

Standard
Error
0.0729
0.0907
0.0000
0.1979

Likelihood Ratio
95% Confidence
Limits
0.5905
0.8762
-1.4220
-1.0662
0.0000
0.0000
-0.9218
-0.1444

LR Statistics For Type 3 Analysis

ChiSource
DF
Square
Pr > ChiSq
carrier
1
7.32
0.0068

ChiSquare
100.99
187.69
.
7.13

Adjacent Categories Logit and Continuation Ratio Logit

Models with Nonproportional Odds

As in cumulative logit case, model of proportional odds form fits

poorly when there are substantive dispersion effects

Each model has a more general non-proportional odds form,

the ACL version being

log[P (yi = j)/P (yi = j + 1)] = j + 0j xi

Unlike cumulative logit model, these models do not have

structural problem of out-of-order cumulative probabilities

Models lose ordinal advantage of parsimony, but effects still

have ordinal nature, unlike BCL models

Can fit general ACL model with software for BCL model,
} estimates to j = , since
converting its {
j+1
j
j

j
j
j+1
log
= log
log
.
j+1
c
c

c. Stereotype model: Multiplicative paired-category

logits
(Sec. 4.3 of OrdCDA)
ACL model with separate effects for each pair of adjacent
categories is equivalent to standard BCL model

0
j
log
= j + j x, j = 1, . . . , c 1.
c

Disadvantage: lack of parsimony (treats response as nominal)

c 1 parameters for each predictor instead of a single
parameter

No. parameters large when either c or no. of predictors large

Anderson (1984) proposed alternative model nested between ACL
model with proportional odds structure and the general ACL or BCL
model with separate effects {j } for each logit.

Stereotype model:

0
j
log
= j + j x, j = 1, . . . , c 1.
c

For predictor xk , j k represents log odds ratio for

categories j and c for a unit increase in xk . By contrast,
general BCL model has log odds ratio jk for this effect, which
requires many more parameters

{j } are parameters, treated as scores for categories of y .

Like proportional odds models, stereotype model has
advantage of single parameter k for effect of predictor xk
(for given scores {j }).
Stereotype model achieves parsimony by using same scores
for each predictor, which may or may not be realistic.

Identifiability requires location and scale constraints on {j },

such as (1 = 1, c = 0) or (1 = 0, c = 1).
Corresponding model for category probabilities is
0

exp(j + j xi )
P (yi = j) =
Pc1
0
1 + k=1 exp(k + k xi )

Model is multiplicative in parameters, which makes model

fitting awkward

(gnm add-on function to R fits this and other nonlinear models).

d. Cumulative Probit Models (Sec. 5.2 of OrdCDA)

Denote cdf of standard normal by .
Cumulative probit model is

1 [P (y j)] = j + 0 x, j = 1, . . . , c 1
j) = 1/2 when j + 0 x = 0
since (0) = 1/2 = P (standard normal r.v. < 0)
e.g., P (y

As in proportional odds models (logit link), effect is same for

each cumulative probability.
(Here, not appropriate to call this a proportional odds model,
because interpretations do not apply to odds or odds ratios.)

Properties

Motivated by underlying normal regression model for latent

variable y with constant (WLOG, can set = 1).
Then, coefficient k of xk has interpretation that a unit
increase in xk corresponds to change in E(y ) of k (change
of k standard deviation, when 6= 1), keeping fixed other
predictor values.

Logistic and normal cdf s having same mean and standard

deviation look similar, so cumulative probit models and

cumulative logit models fit well in similar situations.

Standard logistic distribution G(y) = ey /(1 + ey ) has mean

0 and standard deviation / 3 = 1.8. ML estimates from

cumulative logit models tend to be about 1.6 to 1.8 times ML
estimates from cumulative probit models.

Quality of fit and statistical significance essentially same for

cumulative probit and cumulative logit models. Both imply

stochastic orderings at different x levels and are designed to

detect location rather than dispersion effects.

Example: Religious fundamentalism by highest educational

degree (GSS data from 1972 to 2006, huge n)
Religious Beliefs
Highest Degree

Fundamentalist

Moderate

Liberal

Less than high school

4913 (43%)

4684 (41%)

1905 (17%)

High school

8189 (32%)

11196 (44%)

6045 (24%)

Junior college

728 (29%)

1072 (43%)

679 (27%)

Bachelor

1304 (20%)

2800 (43%)

2464 (38%)

Graduate

495 (16%)

1193 (39%)

1369 (45%)

For cumulative link model

link[P (y
using scores {xi

j)] = j + xi

= i} for highest degree,

= 0.206 (SE = 0.0045) for probit link

= 0.345 (SE = 0.0075) for logit link

----------------------------------------------------------------------data religion;
input degree religion count;
datalines;
0 1 4913
0 2 4684
0 3 1905
1 1 8189
1 2 11196
1 3 6045
...
4 3 1369
;
proc logistic; weight count;
model religion = degree / link=probit aggregate scale=none;
---------------------------------------------------------------------Score Test for the Equal Slopes Assumption
Chi-Square
DF
Pr > ChiSq
0.2452
1
0.6205
Deviance and Pearson Goodness-of-Fit Statistics
Criterion
Value
DF
Value/DF
Pr > ChiSq
Deviance
48.7072
7
6.9582
<.0001
Pearson
48.9704
7
6.9958
<.0001
Model Fit Statistics
Intercept
Intercept
and
Criterion
Only
Covariates
AIC
105532.77
103395.09
SC
105534.19
103397.21
-2 Log L
105528.77
103389.09
Standard
Wald
Parameter
DF
Estimate
Error
Chi-Square
Pr > ChiSq
Intercept 1
1
-0.2240
0.00799
785.6659
<.0001
Intercept 2
1
0.9400
0.00868
11736.5822
<.0001
degree
1
-0.2059
0.00447
2120.0908
<.0001
-----------------------------------------------------------------------

From probit = 0.206, for category increase in highest

degree, mean of underlying response on religious beliefs

estimated to decrease by 0.21 standard deviations.

From logit = 0.345, estimated odds of response in

fundamentalist rather than liberal direction multiply by

exp(0.345) = 0.71 for each category increase in degree.

e.g., estimated odds of fundamentalist rather than moderate or
liberal for those with less high school education are

1/ exp[4(0.345)] = 4.0 times estimated odds for those

with graduate degree.
For each category increase in highest degree, mean of
underlying response on religious beliefs estimated to decrease
by 0.345/(/

3) = 0.19 standard deviations.

Goodness of fit?
Cumulative probit: Deviance = 48.7 (df
Cumulative logit: Deviance = 45.4 (df

= 7)

Either link treating education as factor passes goodness-of-fit test,

but fit not practically different than with simpler linear trend model.
e.g., Probit deviance = 5.2, logit deviance = 2.4 (df

1
Probit

= 4)

= 0.83, 2 = 0.56, 3 = 0.46, 4 = 0.17, 5 = 0

e. Cumulative Log-Log Links (Sec. 5.3 of OrdCDA)

Logit and probit links have symmetric S shape, in sense that

P (y j) approaches 1.0 at same rate as it approaches 0.0.

Model with complementary log-log link

log{ log[1 P (y j)]} = j + 0 x

approaches 1.0 at faster rate than approaches 0.0. It and
corresponding log-log link,

log{ log[P (y j)]},

based on underlying skewed distributions (extreme value) with cdf
of form G(y)

= exp{ exp[(y a)/b]}.

Model with complementary log-log link has interpretation that

P (y > j | x with xk = x+1) = P (y > j | x with xk = x)exp(k )

Most software provides complementary log-log link, but can fit

model with log-log link by reversing order of categories and

using complementary log-log link.

Example: Life table for gender and race (percent)

Males

Females

Life Length

White

Black

White

Black

0-20

1.3

2.6

0.9

1.8

20-40

2.8

4.9

1.3

2.4

40-50

3.2

5.6

1.9

3.7

50-65

12.2

20.1

8.0

12.9

Over 65

80.5

66.8

87.9

79.2

Source: 2008 Statistical Abstract of the United States

For gender g (1 = female; 0 = male), race r (1 = black; 0 = white),
and life length y , consider model

log{ log[1 P (y j)]} = j + 1 g + 2 r

Good fit with this model or a cumulative logit model or a cumulative
probit model

data lifetab;
input sex $ race $ age count;
datalines;
m w 20 13
f w 20
9
m b 20 26
f b 20 18
m w 40 28
f w 40 13
m b 40 49
f b 40 24
m w 50 32
f w 50 19
m b 50 56
f b 50 37
m w 65 122
f w 65 80
m b 65 201
f b 65 129
m w 100 805
f w 100 879
m b 100 668
f b 100 792
;
proc logistic; weight count; class sex race / param=ref;
model age = sex race / link=cloglog aggregate scale=none;
run;
proc genmod; weight count; class sex race;
model age = sex race / dist=multinomial link=CCLL lrci type3 obstats;
run;
----------------------------------------------------------------------------The GENMOD Procedure

Parameter
Intercept1
Intercept2
Intercept3
Intercept4
sex
sex
race
race

f
m
b
w

Analysis Of Parameter Estimates

Likelihood Ratio
Standard
95% Confidence
DF Estimate
Error
Limits
1
-4.2127
0.1338
-4.4840
-3.9587
1
-3.1922
0.0911
-3.3741
-3.0168
1
-2.5821
0.0764
-2.7340
-2.4347
1
-1.5216
0.0623
-1.6458
-1.4015
1
-0.5383
0.0703
-0.6769
-0.4011
0
0.0000
0.0000
0.0000
0.0000
1
0.6107
0.0709
0.4725
0.7506
0
0.0000
0.0000
0.0000
0.0000

ChiSquare
991.04
1226.85
1143.60
596.43
58.57
.
74.20
.

1 = 0.538, 2 = 0.611
Gender effect:

P (y > j | g = 0, r) = [P (y > j | g = 1, r)]exp(0.538)

Given race, proportion of men living longer than a fixed time equals
proportion for women raised to exp(0.538)

= 1.71 power.

Given gender, proportion of blacks living longer than a fixed time

equals proportion for whites raised to exp(0.611)

= 1.84 power.

Cumulative logit model has gender effect = 0.604, race effect =

0.685.

If denotes odds of living longer than some fixed time for white
women, then estimated odds of living longer than that time are

exp(0.604) = 0.55 for white men

exp(0.685) = 0.50 for black women
exp(0.604 0.685) = 0.28 for black men

f. Modeling Non-Standard Count Data

Count responses often have zero inflation

e.g., number of medical appointments a subject had in past
year; some subjects have 0 observation as result of chance,
others because of doctor-avoidance phobia or (in U.S.) cost
and/or lack of medical insurance.

Models for clustered zero-inflated count data include

(a) hurdle model that uses logistic regression to model whether
an observation is zero or positive and a separate loglinear
model with a truncated distribution for the positive counts
(b) a zero-inflated Poisson model that for each observation
uses a mixture of a Poisson loglinear model and a degenerate
distribution at 0
(c) a zero-inflated negative binomial model that allows
overdispersion relative to the zero-inflated Poisson model.

Model (a) requires separate parameters for the effects of

explanatory variables in the logistic model and in the loglinear

model.

Models (b) and (c) require separate parameters for the effects
of explanatory variables on the mixture probability and in the

loglinear model.

The models can encounter fitting difficulties if there is

zero-deflation at any settings of explanatory variables.

When Yt has relatively few distinct count outcomes, simple

alternative approach applies a cumulative logit random effects

model to the count data (Min and Agresti 2005).

The first category is the zero outcome and each other count
outcome is a separate category

When large number of count values recorded, collapse into

ordered categories (at least 4 categories to avoid power loss)

This approach has advantage of single set of parameters for

describing effects. Those parameters describe effects overall,

rather than conditional on a response being positive.

Modeling Repeated Measures of Zero-Inflated Data

ex. Pharmaceutical study comparing two treatments for a disease
with 118 patients, half randomly allocated to each treatment.
Response = number of episodes of a certain side effect, with this
count observed at six times.
Side Effect Frequencies for Treatment A and Treatment B
Frequencies
Treatment

312

278

Total

590

Complete data and SAS code for various analyses at

www.stat.ufl.edu/aa/ordinal/ord.html
Other explanatory variable: time elasped since previous
observation.
Min and Agresti showed strong evidence of zero inflation for
standard models for counts, such as Poisson model with a random
intercept.

For ordinal approach, group Yt into (0, 1, 2, 3, 4, > 4)

Random effects model: For subject i,

logit[P (Yit

j)] = ui +j +1 tr+2 log(time), j = 0, . . . , 4,

where tr is indicator for whether the subject uses treatment A.

From analysis discussed in text (p. 292) with independent random

N (0, 2 ) integrated out to get likelihood,

1 = 0.977 has SE = 0.431

effects ui

At each observation time and for a fixed time elapsed since

previous observation, estimated odds that number of side effects
falls below any fixed level with treatment A are

exp(0.977) = 2.66 times estimated odds for treatment B.

Estimate
u

= 1.73 (with SE = 0.25) of variability among {ui }

suggests considerable within-subject positive correlation among

the repeated responses.

Bayesian Ordinal Data Analysis

(Chapter 11 of OrdCDA)
Recall the posterior density function is proportional to the product
of the prior density function with the likelihood function.
Some highlights:

For multinomial data, Dirichlet distribution serves as conjugate

prior over (1 , . . . , c ) probability simplex. Useful for
smoothing contingency tables (Lindley 1964, Good 1965).

For 2 c ordinal table with two independent Dirichlet priors,

Altham (1969) derived expression for posterior probability that

one distribution is stochastically larger than the other. (p. 336

OrdCDA)

Logistic-normal prior (multivariate normal for vector of logits)

adapts better to ordinality and is more flexible, through, e.g.,
Corr[logit(a ), logit(b )] = |ab|
(Leonard 1973, smoothing a histogram)

Bayesian Approaches with Ordinal Models

For modeling, in which parameters are real-valued, using a

multivariate normal prior provides broad scope. Prior is

non-informative when prior values large (e.g., 1000).

Posterior distributions approximated using MCMC methods

(e.g., with software such as WinBUGS), most simply using
methods for normal responses based on latent variable
connections (Albert and Chib 1993, P. Hoff 2009 First Course
in Bayesian Statistical Methods, Ch. 12)

SAS ver. 9.2 has BAYES option in PROC GENMOD, for

univariate Y (e.g., binomial, Poisson, normal) but not
multinomial. (Sec. 11.3.5, 11.3.6 of OrdCDA)
Can fit (1) continuation-ratio logit model using connection
between multinomial and a product of binomials , (2)
adjacent-categories logit model using connection with Poisson
loglinear model.

Priors for cumulative logit models should recognize constraint

1 < 2 < < c1 . Sec. 11.3.4 of OrdCDA shows ex.,
also summarizing by posterior P ( > 0), P ( < 0). Table
A.8 (p. 353) shows PROC MCMC for Bayesian analysis.

Example: Tonsil Size and Streptococcus

Tonsil Size
Carrier

Not enlarged

Enlarged

Greatly Enlarged

Yes

19 (26%)

29 (40%)

24 (33%)

497 (37%)

560 (42%)

269 (20%)

Continuation-ratio logit model

log

1
2
= 1 + x, log
= 2 + x
2 + 3
3

where x = whether carrier of Streptococcus pyogenes

is a cumulative log odds ratio for the 22 table comparing

column 1 to columns 2 and 3 combined (first logit) and a local log

odds ratio for the 22 table consisting of columns 2 and 3 (second
logit).

Use normal priors for model parameters with means of 0.

For x, let 0.5 = yes, 0.5 = no (instead of 1 = yes and 0 = no), so

the logit has the same prior variability for each logit.

------------------------------------------------------------------------------data tonsils; * look at data as indep. binomials;

input stratum carrier success failure;
n = success + failure; carrier2 = carrier - 0.5; * symmetrize prior
datalines;
1 1 19 53
1 0 497 829
2 1 29 24
2 0 560 269
;
proc genmod data=tonsils; class stratum; * frequentist analysis;
model success/n = stratum carrier / dist=binomial link=logit lrci type3;
proc genmod data=tonsils; class stratum; * Bayesian analysis;
model success/n = stratum carrier2 / dist=binomial link=logit;
bayes coeffprior=normal initialmle diagnostics=mcerror nmc=2000000;
run; * noninformative, takes prior std dev = 1000 for all parameters;
proc genmod data=tonsils; class stratum; * Bayesian analysis;
model success/n = stratum carrier2 / dist=binomial link=logit;
bayes coeffprior=normal (var=1.0) initialmle diagnostics=mcerror nmc=2000000;
run; *informative, takes prior std dev = 1 for all parameters;
-------------------------------------------------------------------------------

Posterior Estimates of . Results in ML row are ML estimate, SE , and

95% profile likelihood confidence interval.

Prior Distribution

Mean

Std Dev

95% Credible Interval

Normal (

= 1000)

0.533

0.199

(0.926, 0.146)

Normal (

= 1.0)

0.518

0.194

(0.902, 0.141)

0.5285

0.198

(0.922, 0.144)

Note: HPD credible interval ok for model parameters, but not sensible for

odds ratios e .

Summary

Logistic regression for binary responses extends in various

ways to handle ordinal responses: Use logits for cumulative

probabilities, adjacent-response categories, continuation ratios.

Stereotype model can treat baseline-category logits ordinally.

models (probit, complementary-log-log), and it can be useful to

handle count data with a multinomial model.

Which model to use? Apart from certain types of data in which

grouped response models are invalid (e.g., cumulative logits

with case-control data), we may consider (1) how we want to

summarize effects (e.g., cumulative probs with cumulative
logit, individual category probs with ACL) and (2) do we want a
connection with an underlying latent variable model (natural
with cumulative logit and other cumulative link models)?

Models extend to multivariate responses using marginal

models and mixed models with random effects (Chap. 8-10).

Other methods require assigning fixed or midrank scores to

response categories, such as extensions of nonparametric

methods to allow for the ties that occur with ordinal data.

3 Software for Ordinal Modeling

See OrdCDA appendix for details, and also some details for SPSS
(not covered here).

SAS

PROC FREQ provides large-sample and small-sample tests of

independence in two-way tables, measures of association and
their estimated SEs, and generalized CMH tests of conditional
independence.

PROC GENMOD fits multinomial cumulative link models and

Poisson loglinear models , and it can perform GEE analyses for

marginal models as well as Bayesian model fitting for binomial

and Poisson data.

PROC LOGISTIC fits cumulative link models.

PROC NLMIXED fits models with random effects and

generalized nonlinear models (e.g., stereotype model).

PROC CATMOD can fit baseline-category logit models by ML,

and hence adjacent-category logit models.

R (and S-Plus)

A detailed discussion of the use of R for models for categorical

data is available on-line in the free manual prepared by Laura

Thompson to accompany Agresti (2002). A link to this manual

is at www.stat.ufl.edu/aa/cda/software.html.

Specialized R functions available from various R libraries. Prof.

Thomas Yee at Univ. of Auckland provides VGAM for vector
generalized linear and additive models
(www.stat.auckland.ac.nz/yee/VGAM).

In VGAM, the vglm function fits wide variety of models.

Possible models include the cumulative logit model (family

function cumulative) with proportional odds or partial
proportional odds or nonproportional odds, cumulative link
models (family function cumulative) with or without common
effects for each cutpoint, adjacent-categories logit models
(family function acat), and continuation-ratio logit models
(family functions cratio and sratio).

Many other R functions can fit cumulative logit and other

cumulative link models. Thompsons manual (p. 121) describes

the polr function from the MASS library. The syntax is simple,
such as
library(MASS)
fit.cum <- polr(y x, data=tab3.1, method=probit)
summary(fit.cum)

The package gee contains a function gee for ordinal GEE

analyses. The package geepack contains a function ordgee for

ordinal GEE analyses.

The package glmmAK contains a function cumlogitRE for using

MCMC to fit cumulative logit models with random effects.

Can fit nonlinear models such as stereotype model using gnm

add-on function to R by Firth and Turner:

www2.warwick.ac.uk/fac/sci/statistics/staff/research/turner/gnm/

R function mph.fit prepared by Joe Lang at Univ. of Iowa can fit

many models for contingency tables that are difficult to fit with

ML, such as mean response models, global odds ratio models,

marginal models.

Stata

The ologit program (www.stata.com/help.cgi?ologit) fits

cumulative logit models.

The oprobit program (www.stata.com/help.cgi?oprobit) fits

cumulative probit models.

Continuation-ratio logit models can be fitted with the ocratio

module (www.stata.com/search.cgi?query=ocratio) and with

the seqlogit module

The stereotype model can be fitted with the slogit program

(www.stata.com/help.cgi?slogit).

The GLLAMM module (www.gllamm.org) can fit a very wide

variety of models, including cumulative logit models with

random effects. See www.stata.com/search.cgi?query=gllamm.

4 Other Work on Modelling Ordinal

Responses
No attempt to be complete; see text notes at end of each chapter of
OrdCDA for more references.
Modelling Association
Association models Anderson and Vermunt (2000), Becker (1989),
Becker and Clogg (1989), Gilula, Krieger and Ritov (1988), Goodman
(1985), Haberman (1981), Kateri, Ahmad, and Papaioannou (1998),
Rom and Sarkar (1992)
Square contingency tables and extensions Agresti (1993), Agresti and
Lang (1993), Becker (1990), Dale (1986), Ekholm et al. (2003), Kateri
and Agresti (1997), Kateri and Papaioannou (1997), Sarkar (1989),
Williamson, Kim, and Lipsitz (1995)
Correspondence analysis, correlation models Beh (1997), Gilula
(1986), Gilula and Ritov (1990), Goodman (1986), Goodman (1996),
Ritov and Gilula (1993)

Modelling Agreement
Latent trait models e.g., Uebersax and Grove (1993), Yang and Becker
(1997)
Loglinear and association models Agresti (1988), Becker (1989, 1990),

Becker and Agresti (1992), Schuster and von Eye (2001), Valet,
Guinot, and Mary (2007)
Random effects Williamson and Manatunga (1997)
ROC and related methods Toledano and Gatsonis (1996), Ishwaran
and Gatsonis (2000)
Measures of agreement Banerjee et al. (1999), Roberts and McNamee
(2005)

Multivariate Models
Marginal models Heagerty and Zeger (1996), Lipsitz, Kim, and Zhao
(1994), Molenberghs and Lesaffre (1994, 1999), Lang, McDonald, and
Smith (1999), Lumley (1996), Stram, Wei, and Ware (1988), Ten Have,
Landis, and Hartzel (1998)
Random effects models Ezzet and Whitehead (1991), Hartzel, Agresti,
and Caffo (2001), Hedeker and Gibbons (1994), Tutz and Hennevogl
(1996), Liu and Hedeker (2006), Ten Have et al. (2000), Xie, Simpson,
and Carroll (2000)
Multilevel models Fielding and Lang (2005), Gibbons and Hedeker
(1997), Grilli and Rampichini (2003, 2007), Steele and Goldstein
(2006), Zaslavsky and Bradlow (1999)

Nonparametric sorts of inference

Inference using rank statistics Akritas and Brunner (1997), Bathke and

Brunner (2003), Brunner and Puri (2001, 2002), Rayner and Best
(2001)
Nonparametric random effects Hartzel, Agresti, and Caffo (2001)

Bayesian Inference
Modelling an ordinal response Lang (1999), Johnson and Albert (1999),
Hoff (2009, Ch. 12)
Multivariate ordinal responses, hierarchical models Albert and Chib
(1993, 2001), Bradlow and Zaslavsky (1999), Cowles et al. (1996),
Kaciroti et al. (2006), Qu and Tan (1998)
Association models Iliopoulos, Kateri, and Ntzoufras (2007), Kateri,
Nicolaou, and Ntzoufras (2005)
Case-control analyses with an ordinal response Mukherjee et al. (2007,
2008), Mukherjee and Liu (2008)

Small-Sample Inference
Exact tests of independence and conditional independence for ordinal
variables Agresti, Mehta and Patel (1990 and linear-by-linear option
in StatXact), Kim and Agresti (1997), Agresti and Coull (1998)
Improved tests from a decision-theoretic perspective Cohen and
Sackrowitz (1991), Berger and Sackrowitz (1997)
Higher-order approximations such as the saddlepoint essentially exact for
single-parameter inference Pierce and Peters (1992), Agresti, Lang,

and Mehta (1993)

Goodness of Fit
Chi-squared statistics inappropriate with continuous predictors or highly
sparse data
Generalization of Hosmer-Lemeshow statistic Lipsitz, Fitzmaurice,
Molenberghs (1996)
For cumulative logit models, can test proportional odds assumption
Brant (1990), Peterson and Harrell (1990)
Goodness-of-link testing Genter and Farewell (1985)

Missing Data
Accounting for drop out Molenberghs, Kenward, and Lesaffre (1997),
Ten Have et al. (2000)
Score test of independence in two-way tables with ordered categories
and extensions for stratified data Lipsitz and Fitzmaurice (1996)
Comparison of likelihood-based and GEE methods for repeated ordinal
responses Mark and Gail (1994), Kenward, Lesaffre, and
Molenberghs (1994)

Order-Restricted Inference
Estimate cell proportions (and conduct tests) assuming solely that a type
of ordinal log odds ratio is uniformly nonnegative

Local odds ratios Patefield (1982), Dykstra et al. (1995), Agresti and
Coull (1998)
Cumulative odds ratios Grove (1980), Robertson and Wright (1981),
Cohen and Sackrowitz (1996), Evans et al. (1997)
Order restrictions on parameters in association models Agresti, Chuang
and Kezouh (1987), Galindo-Garre and Vermunt (2004), Iliopoulos,
Kateri, and Ntzoufras (2007), Ritov and Gilula (1991, 1993)
Marginal modeling Bartolucci et al. (2001), Bartolucci and Forcina
(2002), Colombi and Forcina (2001)

Detailed references are in Bibliography of OrdCDA.

Other areas not discussed here include other model diagnostics,
smoothing ordinal data (e.g., generalized additive models), paired
preference modeling, survival modeling. Can find some info by looking up
the topic in Subject Index of OrdCDA.

Partial Bibliography: Analysis of Ordinal Categorical Data

Some Books
Agresti, A. 2010. Analysis of Ordinal Categorical Data, Wiley, 2nd ed.
Clogg and Shihadeh (1994). Statistical Models for Ordinal Variables, Sage.
Fahrmeir, L., and G. Tutz. 2001. Multivariate Statist. Modelling based on Generalized Linear Models, 2nd ed. Springer-Verlag.
Johnson, V. E., and J. H. Albert 1999. Ordinal Data Modeling. Springer.
McCullagh, P., and J. A. Nelder. 1983, 2nd edn. 1989. Generalized Linear Models. London: Chapman and Hall.

Some Survey Articles

Agresti, A. 1999. Modelling ordered categorical data: Recent advances and future challenges. Statist. Medic. 18: 21912207.
Agresti, A., and R. Natarajan. 2001. Modeling clustered ordered categorical data: A survey. Intern. Statist. Rev., 69: 345-371.
Anderson, J. A. 1984. Regression and ordered categorical variables. J. Roy. Statist. Soc. B 46: 130.
Chuang-Stein, C. and A. Agresti. 1997. A review of tests for detecting a monotone dose-response relationship with ordinal
responses data. Statist. Medic. 16: 25992618.
Goodman, L. A. 1979. Simple models for the analysis of association in cross-classifications having ordered categories. J. Amer.
Statist. Assoc. 74: 537552.
Goodman, L. A. 1985. The analysis of cross-classified data having ordered and/or unordered categories: Association models,
correlation models, and asymmetry models for contingency tables with or without missing entries. Ann. Statist. 13: 1069.
Landis, J. R., E. R. Heyman, and G. G. Koch. 1978. Average partial association in three-way contingency tables: A review and
discussion of alternative tests. Internat. Statist. Rev. 46: 237254.
Liu, I., and A. Agresti. 2005. The analysis of ordered categorical data: An overview and a survey of recent developments (with
discussion). Test 14: 173.
McCullagh, P. 1980. Regression models for ordinal data. 42: J. Royal. Stat. Society, B, 109142.

(Oxford Statistical Science Series 31) Margaret Sullivan Pepe - The Statistical Evaluation of Medical Tests For Classification and Prediction-Oxford University Press (2010) PDF
100% (2)
(Oxford Statistical Science Series 31) Margaret Sullivan Pepe - The Statistical Evaluation of Medical Tests For Classification and Prediction-Oxford University Press (2010) PDF
319 pages
Mixture Modelling For Medical and Health Sciences (Shu Kay NG Liming Xiang Kelvin Kai Wing Yau) (Z-Library)
100% (1)
Mixture Modelling For Medical and Health Sciences (Shu Kay NG Liming Xiang Kelvin Kai Wing Yau) (Z-Library)
315 pages
GLMM in Agriculture and Biology
No ratings yet
GLMM in Agriculture and Biology
436 pages
Weak Convergence and Empirical Processes With Applications To Statistics (A.w. Van Der Vaart - Jon A. Wellner) (Z-Library)
No ratings yet
Weak Convergence and Empirical Processes With Applications To Statistics (A.w. Van Der Vaart - Jon A. Wellner) (Z-Library)
693 pages
(Collett) Modelling Survival Data in Medical Research
100% (3)
(Collett) Modelling Survival Data in Medical Research
174 pages
Survival Analysis With Correlated Endpoints: Joint Frailty-Copula Models
No ratings yet
Survival Analysis With Correlated Endpoints: Joint Frailty-Copula Models
126 pages
(CMBS-NSF 59) Grace Wahba - Spline Models For Observational Data-SIAM (1990)
No ratings yet
(CMBS-NSF 59) Grace Wahba - Spline Models For Observational Data-SIAM (1990)
179 pages
Module 5 - Ordinal Regression
No ratings yet
Module 5 - Ordinal Regression
55 pages
Exploring The Limits of Bootstrap
No ratings yet
Exploring The Limits of Bootstrap
458 pages
Computational Bayesian Statistics
100% (1)
Computational Bayesian Statistics
254 pages
Multiple Imputation of Missing Data
No ratings yet
Multiple Imputation of Missing Data
495 pages
13 Pag Design and Analysis of Experiments in The Health Sciences
100% (1)
13 Pag Design and Analysis of Experiments in The Health Sciences
13 pages
Survival Analysis
100% (1)
Survival Analysis
15 pages
Environmental and Ecological Statistics With R, Second Edition (Song S. Qian)
No ratings yet
Environmental and Ecological Statistics With R, Second Edition (Song S. Qian)
560 pages
Análisis de Supervivencia
100% (2)
Análisis de Supervivencia
441 pages
Hangal - Frailty Models
No ratings yet
Hangal - Frailty Models
307 pages
Robust Nonparametric Statistical Methods Second Edition
100% (3)
Robust Nonparametric Statistical Methods Second Edition
532 pages
Multiple Logistic Regression
No ratings yet
Multiple Logistic Regression
71 pages
Survival Analysis With STATA 1701597623
No ratings yet
Survival Analysis With STATA 1701597623
252 pages
R Manual To Agresti's Categorical Data Analysis
100% (1)
R Manual To Agresti's Categorical Data Analysis
280 pages
RYAN, THOMAS P. - (Wiley Series in Probability and Statistics) Modern Regression Methods - (2
No ratings yet
RYAN, THOMAS P. - (Wiley Series in Probability and Statistics) Modern Regression Methods - (2
658 pages
402 08 Elandt Johnson Survival Models and Data Analysis 1980
No ratings yet
402 08 Elandt Johnson Survival Models and Data Analysis 1980
478 pages
Regression Analysis of Count Data 2nd Ed
No ratings yet
Regression Analysis of Count Data 2nd Ed
9 pages
Survival Analysis-Debby Raden
No ratings yet
Survival Analysis-Debby Raden
98 pages
Logistic Regression
100% (1)
Logistic Regression
21 pages
Pearson Distribution
No ratings yet
Pearson Distribution
11 pages
Modeling Ordinal Categorical Data (Agresti)
No ratings yet
Modeling Ordinal Categorical Data (Agresti)
71 pages
(GAM) Application PDF
No ratings yet
(GAM) Application PDF
30 pages
Hosmer DW & Lemeshow S (1999) - Applied Survival Analysis Regression Modeling of Time To Event Da
No ratings yet
Hosmer DW & Lemeshow S (1999) - Applied Survival Analysis Regression Modeling of Time To Event Da
206 pages
Vba Codes Excel
100% (1)
Vba Codes Excel
46 pages
Landis & Koch 1977
No ratings yet
Landis & Koch 1977
17 pages
Diggle 2013 Statistical Analysis of Spatial and
No ratings yet
Diggle 2013 Statistical Analysis of Spatial and
69 pages
Survival Analysis Dengan Pendekatan R
No ratings yet
Survival Analysis Dengan Pendekatan R
32 pages
Beta Distribution
No ratings yet
Beta Distribution
9 pages
MT 281 Lecture Notes
No ratings yet
MT 281 Lecture Notes
292 pages
Life Tables Survivorship Curves and Popuation Growth
No ratings yet
Life Tables Survivorship Curves and Popuation Growth
18 pages
Cox Proportional Hazard Model
No ratings yet
Cox Proportional Hazard Model
34 pages
Count Data Models in SAS
No ratings yet
Count Data Models in SAS
12 pages
BiodiversityR PDF
No ratings yet
BiodiversityR PDF
128 pages
Solution CH # 5
No ratings yet
Solution CH # 5
39 pages
Introduction To Survival Analysis: BIOST 515 February 26, 2004
No ratings yet
Introduction To Survival Analysis: BIOST 515 February 26, 2004
30 pages
Statistical Models and Methods For Lifetime Data PDF
No ratings yet
Statistical Models and Methods For Lifetime Data PDF
330 pages
Lectures
No ratings yet
Lectures
766 pages
Agresti Ordinal Tutorial
No ratings yet
Agresti Ordinal Tutorial
75 pages
Anderson F. Survival Analysis by Example. Hands On Approach Using R 2016
No ratings yet
Anderson F. Survival Analysis by Example. Hands On Approach Using R 2016
42 pages
835618
No ratings yet
835618
298 pages
Sample Size R Module
No ratings yet
Sample Size R Module
85 pages
SIS Model For An Infectious Disease
No ratings yet
SIS Model For An Infectious Disease
3 pages
Longitudinal Data Analysis
100% (1)
Longitudinal Data Analysis
103 pages
2.lecture2 Ate
No ratings yet
2.lecture2 Ate
61 pages
06 - Natural Experiment (Part 1) PDF
No ratings yet
06 - Natural Experiment (Part 1) PDF
89 pages
IE 6113 Term Project Quality Control and Improvements Spring 2015
0% (1)
IE 6113 Term Project Quality Control and Improvements Spring 2015
7 pages
Doug Bates Mixed Models
No ratings yet
Doug Bates Mixed Models
75 pages
Ordinal and Multinomial Models
100% (1)
Ordinal and Multinomial Models
58 pages
Presentation 2
No ratings yet
Presentation 2
39 pages
GAM: The Predictive Modeling Silver Bullet: Author: Kim Larsen
No ratings yet
GAM: The Predictive Modeling Silver Bullet: Author: Kim Larsen
27 pages
Propensity Scores: A Practical Introduction Using R
No ratings yet
Propensity Scores: A Practical Introduction Using R
21 pages
Generalised Linear Models and Bayesian Statistics
No ratings yet
Generalised Linear Models and Bayesian Statistics
35 pages
Xlguitar
No ratings yet
Xlguitar
11 pages
Models For Polytomous Responses AA 2016-2017
No ratings yet
Models For Polytomous Responses AA 2016-2017
48 pages
Ordered Response Models
No ratings yet
Ordered Response Models
15 pages
91 Ordlogistic
No ratings yet
91 Ordlogistic
4 pages
Lecture 24: Ordinal Logistic Regression
No ratings yet
Lecture 24: Ordinal Logistic Regression
4 pages
Generalized Linear Models
No ratings yet
Generalized Linear Models
109 pages
Cybernetics and Systems: An International Journal
No ratings yet
Cybernetics and Systems: An International Journal
13 pages
ExamplesR Power Law
No ratings yet
ExamplesR Power Law
12 pages
Fractal Structures in Language. The Question of The Imbedding Space
No ratings yet
Fractal Structures in Language. The Question of The Imbedding Space
11 pages
Anova
No ratings yet
Anova
67 pages
5 Paso S Text Mining
No ratings yet
5 Paso S Text Mining
4 pages
Iovitzu Linguistic Publications Updated October 2012
No ratings yet
Iovitzu Linguistic Publications Updated October 2012
3 pages

Agresti Ordinal Tutorial

Uploaded by

Agresti Ordinal Tutorial

Uploaded by

Ordinal data tutorial

Modeling Ordinal Categorical Data

Prof. Emeritus, Dept. of Statistics, University of Florida

Presented for Harvard University Statistics Department

Ordinal categorical responses

Patient recovery, quality of life (excellent, good, fair, poor)

Political philosophy (very liberal, slightly liberal, moderate,

Government spending (too low, about right, too high)

Section 3 on software summary and Section 4 summarizing

This is a shortened version of a 1-day short course for JSM 2010,

Joe Lang (Univ. of Iowa) R function mph.fit fits some

We assume familiarity with basic categorical data methods

With categorical data, there is nonconstant variance, so

In iterative fitting process for ML or WLS assuming multinomial

scores 1, 0), which rarely works with multiple explanatory

With categorical data, we may want estimates of conditional

Regardless of fitting method or distributional assumption,

Example: Floor effect (Sec. 1.3 of OrdCDA)

y = 20.0 + 0.6x 40z + 

uniform(0, 100), P (z = 0) = P (z = 1) = 0.50,

For a random sample of size n

y = 1 if y 20, y = 2 if 20 < y 40, y = 3 if 40 < y 60,

< 50 with z = 1, there is a very high probability that

= 1 has only half

o ooo o o oo o o o oooo 1oo

1 111 1o1111 1111

1 Logistic Regression Models Using

nij = count in row i, column j of r c table cross

pij = nij /n, where n = total sample size (joint)

pj|i = nij /ni+ , where ni+ = total count in row i.

Fj|i = p1|i + + pj|i , j = 1, 2, . . . , c,

Ordinal odds ratios: (text Figure 2.2, p. 20, OrdCDA)

For 22 table, sample odds ratio is

c tables, (r 1)(c 1) ordinal odds ratios include:

Local odds ratios

Cumulative odds ratios

Corresponding population ordinal odds ratios:

Local odds ratios

Global odds ratios

P (x i, y j)P (x > i, y > j)

Cumulative odds ratios

Another ordinal odds ratio, used for sequential processes

For a given ordinal odds ratio, association is called

Ordinal odds ratios are natural parameters for ordinal logit

models (e.g., effects in the cumulative logit model presented

next are summarized by cumulative odds ratios).

Alternative ways to summarize r c tables include summary

(1) extensions of Kendalls tau that summarize relative

(Sec. 7.1 of OrdCDA)

stochastic superiority measure for 2c tables

P (y1 > y2 ) + 12 P (y1 = y2 )

(Sec. 2.1 of OrdCDA)

(2) correlation measures for fixed or rank scores (Sec. 7.2)

Cumulative Logit Model with Proportional Odds

y an ordinal response (c categories), x an explanatory variable

P (y j), j = 1, 2, , c 1, using logits

j)] = log[P (y j)/P (y > j)]

This is called a cumulative logit model

As in ordinary logistic regression, effects described by odds ratios

(Figure 3.1 from OrdCDA, p. 47)

for all j (Proportional odds property)

Model assumes effect identical for every cutpoint,

Model extends to multiple explanatory variables,

that can be qualitative or quantitative

For subject i, estimated conditional distribution function is

Estimated probability of outcome j is

P (yi = j) = P (yi j) P (yi j 1)

(Anderson and Philips 1981, related probit model Aitchison

observed ordinal response

underlying continuous latent variable,

> 0 has usual interpretation of positive effect

Other properties of cumulative logit models

Can use similar model with alternative cumulative link

of cumulative prob.s (McCullagh 1980); e.g., cumulative probit

Effects invariant to choice and number of response

categories (If model holds for given response categories, holds

y = 20.0 + 0.6x 40z +