Three Approaches To Ordinal Classification (Slides 2009)
Three Approaches To Ordinal Classification (Slides 2009)
2 Boosting-like Approach
4 Conclusions
Ordinal classification consists in predicting a label ta-
ken from a finite and ordered set for an object described
by some attributes.
2 Boosting-like Approach
4 Conclusions
Denotation:
• K – number of classes
• y – actual label
• ŷ – predicted label
• x – attributes
• f (x) – prediction (ranking or utility) function
• L(·) – loss function
• J·K – Boolean test
Ordinal Classification – Probability Estimation:
• Prediction risk is defined by a loss matrix:
0 1 2 3
1 0 1 2
L(y, ŷ) =
2 1 0 1
3 2 1 0
Ordinal Classification – Probability Estimation:
• Bayes decision for the loss matrix L(y, ŷ) is given by:
K
ŷ ∗ = arg min
X
Pr(y = k|x)L(k, ŷ).
ŷ
k=1
where
y◦• = sgn(y◦ − y• ),
and f (x) is a ranking (or utility) function.
K−1
X
L(y, f (x), θ) = Jyk (f (x) − θk ) 0K,
k=1
−5 −4 −3 −2 −1 0 1 2 3 4 5
f (x)
Ordinal Classification – Threshold Loss:
• This approach shares characteristics of the previous two.
• Comparison of an object to thresholds instead to all other
training objects – lower complexity, but linear algorithms
exist for rank loss minimization in ordinal classification
settings.
• Joint solution for all K − 1 binary problems – no need of
isotonization of conditional probabilities, but the result is a
single value.
• Weighted threshold loss can approximate any loss matrix.
1 Three Approaches to Ordinal Classification
2 Boosting-like Approach
4 Conclusions
Boosting-like Algorithms for Three Approaches:
• Prediction function is an ensemble of decision rules:
M
X
f (x) = α0 + rm (x).
m=1
CD = 1.076
ENDER−Abs RankRules
ORDER
Ordinal ENDER
4 3 2 1
Experimental Results:
• There is almost no quantitative difference in performance
and time consumption: RankRules is slightly slower.
• Qualitative differences: Ordinal ENDER is related to
probability estimation, but RankRules to AUC maximization.
• Ensemble of decision rules are competitive to: RankBoost
AE, ORBoost-All, SVM-IMC.
1 Three Approaches to Ordinal Classification
2 Boosting-like Approach
4 Conclusions
Ordinal Matrix Factorization:
• Given sparse matrix Y of observed values build a model
based on matrix factorization:
Y ' Ŷ = UVT
M
X
ŷij = uim vjm .
m=1
2 Boosting-like Approach
4 Conclusions
Conclusions:
• Nature of ordinal classification?
• Three approaches to ordinal classification.
• Boosting-like algorithm: rather qualitative than quantitative
differences between these approaches.
• Ordinal Matrix factorization: in progress . . .