Lando Rating Transitions 2002

See
discussions, stats, and author profiles for this publication at:

https://www.researchgate.net/publication/222577709
Analyzing Rating Transitions and

Rating Drift with Continuous
Observations
Article in Journal of Banking & Finance March 2002

DOI: 10.1016/S0378-4266(01)00228-X Source: RePEc
CITATIONS READS
316 199
2 authors, including:
David Lando
Copenhagen Business School
39 PUBLICATIONS 4,916 CITATIONS
SEE PROFILE
All content following this page was uploaded by David Lando on 03 April 2015.
The user has requested enhancement of the downloaded file.

Journal of Banking & Finance 26 (2002) 423444
www.elsevier.com/locate/econbase
Analyzing rating transitions and rating

drift with continuous observations
a,* b
David Lando , Torben M. Skdeberg
a
Department of Statistics and Operations Research, University of Copenhagen, Universitetsparken 5,
DK-2100 Copenhagen , Denmark
b
Nordea Investment Management, Strandgade 3, DK-1401 Copenhagen K, Denmark
Abstract
We consider the estimation of credit rating transitions based on continuous-time

observations. Through simple examples and using a large data set from Standard and
Poors, we illustrate the dierence between estimators based on discrete-time cohort
methods and estimators based on continuous observations. We apply semi-parametric
regression techniques to test for two types of non-Markov eects in rating transitions:
Duration dependence and dependence on previous rating. We nd signicant non-
Markov eects, especially for the downgrade movements. 2002 Elsevier Science B.V.
All rights reserved.
JEL classication: C41; G33

Keywords: Rating transitions; Rating drift; Markov chains; Estimation
1. Introduction
Transition matrices are at the center of modern credit risk management. The
reports on rating migrations published by Standard and Poors and Moodys
are studied by credit risk managers everywhere and several of the most
*
Corresponding author. Tel.: +45-3532-0683; fax: +45-3532-0678.
E-mail addresses: [email protected] (D. Lando), [email protected] (T.M.
Skdeberg).
0378-4266/02/$ - see front matter 2002 Elsevier Science B.V. All rights reserved.
PII: S 0 3 7 8 - 4 2 6 6 ( 0 1 ) 0 0 2 2 8 - X
424 D. Lando, T.M. Skdeberg / Journal of Banking & Finance 26 (2002) 423444
prominent risk management tools, such as J.P. Morgans Credit Metrics and
McKinseys Credit Portfolio View are built around estimates of rating mi-
gration probabilities.
In essence, the estimates published by these agencies and in the published
academic literature use a discrete-time setting and rely on a cohort method
which estimates the transition rates as follows: Given that there are Ni rms in
a given rating category i at the beginning of the year and that out of this
population Nij have migrated to the category j, then the one year transition rate
is estimated as
Nij
pîj ; j 6 i: 1
Ni
An important consequence of this is that if a transition from i to j does not
occur in a given period, the estimate of the corresponding rate is 0.
The rating agencies of course have access to continuous-time data on rating
transitions and know the exact dates within a year that a company changes its
rating or is downgraded. Similarly, a bank using an internal rating system will
have access to a complete history of rating transitions. We argue in the fol-
lowing, that it is crucial to base the estimation of transition rates on these
continuously observed histories to get ecient estimates of transition rates.
This point is particularly important when estimating rare events such as the
transition from, say AAA in Standard and Poors rating, to default. Very
briey stated, the maximum-likelihood estimator that one obtains for the one-
year transition probability from AAA to default will be (and should be) non-
zero even if there has been no direct or indirect defaults (i.e. default through a
sequence of downgrades) in the period of observation. Briey stated, if in a one
year period there are no transitions from AAA to default, but there are
transitions from AAA to AA and from AA to default (but by other rms), then
the estimator for transitions from AAA to default should be non-zero, since
evidently there is a chance of defaulting within a year after successive down-
grades, even if it did not happen for one single rm in the sample. The con-
tinuous-time estimator captures this whereas the discrete-time method does
not.
Apart from getting a better grip on the rare events, the continuous time
methodology based on modern survival analytic techniques (see Skdeberg,
1998) and similar observations in the parallel work by Kavvathas (2000) has a
number of additional advantages:
1. The framework permits a rigorous formulation and testing of assump-
tions rating drift and other non-Markov type behavior (such as seasoning
eects) investigated in for example Altman and Kao (1992a,b), Lucas and
Lonski (1992), and Carty and Fons (1993).
2. The dependence on external covariates can be formulated and tested, and
changes in regimes either due to business cycles, as in for example Nickell
D. Lando, T.M. Skdeberg / Journal of Banking & Finance 26 (2002) 423444 425
et al. (2000), or changes in rating policies, as indicated by Blume et al. (1998)

can be quantied.
3. The continuous-time formulation hooks up nicely with rating-based term
structure modeling in which one tries to estimate and calibrate yield curves for
dierent rating classes, see for example Jarrow et al. (1997), Lando (1998) and
Das and Tufano (1996).
4. Censoring is handled easily within the continuous-time framework. Ac-
cording to Carty (1997) only few (roughly 13%) of the migrations to the not-
rated category are related to changes in credit quality and this observation is
used there and in Nickell et al. (2000) as an argument in support of excluding
the issuers who experience a transition to the not-rated category. Using the
survival theoretic setting of this paper, the conclusion is the opposite: The very
fact that transitions to not rated was caused by rating unrelated event justies
the inclusion of these events as censored variables, thus permitting a full use of
the sample information. The time before a rm migrates to the not-rated
category contains valuable information on the changes that did not occur in
the time before this event. Note that the framework presented in Shumway
(2001) would also allow for censoring to be treated rigorously, but his
framework is still discrete-time.
5. When estimating homogeneous chains in continuous-time by estimating
the generator of the continuous-time Markov chain, we avoid the embedding
problem for Markov chains (for more on this in a rating modeling context, see
Israel et al. (1999)). This problem arises because not every discrete time
Markov chain can be realized as a discretized continuous-time chain. Hence it
may be impossible from a one-year transition matrix (for example if it contains
zeros in some of the non-default rows) to construct a continuous-time chain
which has the one-year transition matrix as its marginal. However, the con-
tinuous-time chain is very useful in that it allows cash ows occurring at all
dates to be weighted by the exact survival probability corresponding to the
chosen time horizon.
One of the most important goals behind the current eort to revise the Basel
Capital Accord is to replace the existing risk weights with a system which more
clearly recognizes the dierences in risk of various instruments. It is likely that
rating systems will play a larger role in quantifying these dierences. The
statistical framework presented in this paper is a natural framework for
quantitatively assessing internal and external rating systems used by nancial
institutions.
The outline of the paper is as follows:
In Section 2, we briey describe the data used in our study. In Section 3, we
present the basic idea of continuous-time estimation in the framework of a
homogeneous Markov chain. A simple example illustrates the importance of
using continuous-time data. In Section 4 we describe how a time-inhomoge-
neous transition probability matrix may be estimated. Section 5 outlines the
statistical framework and formulates a rigorous notion of rating drift. In

Section 6, the test results are presented, and Section 7 concludes. Appendix A
summarizes the technical material needed for the paper.
2. The data
The data covers 17 years of rating history in the S&P system starting on 1
January, 1981 and ending 31 December, 1997. There are a total of 6659 rms
which are rated at some point or another. The ratings are listed in the classi-
cation based on a total of 22 classications. The top rating is AAA. Then
follows AA+ and from then each of the categories AA, A, BBB, BB, B, CCC
contain three ratings obtained by possibly adding or to the letter grade.
Finally, there are some instances of CC and C ratings and a default category
denoted D. For some results, we have chosen to look at groupings into eight
categories which contain the seven letter categories (without plus or minus, and
CC and C grouped into CCC) and the default category. There are a total of
7282 transitions recorded within the system of eight categories including
transitions to NR. For the system consisting of 18 classes (in which all ratings
including the letter C are grouped into one CCC category) there are a total of
11 606 transitions, again counting the number of transitions to the NR cate-
gory. For each rm the exact transition dates between ratings (including de-
fault) are recorded and so are dates where the rating is discontinued. In these
cases the rm receives the not rated (NR) assignment. There are 114 cases of
transitions back from the NR category.
5405 out of the total population of 6659 rms are US companies. We do not
have the names of these rms but we do have access to the distribution of the
rms on industries. We are mainly looking at the aggregated data set to clearly
illustrate our methods and to get enough observed transitions between cate-
gories to make inference possible. We do, however, briey consider the results
for the nancial rms separately at the end.
3. The time-homogeneous case
Appendix A contains an overview of the necessary theory of Markov chain

modeling that we need for the entire paper. For this section we only need to
note the following facts: Throughout, we consider a K-state Markov chain
where we think of state 1 as the highest rating category and state K as the
default state. We collect the transition probabilities of the Markov chain for a
given time horizon in a K K matrix P t whose ijth element is the probability
of migrating from state i to state j in a time period of t. Just like a discrete-time
Markov process on the rating classes can be obtained by matrix multiplication
from the one-period transition matrix, there exists a simple representation of

the matrices P t for arbitrary time horizons t for a continuous-time chain on
the same state space. The generator matrix K is a K K matrix for which
P t expK t; t P 0: 2
Here, K t is the matrix K multiplied by t on every entry and the exponential
function is a matrix exponential, as dened in Appendix A. The critical thing to
note, is that the transition probabilities for every time horizon is a function of
the generator. Hence, one can obtain maximum-likelihood estimators of the
transition probability matrices by rst obtaining the maximum-likelihood es-
timate of the generator and then applying the matrix exponential function on
this estimate, scaled by the time horizon. The entries of the generator K satisfy
kij P 0 for i 6 j;
X
kii kij :
j6i
These entries describe the probabilistic behavior of the holding time in state i as
exponentially distributed with parameter ki , where ki kii , and the proba-
bility of jumping from state i to j given that a jump occurs is given by kij =ki . To
estimate the elements of the generator under an assumption of time-homoge-
neity we use the maximum likelihood estimator (see for example K uchler and
Srensen, 1997)
Nij T
kîj R T ; 3
0
Yi s ds
where Yi s is the number of rms in rating class i at time s and Nij T is the
total number of transitions over the period from i to j, where i 6 j. The in-
tuition is straightforward: the numerator counts the number of observed
transitions from i to j over the entire period of observation. The denominator
has the number of rm-years spent in state i. Any period a rm spends in a
state will be picked up through the denominator. In this sense all information is
being used. We now illustrate the estimator both through a simple example and
on our data set. The simple example will give the intuition of the procedure.
The application on our data set will test the practical signicance of using the
continuous-time technique.
To illustrate the estimator, consider a rating system consisting of two non-
default rating categories A and B and a default category D. Assume that we
observe over one year the history of 20 rms, of which 10 start in category A
and 10 in category B. Assume that over the year of observation, one A rated
rm changes its rating to category B after one month and stays there the rest of
the year. Assume that over the same period, one B rated rm is upgraded after
two months and remains in A for the rest of the period and a rm which started
in B defaults after six months and stays there for the remaining part of the
period. In this case we have for one of the entries
NAB 1 1
kÂB R 1 0:10084:
Y s ds 9 1=12 10=12
0 A
Proceeding similarly with the other entries (and noting that the state D is as-
sumed to be absorbing and the diagonal elements just make sure that rows sum
to zero) we obtain the estimated generator
0 1
0:10084 0:10084 0
^ @ 0:10909 0:21818 0:10909 A:
K
0 0 0
From this, we obtain the maximum likelihood estimator of the one-year

transition matrix as
0 1
0:90887 0:08618 0:00495
d @ 0:09323 0:80858 0:09819 A:
P1
0 0 1
Had we instead used a cohort method the result would have been
0 1
0:90 0:10 0
d @ 0:10 0:80 0:10 A:
P1
0 0 1
As we see, the traditional method does not capture default risk in the A cat-
egory simply because there is no rm defaulting directly from A. Note that the
continuous-time method produces a strictly positive estimator for default from
A over one year despite the fact that no rm in the sample defaults in one year
from A. This is appropriate because the probability of migrating to B and the
probability of default from B are clearly both positive. As a side remark, note
that in a classical cohort analysis the rm upgraded from B does not provide
more information than the upgrade. Here, it matters exactly when the upgrade
took place, and the six months spent in A with no further change contributes
information to the estimate of the transition intensity from rating class A.
We now consider how this dierence materializes in our Standard and
Poors data set. We consider a 10 year sub-period from 1988 to 1998 to have a
reasonable large starting pool for the cohort method. If we use the cohort
method and estimate the one-year transition rates for each of the ten years and
then take an average of the estimated matrices we obtain the result presented in
Table 1. We have chosen to include transitions to and from the NR category.
As we will see later, we can easily modify the estimates to exclude that category
by using estimation under censoring. In Table 2 we report the estimated gen-
Table 1
This shows the average of 10 one-year transition matrices, each estimated using a cohort method in
the period 19881998a
NR AAA AA A BBB BB B CCC D
NR 0.9939 0.0001 0.0000 0.0001 0.0006 0.0006 0.0004 0.0000 0.0044
AAA 0.0266 0.9040 0.0607 0.0070 0.0000 0.0016 0.0000 0.0000 0.0000
AA 0.0302 0.0054 0.8786 0.0791 0.0039 0.0006 0.0008 0.0000 0.0000
A 0.0401 0.0004 0.0157 0.8903 0.0445 0.0068 0.0017 0.0001 0.0003
BBB 0.0583 0.0001 0.0028 0.0519 0.8375 0.0388 0.0068 0.0018 0.0018
BB 0.0906 0.0000 0.0003 0.0051 0.0795 0.7452 0.0587 0.0110 0.0095
B 0.1268 0.0000 0.0008 0.0015 0.0050 0.0730 0.7081 0.0326 0.0500
CCC 0.1658 0.0020 0.0000 0.0061 0.0089 0.0279 0.1003 0.4842 0.2048
D 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 1.0000
a
We show also transitions to and from the NR category.
Table 2
The maximum-likelihood estimator of the generator based upon continuous-time observation over
the 10-year period 19881998
NR 0.0066 0.0000 0.0001 0.0003 0.0006 0.0010 0.0003 0.0000 0.0043
AAA 0.0248 0.1062 0.0720 0.0071 0.0000 0.0024 0.0000 0.0000 0.0000
AA 0.0322 0.0068 0.1301 0.0858 0.0044 0.0004 0.0004 0.0000 0.0000
A 0.0431 0.0004 0.0144 0.1136 0.0499 0.0045 0.0011 0.0002 0.0000
BBB 0.0551 0.0003 0.0023 0.0548 0.1691 0.0496 0.0061 0.0006 0.0003
BB 0.1017 0.0000 0.0012 0.0078 0.1101 0.3213 0.0866 0.0108 0.0030
B 0.1713 0.0000 0.0027 0.0020 0.0061 0.0904 0.4052 0.1038 0.0290
CCC 0.2099 0.0042 0.0000 0.0084 0.0000 0.0336 0.1301 0.9697 0.5835
D 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000
erator using the estimator given in Eq. (3) and by taking the matrix exponential
of that estimator we obtain the estimated transition probabilities given in Table
3. As we can see from this table, the most important dierence is the fact that
with four decimal points there is a measurable default probability even for the
highest rating categories. but note also the sizeable dierence in the one-year
default probability of a CCC-rated rm when using the continuous-time esti-
mation method. One reason for this dierence is the following: When using a
cohort method based on yearly samples, we will only record a migration from
CCC to default when a rm starts out in CCC in the beginning of the year in
which the default occurs. Many rms in the sample are downgraded to CCC
during the year and only stay there a short time before default. These will not
be recorded as defaults from CCC in the cohort method, but they will be re-
corded in the method based on the continuous sample. This explains the in-
crease in the CCC default frequency. It should also be noted, that in the
sample, almost all ratings observed to be in the C and CC category ended up in
Table 3
The one-year transition matrix estimated from continuous-time data over the period 19881998 as
the matrix exponential of the maximum likelihood estimator of the generator
NR 0.9935 0.0000 0.0001 0.0003 0.0006 0.0009 0.0003 0.0000 0.0043
AAA 0.0248 0.8995 0.0640 0.0091 0.0005 0.0020 0.0001 0.0000 0.0001
AA 0.0321 0.0061 0.8788 0.0761 0.0057 0.0006 0.0004 0.0000 0.0001
A 0.0424 0.0004 0.0129 0.8944 0.0436 0.0047 0.0011 0.0002 0.0002
BBB 0.0545 0.0003 0.0023 0.0479 0.8479 0.0393 0.0063 0.0008 0.0008
BB 0.0965 0.0000 0.0012 0.0090 0.0869 0.7303 0.0612 0.0084 0.0065
B 0.1518 0.0001 0.0022 0.0024 0.0084 0.0643 0.6734 0.0534 0.0440
CCC 0.1429 0.0025 0.0003 0.0053 0.0017 0.0215 0.0674 0.3824 0.3760
D 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 1.0000
default or with a rating withdrawal. The fact that these are grouped into the
CCC category may exaggerate the default frequency of this category.
The results above are mainly for illustration purposes. As we will see below
there are many reasons to believe that the data are not from a homogeneous
Markov chain and we need to modify our estimation methods to take into
account both non-homogeneities and the inuence from exogenous variables
on the rating migration probabilities.
4. Estimating non-homogeneous chains
We have just seen how to use the maximum-likelihood estimator to estimate

the generator and transition matrices using continuous data. Even if the method
assumes time homogeneity, something which is hard to assume over the long
run, it is a useful tool for estimating a one-year transition matrix. However, as we
will see in this section another non-parametric method exists. This method is a
useful tool for replacing the cohort methods over longer periods of time. Con-
sider a non-homogeneous, continuous-time Markov process g with nite state
space S f1; 2; . . . ; Kg whose transition probability matrix for the period from
time s to time t is given by P s; t: Hence, the ijth element of this matrix describes
the probability that the chain starting in state i at date s is in state j at date t.
In this section we will explain the so-called product-limit estimator, or
AalenJohansen estimator, for the transition probabilities P s; t and the re-
lation to our example. Appendix A elaborates and provides further references.
Given that our sample has m transitions over the period from s to t, we can
estimate P s; t consistently as
Y
m
P^s; t I D A^Ti : 4
i1
Here, Ti is a jump time in the interval s; t and

0 1 Ti DN12 Ti DN13 Ti DN1p Ti 1
DN
Y1 Ti Y1 Ti Y1 Ti
Y1 Ti
B C
B DN21 Ti 2 Ti DN23 Ti DN2p Ti C
B Y2 Ti DN
Y2 Ti Y2 Ti
Y2 Ti C
B C
B C
D A^Ti B .. .. .. .. C:
B . . . . C
B C
B DNp1;1 Ti DNp1;2 Ti

DNp1; Ti
Yp1
DNp1;p Ti C
@ Yp1 Ti Yp1 Ti Ti Yp1 Ti A
0 0 0
Here, DNhj Ti denotes the number of transitions observed from state h to j at

date Ti . 1
DNk Ti counts the total number of transitions away from state k at date Ti
and Yk Ti is the number of rms in state k right before date Ti and hence the
diagonal element in row k counts, at a given date Ti , the fraction of the exposed
rms Yk Ti which leaves the state at date Ti . The o-diagonal elements count
the specic types of transitions away from the state divided by the number of
exposed rms. Note that the variable Y automatically incorporates censoring
in that nothing happens to the estimator on the day of a censoring event (if that
is the only event). The number of exposed rms changes, however, and this will
aect the estimate on the next date of an observed transition. Note that the
bottom row is zero in DA because we do not model rms leaving the default
state. Note also, that the rows of the matrix I DATi automatically sum to 1.
In summary, one may view this estimator as a cohort method applied to ex-
tremely short time intervals.
Let us briey consider the method on the example of the previous section to
see how the estimator produces yet another candidate for estimating a one-year
transition probability matrix. To compute the one-year transition probability
matrix non-parametrically, we rst compute
0 1
0:1 0:1 0
DAT1=12 @ 0 0 0 A;
0 0 0
1
This notation is used because Nhj t counts the total number of transitions observed from h to j
from the starting date until time t, and DNhj Ti then is an increment of this process. Note that if
observations were truly in continuous time, we would have no simultaneous jumps and for every
time point t at most one o-diagonal element of D A^t would be non-zero. In practice there are ties
so that several o-diagonal elements can be non-zero at the same time point and the increment of a
particular jump-type DNhj Ti may even be larger than one. Given the relative richness of time
points (days) and the many types of transitions possible from each class, ties and conventions for
handling them do not seem to play an important role in our data set.
0 1
0 0 0
DAT2=12 @ 111 111 0 A;
0 0 0
0 1
0 0 0
DAT1=2 @ 0 0:1 0:1 A:
0 0 0
Therefore we get the AalenJohansen estimator
0 1
0:90909 0:08181 0:00909
d
P0; 1 @ 0:09091 0:81818 0:09091 A:
0 0 1
As we can see, there is a dierence between the estimator based on the gen-
erator and this estimator on the default probability. Hence it makes a dierence
whether we estimate the one-year transition probability based on continuous
observations using the exponential of the generator or the non-parametric
AalenJohansen estimator. One can view the matrix exponential as a smoothed
version, and it is clearly this form which is most suited to risk management in
that it allows computation of estimated default and transition intensities over
arbitrarily short time intervals. The two methods are not dramatically dierent
for large data sets, as illustrated in Tables 4 and 5. As we see, the dierence is
much less signicant than the one between the cohort method and the methods
based on continuous samples.
By comparing this AalenJohansen estimator over longer time horizons to
estimators based on a time-homogeneity assumption, time inhomogeneities will
become apparent. This non-parametric estimator does not give a way of de-
tecting the sources of these inhomogeneities. To formulate statistical hypoth-
Table 4
One-year transition probability matrix estimated for the year 1997a
AAA AA A BBB BB B CCC D
AAA 0.95912 0.03982 0.00096 0.00010 0.00000 0.00000 0.00000 0.00000
AA 0.01249 0.93689 0.04519 0.00524 0.00015 0.00004 0.00000 0.00000
A 0.00011 0.01666 0.93097 0.04906 0.00274 0.00042 0.00001 0.00003
BBB 0.00002 0.00253 0.03635 0.90603 0.03955 0.01398 0.00030 0.00125
BB 0.00000 0.00012 0.00318 0.07866 0.85980 0.05411 0.00317 0.00096
B 0.00000 0.00005 0.00495 0.00385 0.07029 0.87618 0.02941 0.01527
CCC 0.00000 0.00004 0.00091 0.02523 0.02890 0.11823 0.52289 0.30380
D 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 1.00000
a
The estimator is the maximum likelihood estimator based on continuous observations and
derived from the maximum likelihood estimator of the intensity matrix assuming time-homoge-
neity. The estimator is only marginally dierent from the AalenJohansen estimator for the same
period.
Table 5
AalenJohansen estimator for the year 1997a
AAA AA A BBB BB B CCC D
AAA 0.95866 0.03926 0.00184 0.00023 0.00001 0.00000 0.00000 0.00000
AA 0.01273 0.93714 0.04440 0.00544 0.00022 0.00007 0.00000 0.00000
A 0.00010 0.01682 0.93088 0.04880 0.00278 0.00061 0.00000 0.00001
BBB 0.00002 0.00252 0.03632 0.90736 0.03888 0.01353 0.00009 0.00128
BB 0.00000 0.00016 0.00347 0.07905 0.86016 0.05342 0.00294 0.00079
B 0.00000 0.00005 0.00507 0.00378 0.07066 0.87599 0.02797 0.01647
CCC 0.00000 0.00000 0.00045 0.02302 0.03104 0.12522 0.51784 0.30242
D 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 1.00000
a
This estimator is a non-parametric estimator of the one-year transition probability matrix. It is
only marginally dierent from the parametric maximum likelihood estimator obtained in Table 4.
esis on this, we have to work with the transition intensities, and this is the topic
of Section 5.
5. Introducing covariates: The statistical framework
We now set up the appropriate framework for testing whether the transition
intensities of a Markov model depend on certain covariates. Our application is
to test for non-Markovian behavior, i.e. so-called rating drift (i.e. dependence
on previous rating) and waiting-time eects but the framework could also
easily handle dependencies on macroeconomic variables (see Kavvathas (2000)
for examples of this).
In Appendix A, we have recalled how a matrix of transition intensities
characterize the evolution of a non-homogeneous Markov chain. The key as-
sumption in the following is that the transition intensity for each type of rating
migration is inuenced by an external, time varying covariate. Let Yhi denote an
indicator process, which is 1 when the process is in state h and 0 otherwise. We
assume that the intensity of transition from state h to state j for rm i is given
as
khji t Yhi tahji t; Zi t;
where ahji t; Zi t has the multiplicative form
ahji t; Zi t ahj0 t expbhj Zi t: 5
This statistical formulation is a semi-parametric multiplicative hazard model,
which is a proportional intensities regression model. The theory behind this
modeling is described in Andersen et al. (1991) and Section VII of Andersen
et al. (1993). We have summarized in Appendix A.
Note that the time-varying baseline intensity ahj0 is unspecied (but non-
negative) and the parameter of interest is the regression coecient bhj . The
covariate Zi is designed to keep track of an inuence on is transition intensity.

Non-zero values of the product of covariates and the parameter cause the
intensities to deviate from the baseline hazard. Thus, if the process of rating
changes exhibits non-Markov behavior the regression coecients are signi-
cantly dierent from zero, and this is exactly what we investigate via statistical
tests on b in the coming subsections. The theory behind the estimation of b is
done by maximizing a so-called partial likelihood. The theory along with
further references is outlined in Appendix A.
We have chosen to work with this model in this paper, since we are con-
cerned with testing non-Markov eects of transitions, i.e. the covariates will be
variables describing whether the previous move was an upgrade or a down-
grade or the duration in the present state for each rm i. It is, of course, likely
(see for example Bangia et al., 2000; Nickell et al., 2000; Kavvathas, 2000) that
macroeconomic variables or other indicators of the business cycle inuence
rating intensities. Indeed, if a rating system attempts to keep the marginal
default probabilities relatively constant for a given rating category, then one
should see downgrades taking place more often in a recession. But since we do
not want in this paper to explain the macroeconomic inuences and do not
want a fully parametric model to be misspecied due to business cycle vari-
ables, we absorb such uctuations through the baseline intensity. Hence this
test is specically designed to allow us to focus precisely on the kind of devi-
ation from a Markov hypothesis that we are interested in.
The rst such deviation which we test for, is whether the last rating change
inuences the transition probabilities out of the present class. 2 The basic co-
variates are dened as

1; individual i was upgraded to the present rating class;
Zi t
0; otherwise:
The statistical test for the hypothesis of no rating drift is seen to be the simple
hypothesis
H : b 0: 6
This hypothesis is equivalent to no serial correlation in any rating class of
previous up- and downgrades. As recalled in the appendix, likelihood theory
provides a (partial) likelihood ratio test for the hypothesis H.
Several studies 3 address the issue of rating drift which is essentially a non-
Markov property in that the history of the rating process not just the current
rating carries information about the transition probabilities. But one needs to
2
For this test, we only include data for rms that experience more than one change of rating.
Similarly, for duration dependence, we need at least one rating change.
3
See e.g. Altman and Kao (1992a,b), Carty and Fons (1993, 1994).
be careful in dening what rating drift really means. To give an example of this,
consider the notion of rating momentum as used for example in Carty and
Fons (1993). That study found the following: for each rating category, the
probability of a downgrade following a downgrade within a year signicantly
exceeds that of an upgrade following a downgrade. This way of tackling serial
correlation eects is inappropriate since it does not recognize the dynamics of
the Markov chain which may very well for several rating categories have a
lower upgrade probability than downgrade probability. One should dieren-
tiate the direction in which one came into the current state, not the direction in
which one leaves the current state. Indeed, one would expect for high rating
categories to see a small probability of an upgrade compared to a downgrade
and in a low grade like BB the picture could be reversed. In a continuous-time
Markov
P modelPthe binomial test hypothesis corresponds to asking whether
j<h ahj t j>h ahj t for all states except for the best and the default rating
class, where ahj t is the transition intensity at time t between state h and j: This
is not a reasonable hypothesis. If in addition (as in some studies) data are
aggregated across rating categories such that only total number of up- and
down-gradings are considered, then again the drift could be a consequence of
the composition of the rms in the sample: A high number of rms in low
categories would show dierent results than a sample with a high number of
rms in the high categories. Instead, a rigorous test of rating drift must check
whether rms in a specic rating class exhibit the same rating behavior re-
gardless of whether they obtained their current rating through an upgrade of a
downgrade. Our specication takes care of this problem. But note that ex-
tensions could readily be made: one might be interested in dierentiating which
type of upgrade preceded the current rating, and not just note that it was an
upgrade. While this of course gives a more precise statistical hypothesis, it also
rapidly decreases the data underlying the test, and it becomes very hard to
obtain statistical power.
Second, we study duration dependencies in this model. This requires the
following denition of the basic duration covariates Zi :
Zi t \time since last entry into the present state":
Since previous empirical evidence has suggested a lower intensity as a function
of time spent in a state, we have chosen the exponential form which not only
keeps the intensity positive but also lets the eect die out as the duration
increases when the regression parameter is negative.
6. Test results
First, we consider the question of momentum or rating drift and ask if the
intensity of being upgraded from a state depends on whether the current state
was reached through a downgrade or an upgrade. We ask the same question

for downgrades: Is there a tendency of a downgrade to be more likely if the
current state was reached through a downgrade. To get enough observed
transitions to make meaningful inference, we consider only transitions from the
current state to a neighboring state. We consider all possible ways of reaching
the current state but group together all the downgrades into the current state
by assigning the same value of the covariate for these rms. All the upgrades
into the current state are then in the other group. The results are shown in
Tables 6 and 7. In all cases, except for current ratings BB, CCC+ and CCC, we
nd a strong, downgrade momentum which in several cases increases the
downgrade intensity by a factor of 3. For upgrades, the result is almost the
opposite. There is virtually no detectable eect on the upgrade intensity of a
previous upgrade except from ratings BBB+, BBB, BB+ and B+.
Next, we ask if the duration in a given rating inuences the downgrade or
upgrade intensity. Again, to make sure the data material is not too thin, we
Table 6
Results shown are for the test of an eect of a previous downgrade on the intensity of a downgrade
to a neighboring state
Ratings
From To b
b std b
b n1 n2 p
AA+ AA 0.897 0.281 149 65 <0.01
AA AA) 0.936 0.211 314 100 <0.01
AA) A+ 0.871 0.172 490 162 <0.01
A+ A 0.582 0.147 663 198 <0.01
A A) 0.868 0.160 842 193 <0.01
A) BBB+ 1.180 0.196 780 161 <0.01
BBB+ BBB 0.714 0.168 721 180 <0.01
BBB BBB) 1.180 0.222 712 140 <0.01
BBB) BB+ 1.090 0.241 641 95 <0.01
BB+ BB 0.970 0.303 513 59 <0.01
BB BB) 0.144 0.227 571 82 0.53
BB) B+ 0.858 0.253 522 74 <0.01
B+ B 1.010 0.282 575 87 <0.01
B B) 0.541 0.457 437 43 <0.01
B) CCC+ 2.030 1.040 271 28 <0.01
CCC+ CCC 6.170 23.5 194 15 0.20
CCC CCC) 0.929 0.873 150 18 0.32
The rst column shows the precise type of transition studied. The second column reports the
estimate of b: A positive (negative) b implies that the downgrade intensity is increased (decreased)
by a factor of expb compared to the case of a previous upgrade. The standard deviation of the
estimate is provided. n1 is the total number of times we have observed a rm exposed to the given
type of transition, i.e. the total number of censored or uncensored observations in the From rating
category. n2 reports the number of actual transitions observed. p is the test statistic reported as a
probability. So a test statistic of <0.01 is signicant at least at the level of one percent. We see
highly signicant eects in virtually all categories.
Table 7
Results shown are for the test of an eect of a previous upgrade on the intensity of an upgrade to a
neighboring state
Ratings
From To b
b std b
b n1 n2 p
AA+ AAA 0.106 0.525 149 15 0.84
AA AA+ 0.011 0.545 314 14 0.98
AA) AA 0.132 0.268 490 56 0.62
A+ AA) 0.337 0.233 663 85 0.14
A A+ 0.449 0.190 842 116 0.02
A) A 0.261 0.151 780 177 0.08
BBB+ A) 0.720 0.168 721 153 <0.01
BBB BBB+ 0.508 0.173 712 137 <0.01
BBB) BBB 0.143 0.173 641 144 0.405
BB+ BBB) 0.535 0.174 513 152 <0.01
BB BB+ 0.100 0.187 571 122 0.60
BB) BB 0.1947 0.190 522 114 0.315
B+ BB) 0.667 0.214 575 90 <0.01
B B+ 0.560 0.277 437 63 0.05
B) B 0.490 0.477 271 22 0.31
CCC+ B) 6.150 25.7 194 17 0.24
CCC CCC+ 7.280 45.1 150 6 0.25
estimate of b. A positive (negative) b implies that the upgrade intensity is increased (decreased) by a
factor of expb compared to the case of a previous downgrade. The standard deviation of the
estimate is provided. n1 is the total number of times we have observed a rm exposed to the given
type of transition, i.e. the total number of censored and uncensored observations in the From
category. n2 reports the number of actual transitions observed. p is the test statistic reported as a
probability. A test statistic of <0.01 is signicant at least at the level of 1%.
consider only transitions to neighboring states. In Table 8 it is shown that in

almost every case of downgrades we reject the hypothesis that the duration has
no inuence. In fact, we see that b b is negative, meaning that exp b b < 1. Thus,
the intensity aji t aj0 t expbZtji is negatively aected by a change in the
duration. i.e. the longer the rm has been in the rating classthe smaller the
probability to downgrade is. In Table 9 we again nd that the longer the rm
has been in the rating class the smaller is the probability to upgrade. Com-
bining the two duration analyses we may conclude, that a rm with a given
rating has a lower probability of changing its rating the more time it spends in
its current state. A possible explanation of these eects could be the reluctance
of rating agencies to change a rating by more than one notch at a time. If this is
the case, then rms on the way (say) down through the rating system, will
spend relatively short time in the intermediate states. Hence those that stay
there a short amount of time are often rms on the way down.
The results are striking. One should however note, that they build upon an
aggregate treatment of the rms in which we do not separate the industries to
Table 8
Results shown are for the test of an eect of the waiting time in the initial category listed under
From on the intensity of a downgrade to a neighboring state
Ratings
From To b
b std b
b n1 n2 p
AAA AA+ 0.348 0.114 61 13 <0.01
AA+ AA 0:405 0.067 149 65 <0.01
AA AA) 0:282 0.037 314 100 <0.01
AA) A+ 0:380 0.041 490 162 <0.01
A+ A 0:351 0.035 663 198 <0.01
A A) 0:547 0.046 842 193 <0.01
A) BBB+ 0:628 0.064 780 161 <0.01
BBB+ BBB 0:360 0.047 721 180 <0.01
BBB BBB) 0:555 0.056 712 140 <0.01
BBB) BB+ 0:679 0.095 641 95 <0.01
BB+ BB 0:708 0.134 513 59 <0.01
BB BB) 0:453 0.099 571 82 <0.01
BB) B+ 0:621 0.110 522 74 <0.01
B+ B 0:529 0.085 575 87 <0.01
B B) 0:683 0.155 437 43 <0.01
B) CCC+ 0:902 0.216 271 28 <0.01
CCC+ CCC 2:241 0.690 194 15 <0.01
CCC CCC) 0:704 0.259 150 18 <0.01
estimate of b: A negative (positive) b implies that the downgrade intensity is decreased (increased)
after a duration of t by a factor of expbt compared to a case where the duration has no eect. The
standard deviation of the estimate is provided. n1 is the total number of times we have observed a
rm exposed to the given type of transition, i.e. the total number of censored or uncensored
observations in the From rating category. n2 reports the number of actual transitions observed. p
is the test statistic reported as a probability. A test statistic of <0.01 is signicant at least at the level
of one percent. We see highly signicant eects in virtually all categories.
which the rms belong. Industry eects are shown to be signicant in Nickell
et al. (2000) and Kavvathas (2000). An analysis run separately on our data for
the largest subsample of nancial institutions does produce some deviations.
For example, the downgrade momentum is no longer present for the categories
from BB)up to BBB). An explanation for this could be that nancial insti-
tutions are typically unable to compete when the rating goes into the specu-
lative categories. It will then often be overtaken or merge to get consolidation
and more competitive funding rates. Hence a downgrade is not likely to be
followed by another downgrade in these categories. But since we do not have
the identities of the rms, we are unable to check that. Also, our subsamples
would be relatively small in the various other industry groups and it would be
hard to get enough statistical power to test the hypotheses we are interested in
for each type of transition.
Table 9
Results shown are for the test of an eect of the waiting time in the initial category listed under
From on the intensity of an upgrade to a neighboring state
Ratings
From To b
b std b
b n1 n2 p
AA+ AAA 0:416 0.132 149 15 <0.01
AA AA+ 0:226 0.096 314 14 <0.01
AA) AA 0:360 0.072 490 56 <0.01
A+ AA) 0:331 0.057 663 85 <0.01
A A+ 0:329 0.049 842 116 <0.01
A) A 0:376 0.045 780 177 <0.01
BBB+ A) 0:449 0.057 721 153 <0.01
BBB BBB+ 0:266 0.043 712 137 <0.01
BBB) BBB 0:346 0.051 641 144 <0.01
BB+ BBB) 0:532 0.075 513 152 <0.01
BB BB+ 0:540 0.085 571 122 <0.01
BB) BB 0:537 0.084 522 114 <0.01
B+ BB) 0:383 0.071 575 90 <0.01
B B+ 0:359 0.100 437 63 <0.01
B) B 0:430 0.189 271 22 0.012
CCC+ B) 0:507 0.247 194 17 0.016
CCC CCC+ 0:934 0.631 150 6 0.032
estimate of b. A negative (positive) b implies that the upgrade intensity is decreased (increased)
after a duration of t by a factor of expbt compared to a case where the duration has no eect. The
standard deviation of the estimate is provided. n1 is the total number of times we have observed a
rm exposed to the given type of transition, i.e. the total number of censored or uncensored
observations in the From rating category. n2 reports the number of actual transitions observed. p
is the test statistic reported as a probability. A test statistic of <0.01 is signicant at least at the level
of 1%. We see highly signicant eects in virtually all categories.
7. Conclusion
There are two main conclusions from this paper: First, we show the im-
portance of estimating transition data based on the full story of rating tran-
sitions. Using either the maximum likelihood estimator in the homogeneous
case or the non-parametric AalenJohansen estimator in the non-homoge-
neous case, the default probabilities over (say) one year are non-zero even for
the highest rating category. This will aect risk measures both of the VaR type
(for small quantiles) and measures of risk taking into account expected loss
given that a certain threshold has been passed.
Second, we have presented a rigorous formulation of the notion of rating
drift a type of non-Markovian behavior in the process of ratings. The
conclusion from analyzing the data set provided by Standard and Poors, is
that there seem to be strong non-Markov eects for downgrades in the
aggregate data set, i.e. working on the entire population of rms without
dierentiating for example between industries. Both the duration in a given

state and the direction from which the state what reached has signicant eects
on the downgrade intensity. These eects would be consistent with a policy of
taking a downgrade through a series of mild downgrades. However, the eect
becomes less pronounced (but still signicant in several categories) when
looking for example at nancial rms only. For upgrades, a signicant eect of
the previous move is only present in a few states, whereas the duration again
seems to be a signicant factor.
Acknowledgements
We are grateful to Reza Bahar and Standard and Poors for providing the
data set used in this paper. Helpful comments from Tom Daula, Bharath
Kumar, Stephen Smith and, especially, Peter Vlaar are gratefully acknowl-
edged. We also thank seminar participants at the 2000 Global Derivatives
Conference in Paris, University of Vienna, The Mathematical Research Insti-
tute at Oberwolfach, Morgan Stanley Dean Witter and at the Federal Reserve
Bank of Chicago/Journal of Banking and Finance conference: Risk Manage-
ment in the Global Economy: Measurement, Management, and Macroeco-
nomic Implications. All errors and opinions are of course our own. Lando
acknowledges partial support by the Danish Natural and Social Science Re-
search Councils.
Appendix A. Markov chains, estimation and testing
A.1. Non-homogeneous Markov chains and transition intensities
This appendix provides a brief outline of the elements we need from the
theory of nite state space, non-homogeneous Markov chains. The nite state
space we consider consists of the rating categories including a default state and
in some cases the not rated category as well.
The evolution of a continuous-time non-homogeneous Markov chain g
is described through transition matrices of the form P s; t where the ijth el-
ement contains the transition probability between states i and j from time s to
t, i.e.
pij s; t Probgt jjgs i; s < t:
Recall that the Markov property says that

Probgt jjgs0 i0 ; gs1 i1 ; . . . ; gsn1 in1 ; gs i Probgt jjgs i
whenever s0 , s1;1 ; . . . ; sn1 < s. This imposes the familiar restriction on the
transition matrices:
P s; u P s; tP t; u for s < t < u:
To understand the time-inhomogeneity, note that for a time-homogeneous
Markov chain the transition probability matrix is a function of the distance
between dates and not the dates themselves, i.e. in the homogeneous case, there
would exist a family of transition matrices indexed by one parameter
P tt P 0 and we could then write
P u s P t sP u t for s < t < u

keeping track only of the distance between the time points and not their lo-
cation in calendar time. For the processes we consider, it is always assumed
that there exist transition intensities for each type of transition, i.e. that for each
t and each pair of states i, j the limit
kij t : lim pij t; t h=h

h!0
exists. It is typically more natural to formulate statistical hypotheses in terms of

transition intensities instead of through the probabilities. When these limits
exist for all transitions, then we also have for each row a sum of all the in-
tensities
X
ki t : kij t
j6i
which, if multiplied by (a small) Dt, approximates the probability of leaving the

state i within Dt. This function also gives us the duration distribution in state i
in that
Z t
P gu i for all u 2 s; tjgs i exp ki u du :
s
Note that this probability is not the same as pii s; t which gives the proba-
bility of being in i both at times s and t, but does not restrict the chain to
staying in i in the period between s and t. It is only in the case of a homoge-
neous Markov chain that one gets a simple formula from all the transition
probabilities from the intensities. If the intensities are constant (time-inde-
pendent) then we have
X1
Kk tk
P t : P 0; t expKt : ;
k0
k!
where we have given the denition of the matrix exponential as an innite sum.
This is Eq. (2) and this equation gives us the maximum likelihood estimator
used in Section 3 for the transition probabilities as a function of the estimated

intensities. In the non-homogeneous case, the link between intensities and
transition probabilities can be described as follows, cf. Gill and Johansen
(1990): dene the cumulative intensity function for a transition from state i to j,
as
Z t
Aij t kij s ds;
0
X
Aii t Aij t:
j6i
The transition matrix for a non-homogeneous chain is given from these cu-
mulative intensities as a limit:
P s; t Ps;t I dA lim Pi I Ati Ati1 ;
max jti ti1 j!0
where s 6 t1 6 tn 6 t, and where the ijth element of the matrix At is just Aij t.
The AalenJohansen estimator directly uses this link by estimating the incre-
ments of the individual intensity functions. These increments are computed
from observed transitions divided by the number of exposed rms. All the
cumulative intensities together produce the estimator for the transition prob-
abilities.
When we test hypotheses on the inuence of factors, such as previous state,
on transitions we analyze each transition intensity separately. The outline of
how this is done is provided in the next section.
A.2. Statistical theory
The data records transitions between states, and we let Nhji denote the
number of observed transitions from state h to state j by rm i. We assume in
the basic model that the intensity of transition from state h to j is given as
Z t
Nhji t ahji uYhi u du Mhji t;
0
where

1
if firm i is in state h at time t;
Yhi t
0
otherwise;
Rt
and Mhji is a martingale. The term 0 ahji uYhi u du is the cumulative intensity
for the transitions of rm i between state h and state j. Such a transition can
occur several times if a rm i reenters the state h several times. The censoring
variable Yhi sets the intensity of jumping away from state h for rm i equal to
zero when the rm leaves the state either due to a migration to another rating
or to a NR category. When the rm leaves for state j this can be viewed as a
censored observation of the transition from i to any state dierent from j and
the transition to NR is viewed as independent censoring as well.
The semi-parametric specication, explained in the paper,
ahji t ahj0 t expbhj Zi t
is used to estimate and test for inuence of the covariate process Zi on the
transition intensity from state h to j: The base-line intensity ahj0 t is left un-
specied, and therefore a full likelihood function cannot be used. Instead, the
regression parameter bhj is found by maximizing the so-called partial likelihood
Y Y expbhj Zi t
Lbhj 0
;
t i
Shj bhj ; t
where
X
n
0
Shj bhj ; t Yhi t expbhj Zi t:
i1
Note that the maximization is done for each type of transition separately: we
use here the fact that the partial likelihood of all the observed rating transitions
used for estimating the regression parameters of all transition types actually
factors into a product of partial likelihood functions one for each transition
type which therefore can be maximized separately. The maximization is done
by setting the (partial) score function of Lb equal to zero, and this score
function can be shown to equal
Z T X h i
o log Lbhj o 0
dNhji t b0hj Zi t logShj bhj ; t :
obhj obhj 0 h;j;i
The asymptotic results is based on the martingale property of this expression

(viewed, of course, as a process in t) and it can be shown (see for example
Andersen et al., 1993) that the estimator is asymptotically normal. Further-
more, the (partial) likelihood ratio test for testing bc hj bhj given as
!
Lbhj
LR 2 log 2log L bc
hj log Lbhj
L bc
hj
has an asymptotic chi-square distribution with one degree of freedom. All our
tests are based on this result. Once the estimate of b is obtained, one may go
back and obtain a NelsonAalen type estimator of the baseline intensity, but
we will not be concerned with that in this paper. For more on this, see An-
dersen et al. (1993).
References
Altman, E., Kao, D., 1992a. The implications of corporate bond rating drift. Financial Analysts
Journal, 6475.
Altman, E., Kao, D., 1992b. Rating drift in high yield bonds. The Journal of Fixed Income, 1520.
Andersen, P., Borgan, ., Gill, R., Keiding, N., 1993. Statistical Models Based on Counting
Processes. Springer, New York.
Andersen, P., Hansen, L.S., Keiding, N., 1991. Non- and semiparametric estimation of transition
probabilities from censored observation of a non-homogeneous Markov process. Scandinavian
Journal of Statistics, 153167.
Bangia, A., Diebold, F., Schuermann, T., 2000. Ratings migration and the business cycle, with
applications to credit portfolio stress testing. Working paper. Oliver Wyman & Co and New
York University.
Blume, M., Lim, F., MacKinlay, A., 1998. The declining credit quality of US corporate debt: Myth
or reality. Journal of Finance 53 (4), 13891413.
Carty, L., 1997. Moodys rating migration and credit quality correlation, 19201996. Special
comment, Moodys Investors Service, New York.
Carty, L., Fons, J., 1993. Measuring changes in corporate credit quality. Moodys Special Report,
November.
Das, S., Tufano, P., 1996. Pricing credit-sensitive debt when interest rates, credit ratings and credit
spreads are stochastic. The Journal of Financial Engineering 5, 161198.
Gill, R., Johansen, S., 1990. A survey of product-integration with a view towards applications in
survival analysis. The Annals of Statistics 184, 15011555.
Israel, R., Rosenthal, J., Wei, J., 1999. Finding generators for Markov chains via empirical
transition matrices. Working paper, Univesity of British Columbia and University of Toronto.
Jarrow, R., Lando, D., Turnbull, S., 1997. A Markov model for the term structure of credit risk
spreads. Review of Financial Studies, 481523.
Kavvathas, D., 2000. Estimating credit rating transition probabilities for corporate bonds.
Working paper. Department of Economics. University of Chicago.
Kuchler, U., Srensen, M., 1997. Exponential Families of Stochastic Processes. Springer, New
York.
Lando, D., 1998. On Cox processes and credit risky securities. Review of Derivatives Research 2
(2), 99120.
Lucas, D., Lonski, J., 1992. Changes in corporate credit quality 19701990. The Journal of Fixed
Income, 714.
Moodys, 1994. Corporate bond defaults and default rates 19701993. Moodys Special Report,
January.
Nickell, P., Perraudin, W., Varotto, S., 2000. Stability of ratings transitions. Journal of Banking
and Finance 24 (12), 203.
Shumway, T., 2001. Forecasting bankruptcy more accurately: A simple hazard model. Journal of
Business.
Skdeberg, T., 1998. Statistical analysis of rating transitions a survival analytic approach.
Masters Thesis, University of Copenhagen.
View publication stats

Lando Rating Transitions 2002

Uploaded by

Document Informationclick to expand document information

Copyright:

Available Formats

Lando Rating Transitions 2002

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Lando Rating Transitions 2002

Uploaded by

Copyright:

Available Formats

See

discussions, stats, and author profiles for this publication at:

Analyzing Rating Transitions and

Article in Journal of Banking & Finance March 2002

The user has requested enhancement of the downloaded file.

Analyzing rating transitions and rating

We consider the estimation of credit rating transitions based on continuous-time

JEL classication: C41; G33

et al. (2000), or changes in rating policies, as indicated by Blume et al. (1998)

statistical framework and formulates a rigorous notion of rating drift. In

3. The time-homogeneous case

Appendix A contains an overview of the necessary theory of Markov chain

from the one-period transition matrix, there exists a simple representation of

From this, we obtain the maximum likelihood estimator of the one-year

4. Estimating non-homogeneous chains

We have just seen how to use the maximum-likelihood estimator to estimate

Here, Ti is a jump time in the interval s; t and

Here, DNhj Ti denotes the number of transitions observed from state h to j at

5. Introducing covariates: The statistical framework

covariate Zi is designed to keep track of an inuence on is transition intensity.

was reached through a downgrade or an upgrade. We ask the same question

consider only transitions to neighboring states. In Table 8 it is shown that in

dierentiating for example between industries. Both the duration in a given

Appendix A. Markov chains, estimation and testing

A.1. Non-homogeneous Markov chains and transition intensities

Recall that the Markov property says that

P u  s P t  sP u  t for s < t < u

kij t : lim pij t; t h=h

exists. It is typically more natural to formulate statistical hypotheses in terms of

which, if multiplied by (a small) Dt, approximates the probability of leaving the

used in Section 3 for the transition probabilities as a function of the estimated

A.2. Statistical theory

ahji t ahj0 t expbhj Zi t

The asymptotic results is based on the martingale property of this expression

View publication stats

You might also like

P u s P t sP u t for s < t < u