Curve Fitting
There is a need to value all instruments consistently within a single valuation
framework. For this we need a risk-free yield curve which will be a continuous zero
curve (because this is the standard format, for all option pricing formulae). Thus, a
yield curve is a function r5r(t), where a single payment investment for time t will
earn a continuous rate r5r(t), that is, a payment of 1 at initiation will be redeemed
by a payment of exp(r(t)t) at time t.
As explained in Zangari (1977) and Lin (2002) term structure estimation methods
can be classified into two groups: theoretical and empirical. Theoretical term
structure methods typically posit an explicit structure for a variable known as the
short rate of interest, whose value depends on a set of parameters that might be
large set of instruments, does the algorithm to find the best fit curve converge
sufficiently rapidly, and is the degree of error in the created curve sufficiently
(2) In the case of yield curves, how good do the forward rates look? These are
usually taken to be the 1 month or 3 month forward rates, but these are
virtually the same as the instantaneous rates. We will want to have positivity
and continuity of the forwards. It is required that forwards be positive to avoid
arbitrage, while continuity is required as the pricing of interest-sensitive
instruments is sensitive to the stability of forward rates. As pointed out in
McCulloch & Kochin (2000), ‘a discontinuous forward curve implies either
implausible expectations about future short-term interest rates, or implausible
expectations about holding period returns’. Thus, such an interpolation
method should probably be avoided, especially when pricing derivatives whose
value is dependent upon such forward values.
(3) How local is the interpolation method? If an input is changed, does the
interpolation function only change nearby, with no or minor spill-over
elsewhere, or can the changes elsewhere be material?
(4) Are the forwards not only continuous, but also stable? We can quantify the
degree of stability by looking for the maximum basis point change in the
forward curve given some basis point change (up or down) in one of the inputs.
Many of the simpler methods can have this quantity determined exactly, for
others we can only derive estimates.
(5) How local are hedges? Suppose we deal with an interest rate derivative of a
particular tenor. We assign a set of admissible hedging instruments, for
example, in the case of a swap curve, we might (even should) decree that the
admissible hedging instruments are exactly those instruments that were used to
bootstrap the yield curve. Does most of the delta risk get assigned to the
hedging instruments that have maturities close to the given tenors, or does a
material amount leak into other regions of the curve?
We will discuss criteria (1) and (2) as we proceed with each method that we
analyse. Criteria (3), (4) and (5) will be discussed much later.
In most cases we have the rates r1, r2, …, rn at the nodes t1, t2, …, tn and need to
determine the rate r(t) where t is not necessarily one of the ti. Occasionally we will
have the forward rates rather than the rates themselves, and are required to perform
the interpolation on these. In these cases, we may wish to recover the rates using the
relationship f ðtÞ~ Lt rðtÞt.
[ ½t1 , tn , the value of r(t) or f(t) will be that rate found at the nearer of
For any t =
t1 or tn.
Note that the forward is positive if and only if the capitalization function is
increasing, equivalently, r(t)t is increasing.
Swap Curves
Let us first consider swap curves. Suppose a swap makes the fixed payments at time
t1, t2, …, tn; time is measured in years. As explained in Hull (2002, Section 6.4), a
swap just issued at par can be valued by
Rn ai Z ðti ÞzZ ðtn Þ~1 ð1Þ
where Rn is the par swap rate, and ai is the time in years from ti21 to ti, calculated
with the relevant day count convention. In the theory, Rn is now solved for, as
1{Z ðtn Þ
R n ~ Pn ð2Þ
i~1 ai Z ðti Þ
Alternatively, we can inductively suppose that Z(ti) is known for i51, 2, …, n21,
and Rn is known, to get
1{Rn n{1i~1 ai Z ðti Þ
Z ðtn Þ~ ð3Þ
1zRn an
At first blush, use of (3) assumes that inputs to the curve are available for all
standard tenors1 to maturity. This is typically not the case. For example, in
constructing a swap curve, we might use deposit rates in the very short term, forward
rate agreements or futures in the short to medium term, and swap rates in the longer
term. Typically, the FRA or futures rates will be available for calculation of the
relevant rates for all three-month tenors out to say two years.
The use of futures and FRAs will pose no difficulty. One applies a standard
convexity adjustment to futures prices to get an equivalent FRA rate. This convexity
adjustment will depend on some time and volatility parameters, but not on the yield
curve itself. However, the swap rates may only be available in say 2, 3, 4, up to 10
year tenors. What to do about tenors which are not in whole number of years away?
Even worse, the swap rates may only be available in say 2, 3, 4, 5 and 10 year tenors,
with the 6 to 9 year tenors insufficiently liquid to use with confidence. Thus, lack of
liquidity can reduce our information set dramatically.
One approach now advocated in some sources is to interpolate (linearly, say) the
input swap rates to the expiries which are not quoted, and then proceed with a
complete information set. However, this decouples the interpolation procedure from
the bootstrap procedure, even if the chosen interpolation method here is the same as
the interpolation method that will be used to find rates at points which are not nodes
after the bootstrap is completed. Rather, we rewrite (3) as
" P #
{1 1{Rn n{1 j~1 aj Z t, tj
rn ~ ln ð4Þ
tn 1zRn an
Figure 1. This method used in finding a swap curve, with the limiting curve in the contrasting
and this gives us a very useful iterative formula: we guess initial rates rn for each
of the quoted expiries, perform interpolation using our chosen method of the
yield curve itself to determine any missing rj, and hence any Z(t, tj), and use
this formula to extract new estimates of the rn. The initial guess might, for
example, be the continuous equivalent of the input swap rate, but in reality, any
guess will suffice. We then iterate; convergence is fast over the entire yield
Thus, the interpolation method applies not only to the spaces between standard
tenors, but the (typically larger) spaces between the input tenors.
Bond Curves
Bootstrapping bond curves poses new problems. Let us first consider the case where
there are only a few bonds for construction of the yield curve, and we require an
output yield curve which prices those bonds exactly. In this case, we can consider two
different ways of realizing the value of any of the bonds: the all-in (dirty) price of the
bond, adjusted if necessary for any defined payment lags in the market, and the sum
of the present value of all of the cash flows due to the owner as found off of the
desired yield curve.
We easily set up equations very similar to (4): there will be one equation for each
bond, and the rate on the left-hand side will be the rate for the maturity date of the
bond. The first guess could, for example, be the continuous equivalent of the yield to
maturity of the bond if such an input exists (in other words, if the market trades on
or calculates yield to maturities of bonds). Again, in reality, any initial estimate will
typically suffice.
In many markets, there will rather be a surfeit of bond information, with many
bonds of different maturities trading. We must assume that, modulo liquidity
issues, the bonds are reasonably homogenous, or can be homogenized using some
procedure which will occur prior to input to a bootstrap algorithm.2 Because of
liquidity issues, one may prefer to exclude some of the bonds, and use only a subset
of the bonds to bootstrap the yield curve; those left out are then deemed to be
marked to market at the price one obtains by stripping them off the yield curve,
rather than the illiquid (and hence by now ‘erroneous’) last price at which they
A key issue is to decide on how many bonds to include as bona fide inputs to the
bootstrap. To exclude too many runs the risk of excluding market information
which is actually meaningful, on the other hand, including too many could result
in a yield curve that is implausible, a yield curve that admits arbitrage, or a
bootstrap algorithm that fails to converge. In this case, we need to consider
constructing a yield curve that ‘does as good a job as possible’ in recovering the
prices of the inputs.
In this case, what needs to be done can easily be understood; we will not deal with
the specifics here, they will involve some multi-dimensional minimization problem.
One needs to fix some set of node points, for example, they could be the maturity
dates of those bonds that are deemed to be the most important, or could be the same
nodes as exist in the swap curve, for example. One then postulates values of the yield
curve at each of those node points, and completes the yield curve by using the chosen
interpolation method. We can then calculate the value of each bond as stripped off
this curve, versus the value that it is trading at in the market. The error (typically
squared, and possibly weighted in order to attach more importance to some bonds
than to others) is then summed across all the bonds. The values of the curve at the
node points are perturbed, using some optimisation routine, to minimise this
summed error.
We now go on to consider a variety of interpolation methods.
and so
{1 t{ti tiz1 {t
rðtÞ~ ln diz1 z di ð5Þ
t tiz1 {ti tiz1 {ti
Raw Interpolation
This method is linear on the logarithm of discount factors, and as we shall see,
corresponds to piecewise constant forward curves. To a good approximation, any
forward curve that has the same area between each node would work. This means
that if a piecewise linear approximation starts too high, it has to go too low to
average to the right value, but then it starts the next interval too low and has to go
too high to average to the right value. This method is very stable, is trivial to
implement, and is usually a base method one implements in a system before any
others. One can often find mistakes in fancier methods by comparing the raw
method with the more sophisticated method.
Since the instantaneous forward curve is f ðtÞ~ Lt rðtÞt, the interpolating function
for the yield curve is rðtÞ~Kz Ct . Given the two endpoints, this solves as
riz1 tiz1 {ri ti
f ðtÞ : ~K~
tiz1 {ti
ðri {riz1 Þti tiz1
tiz1 {ti
and after some manipulation, we get
Remarkably, this method is quite popular, being provided as one of the default
methods by many software vendors. However, it clear from (11) that this method
does not guarantee positive forward rates. As a trivial (not necessarily practicable)
example, if we have a two-point curve, with nodes (1,6%) and (30,2%) then the
forward rates are negative from about the 26th year.
All these simple methods have continuity difficulties associated with them. Thus,
they should not be used for anything but naive interpolation of yield curves,
after which criteria such as rate smoothness, forward rate smoothness etc. are
Cubic Splines
As before, suppose t1, t2, …, tn and r1, r2, …, rn, ri:5r(ti) are known. To complete a
cubic spline, we desire coefficients (ai, bi, ci, di) for 1(i(n21. Given these
coefficients, the function value at any term t will be
rðtÞ~ai zbi ðt{ti Þzci ðt{ti Þ2 zdi ðt{ti Þ3 ti ƒtƒtiz1 ð12Þ
Note that
r0 ðtÞ~bi z2ci ðt{ti Þz3di ðt{ti Þ2 ti vtvtiz1
r ðtÞ~2ci z6di ðt{ti Þ ti vtvtiz1
r000 ðtÞ~6di ti vtvtiz1
Let us define
bn : ~bn{1 z2cn{1 hn{1 z3dn{1 h2n{1 ð14Þ
so that bn is the derivative of the interpolating function at the right hand endpoint. In
the most general case (de Boor 1978, 2001, Chapter IV), the specification of the
remaining n linear constraints is equivalent to specifying b1, b2, …, bn (as any n
additional conditions will do, assuming there is no redundancy - the point in de
Boor, 1978, 2001, is that such a view can be an aid to classification). In particular, if
we take this approach, defining b1, b2, …, bn, then c1, c2, …, cn21 and d1, d2, …, dn21
follow easily, as for each i, we have two equations in two unknowns, which easily
solve as:
aiz1 {ai
mi ~ ð15Þ
Then, using the natural cubic spline, forward rates after about 28 years are
negative. On the other hand, with the inputs
the forward rates are satisfactory. In both cases, the discrete forward rate in the
20–30 year period is 5%. This illustrates another property that is missing from these
analytic splining methods: locality. The interpolation in a region should take into
account the data in that region, and not the data some distance away.
One determines the coefficients using the well-known natural cubic spline
algorithm (Burden & Faires 1997, Algorithm 3.4).
| | 0 1 b1 0
| h1 2 3 c
h1 h1 1 a2 {a1
2h1 3h21 {1 d1 0
3h1 0 b
{1 2 0
h2 h22 h3 c2 a3 {a2
1 2h2 3h22 {1 d2 0
1 3h2 0 {1 b3 0
. .. .. . . .
. .. . .
. . . . .
. .. .. . . ..
. . .
. . . . . .
1 3hn{2 0 {1 b 0
0 hn{1 h2n{1 hn{1 cn{1 an {an{1
1 2h 3h2 | d 0
n{1 n{1 n{1
The first equation above, c1 5 0, is the left-hand condition f 0(t1) 5 0. The last equation
above, bn{1 z2cn{1 hn{1 z3dn{1 h2n{1 ~ 0 is the right-hand condition f 9(tn) 5 0.
derivative of r(t)t zero at the short end would make the first derivative of r(t) zero,
and this is contrary to typically observed features of yield curves.
The entire system of equations can once again be rewritten into a tridiagonal
system, in a way very much analogous to the ordinary cubic spline algorithm, and so
this is a special case implementation of Crout’s method of solving a tridiagonal system.
Bessel (Hermite) cubic spline. This method stands in relation to the Bessel method as
the quadratic-normal method stands in relationship to the natural cubic spline. Thus,
it is exactly the Bessel method, but applied to the function r(t)t rather than r(t).
Thus, the interpolation formulae are the same as (21) and (22).
riz1 {ri
mi ~ ð1ƒiƒn{1Þ ð24Þ
but simple examples show that this method may fail to be locally monotone
immediately to the interior side of the endpoints. In particular, negative forward
rates are possible, even likely. Thus, rather we define
b1 ~0~bn ð25Þ
102 P. S. Hagan and G. West
Then we also include the adjustment (Hyman 1983, Equation 2.3), which ensures
that no spurious extrema are introduced in the interpolated function. This
adjustment is
minðmaxð0, bi Þ, 3 minðmi{1 , mi ÞÞ if the curve is locally increasing at i
bi ~ ð27Þ
maxðminð0, bi Þ,3 maxðmi{1 , mi ÞÞ if the curve is locally decreasing at i
Note that the requirement that the interpolatory curve preserves the geometry of
the curve does not guarantee that the forward function is positive.
Quartic Splines
Quartic Forward Spline
According to Adams (2001), the interpolation method that guarantees the smoothest
interpolation of the continuous instantaneous forward rates is a quartic spline of
that continuous forward curve. See also van Deventer & Inai (1997), Adams & van
Deventer (1994), and Lim & Xiao (2002). A variation of this method is implemented
in Quant Financial Research (2003).
As before, suppose t1, t2, …, tn and f1, f2, …, fn, fi:5f(ti) are known. To complete
the requisite spline for f, we desire coefficients (ai, bi, ci, di, ei) for 1(i(n21. Given
Figure 2. The forward curves under various cubic interpolation methods for the given rates
Thus we have 5n-8 equations in 5n-5 unknowns. Thus, we need three more
conditions. The following three conditions are specified in Adams (2001):
N f 0(t1)50,
N f 9(tn)50,
N f 0(tn)50.
The system is actually a bandwidth matrix with widths 2 and 6. As a banded
matrix, we write it in the form suggested in Press et al. (1992, Section 2.4): so a
5(n21)69 matrix A. The scheme below is [A||x||b], where Ax5b, and A is written in
2-1-6 bandwidth form.
| | 12t21 0
0 2 6t1 0 a1 0
| 1 ti i t3i t4i 0 0 0 bi fi
1 tiz1 t2iz1 t3iz1 t4iz1 0 0 ci
0 0 fiz1
{1 {2tiz1 {3t2iz1 {4t3iz1 0 1 2tiz1 3t2iz1 4t3iz1 di 0
{2 {6tiz1 {12t2 6tiz1 12t2iz1 0
iz1 0 0 2 ei 0
{6 {24tiz1 0 0 0 6 24tiz1 0 0 aiz1 0
.. .. .. .. .. .. .. .. .. . .
. . . . . . . . . .. ..
1 2tn 3t2n 4t3n | | | | | dn{1 0
2 6tn 12t2n | | | | | | en{1 0
The first equation above is the first extra condition, while the last two equations
above are the other extra conditions.
This system is solved with the bandwidth matrix algorithms.
The bad news (and this should be expected) is that, like the cubic splining
methods, there is no guarantee of the absence of negative forward rates. These
methods are demanding such high smoothness criteria that any desired stiffness is
completely lost from the system. Thus, we can have enormous and completely
implausible fluctions in the output curve. For example, if we have the following
forward curve:
0.1 2.00%
1 2.00%
2 2.00%
6 3.00%
7 2.00%
30 2.00%
then the quartic spline on forwards has negative values. See Figure 3. By simply
adjusting the 6 year rate from 3% to 2%, we get a forward curve which is flat
everywhere, at 2%.
f ðtÞ~bi z2ci tz3di t2 z4ei t3 z5gi t4 ti ƒtƒtiz1
f 0 ðtÞ~ci ð2Þzdi ð6tÞzei 12t2 zgi 20t3
f 00 ðtÞ~di ð6Þzei ð24tÞzgi 60t2
f 000 ðtÞ~ei ð24Þzgi ð120tÞ
The first equation above is the first extra condition, while the three final equations
above, are, in order, the second, third and fourth extra conditions specified above.
This system is solved with the bandwidth matrix algorithms.
The problems are even worse than before. This method produces negative forward
rates for the inputs seen earlier. As another tamer example, with the very innocuous
rate inputs
one has a very unstable interpolant, with wild fluctuations in the output curve, in
particular at the short end (where the tenors of the inputs are closer together). See
Figure 4. Even though the discrete forwards are completely reasonable, lying in the
range 4–8%, the interpolated curve is not.
The problem with many of the schemes we have seen so far is that we do not have
instantaneous forward rates as inputs to a yield curve, we have (or can rearrange our
inputs so that we have) discrete forwards for entire intervals. Many of the methods
we have seen so far are implicitly treating discrete forwards not as a property of the
entire interval, but as a property of the right endpoint of that interval, and ignore the
interval itself. We will change focus appropriately in the following section.
f0 ~f1d { f1 {f1d , ð31Þ
fn ~fnd { fn{1 {fnd : ð32Þ
The interpolation algorithm will proceed on the rates fi. For i51, 2, …, n21 this
choice amounts to interpolating fi from the average values fiz1 and fid at the
midpoints of the adjacent intervals. The values f0 and fn were selected so that
f 9(0)505f 9(tn).
For some emerging markets, we may know the overnight rate f(0). If so, this
should be used for the end-point in preference to (31).
xðtÞ~ ð36Þ
ti {ti{1
for i51, 2, …, n.
Here and later we use the following simple fact: suppose x : ~xðsÞ~ ts{t i{1
i {ti{1
G95g. Then
Z t
gðxðsÞÞ ds~ðti {ti{1 Þ G {G ð0Þ ð37Þ
ti{1 ti {ti{1
Beyond tn, we should use flat extrapolation: f(t) 5 f(tn) for all t . tn.
In the next section we enforce monotonicity and convexity. Before doing this,
however, let us note some properties of the basic interpolator.
First, the accuracy of the interpolator is O(Dt)2 as DtR0. This, because
(1) (30) can be viewed as first approximating f(t) at the midpoints of the intervals
by its average over the interval, and then linearly interpolating to find fi at the
end points of the interval;
(2) the value of any smooth function f(t) at the midpoint of an interval is within
O(Dt)2 of the average value of the function over the interval;
(3) linear interpolation
R has an error of O(Dt)2. Moreover, discount factors rely on
the integrals f ðt0 Þdt0 . Since f(t) has an O(Dt)2 error, the error in the discount
factor is O(Dt)3.
Second, (30) implies that fi is between the average values of f(t) on the adjacent
min fid , fiz1
ƒfi ƒmax fid , fiz1
: ð38Þ
Lf 3
0v v , ð41Þ
Lfid 2
1 ti {ti{1 Lf ti {ti{1
{ v d v ð42Þ
3 tiz1 {ti{1 Lfiz1 tiz1 {ti{1
Now, by considering (38), we see that a sufficient condition for our monotonicity
requirements over the ith interval are:
fi{1 ƒfid ƒfi [f ðtÞ is monotone increasing ð43Þ
So, for example, if gi21.0, gi.0 then g9(0),0 and g9(1).0, so g has a minimum on
the interval. In full generality, the analysis breaks down into eight cases, where the
values g9(0) and g9(1) are positive or negative, and where one (but not both) are zero.
See Figure 5. The eight regions are the rays labelled (b), (d), (f), and (h) and the
angled regions (a), (c), (e), (g) between them. The origin is excluded.7
Observe that
(a) If gi21.22gi and gi{1 > { 12 gi then g(x) has a minimum in 0,x,1.
(b) If gi21.0 and gi ~{ 12 gi{1 then g(x) has a minimum at x51.
(c) If gi21.0 and {2gi{1 vgi v{ 12 gi{1 then g(x) is monotone decreasing.
(d) If gi21.0 and gi522gi21 then g(x) has a maximum at x50.
Interpolation Methods for Curve Construction 111
gi{1 gi
A~{ ð56Þ
gi{1 zgi
This is very promising. Whenever the data forces a maximum or minimum in the
interval, the maximum deviation from the average value is |gi21gi/(gi21+gi)|, which is
smaller than the smallest deviation of the endpoint.
For the applications of interest we may assume that all the average values fid , are
positive, for otherwise all interpolants f(t) would be zero somewhere. By (38) all the
fi are positive, except possibly f0 and fn.
Thus f(t) can only be negative if it has a negative local minimum within the
inteval, which occurs in the quadrant gi.0, gi21.0. Since gmin52gi21gi/(gi21+gi), it
suffices to require:
0vfi{1 v3fid and 0vfi v3fid ð58Þ
If the application sets f0, then we cannot apply the first shift.
fid z 12 h{ { d 1 {
i vfi vfi z 2 hi if d
fi{1 vfid vfiz1
fid { 12 lh{ {
i vfi vfi
if d
fi{1 vfid , fid §fiz1
d 1 { d max d { d
1 ~min fi z 2 hi , fiz1 , fi, 1 ~min fi z2hi , fiz1 if
fi{1 vfid ƒfiz1
d 1 { d max d
1 ~max fi { 2 lhi , fiz1 , fi, 1 ~fi if d
fi{1 vfid , fid > fiz1
d 1 { d d
fi,min d
1 ~fi , fi, 1 ~min fi { 2 lhi , fiz1 if fi{1 §fid , fid ƒfiz1
d max d 1 { d
fi,min { d if d
fi{1 §fid > fiz1
1 ~max fi z2hi , fiz1 , fi, 1 ~max fi z 2 hi , fiz1
ti {ti{1 d
i ~
fi {fi{1 ð68Þ
ti {ti{2
Similar considerations about the smoothness of f(t) in the interval i+1 lead to the
target ranges
fi,min max
2 ƒfi ƒfi, 2 ð69Þ
d z
d 1 z
fi,min d
2 ~max fiz1 {2hi , fi , fi,max
2 ~max fiz1 { 2 hi , fi
if fid vfiz1
d d
d 1 z
fi,min d
2 ~max fiz1 z 2 lhi , fi , fi,max d
2 ~fiz1 if fid vfiz1
d d
, fiz1 d
> fiz2
fi,min d
2 ~fiz1 , fi,max d 1
2 ~min fiz1 z 2 lhi , fi
if fid §fiz1
d d
, fiz1 d
d z
fi,min d 1 d
2 ~min fiz1 { 2 hi , fi , fi, 2 ~min fiz1 {2hi , fid
if fid §fiz1
d d
tiz1 {ti d
i ~
fiz2 {fiz1 ð71Þ
tiz2 {ti
To ameliorate the max’s, min’s, and general ugliness of the interpolant, we use the
following procedure:
(a) add an additional interval at the beginning and the end:
t1 {t0 d
t{1 ~t0 {ðt1 {t0 Þ, f0d ~f1d { f2 {f1d ð72Þ
t2 {t0
d tn {tn{1 d
tnz1 ~tn zðtn {tn{1 Þ, fnz1 ~fnd z d
fn {fn{1 ð73Þ
tn {tn{2
(b) Select the fi’s by linearly interpolating on the midpoints of the intervals:
ti {ti{1 d tiz1 {ti d
fi ~ f z f , for i~0, 1, . . . , n ð74Þ
tiz1 {ti{1 iz1 tiz1 {ti{1 i
Note that with the false intervals, this formula works for i50 and i5n.
(c) For each i51, 2, …, n21,
15-May-03 0.88456260
7-Aug-03 0.88456260
6-Nov-03 0.92185792
6-May-04 0.94334882
12-May-05 0.86603225
11-May-06 0.80999250
10-May-07 0.81417719
8-May-08 0.83105534
7-May-09 0.80773549
6-May-10 0.79198677
12-May-11 0.82822284
10-May-12 0.82815909
9-May-13 0.82757452
7-May-15 0.83522826
10-May-18 0.83603242
11-May-23 0.84646560
12-May-33 0.86299257
The curve in Figure 7 results. After applying the amelioration step with l50.20,
the curve in Figure 8 results.
This curve is clearly a smoother, better looking forward curve. However,
by using amelioration the values of f(t) in interval i depends on the average values
fkd in intervals i22, i21, i, i + 1, and i + 2. The raw monotone convex spline without
the amelioration step depends only on the average values in the neighbouring
intervals i21, i, and i + 1. Whether the better smoothness properties make up for
the poorer locality properties is an application-by-application judgement, which
may depend on trader preferences. An interpolation application should offer both
Figure 8. Monotone convex ameliorated spline interpolation applied to BMA basis factor
form for r:
Z t
rðtÞt~ f ðsÞ ds
Z ti{1 Z t
~ f ðsÞ dsz f ðsÞ ds
0 ti{1
Z t
~ri{1 ti{1 z f ðsÞ ds
Z t
~ri{1 ti{1 zðt{ti{1 Þfid z gðsÞ ds
where i is found so that ti21(t,ti and gðsÞ~f ðsÞzfid , as defined. We now apply the
footnote equation (37). For example, if we are in region (i) or (v), then via (47) we have
Z t Z t
gðsÞ ds~ gi{1 1{4xðsÞz3xðsÞ2 zgi {2xðsÞz3xðsÞ2 : ds
ti{1 ti{1
~ðti {ti{1 Þ gi{1 x{2x2 zx3 zgi {x2 zx3
: ~It
where x~ tt{t i{1
i {ti{1
. The other cases are equally elementary.10 Hence
rðtÞ~ ti{1 ri{1 zfid ðt{ti{1 ÞzIt ð81Þ
Figure 9. The forward raw and monotone convex splines applied to the curve seen earlier in
Figure 2
Figure 10. The forward raw and monotone convex splines applied to the forward curve seen
Figure 3
120 P. S. Hagan and G. West
f 00 ðtÞ~2ci ð87Þ
Let us now define the jumps in the first derivatives, and the weighted values in the
second derivative, for attempted minimization:
The penalty function, which we hope to minimize, is defined, for some prescribed
w g (0,1), as
n{1 X
Pw ~w J1,2 i zð1{wÞ J2,2 i ð90Þ
i~1 i~1
0~x3i{2 zx3i{1 hi zx3i h2i {x3iz1 : ~gnzi ðxÞ i~1, 2, . . . , n{1 ð92Þ
n{1 X
Pw ~w ðx3iz2 {x3i{1 {2x3i hi Þ2 z4ð1{wÞ x23i h2i
i~1 i~1
Now introducing Lagrange multipliers l1, l2, …, l2n21 we have the solution
+Pw ~ lk +gk
in other words
X Lgk
LPw 2n{1
~ lk i~1, 2, . . . , 3n ð93Þ
Lxi k~1
Let z5(x1, x2, …, x3n, l1, l2, …, l2n21) be the unknown and required vector. We
have a system of 5n21 equations ((91), (92) and (93)) in the same number of
unknowns z1, z2, …, z5n21:
0~½F i
> {z3nz1 {z4nz1 if i~1
> {z3nzðiz2Þ=3 zz4n{1zðiz2Þ=3 {z4nzðiz2Þ=3
if i~4, 7, . . . , 3n{5
> {z4n zz5n{1 if i~3n{2
> {2wðz5 {z2 {2z3 h1 Þ{z3nz1 12 h1 {z4nz1 h1 if i~2
> 2w {zi{3 {2zi{2 hði{2Þ=3 z2zi z2ziz1 hðiz1Þ=3 {ziz3
~ {z3nzðiz1Þ=3 12 hðiz1Þ=3 {z4nzðiz1Þ=3 h2ðiz1Þ=3 if i~5, 8, . . . , 3n{4
> 2wðz3n{1 {z3n{4 {2z3n{3 hn{1 Þ{z4n 12 hn if i~3n{1
> 2 1 2 2
> {4w ziz2 {zi{1 {2zi hi=3 z8ð1{wÞzi hi=3 {z3nzi=3 3 hi=3 {z4nzi=3 hi=3
if i~3, 6, . . . , 3n{3
> 8ð1{wÞz3n h2 {z4n 1 h2
> n 3 n if i~3n
> z 3ð i{3n Þ{2 z 1
z3ð i{3n
1 2 d
Þ{1 hi{3n z 3 z3ði{3nÞ hi{3n {fi{3n if 3nz1ƒiƒ4n
> 2
z3ði{4nÞ{2 zz3ði{4nÞ{1 hi{4n zz3ði{4nÞ h2i{4n {z3ði{4nÞz1 if 4nz1ƒiƒ5n{1
122 P. S. Hagan and G. West
method is very popular. However, this method does not guarantee positive forward
rates, nor is it local. In fact, empirically its behaviour is often similar to the
behaviour we have seen for some of the cubic or quartic spline algorithms.
Nevertheless, this approach deserves attention, because it correctly interprets the
information provided as information about intervals, not as information about the
endpoint of the interval. Additional conditions to guarantee that each forward is
positive on the domain of interest would need to be introduced, and penalty
minimization under these conditions appears to be non-trivial.
this criterion is concerned. As expected, the simple methods described earlier are the
most local. Finally, using amelioration in the monotone convex method
compromises locality only very slightly.
Forward Stability
We want to measure how stable an interpolation method is – given a change in one
of the inputs, how much can the output interpolated curve be changed? We measure
this noise feature on the yield curve rates or on the forwards – both as inputs and as
outputs. Appropriate norms for any interpolation method a given yield curve, or a
given forward curve, would be
kM ðrÞk~ sup max
t i Lri
Lf ðtÞ
kM ð f Þk~ sup max d
t i Lfi
where the input rates are r1, r2, …, rn or f1d , f2d , . . . , fnd respectively, and the
bootstrapped curves are r and f respectively.
Except in some simple cases, these norms cannot be determined analytically, and
would need to be estimated empirically. In those cases for any given curve this was
achieved by measuring the maximum difference, in the supremum norm, between the
original bootstrapped curve, and any of the 2n curves which arise when any of the n
nodes are blipped up or down by one basis point. The difference was estimated by
testing at discrete points along the entire curve, in steps of 0.01 of a year. These two
measures (the theoretical derivative measure and the discrete measure) will be
equivalent up to the scaling constant 10000 (since mathematically one basis point is
1000021), so in actual fact we will take the value found empirically and multiply it by
The second issue is to decide over what set of curves to calculate this norm. For
this we simply considered a fairly large set of plausible curves that we have seen
trading in various markets.
Clearly all of the earlier methods then have a norm of 1. The later arguments show
that the monotone convex method initially has a norm of 1.5, although this will be
compromised by the modifications which enforce monotonicity and convexity, and
then by the amelioration of the interpolant. We found that, with an extensive set of
plausible test curves, the norm for this method, unameliorated and with amelioration
at l50.2, was never more than about 2.
All other methods, with the exception of the two quartic methods, had norms
varying from about 1.2 to about 1.6. Perhaps the fact that the monotone convex
method has a slightly worse norm is to be expected: more constraints are being
imposed, so the norm should be higher. The two quartic methods had norms in
excess of 100000 for our test set, and we have no reason to expect that, in the class of
arbitrage free curves, these methods have finite norm. The instability observed was
gross, and seemed to be the rule more than the exception.
Localness of Hedges
What we assume is that exactly the instruments which were used to bootstrap the
yield curve are those that are available for creating hedge portfolios. Given a
portfolio with certain risky cashflows, we wish to construct the portfolio of these
underlying instruments that will hedge those cash flows.
We assume that the set of risky cash flows are all known, for example, swaps have
been decomposed into a fixed coupon bond and a floating rate note, the latter of
which has been discarded as riskless, and any flows with optionality have had some
sort of Greek substitution as flows without optionality. Finally, we assume that we
are in the world where our bootstrap prices all of the inputs exactly.
In order to proceed with this analysis, we first review the two classic approaches to
hedging: bumping and waves.
In bumping, we form new curves indexed by j: to create the jth curve one bumps
up the jth input rate by say 1 basis point, bootstraps the curve again, and reprices the
risky portfolio with the new curve; the difference between the new and old price
forms a vector DV, indexed by the bootstrapping instruments.
One now calculates the matrix P where Pij is the change in price of the jth
bootstrapping instrument under the ith curve. As each of the bootstraps prices the
input set exactly to par, this matrix is diagonal: it is only the original instrument,
which reprices now that that fair rate has changed. The hedge vector Q is chosen to
satisfy PQ5DV.
The portfolio Q perfectly hedges the given risky portfolio exactly under the case
where the valuation curve moves from exactly one of the inputs being blipped by
1bp. One then hopes that the portfolio provides an adequate hedge against more
general moves. Typically 1 bp is small enough to be in the linear regime, and so one
is adequately protected against all small changes in the inputs.
In waves: one approach is to again form new curves, again indexed by i: the ith
curve has the original yield curve incremented by a triangle, with left hand endpoint
at ti21, height of say 1 or 10 basis points and apex at ti, and right hand endpoint at
ti+1. (The first and last triangle will in fact be right angled, with their apex at the first
and last time points respectively.) The matrix P is defined as before. This time we see
that P is not diagonal, but is typically upper triangular – if there are no overlapping
inputs then the ith curve is the same as the mark to market curve before time ti21, so
if i.j, then the jth instrument does not change in value under this curve. (An example
where we have no overlapping inputs would be where we have LIBOR instruments
to 3 months, then 366, 669, 9612, 12615, 15618 FRAs, and then annual swaps.
126 P. S. Hagan and G. West
If we had in addition a 265 FRA say then the inputs would be overlapping. In this
case, P will have a subdiagonal band of width 1; if we then added a 164 FRA the
band width would increase to 2.) Also Q is defined as before. No problems arise with
this setup, the matrix is invertible and has a low conditioning number: in the case the
matrix is upper triangular, the eigenvalues are exactly the diagonal elements, and by
inspection we found that the conditioning number for any interpolation method was
typically as low as about 50. In the case where there were subdiagonal elements we
used The MathWorks (2004) to test the conditioning number and again found it to
be quite innocuous, never more than 100 say.
We will consider comparative tests of the algorithms developed here in some
simple cases. We will suppose our risky portfolio is in fact a single fairly simple
instrument. We consider a 30 year swap curve constructed with some LIBOR and
FRA rates until 18 months, and then annual swap rates from 2 years onwards until
15 years, and then rates every 5 years. Our instrument to be hedged will be a 4 12 year
swap. First, the hedge portfolio is constructed by bumping. Intuitively, one would
expect the hedge to consist more or less of a position of half the nominal in a 4 year
swap, and half in a 5 year swap. Typical results are in Figure 14. For clarity, we have
truncated the graph to reflect only the region of interest; hedge leakage outside of
the region shown was immaterial for all the methods displayed here. (Once again,
the quartic methods showed grossly unreasonable behaviour, and have been
eliminated from the analysis. Also, the performance of the minimal method is quite
Here we see a number of interesting features: all of the methods which involved
clamping of any sort (cubic, financial, quadratic-normal, and as mentioned the
quartic methods) have fairly significant leakage of hedge into buckets which are well
away from ‘the area of interest’. The local splining methods, and the Bessel methods,
fair much better. Under raw interpolation the hedge portfolio consists entirely of 4
Figure 14. Hedge instruments required for hedging a 4.5 year swap under bumping
Figure 15. Hedge instruments required for hedging a 4.5 year swap under waves
and 5 year swaps; the unameliorated monotone convex method require a position in
the 3 year swap, but nothing elsewhere. The ameliorated method shows significant
hedge leakage.
Let us now construct the hedge portfolio with waves. Typical results are in
Figure 15. Two points arise: in the first place, the hedging portfolios that arise under
this approach are very intuitive and very stable. Secondly, no obvious criteria emerge
to enable us to claim the superiority of one method over another.
Several questions should present themselves. For example, what happens to
the hedge as we move into the swap, so the payment dates no longer coincide
with the payment dates of instruments available in the market? Furthermore,
what happens to hedge algorithms when we have overlapping input sets (say as
the 164 or 265 FRAs mentioned earlier)? We intend to deal further with the
issues of suitability and robustness of hedge algorithms in detail in forthcoming
We have reviewed many interpolation methods available and have introduced a
couple of new methods. In the final analysis, the choice of which method to use will
always be subjective, and needs to be decided on a case by case basis. But we hope to
have provided some warning flags about many of the methods, and have outlined
several qualitative and quantitative criteria for making the selection in which method
to use.
We wish to thank an anonymous referee for several useful suggestions.
By this we mean the intervals between the fixed payments of the swap, such as three or six months.
For example, it is not reasonable to expect a bootstrapper to differentiate bonds according to tax status,
rather, some value adjustment should be made a priori to the prices one set of bonds or the other, so that
what is input can be considered to have the same tax status.
To paraphrase the nomenclature of de Boor (1978, 2001): a cubic spline where the first derivative is
known (in addition to the function values) is called a Hermite spline. The Bessel method is an
intermediate method, where the derivatives are estimated from the function values, and then the Hermite
method is applied.
A matrix in a-1-b bandwidth form means that the entry in the ith row and jth column in this
representation actually lies in the ith row and i + j2a21th column in the canonical matrix representation.
Note that the entries in the top left and bottom right of the bandwidth matrix are redundant, they can be
set to anything. This redundancy is denoted by a 6.
Why not f(r)5ai + bi(r2ri) + ci(r2ri)2 + di(r2ri)3 + ei(r2ri)4? We would prefer on principle this format.
However, the bandwidth matrices which result are very unwieldy.
This observation is consistent with Adams (2001) where, after interpolating the forward curves, one
additional piece of information is needed to recover the interpolatory function on the yields i.e. the
method of Adams (2001) has one remaining degree of freedom.
If gi21505gi, then the curve is flat, and g50.
It is advisable to ensure no problems with code execution (‘division by zero’) to trap the cases x(r)50 and
x(r)51 up front: in these cases, g(r)5gi21 and g(r)5gi respectively.
The bound function is given by bound(a, x, b)5min(max(a, x), b).
The same trapping as before has Iti{1 ~0~Iti .
