Carlos Eduardo DIssertation

Instituto Nacional de Matemtica Pura e Aplicada
A new pairs trading strategy based on linear state

space models and the Kalman lter
Carlos Eduardo Silva de Moura
Advisor:
Adrian Pizzinga
Rio de Janeiro - January, 2014
To my family, girlfriend and friends, with love...
Acknowledgements
Foremost, I would like to express my sincere gratitude to my advisor Prof. Adrian Pizzinga and
Jorge Zubelli for the continuous support of my master study and research, for their patience,
motivation, enthusiasm, and immense knowledge. Their guidance helped me in all the time of
research and writing of this master thesis. I could not have imagined having a better advisor
and mentor for my master study.
Besides my advisors, I would like to thank the rest of my teachers in IMPA: Prof.
Alcides
Neto, Prof. Paulo Cezar Carvalho, Prof. Luca Mertens, Prof. Rodrigo Novinski, Prof. Beatriz
Mendes, Prof. Max de Souza, Prof. Claudia Sagastizbal, Prof. Sergei Vieira, Prof. Hugo de
La Cruz and Prof. Gustavo Silva Araujo for their encouragement, insightful comments, and
hard questions.
My sincere thanks also goes to Giuliano Lorenzoni, Guilherme Abry, Elsio Oliveira and all
people that worked with me in this period.
I would like to thank all my friends for support and frendship.
I would like to thank my girlfriend Jessica for encouragement and support.
I would like to thank my siblings Marcos and Ana for friendship and happy moments.
Last but not the least, I would like to thank my family: my parents Margarida and George, for
giving support to me at the rst place and all the love during all my life.
iii
Abstract
Among many strategies for nancial trading, pairs trading has been playing an important role
in practical and academic frameworks. Loosely speaking, it consists of a statistical arbitrage
tool for identifying and exploiting ineciencies of two long-term related nancial assets. When
a signicant deviation from this equilibrium is observed, a prot might result. In this work,
we propose a pairs trading strategy entirely based on linear state space models designed for
modeling the spread formed with a pair of assets.
Once an adequate state space model for
the spread is estimated, we calculate conditional probabilities that the spread will return to its
long-term mean. The strategy is activated upon large values of these conditional probabilities:
if the latter become large, the spread is bought or sold accordingly. Three applications with
real data from the US and Brazilian markets are oered and indicate that a very basic portfolio
consisting on a sole spread already outperforms some of the main market benchmarks.
Key words: Kalman lter, mean-reverting conditional probabilities, pairs trading, spread,
state space models, statistical arbitrage.
Resumo
Dentre as muitas estratgias no mercado nanceiro, uma das mais populares em estudos acadmicos a estratgia denominada pairs trading. Ela consiste em uma estratgia de arbitragem
estatstica, que procura identicar e explorar inecincias, de dois ativos nanceiros relacionados no longo prazo. Quando um desvio deste equilbrio entre os preos signicativo, um
lucro pode ser obtido mediante aplicao de tais estratgias. Neste trabalho, proposta uma
estratgia de pairs trading inteiramente baseada em modelos de espao de estados adequados
para a srie temporal do spread formado entre dois ativos.
Uma vez estimado o modelo de
espao de estado adequado para o spread, so calculadas as probabilidades condicionais de que

o spread retorne sua mdia de longo prazo. A estratgia executada quando so observados
altos valores destas probabilidades:
o spread comprado ou vendido.
Trs aplicaes com
dados reais do mercado brasileiro e americano so oferecidas e indicam que uma carteira muito
bsica, que consiste em um nico spread (par entre ativos), teve resultados melhores do que
alguns dos principais benchmarks de mercado.
Palavras-chaves: arbitragem estatstica, ltro de Kalman, modelo em espao de estado, pairs

trading, probabilidades condicionais de reverso mdia, spread.
brazil
vi
viii
Contents
Index
viii
Introduction
Pairs Trading: a glimpse at the literature
Statistical Arbitrage Strategies
Proposed Models
11
3.1
What is a pair? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
11
3.2
Unobserved component models: the stochastic spread approach
. . . . . . . . .
12
3.3
ARMA models: generalizing the stochastic spread approach
. . . . . . . . . . .
13
A new pairs trading strategy
15
4.1
Mean-reverting conditional probabilities
pup
and
pdown :
theory
. . . . . . . . . .
16
4.2
Mean-reverting conditional probabilities
pup
and
pdown :
practical evaluation . . .
18
4.3
The strategy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
20
Applications
23
5.1
The data and some computational details . . . . . . . . . . . . . . . . . . . . . .
23
5.2
Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
25
ix
CONTENTS
Conclusion
43
A Appendix
47
A.1
Linear state space models and the Kalman lter . . . . . . . . . . . . . . . . . .
47
A.2
Proof of Proposition 1
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
48
A.3
Proof of Proposition 2
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
49
A.4
Matlab Code
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
51
References
62
Introduction
Pairs trading is a type of statistical arbitrage strategy that has been rstly implemented
in the mid 1980's by Nunzio Tartaglia and his group at Morgan Stanley (cf. [39]). Nowadays,
pairs trading is widely used by investment banks and hedge funds. In general terms, a pairs
trading aims at identifying and exploiting market ineciencies observed with two long-term
related assets, the two assets are said to form a pair , mostly by using statistical methods.
When a signicant deviation of the prices between the two assets is detected, a trading position
is carried out: the higher priced asset is sold (this is the so-called short position by market
practitioners) and lower priced asset is bought (that is: a short position is taken), with the
hope that mispricing will correct to the long term equilibrium value (cf. [11] and [39]).
In this work, we consider two linear state space models appropriate for modeling spreads
(stationary linear combinations of long term-related assets), with the intent of testing a new
quantitative strategy involving pairs trading.
models proposed by [11].
The rst model is the unobserved component
Such model, which has a Gaussian linear space state form, is a
discrete-time version of the linear mean reverting Ornstein-Uhlenbeck model. The second model
is the traditional stationary autoregressive moving-average, or ARMA, model (cf. [5], [6], [19]
and [12]), whose particular specications are also dealt with in this work under appropriate
linear state forms. We shall prove that this second class of models, even though lacking nance
theoretical support, encompasses the former proposal by [11] as a particular case. Moving on,
Introduction
we develop a methodology for calculating conditional probabilities (given past and actual spread
data) that the spread will return to its long-term mean by
k -steps ahead (the frequency can be
daily or intra-daily), whenever it deviates somehow from the long-term mean at a given time
instant. For such, we propose an alternative augmented state-space form for a given model,
formerly selected and estimated with spread data, and with this enlarged state space form
we apply the Kalman lter
k -steps
ahead prediction (see, for instance, [20] and [9]) to obtain
conditional mean vectors and covariance matrices of the
future spreads. The latter is all that
is needed for calculating the conditional probabilities previously mentioned. The quantitative
strategy we shall pursue here is activated according to the rule: if the spread is found to be
considerably below (above) its long-term mean and the conditional probability that the spread
will increases above (decreases below) its long-term mean by
k -steps
ahead is reasonably large,
buy (sell) the spread.
The dissertation is organized as follows. Chapter 1 briey reviews the literature on pairs
trading, without claiming exhaustiveness. Chapter 2 discusses pair trading from the statistical
arbitrage standpoint, enumerating some of its main practical features. Chapter 3 presents the
two aforementioned linear state space models models, discusses their mathematical properties
and embeds each of them into the state space modeling/Kalman lter framework. Chapter 4
formally discusses how the conditional probabilities that the spread will mean-revert are calculated, the corresponding computational issues and describes step-by-step how the quantitative
strategy shall be implemented. Chapter 5 oers three applications to real data from the US
and Brazilian markets and compares the performances of the proposed strategy with the main
benchmarks and with a former pairs trading strategy already tackled by market practitioners.
Analysis regarding computational eorts for estimation and goodness-of-t is included. Chapter 6 oers a discussion about the main results obtained in the former chapter and makes some
Introduction
comments regarding the use of the methodology in real scenarios. The appendices review the
main Kalman lter techniques used in the work and provide the proofs of the technical results.
Introduction
Chapter 1
Pairs Trading: a glimpse at the literature
This chapter briey discusses earlier works on pairs trading strategies, focusing mainly
on spread modeling. A common feature to each of such models which, to some extent, shall
be also pursued by the models of this paper consists of recognizing the spread, associated
with a pair of stocks (cf.
the naive denition of pair given in Chapter ), as some kind of
mean-reverting stochastic process, whose parameters are estimated with nancial market data.
The last paper reviewed in this chapter, by its turn, is an empirical investigation that uses a
simple standard deviation strategy to show that pairs trading can be protable after costs.
Elliot et al. [11] developed a Gaussian linear state space models for the mean reversion
behavior of the spread between paired stocks in a continuous time setting. It is assumed that
the observed spread
St
is a noisy observation of some mean-reverting unobserved spread
xt .
The set-up for parameter estimation was based on a version of the expectation-maximization
(EM) algorithm previously developed in [10]. The pairs trading strategy proposed is this: if
St
is larger/smaller than the one-step-ahead estimate
xt|t1 ,
then the spread is regarded as
too large/small, and so the trader could take a short/long position in the spread portfolio.
Therefore, a prot is expected whenever a price correction occurs.
CHAPTER 1.
PAIRS TRADING: A GLIMPSE AT THE LITERATURE
Triantafyllopoulos & Montana [37] have extended the modeling framework proposed by
Elliot et al.
in several ways.
First, they introduced time-varying autoregressive (or mean-
reverting) parameters, something that potentially allows the model to adapt itself to sudden
changes in the data.
Second, they developed and implemented a Bayesian approach for es-
timating the parameters, providing an on-line estimation scheme.
Lastly, they advocated a
procedure known as exible least squares (FLS) to estimate the coecient of co-integration
coecient recursively, unveiling possible time-varying co-integration relationship between the
two asset prices.
Vidyamurthy [39] exploited the pairs trading universe in his book. He gives a good background about the theme and discusses several techniques to choose pairs trading, focusing on
co-integration tests. Moreover, the author explains how pairs trading works and surveys some
methods for dealing with the problem in real settings for instance: common trends/cointegration models, arbitrage pricing theory (APT), distance measure and state space models/Kalman
lter.
It is also worth citing the paper by Avellaneda & Lee [4]. These authors employed principal components analysis and sectors Exchange Trade Funds (ETF) for extracting risk factors.
For each method, they modeled the corresponding residuals as mean-reverting processes.
Finally, Gatev et al. [16] studied pairs trading strategy in the U.S. equity market with
daily data over the period from 1962 through 2002.
In their study, stocks from companies
that had at least one day out of business have been discarded.
A pair formation for each
stock was found by minimizing the squared deviations between the two normalized daily price
series, where dividends were reinvested. The basic strategy consisted of opening a position in
a pair when prices diverge by more than two historical standard deviations and unwind the
position whenever the prices cross each other and, should prices do not cross after the end
of trading interval, gains and losses are calculated at the end of the last trading day. Latter,
the performance of this strategy by [16] for the Brazilian stock market case was addressed by
[29]. The latter investigated the period from 2000 until 2006 and tested dierent conditions of
long and short, ranging between 1.5 and 3 standard deviations. For the data set used, the best
options were those contained between 1.5 and 2 standard deviations.
CHAPTER 1.
PAIRS TRADING: A GLIMPSE AT THE LITERATURE
Chapter 2
Statistical Arbitrage Strategies
Quoting [22], when the two legs of a spread are highly correlated and therefore the
opportunity for prot from price divergence is of short duration, the trade is called an arbitrage.
True arbitrage has, theoretically, no trading risk, however it is oset by small prots and limited
opportunity for volume .
Statistical arbitrage is a class of strategies widely used by hedge funds and proprietary
traders.
The distinctive feature of such strategies is that prots can be made by exploiting
statistical mispricing of one or more assets, based on their regular behavior. Despite the use of
the term arbitrage", such class is not riskless. One of the simplest albeit very popular strategy
that ts in with the denition of statistical arbitrage is pairs trading (cf. [11]). Other types of
statistical arbitrage are discussed in [39] and [30].
Following [39], the rst use of a pairs trading strategy is attributed to the Wall Street
quant Nunzio Tartaglia, who was at Morgan Stanley in the mid 1980. Pairs-trading is based
on the arbitrage pricing theory (APT) (cf.
[32]).
Informally speaking, if two stocks have
similar characteristics, then the prices of both assets must be more or less the same; that is,
they maintain some degree of equilibrium. When prices diverge, then it is likely that one of the
10
CHAPTER 2.
STATISTICAL ARBITRAGE STRATEGIES
assets is overpriced and/or the other is underpriced. Basically, pairs trading schemes involve
selling the higher priced asset and buying lower priced asset with the hope that mispricing will
be ultimately corrected by the long term equilibrium value. The dierence between the two
observed prices is termed spread.
Therefore, the idea behind a given pairs trading strategy
is to trade on the oscillations about the equilibrium value of the spread.

of the spread occur because the latter is allegedly mean-reverting.
The oscillations
One can put on a trade
when the spread deviates substantially from its equilibrium value and unwind the trade when
the equilibrium is restored (cf. [11]). In order for the trade to be potentially protable, and
therefore be executable, the deviation must be reasonably larger than trading costs.
Pairs trading is a market-neutral trading strategy. Hence, this strategy strives to provide
positive returns in both bull and bear markets by selecting a large number of long and short
positions with no net exposure to the market (cf. [28] and [25]). The main risks involved in
a pairs trading are: (1) the divergence risk : the long-term equilibrium relation between the
assets may change or even vanish; and (2) the horizon risk : the spread does not converge in a
given horizon of time, hence forcing the traders to close your position before the convergence,
due to worsened mispricing or margin call (cf. [13]). Additional details about pairs trading can
be found in [30] and [39].
Chapter 3
Proposed Models
3.1 What is a pair?
The idea behind a pair (of stocks, bonds, foreign exchanges, commodities etc.) is closely
linked to the econometric concept of cointegration. Rigorously, two time series
Xt I(1)
are said to be cointegrated i
notation I(d) means integrated of order
aYt + bXt I(0)
d.
for some
a 6= 0
Yt I(1)
and
b 6= 0
and
the
This denition shall be enough for the aims of this
work. For richer expositions on the theme and more general denitions, see [21], [19] and [12].
Consider now
St = log(Pt1 ) [ + log(Pt2 )] ,
where
Pt1
and
Pt2
are the prices of assets
A1
and
A2
(3.1.1)
in time t, respectively. The time frequency
can be daily or some kind of intraday frequency (second, minute, hour etc.). If log(Pt1 ) and
log(Pt2 ) are cointegrated, the spread
St
the mean of cointegration relationship,
is stationary that is:
St I(0).
In such case,
is the cointegration coecient, and
A1
and
A2
is
form
a pair.
Cointegration, once veried, suggests that
11
St would wander around an equilibrium value.
12
CHAPTER 3.
PROPOSED MODELS
This is actually the main ingredient for achieving success in a pairs trading. Such value is zero,
in view of
in Eq.(3.1.1). Any expressive deviations from this value can be traded against.
3.2 Unobserved component models: the stochastic spread

approach
Following [11], in this section we assume that the the observed spread
with a given pair of assets
reverting spread
A1
and
A2 ,
St ,
associated
is a noisy realization of the unobserved or actual mean-
xt :
St = xt + Dt
(3.2.1)
xt xt1 = a bxt1 + Ct
where
a <, 0 < b < 2, C > 0, (t , t )0
NID (0, I2 ). Adapting Eqs.(A.1.1) of Appendix A.1
in order to obtain an appropriate state space representation for the model in Eqs.(3.2.1), just
dene
Zt = 1, dt = 0, Ht = D2 , Tt = B 1 b, ct = A, Rt = 1
and
Qt = C 2 .
Then the
Kalman lter formulae in Eqs.(A.1.2) of Appendix A.1 turn to
t = St at|t1 ,
Kt = BPt|t1 Ft1 ,
at+1|t = A + Bat|t1 + Kt t ,
Ft = Pt|t1 + D2 ,
L t = B Kt ,
(3.2.2)
Pt+1|t = BPt|t1 L0t + C 2
Eqs.(3.2.2) can be started under the initial conditions
B 2 ).
t = 1, . . . , n.
a1|0 = A/(1 B)
and
P1|0 = C 2 /(1
Notice that the latter are precisely unconditional rst- and second-order moments of the
stationary process
xt .
This model proposed by Elliot et al. has three interesting features. The rst is that it
has at least some support from nance theory, since it can be viewed as a discrete time version
of the Ohrstein-Uhlenbeck continuous time stochastic process see [33]. The second is that it
3.3.
ARMA MODELS: GENERALIZING THE STOCHASTIC SPREAD APPROACH
13
recognizes a mean reverting behavior for the spread. The last good property is a consequence
of the next result, the proof of which is in Appendix A.2:
Proposition 1. If
then
St follows the unobserved component model by Elliot et al.
St ARMA (1, 1)
given in Eqs.(3.2.1),
with restrictions.
This last proposition, besides encapsulating this proposal by Elliot et al.
in a more
general class of mean reverting statistical models (next section), suggests a procedure for selecting/discarding Eqs.(3.2.1) as a probabilistic description of some spread time series: if one
obtains evidences from the data that the latter shall not be adequately tted by any ARMA(1,1)
model, then the proposal by Elliot et al. is necessarily misspecied for being considered in a
pairs trading scheme.
3.3 ARMA models: generalizing the stochastic spread approach

Because of their mean reverting behavior, stationary autoregressive-moving average
(ARMA) dynamics can be always considered as valid attempts for modeling the spread
For instance, one could assume that
St ARMA (2, 2),
that is,
St = 0 + 1 St1 + 2 St2 + t + 1 t1 + 2 t2 ,
where
t
z <,
and
NID (0,
and
(1 , 2 )0
are such that the polynomial
St
St
p (z) = 1 1 z 2 z 2 ,
is invertible that is,
t
be a stationary process see [5], [6] and [19]. The same
restrictions could be imposed to the moving average coecients

that
(3.3.1)
has its two roots outside the unit circle. The latter assumption on the coecients
is a sucient condition to
St .
and
can be written as a function of
in order to guarantee
Yt , Yt1 , . . . ,
by means of
14
CHAPTER 3.
an AR () representation for
St
again, see [5], [6] and [19].
PROPOSED MODELS
Fortunately, such question
regarding invertibility is immaterial under the state space modeling/Kalman lter framework,
since the latter always makes both likelihood function evaluation and forecasting attainable
tasks independently of the invertibility question, as cleverly discussed by [19], Chaps. 4, 5 and
13.
One can use Eqs.(A.1.1) of Appendix A.1 to accommodate the model in Eq.(3.3.1), and
any other stationary ARMA(p, q) model, under state space representations.
Although there
is no unique way of doing such conversion and the literature has been frequently oering and
defending several state space forms for ARIMA models to cite a few books: [20], [5], [6], [19]
and [9] , in this work the following alternative for the ARMA(2, 2) model given in Eq.(3.3.1)
shall be used in the sequel:
Zt = 1 0 0 0 0
1 2
1 0
Tt =
0 0
0 0
0 0
dt = 0, Ht = 0,
1 1 2
0
0

0
0
0 0 0

1, Qt = 2 .
,
R
=
,
c
=
t
t
0 0 0

0

1 0 0

0

0
0
0 1 0
The Kalman lter formulae in Eqs.(A.1.1)(cf.Appendix A.1) is carried out with the ma-

trices above and can be initialized under the initial conditions
and vec (P1 )
= (I T T )1 vec (RQR) .
a1 =
0
0
0
,
, 0, 0, 0
1 1 2 1 1 2
Chapter 4
A new pairs trading strategy
In this chapter, we discuss the main elements of a quantitative pairs trading strategy
entirely based on the estimation of state space models proposed in Chapter 3. Firstly, in Section
4.1, we give theoretical details on how conditional probabilities that the spread will return to
its long-term mean, by
k -steps
ahead from a given time instant
t,
are dened.
Moving on,
in Section 4.2 we explore the practical matters for eectively calculating the aforementioned
probabilities in a on-line fashion as it will be shown, once an appropriate state space model is
estimated by maximum likelihood (see Appendix A.1), the implementation of the usual Kalman
lter prediction equations given in Eqs.(A.1.2) to an augmented version of the model shall be
everything needed. Finally, in Section 4.3 the quantitative strategy is described step-by-step,
where the content derived in Sections 4.1 and 4.2 are merged with the trading rule that involves
buying or selling the spread accordingly.
15
16
CHAPTER 4.
A NEW PAIRS TRADING STRATEGY
4.1 Mean-reverting conditional probabilities pup and pdown:

theory
The main target for success: to achieve, from a statistical/probabilistic standpoint, a
minimum condence that a future observed value of the spread will not take much long to cross
back some long-term value (for instance: its unconditional mean), once the spread observed on
some time
t is somewhat distant from that same long-term value.
one might buy (or sell) the spread on that time
t,
If such task is accomplishable,
whenever chances are that he or she will be
able to make a prot soon.
Formally, the strategy that we now begin to build is strongly based upon the ability of
calculating conditional probabilities that the spread will revert to its long-term mean or any
other convenient value
to be chosen by
steps ahead, given the past and actual spread
data; that is:
pup (t, k, c)
= P [(St+1 > c) (St+2 > c) (St+k > c)|Ft ]

= 1 P [(St+1 c, St+2 c, . . . , St+k c|Ft ]
= 1 FSt+1 ,St+2 ,...,St+k |Ft (c, c, . . . , c),
pdown (t, k, c)
= P [(St+1 < c) (St+2 < c) (St+k < c)|Ft ]

= P [(St+1 > c) (St+2 > c) (St+k > c)|Ft ]
= 1 P [St+1 c, St+2 c, , St+k c|Ft ]
= 1 FSt+1 ,St+2 , ,St+k |Ft (c, c, , c),
(4.1.1)
where
Ft is the -eld generated by past and actual spread data; that is Ft (S1 , . . . , St1 , St ).
If the assumption that a specic Gaussian linear state space model is appropriate for the spread
4.1. MEAN-REVERTING CONDITIONAL PROBABILITIES
PU P
AND
PDOW N :
THEORY17
(something that needs checking in practical implementations), the conditional distribution functions described in Eqs.(4.1.1) correspond to
St,k
where
t,k
E (St,k |Ft ) and
St+1
S
t+2
. |Ft N (t,k , t,k ) ,

..
St+k
t,k
Var (St,k |Ft ).
(4.1.2)
Sticking to the notation established in
Appendix A.1 for key quantities related to the Kalman lter and also dening
Cov(t+i , t+j |Ft ), for
that each entry of
t,k
i, j = 1, 2, . . . , k
and
i<j
(recall that
Pt+i,t+j|t
Pt+i,t+j|t = Pt+j,t+i|t ),
it follows
is given by
E(St+i |Ft )
= E(Zt+i t+i + dt+i + t+i |Ft )

= Zt+i E(t+i |Ft ) + dt+i + E(t+i |Ft )
= Zt+i E(t+i |Ft ) + dt+i + E(t+i )
(4.1.3)
= Zt+i at+i|t + dt+i .
Regarding
t,k ,
its diagonal and o-diagonal blocks are given respectively by
Var(St+i |Ft )
0
= Zt+i Var(t+i |Ft )Zt+i
+ Var(t+i |Ft )
0
+ Zt+i Cov(t+i , t+i |Ft ) + Cov(t+i , t+i |Ft )Zt+i
(4.1.4)
0
= Zt+i Pt+i|t Zt+i
+ Ht+i ,
(4.1.5)
18
CHAPTER 4.
Cov(St+i , St+j |Ft )
= Cov(Zt+i t+i + dt+i + t+i , Zt+j t+j + dt+j + t+j |Ft )

0
+ Zt+i Cov(t+i , t+j |Ft )
= Zt+i Cov(t+i , t+j |Ft )Zt+j
(4.1.6)
0
+ Cov(t+i , t+j |Ft )Zt+i
+ Cov(t+i , t+j |Ft )
0
.
= Zt+i Pt+i,t+j|t Zt+j
4.2 Mean-reverting conditional probabilities pup and pdown:

practical evaluation
For each
t,
the rst- and second-order conditional moments displayed in Eqs.(4.1.3)
and (4.1.4) are trivially obtained from the Kalman lter in Eqs.(A.1.2) applied with the data
subset
{S1 , S2 , . . . , St }
has to consider
enlarged with
missing values after the last spread
{S1 , S2 , . . . , St , .NaN, .NaN, . . . , .NaN},
Available Number" and appears
St ;
that is, one
where the acronym .NaN" means Not
times exactly right after
St .
Following [9], Sec.4.9, this is
the recognition that, under the state space modeling approach, forecasting is a particular case
of missing values estimation. On the other hand, to be calculated, Eqs.(4.1.6) shall depend on
additional implementation of Kalman recursions other than those revisited in Appendix A.1
specically, those derived in [9], Sec.4.5, with appropriate adaptations for the case of missing
values. In order to avoid this computational eort, which is not always available as a readyto-use option oered by commercial softwares and neither is considered in usual Kalman lter
codes suggested in textbooks, in this work we propose an alternative. Our proposal will make
use of already implemented formulae known to time series analysts.
The building block for routinely evaluating Eqs.(4.1.3), (4.1.4) and (4.1.6) for each time
is to use an augmented state space form equivalent to a given time series model formerly
selected and estimated with the spread data. In this paper, the models considered are those
4.2. MEAN-REVERTING CONDITIONAL PROBABILITIES
PU P
AND
PDOW N :
previosly discussed in Sections 3.2 and 3.3. This task consists of guring up
PRACTICAL EVALUA
new blocks to
the state vector in Eqs.(A.1.1), each one having the same dimension of the original state vector.
Formally:

Yt = Zt 0 . . .

t1
0
. + dt + t ,
..
tk
t+1 Tk 0

I
0
.
.
.. . . .
.
.

t(k1)
0 I
where
Zt , dt , Tt , Rt
and
ct
0 t ct Rt

0 0
0
t1
+ + t ,
. .
. .
.. ..
.
.
. .

0 tk
0
0
(4.2.1)
are the system matrices of the original model. With this enlarged
state space form, we apply the Kalman lter
k -steps
obtain rst- and second-order conditional moments of
ahead prediction in a given time
(t+1 , . . . , t+k )0 ;
to
with these quantities,
the calculation of the rst- and second-order moments displayed in Eqs.(4.1.3), (4.1.4) and
(4.1.6) becomes straightforward.
Denote the vectors of unknown parameters associated with Eqs.(4.2.1) and (A.1.1) by
and
respectively, and the corresponding likelihood functions by
and
L .
augmented model does not carry o any new parameters, it trivially follows that
Since the
= .
Even
though it is not that easy to claim the same for the maximum likelihood estimators obtained
under
and
L , the next proposition, whose proof is in Appendix A.3, asserts that it is indeed
the case:
Proposition 2.
arg max L( ) = arg max L ( ) .
20
CHAPTER 4.
This result and its proof are admittedly inspired on Theorem 2 of [3], but we decided to include
them here in detail, with proper adaptations from the former proof, in order to make this paper
more self-contained.
The interpretation of Proposition 2 is that there are no changes in maximum likelihood
estimation when considering the augmented model in Eqs.(4.2.1); hence, does not need to use
the latter to estimate the parameters, something that would create additional and unnecessary
computational endeavor. Instead, the estimation of unknown parameters can be accomplished
using the original model in Eqs.(A.1.1) and the estimates obtained shall be used with the
augmented model. From a practical standpoint, this result shall prove to be a key one in the
applications of Chapter 5 for speeding up the calculation of the probabilities in Eqs.(4.1.1).
Finally, once
t,k
and
t,k
in Eq.(4.1.2) are calculated, the conditional probabilities
in Eqs.(4.1.1) shall be evaluated through standard numerical multiple integration algorithms,

which have been adapted for multivariate normal distributions framework see for instance [8],
[17], [18] and [7].
4.3 The strategy

Assuming that a particular state space model has been already estimated with available
time series data from the spread process
and
A2
St
the latter is associated with a pair of assets
A1
and that the numerical devices discussed in Section 4.2 have been implemented, we
are now able to propose our trading rule. Summarily, this can be split in two mutually exclusive
situations:
If the observed value of

value
c,
St
is found minimally below (let us say: for
units) a long term
the very same used along Eqs.(4.1.1) and previously xed in a particular value
(for instance:
c = 0,
should one choose the spread mean), and
pup
in Eq.(4.1.1) is found
4.3.
THE STRATEGY
21
above some large value
If the observed value of

the same amount
pup ,
buy the spread.
St is found minimally above c (without loosing generality, consider
and
pdown
in Eq.(4.1.1) is found above some large value
pdown ,
sell
the spread.
The items above deserve some qualication. Firstly, the meaning of the expression buy
the spread is: the lower priced asset (in this case,
asset is sold.
A1
see Eq.(3.1.1) is bought and the other
The expression sell the spread could be analogously explained.
Secondly, it
is worth noticing that either the rst situation (long position) or the second (short position)
shall occur when the spread deviates more than or less than
the latter being a threshold
that guarantees a minimum protable trade after costs. Thirdly, since their values are priory
set,
pup
and
pdown
necessarily reect risk aversion and one does have the option of choosing
dierent values for each one.
Fourthly, the position (either long or short) shall be disabled
whenever the spread hits the long-term value
c,
or when it does not return in
k -steps
ahead
recall Eqs.(4.1.1). Finally, even though the two situations are mutually exclusive, these are
certainly not exhaustive: indeed, if the conditions required for each of them are not met, the
capital remains invested in some xed income market until one of the triggers is activated.
The choices for the parameters
, pup , pdown , c
and
considered in this paper will be given in
the applications of Chapter 5.

When the strategy just described is adopted, the main risk one might be exposed to is
that related to specic fundamental changes: the prices of
A1 and A2 may diverge, which means
that the spread, not stationary anymore, is not hitting its former long term value
the parameter
c.
Actually,
has precisely the function of mitigating such divergence risk. Another aspect
is that the target return must always be higher than the return earned in the xed income
market because it is the opportunity cost inherent to this strategy. The parameter
is present
22
here to try to controls such point.
CHAPTER 4.
Chapter 5
Applications
This section presents the results of applying models from Chapter 3 and the pairs trading
strategy derived in Chapter 4 with real data from the US and Brazilian markets. In Section 5.1
we describe the data used in the estimations and justify our choice of the stocks as candidates to
form pairs. In Section 5.2 we present the results regarding co-integration tests, model estimation
and goodness-of-t, and the strategy performances.
5.1 The data and some computational details

All the nancial time series used in the implementations have been obtained from
Bloomberg Professional service.
Four of them, considered in two of the three exercises of-
fered here, consist of daily stock prices of two securities: Exxon Mobil Corporation (traded in
the NYSE with the symbol XOM) and Southwest Airlines Co (traded in the NYSE with the
symbol LUV). ExxonMobil Corporation is the world's largest traded international oil and gas
company and has its headquarters located in Texas in the US. Southwest Airlines Co operates
passenger airlines that provide scheduled air transportation services in the United States. The
other exercise in the US equity market consist of modeling daily stock prices of two ETFs:
23
24
CHAPTER 5.
APPLICATIONS
Market Vectors Gold Miners ETF (traded in the NYSE Arca with the symbol GDX) and
SPDR Gold Shares (traded in the NYSE Arca with the symbol GLD). Market Vectors Gold
Miners seeks to replicate as closely as possible, before fees and expenses, the price and yield
performance of the NYSE Arca Gold Miners Index. SPDR Gold Shares seeks to replicate the
performance of the price of gold bullion. ETF (Exchange Traded Fund) is a security that tracks
an index, a commodity or a basket of assets like an index fund, but is traded as a stock on
an exchange. For these four stocks, the period considered goes from September 22nd, 2011 till
September 20th, 2012. Two other series, corresponding to the third exercise, are daily stock
prices of Vale (traded in the stock exchange BMF&BOVESPA in Sao Paulo with the symbol
VALE5) and Bradespar (traded in the stock exchange BMF&BOVESPA in Sao Paulo with the
symbol BRAP4). Vale is the second largest mining company in the world and the largest private
company in Brazil. It is the largest producer of iron ore in the world and the second largest
of nickel.
Bradespar is an investment company seeeking to create value for its shareholders
through relevant interests in companies that are leaders in their operational areas. Currently,
Bradespar holds a stake in Vale, acting directly in senior management, with members on the
Board of Directors and Advisory Committees. We have used available data for these two stocks
from August 29th, 2011 till September 20th, 2012. In view of the denition of pair given and
discussed in Chapter 3, the stocks described above have been chosen mainly because, in view
of their details given above, XOM and LUV, GDX and GLD like VALE5 and BRAP4 are
supposedly long-term related.
Also, the following asset class indexes have been used in the strategy results evaluation:
Libor - 1 year: This indicator stands for London Interbank Oered Rate. It is the rate
that banks use to borrow from and lend to one another in the wholesale money markets
in London.
5.2.
RESULTS
25
Standard and Poor's 500 Index (S&P): This is a capitalization-weighted index of 500
stocks representing all major industries and is designed to measure performance of the
broad domestic economy through changes in the aggregate market.
Inter-bank deposit certicate (CDI): This indicator is the overnight rate in Brazil.
As
such, this play the very same role as Libor does. Despite of being a market rate, the CDI
is closely tied to the interest rate, which is xed by the Brazilian Central Bank on the
course of monetary policy decisions.
Bovespa Index (Ibovespa):

average performance.
This is the main indicator of the Brazilian stock market's
The relevance of this index comes from several reasons; one of
them is the integrity of its historical series, which have been regularly calculated without
any methodological change since its inception in 1968.
5.2 Results
We begin by checking whether XOM-LUV, GDX-GLD and VALE5-BRAP4 show degrees
of mutual equilibrium in the periods considered.
This is assessed by testing co-integration
hypotheses. See Section 3.1. For such, we used the two-step Engle Granger co-integration test,
which is essentially an augmented Dickey-Fuller unit root test performed with the ordinary
least squares (OLS) residuals (this is the second step), obtained after regressing one time series
on the other (this is the rst step); the critical values for the unit root test must be conveniently
modied cf. [14], [12], Chap.6, and [27]. Once the co-integration hypothesis is not rejected,
the spread to be considered in upcoming analyses shall be simply the OLS residuals recall
Eq.(3.1.1) in Section 3.1.
The Engle-Granger test results were obtained from EViews 4.0.
Looking at Table 5.1, we see that the data give enough evidence in favor of co-integration for
26
CHAPTER 5.
APPLICATIONS
both XOM-LUV and VALE5-BRA4. The pair GDX-GLD can also be considered co-integrated
at a 10% level.
Table 5.1:
Engle-Granger cointegra-
tion tests with the pairs.
Pairs
Dicker-Fuller Test
XOM-LUV
-3.006
VALE5-BRAP4
-4.059
GDX-GLD
-1.952
**
**
***
Critical values considered have been

taken from [27].
**
Pair was considered stationary at a 5%

level.
***
Pair was considered stationary at a 10%
level.
Moving on, we now examine the information depicted in Table 5.2, 5.3, 5.4. This contains information concerning goodness-of-t for three parsimonious ARMA (p, q) models and the
model proposed by Elliot et al., along with some diagnostics performed with the standardized
residuals
t
tS = ,
Ft
where
and
Ft
are obtained from Eqs.(A.1.2). The implementations
have been carried out using MATLAB 7.6.0. The unknown parameters were estimated by maximum likelihood, in which we adopted the exact log-likelihood function displayed in Eq.(A.1.3).
See Appendix A.1. First, we see that, for each of models estimated with spreads from both US
and Brazilian markets, the data are reproduced by each of the models almost under similar capabilities according to Pseudo
R2
and MSE measures. However, AIC and BIC criteria do reveal
that the AR (1) model, which is the simplest option, shows the best complexity/t relation.
5.2.
RESULTS
27
Before addressing the diagnostics, it is worth bearing in mind that, if a given linear Gaussian
state space model is adequate for the data at hand, the standardized residuals must behave
like i.i.d. standard normal random variables. Regarding serial dependence, Ljung-Box tests for
both level and squared standardized residuals showed good results for all models and spreads
from both markets. The Jarque-Bera normality test and the coverage Kupiec tests agreed on
revealing adequacy for the pair XOM-LUV. On the other hand, even though the Kupiec tests
suggested that the standardized residuals from all the models estimated with VALE5-BRAP4
and GDX-GLD spread seem to come from a probability distribution similar to the standard
normal distribution as regards tails, the Jarque-Bera test unveiled discrepancies.
Therefore,
some care must be exercised in interpreting and even using the conditional probabilities
and
pdown
in Eqs.(4.1.1) in trading decisions indeed:
pup
and
pdown
pup
are not tail probabilities.
28
CHAPTER 5.
APPLICATIONS
Table 5.2: Results from the estimations with the pair XOM-LUV
(p-values in parentheses).
XOM-LUV
Attribute
AR(1)
AR(2)
ARMA(1,1)
Elliot
-989.044
-989.044
-989.044
-989.081
Pseudo R2
0.896
0.896
0.896
0.902
MSE x 104
2.299
2.298
2.299
2.161
AIC
7.865
7.873
7.873
7.882
BIC
7.893
7.915
7.915
7.938
LR Kupiec test (superior)*
0.077
0.077
0.077
0.434
(0.781)
(0.781)
(0.781)
(0.510)
0.434
0.434
0.434
0.434
(0.510)
(0.510)
(0.510)
(0.510)
13.718
13.727
13.726
13.684
(0.845)
(0.844)
(0.844)
(0.846)
29.557
26.679
29.669
29.582
(0.148)
(0.145)
(0.145)
(0.152)
0.709
0.706
0.706
0.703
(0.685)
(0.687)
(0.689)
(0.688)
Mean****
0.069
0.068
0.068
0.085
Variance****
0.999
0.999
0.999
0.997
Log-likelihood
LR Kupiec test (inferior)*
Ljung-Box test 1(20 lags) **
Ljung-Box test 2 (20 lags)
Jarque-Bera test
***
These are likelihood ratio unconditional coverage tests proposed by [24].

The rst and second tests check standard residual violations of 95% and 5%
standard normal distribution quantiles (that is: 1,65 and -1,65) respectively.
**
This test has been performed with the standardized residuals.
***
This test has been performed with the squared standardized residuals.
****
These sample statistics have been calculated with the standardized resid-
uals.
5.2.
RESULTS
29
Table 5.3:
Results from the estimations with the pair
GDX-GLD (p-values in parentheses).
GDX-GLD
Attribute
AR(1)
ARMA(1,1)
Elliot
-885.036
-885.124
-885.126
Pseudo R2
0.935
0.935
0.935
MSE x 104
3.325
3.323
3.337
AIC
7.040
7.048
7.056
BIC
7.068
7.091
7.112
0.434
0.434
0.434
(0.510)
(0.510)
(0.510)
0.296
0.987
0.296
(0.587)
(0.320)
(0.587)
28.829
29.411
28.853
(0.091)
(0.080)
(0.091)
12.664
12.606
12.707
(0.891)
(0.894)
(0.890)
299.843
309.519
299.517
(0.000)
(0.000)
(0.000)
Mean****
-0.023
-0.024
-0.019
Variance****
1.002
1.002
1.002
Log-likelihood
Jarque-Bera test
***
These are likelihood ratio unconditional coverage tests proposed

by [24]. The rst and second tests check standard residual violations of 95% and 5% standard normal distribution quantiles
(that is: 1,65 and -1,65) respectively.
**
***
This test has been performed with the squared standardized
residuals.
****
These sample statistics have been calculated with the stan-
dardized residuals.
30
CHAPTER 5.
APPLICATIONS
Table 5.4: Results from the estimations with the pair VALE5-BRAP4
(p-values in parentheses).
VALE5-BRAP4
Attribute
AR(1)
AR(2)
ARMA(1,1)
Elliot
-1053.502
-1060.960
-1068.300
-1068.310
Pseudo R2
0.767
0.780
0.788
0.789
MSE x 104
0.905
0.857
0.8130
0.8110
AIC
8.377
8.444
8.502
8.510
BIC
8.405
8.486
8.544
8.566
0.987
0.987
0.015
0.015
(0.320)
(0.320)
(0.903)
(0.903)
2.952
0.434
0.434
0.434
(0.086)
(0.510)
(0.510)
(0.510)
29.706
23.628
18.040
18.035
(0.075)
(0.259)
(0.585)
(0.585)
13.891
13.694
16.381
16.473
(0.836)
(0.832)
(0.693)
(0.687)
22.602
24.308
24.880
24.914
(0.002)
(0.001)
(0.001)
(0.001)
Mean****
-0.019
-0.029
-0.045
-0.049
Variance****
1.004
1.003
1.002
1.002
Log-likelihood
Jarque-Bera test
***
These are likelihood ratio unconditional coverage tests proposed by [24]. The
rst and second tests check standard residual violations of 95% and 5% standard normal distribution quantiles (that is: 1,65 and -1,65) respectively.
**
***
This test has been performed with the squared standardized residuals.
****
These sample statistics have been calculated with the standardized residuals.
5.2.
RESULTS
31
Now, we start the discussion about the pairs trading strategy performances. The parameter
is set to zero, which is the long-term mean of the spreads, since these are precisely
the OLS residual time series from co-integration regressions. Regarding the parameter value
since operating costs, due to slippage (this is the dierence between the trade expected price
and the trade actual price) and transaction, are being ignored here,
this aw. In view of these two choices for
is set to 0.5% to overcome
c and , a position to buy (sell) spread is open, if and
only if, the spread is less (greater) than - (+ ). Finally, regarding the conditional probabilities
pup
and
pdown , their threshold values pup
and
pdown
are both set to 80% and the parameter
k was
dened arbitrarily as 25 days, meaning that the strategy will be closed if, once the spread is
bought or sold, the pair does not return to its long-term mean after 25 days at current market
prices the latter being an event with conditional probability of 20% at the most (whenever
model assumptions are satised).
Table 5.5 and Figures 5.1, 5.2, 5.3 and 5.4 display the results corresponding to the
pair XOM-LUV for the four linear state space models already under investigation.
These
also show results from traditional benchmarks of the USA nancial market already detailed in
Section 5.1, and the performance of something we term the plain strategy: the spread for this
strategy is dened as the ratio between the highest and lowest price assets, and the trading
strategy, formerly addressed by [16], consists of opening a position with two assets whenever
their corresponding spread deviates more than two historical (sample) standard deviations, and
unwinding the position when it returns to the spread historical mean in case of prices do not
converge after the end of the trading interval, gains and losses are calculated at the end of the
last trading day. Analyzing Table 5.5, Sharpe ratios, calculated here for being used as the main
criterion for choosing the best strategy (since these measure return performances adjusted to
market risk, cf. [34] and [35]), indicate that the best trading options in the period considered
32
CHAPTER 5.
APPLICATIONS
have been our pairs trading strategy implemented with AR (2) and ARMA (1, 1) models, both
presenting the same cumulative return and historical volatility. Cumulative and average returns
corresponding to these two models are larger than the others, except for the plain strategy.
However, due to its quite larger volatility, the latter has a worse Sharpe ratio. Additionally,
low correlations with the stock index(S&P) are observed the latter have been previously
expected, since this type of strategy is supposedly market neutral.
Figure 5.1, 5.2, 5.3 and
5.4 depict cumulative returns for the four state space model, together with cumulative returns
from the market indices and the plain strategy, corroborating and illustrating the ndings from
Table 5.5.
Table 5.5: USA market data: performance measures from four dierent models for the spread
and from three benchmarks
XOM-LUV
Benchmarks
Measures
AR(1)
AR(2)
ARMA(1,1)
ELLIOT
LIBOR
S&P
Plain strategy
Average return
0.066%
0.077%
0.077%
0.066%
0.0004%
0.077%
0.080%
Volatility*
0.645%
0.595%
0.595%
0.645%
0.00004%
1.034%
0.751%
Cumulative return
15.648%
18.606%
18.606%
15.648%
0.095%
17.573%
19.086%
Sharpe ratio
1.601
2.066
2.066
1.602
1.122
1.678
Correlation**
0.160
0.150
0.150
0.160
0.021
1.000
0.006
This is the standard deviation calculated with the daily returns.
**
Correlation between the daily returns from the strategy P/L and the equity market (S&P).
***
Ratio between accumulated returns from the strategy P/L and the Libor in percentual terms.
5.2.
RESULTS
33
Figure 5.1: AR(1) Model. Comparison among cumulative returns: strategy P/L with the pair
XOM-LUV, Libor, S&P and plain strategy.
XOM-LUV, Libor, S&P and plain strategy.
34
CHAPTER 5.
APPLICATIONS
Figure 5.3: ARMA(1,1) Model. Comparison among cumulative returns: strategy P/L with the
pair XOM-LUV, Libor, S&P and plain strategy.
Figure 5.4: ELLIOT Model. Comparison among cumulative returns: strategy P/L with the
pair XOM-LUV, Libor, S&P and plain strategy.
5.2.
RESULTS
35
Now, analysing the results from the pair GDX-GLD in Table 5.6 and Figures 5.5, 5.6, 5.7,
we see that the best performances, relying once again on Sharpe ratio comparisons, are those
corresponding to model plain strategy. This has also shown low cumulative return and small
volatility in the period.
The low volatility is due to the small number of negotiations (only
one) in the interval trading. Notice also that, in terms of cumulative returns, the S&P and the
Elliot et al. model had the best performance and the S&P is showing a higher risk-return ratio
in the period considered.
Table 5.6: USA market data: performance measures from four dierent models for
the spread and from three benchmarks
GDX-GLD
Benchmarks
Measures
AR(1)
ARMA(1,1)
ELLIOT
LIBOR
S&P
Plain strategy
Average return
0.047%
0.047%
0.065%
0.0004%
0.077%
0.019%
Volatility*
1.388%
1.388%
1.364%
0.00004%
1.034%
0.242%
Cumulative return
8.772%
8.772%
13.485%
0.095%
17.573%
4.121%
Sharpe ratio
0.415
0.415
0.651
1.122
1.105
Correlation**
0.068
0.068
0.062
0.021
1.000
0.181
This is the standard deviation calculated with the daily returns.
**
Correlation between the daily returns from the strategy P/L and the equity market (S&P).
***
Ratio between accumulated returns from the strategy P/L and the Libor in percentual terms.
36
CHAPTER 5.
APPLICATIONS
GDX-GLD, Libor, S&P and plain strategy.
Figure 5.6: ARMA(1,1) Model. Comparison among cumulative returns: strategy P/L with the
pair GDX-GLD, Libor, S&P and plain strategy.
5.2.
RESULTS
37
pair GDX-GLD, Libor, S&P and plain strategy.
Likewise, both Table 5.7 and Figures 5.8, 5.9, 5.10 and 5.11 show the results for the
pair VALE5-BRAP4. The best performances, relying once again on Sharpe ratio comparisons,
are those corresponding to models AR (1) and AR (2) these have also shown low correlations
with the Ibovespa domestic stock index. In Figures 5.8, 5.9, 5.10 and 5.11, it is suggested that
cumulative returns coming from our pairs trading strategy, implemented with the best two
models, maintained an upward trend with relatively low volatility, probably corroborating the
best Sharpe ratios. On the other hand, even though did Ibovespa present larger nal return
in the period considered amongst all the investment alternatives, one should notice its huge
risky behavior (compare volatilities in Table 5.4), which have certainly contributed for some
temporary losses. This can be seen on the downward and persistent reverses for this index in
Figures 5.8, 5.9, 5.10 and 5.11. Notice also that, in terms of cumulative returns, Ibovespa has
been the worst investment option for several moments in the period considered.
38
CHAPTER 5.
Table 5.7:
Brazilian market data:
APPLICATIONS
performance measures from four dierent models for the
spread and from three benchmarks.
VALE5-BRAP4
Benchmarks
Measures
AR(1)
AR(2)
ARMA(1,1)
ELLIOT
CDI
IBOVESPA
Plain strategy
Average return
0.084%
0.083%
0.063%
0.063%
0.036%
0.095%
0.035%
Volatility*
0.9505%
0.933%
0.881%
0.881%
0.005%
1.494%
0.261%
Cumulative return
19.878%
19.592%
9.872%
9.872%
8.550%
21.064%
8.202%
Sharpe ratio
0.789
0.784
0.0994
0.0994
0.5547
-0.0884
Correlation **
0.036
0.050
0.033
0.033
0.038
1.000
0.009
232.49%
229.15%
115.46%
115.46%
100.00%
246.36%
95.93%
%CDI ***
*
This the standard deviation calculated with the daily returns.
**
Correlation between the daily returns from the strategy P/L and the equity market (IBOVESPA).
***
Ratio between accumulated returns from the strategy P/L and the CDI in percentual terms.
5.2.
RESULTS
39
VALE5-BRAP4, CDI, IBOVESPA and plain strategy.
VALE5-BRAP4, CDI, IBOVESPA and plain strategy.
40
CHAPTER 5.
APPLICATIONS
Figure 5.10: ARMA(1,1) Model. Comparison among cumulative returns: strategy P/L with
the pair VALE5-BRAP4, CDI, IBOVESPA and plain strategy.
pair VALE5-BRAP4, CDI, IBOVESPA and plain strategy.
5.2.
RESULTS
41
Finally, Table 5.8 shows the computational gain, in terms of estimation time, due to the
Proposition 2 of this work. Even though the information corresponding to model estimations
with a portfolio with just a pair of assets in a daily basis, it is plausible to assume that the
augmented model would also be excessively time consuming, should we have adopted and
implemented the modeling and pairs trading strategy proposed in this work with intraday
high frequency data estimation times would have been even increased in case of a portfolio
containing several pairs. For instance, the augmented model with
k = 25 for the Elliot's model
required almost three minutes to be estimated; the original model took less than three seconds.
Table 5.8: Computational times (seconds) for maximum likelihood estimation of the models with
the pair VALE5-BRAP4.
Models
Original Model
Augmented Model (k=10)
ELLIOT
2.481
6.113
14.049
44.603
152.091
AR(1)
0.579
1.125
2.250
7.513
24.086
AR(2)
0.939
1.737
4.026
12.725
41.557
ARMA(1,1)
0.891
2.004
5.716
18.154
58.531
42
CHAPTER 5.
APPLICATIONS
Chapter 6
Conclusion
In this dissertation we developed a new pairs trading strategy based on linear state
space models and the Kalman Filter. As opposed to other approaches found in the literature,
neither point forecasts nor condence bands constitute any basis for decisions regarding trading
operations; instead, we look at the conditional probability that the value of the mispriced spread
will mean-revert eventually by some pre-established horizon. The evidences gathered from the
three applications detailed along Chapter 5, even though limited (and therefore far from being
conclusive), suggest that this change of direction in usual pairs trading paradigms might work
well in practice.
At the end of this work, we address some points potentially relevant and
in tune with the nancial market reality for the case of implementing the strategy under real
scenarios.
We start by suggesting that further investigations about how the parameters
c, k
and
which have been held constant in the examples of this work, must be set (notice that nothing
prevents them from being estimated or, should one prefer, optimized under usual back-testing
schemes). One might also enhance the use of such parameters. For instance, the parameter
although designed here to simultaneously take into account the transaction costs from both
43
44
CHAPTER 6.
long and short positions, might be doubled:
CONCLUSION
for one type of position and a
for the
other. In the case of short positions, a very important cost, which anyone willing to adopt any
pairs trading strategy (including ours) must pay attention to, is the rented asset cost. Since
transactions fees vary according to the type of investment, analysis on these latter would help
to identify how well our strategy would be suitable. More details on these costs in the Brazilian
market can be found www.bmfbovespa.com.br, and for the New York Stock Exchange, one is
referred to nyse.nyx.com.
Moving on, we now take a closer look at the question of distributional assumptions, as strong
violations of normality can make the quantities
pup
and
for the true conditional probability of mean-reverting.
pdown
pretty unreliable as proxies
An alternative for dealing with such
inconvenient situation is to rely on Monte Carlo simulation of future trajectories of the spread
St k
steps ahead. For the case of the ARMA models, this would require modeling the error
term with the aid of standardized residuals.
A second alternative, which releases one from
choosing/modeling error distributions (but is quite more demanding in computational terms),

is to adopt some bootstrap procedure to estimate the mean-reverting conditional probabilities.
Wall & Stoer [40] and Rodriguez & Ruiz [31] are two papers of a certainly large list of references
on bootstrapping state space models these two papers seem to have methodologies that shall
address the aims being discussed here.
Lastly, we discuss the use of our strategy in high-frequency data. The analysis of these data
are complicated by irregular temporal spacing, daily patterns and price discreteness (cf. [1],
Ch.7).
Another major characteristic of high frequency data is the strong intra-day seasonal
behavior of the volatility, as pointed by [15], Ch.4.

seasonal patterns cannot be stationary.
A data generating process with strong
Therefore, controlling these periodical movements
before tting any time series model to the data should be a mandatory initial step. In light
45
of these issues typically related to high-frequency situations, other state space models shall be
combined with the pairs trading strategy proposed in this work.
46
CHAPTER 6.
CONCLUSION
Appendix A
Appendix
A.1 Linear state space models and the Kalman lter
By a Gaussian linear state space model we mean the following measurement equation,
state equation and initial state vector:
Yt = Zt t + dt + t , t N ID(0, Ht )
(A.1.1)
t+1 = Tt t + ct + Rt t , t N ID(0, Qt )
1 N (a1 , P1 ).
The former equation is an ane function relating the observed
generally unobserved
m-variate state vector t
1 ).
The system matrices
at most, depend on past of
Yt .
time series
Yt
to the
and the latter equation gives the state evolution
through a Markovian structure. The random errors

each other and of
p-variate
t
and
are independent (in time, between
Zt , dt , Ht , Tt , ct , Rt
and
Qt
are deterministic or,
In the latter case, [20] , Sec. 3.7, refers to Eq. (A.1.1) as a
conditionally Gaussian state space model.

For a given time series of size
and any time instants
(Y1 , . . . , Yj ), at|j E (t |Fj ) and Pt|j Var (t |Fj ).
t,j {1, 2, . . . n},
dene
Fj
The Kalman ltering consists of recursive
equations for these rst- and second-order conditional moments, corresponding to one-step-
47
48
APPENDIX A.
ahead prediction (j
= t 1) and smoothing (j = n).
APPENDIX
The formulae corresponding to prediction
are given below:
Ft = Zt Pt|t1 Zt0 + Ht ,
t = Yt Zt at|t1 dt ,
Kt = Tt Pt|t1 Zt0 Ft1 ,
Lt = Tt Kt Zt ,
at+1|t = Tt at|t1 + ct + Kt t ,
t = 1, . . . , n,
(A.1.2)
Pt+1|t = Tt Pt|t1 L0t + Rt Qt Rt0 ,
The derivations of Eqs.(A.1.2) are found in [9]. There are several other references on
this subject that deserve mentioning, such as [20], [21], [5], [6], [19] and[36].
In practice, system matrices include unknown parameters that must be estimated. By
grouping all unknown parameters of the model described in (A.1.1) in a vector
its corresponding parametric space by
, one can obtain an exact
, and denoting
log-likelihood function, using
some outputs from Eqs.(A.1.2):
log L() =

np
1X
log
log |Ft | + t0 Ft1 t , .
2
2 t=1
The maximum likelihood estimator of

mality assumption for
function and
(0t , t0 )0
is dened by
arg max log L().
(A.1.3)
When the nor-
is violated, Eq.(A.1.3) should be viewed as a quasi log-likelihood
in its turn, as a quasi maximum likelihood estimator.
A.2 Proof of Proposition 1

From the second equation of Eqs.(3.2.1), it follows that
xt = a + (1 b)xt1 + Ct a + Bxt1 + t ,
where
t N(0, C 2 ).
Therefore,
xt =
(1 BL)xt = xt Bxt1 = a + t ,
leading to
1
a
a
1
a+
t =
+
t ,
(1 BL)
(1 BL)
(1 B) (1 BL)
(A.2.1)
PROOF OF PROPOSITION ??
A.3.
where
49
L is the usual lag operator (recall: 0 < b < 2).
Now place Eq.(A.2.1) in the rst equation
of Eqs.(A.2.1) to get
St =
where
a
a
1
1
+
t + Dt =
+
+ Dt
(1 BL) (1 BL)
(1 BL) (1 BL) t
(A.2.2)
t N(0, D2 ).
(1 BL)
Applying the operator
on both sides of Eq.(A.2.2),
St (1 BL)St = a + t + t Bt1
(A.2.3)
From Eq.(A.2.3), it is straighforward to see that
(0) = C 2 + (1 + B 2 )D2 , (1) = C 2 + (1 B)D2 , (k) = 0, k 2,
where
(A.2.4)
(k) = Cov(St , Stk

), k = 1, 2, . . .
From Eqs.(A.2.4) and [5], p.89, Prop.3.2.1, it follows that
St MA(1).
A.3 Proof of Proposition 2

We have to prove that the likelihood function from models
equal; in other words,
that
t = t
for each
L = L
t = 1, . . . , n.
(original) and
(augmented) are
over all the parametric space. For such, it is sucient to show

Notice that
t = yt Zt at|t1 dt
where
at|t1 E(t |Ft1

) and dt = dt .
and
t = yt Zt at|t1 dt ,
(A.3.1)
Besides, under the augmented model in Eqs.(4.2.1), the
recursive solution for the measurement equation, for an arbitrary
s = 1, . . . , t 1,
is
50
Ys
APPENDIX A.

= Zs 0
Tj 0

s1 I
0
Y
j=1 ... . . .
0 I
Tk 0
s2 Y
0
X
s1 I
.
..
j=1 k=j+1 ..
.
0 I

+ Zs 0
where
APPENDIX

0
+
1
.
.
.
0 Rj
cj

0
0
0

j + +

.
. .
..
.
.
. .

0
0
0
R
c
s1
s1
0 0
0
+ . s1 + ds + s ,
.
. ..
0
0
is an initial state vector with appropriate
1st
and
2nd
(A.3.2)
moments.
Now, observe that
Qs1
Tk 0
s1 I
0
Y
.
..
j=1 ..
.
0 I
j=1 Tj
Q
s1
0
j=1 Tj+1

Q
s1
0
j=1 Tj+2
=

.
.
.
.
.
.
0
Ts1
0 0
0 0
0 0
.
.
.
.
.
.
.
.
.
0 0
0 0
.
.
.
(A.3.3)
A.4.
MATLAB CODE
51
Qs1
k=j+1 Tk (Rj j + cj )
s1 T
0 Rj
c
(R
+
c
)
j
k=j+1 k+1 j j
j

.
s2
.
0 0
0 X
j + =
.

.
. .
.
.
. .
Ts1 (Rj j + cj )
.. j=1

0
0
0
(Rj j + cj )
Tk 0
s1
s2
0
X Y I
.
..
j=1 k=j+1 ..
.
0 I
(A.3.4)
Placing (A.3.3) and (A.3.4) properly in (A.3.2) implies
Ys
= Zs
("s1
Y
#
Tj 1 +
j=1
s2
X
j=1
"
s1
Y
#
Tk (Rj j + cj ) + cs1 + Rs1 s1
)
+ d t + s ,
(A.3.5)
k=j+1
which coincides with the recursive solution of the measurement equation from the original model
(A.1.1). Finally, combine Eq. (A.3.5) and Eq. (A.3.1) to obtain the result.
A.4 Matlab Code
function [para, sumll] = AJUSTEKFAR1(r,k,p,delta,prob)
dados = importdata('SPREAD_GG_1ANO.csv');
spread = dados.data;
tic
Y=spread;
n=length(Y);
pos = n252+1;
52
APPENDIX A.
APPENDIX
Y = Y(pos:n,1:3);
%plot(Y);
%legend('Spread Observado');
para0 = [0,0];
[x,fval,exitflag,output] = fminunc(@loglikAR1,para0,optimset('Display','iter',
'MaxFunEvals',5000,'HessUpdate','bfgs'),Y,r);
para = x
sumll = fval
exitflag
output
phi = 1/(1+exp(para(1)));
sigeta = exp(para(2));
%FKAR1(para,Y,nc,r);
FKAR1_Trading(para,Y,r,k,n,p,delta,prob);
toc
end
function sumll = loglikAR1(para0,Y,r)
A.4.
MATLAB CODE
% Chute Inicial
n=length(Y);
phi = 1/(1+exp(para0(1)));
sigeta = exp(para0(2));
Z = [1 zeros(1,r1)];
T = [phi; eye(r1) zeros(r1,1)];
c = zeros(r,1);
R = [1; zeros(r1,1)];
Q = sigeta;
at = pinv((eye(r)T))*c;
Pt =reshape(pinv((eye((r)^2)kron(T,T)))*vec2mat(R*Q*R',1),r,r);
ll = zeros(n,1) ;
for t=1:n
Ft = Z*Pt*Z';
Kt = T*Pt*Z'*inv(Ft);
inov = Y(t,1) Z*at;
at = T*at + Kt*inov + c;
Lt= T Kt*Z;
Pt = T*Pt*Lt'+ R*Q*R';
53
54
APPENDIX A.
ll(t) = 0.5*(log(Ft)+ (inov^2)/(Ft));
end
sumll = sum(ll);
end
function [at,Pt,Ft] = FKAR1_Trading(para,Y,r,k,n,p,delta,prob)
% p : posio do ncio do backtesting

%n : tamanho da amostra
%r : ordem do modelo AR
%K : nmero de passos
phi = 1/(1+exp(para(1)));
sigeta = exp(para(2));
flag = 0; % Ativa o tipo de estratgia
% Matrizes do Sistema Modelo Aumentado
Z = [1 zeros(1,k1)];
T = [phi zeros(1,k1); eye(r+k2) zeros(k1,1)];
c = zeros(r+k1,1);
R = [1; zeros(k1,1)];
Q = sigeta;
APPENDIX
A.4.
MATLAB CODE
a1 = pinv((eye(r+k1)T))*c;
P1 =reshape(pinv((eye((r+k1)^2)kron(T,T)))*vec2mat(R*Q*R',1),r+k1,r+k1);
% Dimenso dos vetores de saida
%RECURSOES DE KALMAN
%Condio Inicial
% Matrizes de Outputs
%yup guarda as probabilidades de subir
%ydown guarda as probabilidades de cair
%cota guarda as cotas da estratgia e dos principais benchmarking
yup = zeros(np,4);
ydown = zeros(np,4);
cota = zeros(n(p1),3);
cota(1,:)=100;
% Cota inicial
for s=p:n
at = zeros(r+k1,1,s+k+1);
Pt = zeros(r+k1,r+k1,s+k+1);
at(:,:,1)= a1;
Pt(:,:,1)= P1 ;
for t=1:s+k
55
56
APPENDIX A.
APPENDIX
if (t<=s)
Ft = Z*Pt(:,:,t)*Z';
Kt = T*Pt(:,:,t)*Z'*inv(Ft);
inov= Y(t,1) Z*at(:,:,t);
Lt= T Kt*Z;
at(:,:,t+1) = T*at(:,:,t) + Kt*inov +c;
Pt(:,:,t+1) = T*Pt(:,:,t)*Lt'+ R*Q*R';
else
at(:,:,t+1) = T*at(:,:,t)+c;
Pt(:,:,t+1) = T*Pt(:,:,t)*T'+ R*Q*R';
end
end
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% Estrategia %%%%%%%%%%%%%%%%%%%%%%%

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% de %%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% Trading %%%%%%%%%%%%%%%%%%%%%%%%
% c (ponto de ativao da estratgia)
% delta desvio em relao ao equilibrio de longo prazo (c).
X=zeros(1,k);
c=0;
A.4.
MATLAB CODE
if flag == 0
contdown =0;
contup=0;
%Probabilidade de Cair
ydown(tp(k1),1) = Y(s,1);
ydown(tp(k1),2) = 1 mvncdf(X,at(:,:,s+k)',Pt(:,:,s+k));
ydown(tp(k1),3) = c + delta;
yup(tp(k1),1) = Y(s,1) ;
yup(tp(k1),2) = 1 mvncdf(X,at(:,:,s+k)',Pt(:,:,s+k));
yup(tp(k1),3) = c + delta;
cota(tp(k2),1)= cota(tp(k1),1)*(1+Y(s,2));
if (ydown(tp(k1),1) >= c + delta && ydown(tp(k1),2)> prob)

flag =1;
contdown = contdown +1;
ydown(tp(k1),4)= contdown;
if contdown ~= 1
%
cota(tp(k2),1)= cota(tp(k1),1)*(1+(Y(s1,1) Y(s,1)));
57
58
APPENDIX A.
APPENDIX
%
%
end
elseif
(yup(tp(k1),1) <= c delta
&& yup(tp(k1),2)> prob )
% Probabilidade de Subir
flag=2;
contup = contup + 1;
yup(tp(k1),4)= contup;
if contup ~= 1
%
cota(tp(k2),1)= cota(tp(k1),1)*(1 + (Y(s,1) Y(s1,1)));
%
%
end
end
elseif (flag == 1)
contdown
= contdown +1;
%ydown(tp(k1),2) = 1 mvncdf(X,at(:,:,s+k)',Pt(:,:,s+k));
ydown(tp(k1),3) = c;
yup(tp(k1),1) = Y(s,1) ;
%yup(tp(k1),2) = 1 mvncdf(X,at(:,:,s+k)',Pt(:,:,s+k));
A.4.
MATLAB CODE
59
yup(tp(k1),3) = c;
if contdown <= k
if ydown(tp(k1),1)<= c
%Desfazer a Estrategia
flag=0;
else
% Mantenho a estratgia porque o spread no voltou

% e o nmero de dias menor que k.

end
else
flag =0;
60
APPENDIX A.
APPENDIX
end
else
contup = contup + 1;
yup(tp(k1),1) = Y(s,1) ;
%yup(tp(k1),2) = 1 mvncdf(X,at(:,:,s+k)',Pt(:,:,s+k));
yup(tp(k1),3) = c;
%ydown(tp(k1),2) = 1 mvncdf(X,at(:,:,s+k)',Pt(:,:,s+k));
ydown(tp(k1),3) = c;
if contup <= k
if yup(tp(k1),1)>= c
%Desfazer a Estrategia

flag=0;
else
A.4.
MATLAB CODE
61
% Mantenho a estratgia porque o spread no voltou

% e o nmero de dias menor que k.

end
else
flag=0;
end
end
end
plot(cota(:,1),'blue');
hold on
plot(cota(:,2),'green');
hold on
plot(cota(:,3),'red');
legend('P/L','CDI','Ibovespa');
62
cota
yup
ydown
end
APPENDIX A.
APPENDIX
Bibliography
[1] AIT-SAHALIA, Y. and HANSEN, L. (2010). Handbook of Financial Econometrics. 2nd
edition. Springer-Verlag.
[2] ANDERSON, B. D. O. and MOORE, J. B. (1979). Optimal Filtering. Prentice Hall.
[3] ATHERINO, R., PIZZINGA. A and FERNANDES; C. (2010).A row-wise stacking of

the runo triangle: state space alternatives for IBNR reserve prediction. Astin Bulletin,
vol.40(2), p.917-946.
[4] AVELLANEDA, M. and LEE, J. H. (2010). Statistical arbitrage in the US equities market". Quantitative Finance, vol.10(7), p.761-782.
[5] BROCKWELL, P. J. and DAVIS, R. A. (1991). Time Series: Theory and Methods. 2nd
edition. Springer-Verlag.
[6] BROCKWELL, P. J. and DAVIS, R. A. (2003). Introduction to Time Series and Forecast-
ing. 2nd edition. Springer Texts in Statistics.
[7] DREZNER, Z. (1994). Computation of the Trivariate Normal Integral. Mathematics of
Computation, vol.63, p.289-294.
[8] DREZNER, Z. and WESOLOWSKY, G. O. (1989). On the Computation of the Bivariate
Normal Integral. Journal of Statistical Computation and Simulation, vol.35, p.101-107.
63
64
BIBLIOGRAPHY
[9] DURBIN, J. and KOOPMAN, S. J. (2001).Time Series Analysis by State Space Methods.
Oxford Statistical Science Series.
[10] ELLIOT, R.J. and KRISHNAMURTHY, V. (1999). New nite-dimensional lters for
parameter estimation of discrete-time linear Gaussian models. IEEE Transactions of Au-
tomatic Control, vol.44, p.938-951.
[11] ELLIOT, R. J., VAN DER HOEK, J. and MALCOLM, W. P. (2005). Pairs Trading.
Quantitative Finance, vol.5(3), p.271-276.
[12] ENDERS, W. (2004). Applied Econometric Time Series. 2nd edition. John Wiley & Sons.
[13] ENGELBERG, J., GAO, P. and JAGANNATHAN, R. (2009). An anatomy of Pairs trading: the role of idiosyncratic news, common information and liquidity. Third Singapore
International Conference on Finance.
[14] ENGLE, R. and GRANGER, C. (1987) Co-Integration and Error Correction: Representation, Estimation, and Testing. Econometrica, vol.55, n2, p.251-276.
[15] FOUQUE,P. J.,PAPANICOLAOU, G. and SIRCAR, R., K. (2000). Derivatives in Finan-
cial Markets with Stochastic Volatility. Cambridge University Press.
[16] GATEV, E., GOETZMANN, W. and ROUWENHORST, K. (2006). Pairs Trading: Performance of a Relative Value Arbitrage Rule. The Review of Financial Studies, vol.19,
p.797-827.
[17] GENZ, A. (1992) Numerical Computational of Multivariate Normal Probabilities. Jour-
nal of Computation and Graphical Statistics, vol.1, p.141-149.
[18] GENZ, A. (2004) Numerical Computation of Rectangular Bivariate and Trivariate Normal
and t Probabilities. Statistics and Computing, vol.14, No. 3, p.251-260.
BIBLIOGRAPHY
65
[19] HAMILTON, J. D. (1994). Time Series Analysis. Princeton University Press.
[20] HARVEY, A. C. (1989). Forecasting, Structural Time Series Models and The Kalman
Filter. Cambridge University Press.
[21] HARVEY, A. C. (1993). Time Series Models. 2nd edition. Harvester Wheatsheaf.
[22] KAUFMAN, P. J. (2005). New Trading Systems and Methods. Fourth Edition. John Wiley
& Sons, Inc.
[23] KORN, R. and KORN, E. (2001). Option Pricing and Portfolio Optimization: Modern
Methods of Financial Mathematics (Graduate Studies in Mathematics). AMS.
[24] KUPIEC, P. (1995). Techniques for Verifying the Accuracy of Risk Management Models.
Journal of Derivatives, vol.3, p.73-84.
[25] JACOBS, B. and LEVY, K. (2005). Market Neutral Strategies. John Wiley & Sons.
[26] JOHNSON, B. (2010). Algorithmic Trading & DMA An introduction to direcet acess trad-
ing strategies. 4Myeloma Press.
[27] MACKINNON, J. G. (2010). Critical Values for Cointegration Tests. Queens Economics
Department Working Paper No. 1227. Queens University.
[28] NICHOLAS, J. G. (2000). Market-Neutral Investing: Long/Short Hedge Fund Strategies.

Bloomberg Press.
[29] PERLIN, M. S. (2007). Evaluation of Pairs Trading Strategy at the Brazilian Financial
Market. Journal of Derivatives & Hedge Funds, vol.15, p.122-136.
[30] POLE, A. (2007). Statistical Arbitrage: Algorithmic Trading Insights and Techniques. John
Wiley & Sons, Inc.
66
BIBLIOGRAPHY
[31] RODRIGUEZ, A. and RUIZ, E. (2009). Bootstrap Prediction Intervals in State Space
Models . The Journal of Time Series Analysis, vol.30, No. 2.
[32] ROSS, S. (1976). The Arbitrage Theory of Capital Asset Pricing. Journal of Economic
Theory, vol.13, p.341-360.
[33] RAMPERTSHAMMER, S. (2007). An Ornstein-Uhlenbech Framework for Pairs Trading.
Department of Mathematics and Statistics the University of Melbourne, Unpublished Note.
[34] SHARPE, F. W. (1966). Mutual Fund Performance. The Journal of Business, vol.39,
p.119-138.
[35] SHARPE, F. W. (1994). The sharpe ratio. The Journal of Portfolio Management, vol.21,
p.49-58.
[36] SHUMWAY, R. H. and STOFFER, D. S. (2006). Time Series Analysis and Its Applications
(With R Examples). 2nd edition. Springer.
[37] TRIANTAFYLLOPOULOS, K. and MONTANA, G. (2009). Dynamic modeling of meanreverting spreads for statistical arbitrage. Computational Management Science, vol.8,
p.23-49.
[38] TSAY, R. S. (2010). Analysis of Financial Time Series. 3rd edition, John Wiley & Sons,
Inc.
[39] VIDYAMURTHY, G. (2004). Pairs Trading, Quantitative Methods and Analysis. John
Wiley & Sons, Inc.
[40] WALL, K. and STOFFER, S. (2002). A State Space Approach to Bootstrapping Conditional Forecasts in ARMA Models . The Journal of Time Series Analysis, vol.23, No.
6.

Carlos Eduardo DIssertation

Uploaded by

Copyright:

Available Formats

Carlos Eduardo DIssertation

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Carlos Eduardo DIssertation

Uploaded by

Copyright:

Available Formats

Instituto Nacional de Matemtica Pura e Aplicada

A new pairs trading strategy based on linear state

Carlos Eduardo Silva de Moura

Rio de Janeiro - January, 2014

To my family, girlfriend and friends, with love...

Once an adequate state space model for

Uma vez estimado o modelo de

espao de estado adequado para o spread, so calculadas as probabilidades condicionais de que

o spread comprado ou vendido.

Trs aplicaes com

Palavras-chaves: arbitragem estatstica, ltro de Kalman, modelo em espao de estado, pairs

Pairs Trading: a glimpse at the literature

Statistical Arbitrage Strategies

Unobserved component models: the stochastic spread approach

ARMA models: generalizing the stochastic spread approach

A new pairs trading strategy

Mean-reverting conditional probabilities

Mean-reverting conditional probabilities

The data and some computational details . . . . . . . . . . . . . . . . . . . . . .

Linear state space models and the Kalman lter . . . . . . . . . . . . . . . . . .

The rst model is the unobserved component

Such model, which has a Gaussian linear space state form, is a

k -steps ahead (the frequency can be

ahead prediction (see, for instance, [20] and [9]) to obtain

conditional mean vectors and covariance matrices of the

future spreads. The latter is all that

ahead is reasonably large,

buy (sell) the spread.

the naive denition of pair given in Chapter ), as some kind of

is a noisy observation of some mean-reverting unobserved spread

is larger/smaller than the one-step-ahead estimate

then the spread is regarded as

PAIRS TRADING: A GLIMPSE AT THE LITERATURE

First, they introduced time-varying autoregressive (or mean-

Second, they developed and implemented a Bayesian approach for es-

timating the parameters, providing an on-line estimation scheme.

Lastly, they advocated a

In their study, stocks from companies

A pair formation for each

PAIRS TRADING: A GLIMPSE AT THE LITERATURE

Informally speaking, if two stocks have

STATISTICAL ARBITRAGE STRATEGIES

Therefore, the idea behind a given pairs trading strategy

is to trade on the oscillations about the equilibrium value of the spread.

One can put on a trade

are said to be cointegrated i

notation I(d) means integrated of order

aYt + bXt I(0)

This denition shall be enough for the aims of this

are the prices of assets

in time t, respectively. The time frequency

the mean of cointegration relationship,

is stationary  that is:

is the cointegration coecient, and

Cointegration, once veried, suggests that

St would wander around an equilibrium value.

3.2 Unobserved component models: the stochastic spread

is a noisy realization of the unobserved or actual mean-

a <, 0 < b < 2, C > 0, (t , t )0

Palavras-chaves: arbitragem estatstica, ltro de Kalman, modelo em espao de estado, pairs

Linear state space models and the Kalman lter . . . . . . . . . . . . . . . . . .

The rst model is the unobserved component

the naive denition of pair given in Chapter ), as some kind of

is a noisy observation of some mean-reverting unobserved spread

are said to be cointegrated i

notation I(d) means integrated of order

This denition shall be enough for the aims of this

is stationary that is:

is the cointegration coecient, and

Cointegration, once veried, suggests that

a <, 0 < b < 2, C > 0, (t , t )0

Kalman lter formulae in Eqs.(A.1.2) of Appendix A.1 turn to

St = 0 + 1 St1 + 2 St2 + t + 1 t1 + 2 t2 ,

is invertible that is,

be a stationary process see [5], [6] and [19]. The same

restrictions could be imposed to the moving average coecients

again, see [5], [6] and [19].

able to make a prot soon.

= E(Zt+i t+i + dt+i + t+i |Ft )

its diagonal and o-diagonal blocks are given respectively by

= Cov(Zt+i t+i + dt+i + t+i , Zt+j t+j + dt+j + t+j |Ft )

the rst- and second-order conditional moments displayed in Eqs.(4.1.3)

where the acronym .NaN" means Not

state space form, we apply the Kalman lter

obtain rst- and second-order conditional moments of

the latter is associated with a pair of assets

above some large value