Statistical Arbitrage and High-Frequency Data
with an Application to Eurostoxx 50 Equities
May 2010
Authors:
Christian L. Dunis
Gianluigi Giorgioni
Jason Laws
Jozef Rudy
Corresponding author and presenter :
Jozef Rudy
[email protected]
Liverpool John Moores University
Outline
• Motivation
• Data used
– Data provider
– 2 types of data: HF and daily
– In- and out-of-sample periods
• Methodology
– Pair trading system
– Calculation of adaptive parameters
– Entry and exit points, stoploss
• Preliminary out-of-sample results
– Average trading results for all 176 pairs
• Further analysis
– Relation between in-sample information ratio, t-stat and out-of-sample information ratio
• Final results
– Results for 5 best pairs based on in-sample information ratio, t-stat
– Comparison with benchmarks
2
Motivation
• Recent bad performance (see Gatev et al., 2006) of market
neutral strategies (see Vidyamurthy, 2004 for an
introduction)
• Technique developed in 1980 by Wall Street quant Nunzio
Tartaglia. Now a well-known technique (Alexander et al.,
2002, Burgess, 2003)
• Majority of trading ideas well-known across Wall Street. A
practical implementation and parameters make every
strategy “unique“ (Chan, 2009)
• Application of a pair trading strategy to equity HF/daily data
and comparison of the results (Nath (2003))
3
Data used
• Eurostoxx 50 Equities:
– Daily data :3rd Jan 2000 – 17th Nov 2009
– Intraday data :3rd Jul 2009 – 17th Nov 2009
• Various intraday intervals: 5, 10, 20, 30 and 60 minutes
• Each share from 1 of 10 sectors: Basic Materials,
Communications, Consumer Cyclical, Consumer Non-
cyclical, Diversified, Energy, Financial, Industrial,
Technology and Utilities
• In- and Out-of-Sample Periods:
4
Methodology I
• Pair trading model: zt PY t PX
t t
and pairs only from
the same industry
• Alternative approaches for beta calculation:
– fixed beta (calculated by OLS)
– moving window beta (calculated by rolling OLS)
– Double exponential - smoothing prediction model (DESP)
– Kalman filter - system and observation noise variances
constant (Bentz, 2003)
5
Methodology II
• Genetic algorithm used to optimize:
– Rolling OLS: Length of the OLS rolling window optimized by
genetic algorithm
– DESP: Smoothing parameter and number of look-ahead
periods optimized by genetic algorithm
– Kalman filter: Signal/noise ratio (system/observation
noise) optimized by genetic algorithm
• Genetic optimization algorithm:
– Objective: maximization of the in-sample information ratio
– Started with 100 generations
– Mutation and crossover allowed
– Only 6 randomly chosen pairs optimized and these values
used for all the pairs
6
Methodology III
• Spread generated by the pair trading model:
zt z
– Normalized: nt
sz
– z and sz calculated from the entire in-sample period
– Entry into the spread: abs(n )>2 t
– Exit from the spread: abs(n )<0.5t
– Stop loss not used
7
Methodology in practice
• Bayer AG and Arcelor Mittal pair sampled at a 20-
minute interval
4 35%
Equity curve
3
30%
Value of the normalized spread
2
1 25%
Cumulative Return
0
20%
-1
-2 15%
-3 Normalized spread
10%
Positions
-4
-5 5%
-6 0%
1 501 1001 1501 2001 1 501 1001 1501 2001
Time
Time
Normalized spread Cumulative equity curve
8
Costs of trading
• Trading costs one-way for both shares (long
and short): 0.3%
– Transaction costs: 0.2% (0.1% * 2)
– Bid-ask spread: 0.1% (0.05% * 2)
• Net return calculation:
Rett ln( PX t / PX t 1 ) ln( PYt / PYt 1 ) TC
9
Preliminary out-of-sample results
• Results for different approaches:
• Detailed results for the Kalman filter approach:
10
Some further analysis
• 95% confidence bounds for the correlation between in- and out-of-sample
information ratio
• 95% confidence bounds for the correlation between the in-sample t-stats
of the ADF test and out-of-sample information ratio
11
Results after further analysis I
• 5 best pairs based on the in-sample information ratios:
• 5 best pairs based on the in-sample t-stat of the ADF test:
12
Results after further analysis II
• 5 best pairs based on the in-sample t-stat (calculated from daily data)
• 5 best pairs based on the in-sample t-stat (calculated from daily frequency
data)
13
Comparison with benchmarks
• Comparison of portfolio of 5 best pairs with benchmarks:
– Using HF data in the out-of-sample period (10 Sep – 17 Nov 2009)
– Using daily data in the out-of-sample period (1 Jan – 17 Nov 2009)
14
References
• Alexander, C. and Dimitriu, A. (2002) The Cointegration Alpha: Enhanced Index
Tracking and Long-Short Equity Market Neutral Strategies. SSRN eLibrary,
http://ssrn.com/paper=315619
• Bentz, Y. (2003) Quantitative Equity Investment Management with Time-Varying
Factor Sensitivities. In Dunis, C., Laws, J. And Naïm, P. [eds.] Applied Quantitative
Methods for Trading and Investment. John Wiley & Sons, Chichester, 213-237.
• Burgess, A. N. (2003) Using Cointegration to Hedge and Trade International
Equities. In Dunis, C., Laws, J. And Naïm, P. [eds.] Applied Quantitative Methods for
Trading and Investment. John Wiley & Sons, Chichester, 41-69.
• Chan, E. (2009) Quantitative Trading: How to Build Your Own Algorithmic Trading
Business, John Wiley & Sons, Inc., New Jersey.
• Gatev, E., Goetzmann, W. N. and Rouwenhorst, K. G. (2006) Pairs Trading:
Performance of a Relative-Value Arbitrage Rule. The Review of Financial Studies.
19, 3, 797-827.
• Nath, P. (2003) High Frequency Pairs Trading with U.S. Treasury Securities: Risks
and Rewards for Hedge Funds. SSRN eLibrary, http://ssrn.com/paper=565441
15
Thank you for your attention
16