0% found this document useful (0 votes)
66 views6 pages

Deep Learning For Predictions in Emerging Currency Markets: Svitlana Galeshchuk and Sumitra Mukherjee

Uploaded by

Bernabas
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
66 views6 pages

Deep Learning For Predictions in Emerging Currency Markets: Svitlana Galeshchuk and Sumitra Mukherjee

Uploaded by

Bernabas
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

Deep Learning for Predictions in Emerging Currency Markets

Svitlana Galeshchuk1,2 and Sumitra Mukherjee3


1
Department of Accounting and Audit, Ternopil National Economic University, Ternopil, Ukraine
2Laboratoire d'Informatique de Grenoble, Université Grenoble Alpes, Grenoble, France
3
College of Engineering and Computing, Nova Southeastern University, Fort Lauderdale, U.S.A.

Keywords: Neural Networks, Deep Learning, Convolution Networks, Exchange Rate Prediction, Emerging Markets.

Abstract: Accurate prediction of exchange rates is critical for devising robust monetary policies. Machine learning
methods such as shallow neural networks have higher predictive accuracy than time series models when
trained on input features carefully crafted by domain knowledge experts. This suggests that deep neural
networks, with their ability to learn abstract features from raw data, may provide improved predictive
accuracy with raw exchange rates as inputs. The preponderance of research focuses on developed currency
markets. The paucity of research in emerging currency markets, and the crucial role that stable currencies
play in such economies, motivates us to investigate the effectiveness of deep networks for exchange rate
prediction in emerging markets. Literature suggests that the Efficient Market Hypothesis, which posits that
asset prices reflect all relevant information, may not hold in such markets because of extraneous factors
such as political instability and governmental interventions. This motivates our hypothesis that inclusion of
carefully chosen macroeconomic factors as input features may improve the predictive accuracy of deep
networks in emerging currency markets. This position paper proposes novel input features based on
currency clusters and presents our method for investigating the hypothesis using exchange rates from
developed as well as emerging currency markets.

1 INTRODUCTION The recent success of deep neural networks in a


variety of domains may be partially attributable to
Transactions worth billions of dollars a day take their ability to learn abstract features from raw data
place in the foreign exchange market, making it one (LeCun et al., 2015). This suggests that deep
of the largest financial markets in the world (Report networks may be effective in predicting foreign
on global foreign exchange market activity in 2013). exchange rates based on raw time series data.
Exchange rates are expressed in terms of a base- Our first objective is to investigate whether deep
quote currency pair that represents the number of neural networks are significantly better at foreign
units of quote currency that may be exchanged for exchange rate prediction than time series models and
each unit of the base currency. Accurate prediction shallow networks when raw exchange rate data are
of forex rate rates is critical for formulating robust used as input features. Our preliminary results using
monetary policies and developing effective trading exchange rates between the US dollar and three
and hedging strategies in the foreign exchange major currencies in mature markets–Euro, British
market (Lukas and Taylor, 2007) Pound, and Japanese Yen–suggest that indeed deep
Econometric models are not effective for convolution networks perform better than extant
exchange rate predictions when the forecast horizon methods.
is less than a year (Meese and Rogoff, 1983). Time The preponderance of research in foreign
series models are poor at predicting the direction of exchange prediction focuses on established markets.
change in rates. Shallow artificial neural networks In response to the paucity of research in emerging
and support vector machines perform marginally currency markets, and in recognition of the fact that
better when using carefully crafted input features; stable currency markets play a crucial role in
significant efforts by domain experts may be needed determining the well-being of such economies, our
to obtain such features from raw input data. second objective is to adapt deep network models
for predicting exchange rates in emerging markets.

681
Galeshchuk S. and Mukherjee S.
Deep Learning for Predictions in Emerging Currency Markets.
DOI: 10.5220/0006250506810686
In Proceedings of the 9th International Conference on Agents and Artificial Intelligence (ICAART 2017), pages 681-686
ISBN: 978-989-758-220-2
Copyright c 2017 by SCITEPRESS – Science and Technology Publications, Lda. All rights reserved
ICAART 2017 - 9th International Conference on Agents and Artificial Intelligence

As representative emerging markets we consider learnt by a model. A period forward prediction


countries in the Eastern Partnership (EaP). The model model is evaluated by its classification
Eastern Partnership is an initiative of the European accuracy on out-of-sample observations, where
Union that aims to foster improved economic classification accuracy is defined as the percentage
relationship with the post-Soviet states of Armenia, of test cases for which the predicted direction of
Azerbaijan, Belarus, Georgia, Moldova, and change ̂ ( ) equals the true direction of change
Ukraine. Improved macroeconomic conditions in the ( ).
EaP countries is a pre-condition for their economic
integration with European Union. Research suggests
that currency market stability is one of the most 3 RELATED WORK
important indicators of sustainable development and
growth in these economies and that accurate
Exchange rate prediction methods may be
prediction of exchange rate is critical to the
categorized into econometric methods, time series
formulation of robust monetary policies. This lends
models, and machine learning techniques. We re-
further impetus to our study of developing improved
view these approaches briefly and then discuss deep
models for exchange rate prediction in emerging
neural networks.
markets.
Literature suggests that the Efficient Market
Hypothesis, which posits that asset prices reflect all 3.1 Econometric Models
relevant information, may not hold in emerging
markets because of extraneous factors such as Econometric models predict exchange rate based on
political instability and governmental interventions. economic factors. The Mundell-Fleming model
This motivates our hypothesis that inclusion of (1962), Dornbusch’s (1976) asset-market approach
carefully chosen macroeconomic factors as input to exchange-rates, and New Keynesian models are
features may improve the predictive accuracy of examples of such models and a good survey of such
deep networks in emerging currency markets. An models can be found in Engel (2013). These models
ancillary goal of this study is to develop a novel set are widely used by central bankers around the world.
of input features that are obtained by forming However, research indicates that these models are
clusters of currency markets based on distance not effective when the prediction horizon is less than
metrics derived from correlation measures. a year (Neely and Sarno, 2002).
The roadmap for the remainder of this position Meese and Rogoff (1983) demonstrated that such
paper is as follows: Section 2 formally defines the models fail to outperform a random walk in out of
exchange rate prediction problem. Section 3 briefly sample predictions and their findings are still widely
discusses the related literature. Section 4 describes accepted.
our proposed methodology. Section 5 concludes
with some observations. 3.2 Time Series Models
An excellent survey of time series forecasting
models can be found in Box et al. (2015).
2 THE PREDICTION PROBLEM Autoregressive Integrated Moving Average
(ARIMA) models and Exponential Smoothing (ETS)
We use a standard formulation of the exchange rate models are the most commonly used time series
prediction problem where our goal is to predict the models for foreign exchange rate prediction.
direction of change: Let and denote the ARIMA models can deal with non-stationary data by
values of an exchange rate between a pair of differencing transformations and subsume
currencies in periods and + , respectively, for autoregressive models and moving average models
some > 0. Define the direction of change ( ) = as special cases. ETS models are non-stationary and
1 if the rate increases in periods, i.e. if − can capture trends and seasonality. Time series
> 0; otherwise, ( ) = 0. Our objective is to models may provide satisfactory point estimates for
learn a function : ℝ → 0,1 such that exchange rates, but the direction of change implied
, ,…, = ( ). We train models to by these estimates are often poor indicators of the
predict the direction of change. Let ̂ ( ) = true direction.
, ,…, be the predicted direction of
change periods forward, where is a function

682
Deep Learning for Predictions in Emerging Currency Markets

3.3 Artificial Neural Networks achieve better out-of-sample prediction accuracy


than baseline methods.
Artificial neural network (ANN) with a single Units in a DN receive inputs from small
hidden layer often outperform time series models in contiguous receptive fields that collectively cover
providing point estimates for exchange rates as the entire set of input features. This allows units to
demonstrated in Dunis (2015) Thinyane and Millin act as local filters and to exploit local correlation
(2011), Nag (2002), and Galeshchuk (2016). between contiguous inputs. Units share weights and
However, the direction of change implied by these bias parameters to create a feature map and this not
point estimates are often unacceptably inaccurate. only results in a significant reduction in the number
This renders these method less useful as a basis for of parameters to be estimated but also facilitates
formulating monetary policies. This further detection of features irrespectively of their actual
motivates us to investigate the ability of deep position in the input field. The reduction in the
networks to predict the direction of change in forex number of parameters may be very significant as the
rates. number layers in the network and the number of
units in each layer increases.
3.4 Deep Neural Networks Recurrent neural networks are an effective class
of neural network designed to handle sequence
Deep learning techniques originally introduced by dependence. Stacked Long Short-Term Memory
Ivakhnenko (1971) and then Hinton (2002, 2006) (LSTM) is a type of recurrent neural network used in
has been successfully applied in a variety of deep learning which makes effective use of model
domains including face detection (Osadchy et al., parameters, converges quickly, and outperforms
2013), speech recognition (Sukittanon et al., 2004), deep feed forward neural networks. That is why, it is
object recognition (Schmidhuber, 2005), document often used for time-series predictions. Being adapted
categorization (Hinton and Salakhutdinov, 2006), for dimensionality reduction and unsupervised pre-
and natural language processing (Lee et al., 2009). training tasks, LSTMs have been successfully used
Deep learning networks have also been used for time for unsupervised extraction of abstract input features
series predictions (Busseti et al., 2012; Langkvist et for prediction problems. The approach has also
al., 2014) and for financial predictions (Ribeiro and proved effective in financial predictions.
Noel, 2011; Chao et al. 2011; Yeh et al., 2014; Lai et
al.). Restricted Boltzmann machines and auto-
encoders machines have been used for 4 METHODOLOGY
dimensionality reduction and unsupervised pre-
training. Applications are discussed in Larochelle et In this section we describe the data sets to be used in
al. (2009), Masci et al. (2011), and Vincent et al. this study, discuss additional features to be used for
(2007). prediction in emerging markets, present baseline
Deep convolution networks (DN) are attractive models including shallow neural networks, and
for high dimensional prediction and classification describe our deep convolution networks.
problems (LeCun et al 2015). DNs are suitable for
exchange rate prediction for two main reasons: First, 4.1 Data Sets
high level features abstracted by the network may
serve as noise filters and dimensionality reduction For developed currency markets, we use the daily
techniques may help abstract input features. closing rates between three currency pairs: Euro and
Secondly, the temporally-local correlation between US Dollar (EUR/USD), British Pound and US
consecutive observations may be exploited to reduce Dollar (GBP/USD), and US Dollar and Japanese
the number of parameters to be estimated in the Yen (USD/JPY) to train and test our models. The
network by connecting only a small number of rates may be downloaded from: http://www.global-
adjacent inputs to each unit in a hidden layer. view.com/forex-trading-tools/forex-history/. Data
Our work is motivated by results from for the years 2000 to 2015 are considered. For
experiments to compare the accuracy of deep emerging currency markets we use the exchange
networks with baseline models (ARIMA, ETS, and rates of EaP countries to US Dollar: AZN/USD,
ANN) to predict the direction of changes of AMD/USD, BYR/USD, MDL/USD, UAH/USD,
exchange rates for EUR/USD, GBP/USD, and GEL/USD. For each data set we train models for
USD/JPY (Galeshchuk and Mukherjee, 2017). daily, monthly, and quarterly predictions.
Results demonstrate that trained deep networks

683
ICAART 2017 - 9th International Conference on Agents and Artificial Intelligence

4.2 Input Macroeconomic Features 4.3 Baseline Models


In order to provide better exchange-rates prediction We use a random walk model, two time series
on the macroeconomic level, researchers develop models (ARIMA and ETS), and a single layered
monetary models of exchange rates based on neural network as baseline models. The time series
fundamental economic data. We will include the models provide point estimates for the rates.
indicators of real sector (GDP growth, We predict output class ̂ ( ) = 1 if > , and
unemployment, wages), current and capital account 0 otherwise. The predicted direction of change ̂ ( )
(current account balance, openness as ratio of total is compared with the actual direction of change
import and export to GDP), public and private ( ). Results for ARIMA and ETS are obtained
foreign debt, capital flows, and ratio of international using the auto.arima model and the ets model from
reserves to 3 months import, international variables the R library forecast with default parameters
(interest rates and price ratios). Some additional (Hyndman and Khandakar 2008).
factors that may need to be considered include: A neural network model with a single hidden
money growth, fiscal growth, and a measure for the layer will also be used in our study as a baseline
degree of political instability and market model. The units have sigmoid transfer functions
liberalization. and use gradient descent and backpropagation for
Improved exchange rate prediction models are training. The model is trained on vectors with
particularly challenging to develop in volatile features , ,…, as inputs and as
emerging markets with political instability as is the output to predict a point estimate for the
case in EaP economies. The EU is the main period forward rate. As in the case of the time series
economic partners of EaP states. Financial markets models, we predict the output class ̂ ( ) = 1 if
of EaP countries and Russia are still highly coupled > , and 0 otherwise to compare the actual
through trade and political relationships in post- and predicted directions of change. Results are
soviet period. The high co-volatility of these markets obtained using the R package nnet. Models
requires us to identify distinct patterns of linkages parameters are tuned through cross-validation by
among European, EaP, and Russian markets. performing a grid search over the parameter ranges
Furthermore, contagious effect of crises is observed using the tune function from the R package e1071.
widely as local currency deterioration worsens For details of these packages, see https://cran.r-
macroeconomic indicators in trading partners. project.org/web/packages/nnet/nnet.pdf and: https://
The core currencies in EU-EaP-Russia area will cran.r-project.org/web/packages/e1071/e1071.pdf).
be modelled as a network. The correlation between
these exchange rates will be computed for a selected 4.4 Deep Convolution Network
time horizon. We will use a 3 month horizon since
international trading the payments are made up to 90 The deep convolution network has layers of hidden
days. Then, each correlation coefficient in the units separating the input layer from the output unit.
correlation matrix of the N markets will be mapped
We use to denote the internal bias of the th unit in
to a metric distance between pairs of indices to form
an N×N distance matrix with values ranging the th layer and to represent the weight of the
between 0 and 1. This distance matrix will be used connection to that unit from the th unit in the
to construct a minimal spanning tree (MST) in a ( − 1)th layer. For an input vector , the output of
fully connected graph where the vertices represent th
unit in the th layer is computed as ℎ ( ) =
the currencies and the arc lengths inversely , where = +∑ ℎ ( ), and
proportion to the strength of the correlations ( ) = max (0, ) is the rectified linear unit
between the currencies. Clusters will be formed by function. The output uses a softmax transfer
removing the longest edges of the MST. Strongly function. Adam optimizer (Kingma et al 2015) is
correlated currencies are connected by short links used to minimize a cross-entropy loss function. The
and belong to the same cluster; unrelated currencies open source library TensorFlow is used to create the
connected by longer links belong to different DN models (https://www.tensorflow.org/).
clusters. This will provide insights regarding the
pattern of currency crises spread in the EaP
economies and permit us to investigate
4.5 Stacked Long Short-term Memory
synchronization among the currency markets in the
We intend to use Stacked Long Short-Term Memory
EaP area.
(LSTM) deep network with mechanisms for

684
Deep Learning for Predictions in Emerging Currency Markets

exchange-rate prediction in this experiment. LSTM now plan to adapt this model for exchange rate
network is a type of recurrent neural network used in prediction in emerging currency markets by
deep learning because very large architectures can including macroeconomic factors as input features.
be successfully trained. A novel set of input features based on currency
The output value of recurrent neural network clusters may help improve predictive accuracy of
(Galeshchuk, 2014) can be formulated as: such models. This study will be among the first to
integrate information about market liberalization and
= (∑ − ),
political stability with macroeconomic indicators
and time-series data on exchange rate and
ℎ = ( + ( − 1) + 3 ( − 1) − )
transaction volume. Inclusion of these factors as
predictors should improve predictive accuracy for
where , are logistic activation functions, is the exchange rate, especially in emerging markets.
number of neurons in the hidden layer, is the
weight coefficient from -neuron of the hidden layer
to the output neuron, is the output value of -
neuron of the hidden layer, is the threshold of the REFERENCES
output neuron, is the number of neurons in the
Box G. E. P., Jenkins G. M., Reinsel G. C., Ljung G. M.
input layer, are the weight coefficients from the
2015. Time Series Analysis: Forecasting and Control,
i -input neuron to -neuron of the hidden layer, 5th Edition, Wiley.
are the input values, are the thresholds of the Busseti E., Osband I., Wong S. 2012. Deep Learning for
neurons of the hidden layer, is the synapse from Time Series Modeling. CS 229 Final Project Report.
context neuron of the hidden layer to the -neuron Chao J., Shen F., Zhao J. 2011. Forecasting Exchange
Rate with Deep Belief Networks. Proceedings of
of the same (hidden) layer, ( − 1) is the output
International Joint Conference on Neural Networks,
value of context neuron of hidden layer in the San Jose, California, USA.
previous moment of time − 1, 3 is the synapse Dornbusch R. 1976. Exchange Rate Expectations and
from context output neuron to the -neuron of the Monetary Policy. Journal of International Economics
hidden layer, ( − 1)is the value of context output 6 (3): 231–244.
neuron in the previous moment of time t  1 Dunis C. L., Laws J., Sermpinis G. 2011. Higher order and
For the version of LSTM used, is implemented recurrent neural architectures for trading the
EUR/USD exchange rate. Quantitative Finance 11(4):
by the following composite function (see Graves at
615-629.
al., 2013): Engel. C. 2013. Exchange rates and interest parity.
= ( + ℎ + + ) National Bureau of Economic Research: 77.
= ( + ℎ + + ) Fleming J. M. 1962. Domestic financial policies under
fixed and floating exchange rates. IMF Staff Papers 9:
= + tanh ( + ℎ + ) 369–379.
= ( + ℎ + + ) Galeshchuk S. 2016. Neural networks performance in
ℎ = tanh ( ) exchange rate prediction. Neurocomputing 172: 446-
452.
where is the logistic sigmoid function, and , , ,
Galeshchuk, S., 2014. Neural-based method of measuring
are respectively the input gate, forget gate, output exchange-rate impact on international companies’
gate and cell activation vectors, all of which are the revenue. In Distributed Computing and Artificial
same size as the hidden vector ℎ. Intelligence, 11th International Conference. Springer
International Publishing: 529-536.
Galeshchuk, S., Mukherjee S., 2016 Deep Networks for
Predicting Direction of Change in Foreign Exchange
5 CONCLUSIONS Rates. Intelligent Systems in Accounting, Finance and
Management: early view papers.
This position paper outlines our approach for Graves, A., Mohamed, A.R. and Hinton, G., 2013, May.
developing improved models for exchange rate Speech recognition with deep recurrent neural
prediction using deep neural networks. The ability of networks. In 2013 IEEE international conference on
deep networks to learn abstract features from raw acoustics, speech and signal processing (pp. 6645-
data motivates this approach. Preliminary results 6649). IEEE.
confirm that our deep network produces Hinton G. E. 2002. Training products of experts by
minimizing contrastive divergence. Neural Comput.
significantly higher predictive accuracy than the
14: 1771–1800.
baseline models for developed currency markets. We Hinton G. E., Osindero S., The Y. 2006. A fast learning

685
ICAART 2017 - 9th International Conference on Agents and Artificial Intelligence

algorithm for deep belief nets. Neural Computations. Settlements. http://www.bis.org/publ/rpfx13fx.pdf.


18: 1527–1554. Ribeiro B., Noel L. Deep Belief Networks for Financial
Hinton G. E., Salakhutdinov R. 2006.Reducing the Prediction. Proceedings of ICONIP 2011, Part III,
dimensionality of data with neural networks. Science. LNCS 7064; 766–773.
313 (5786): 504–507. Schmidhuber J. 2005. Deep Learning in Neural Networks:
Hyndman R. J., Khandakar Y. 2008. Automatic time An Overview. Neural Networks: 85-117.
series forecasting: the forecast package for R,. Journal Simard P. Y., Steinkraus D., Platt J. C.,. 2003. Best
of Statistical Software 26 (3): 1-22, 2008 DOI: Sukittanon S., Surendran A.C., Platt J. C., Burges C. J.
http://ideas.repec.org/a/jss/jstsof/27i03.html. 2004. Convolutional networks for speech detection.
Kingma, D. P., Ba, J. L. (2015). Adam: a Method for Interspeech: 1077–1080.
Stochastic Optimization. International Conference on Thinyane H., Millin J. 2011. An investigation into the use
Learning Representations, 1–13. of intelligent systems for currency trading.
Lai A., Li M. K., Pong F.W. Forecasting Trade Direction Computational Economics 37(4): 363-374.
and Size of Future Contracts Using Deep Belief Vincent P., Larochelle H., Bengio Y., Manzagol P.
Network. Stanford University. Extracting and Composing Robust Features with
Langkvist M., Karlsson L., A. Loutfi. 2014. A review of Denoising Autoencoders,” Proceedings of the 25th
unsupervised feature learning and deep learning for International Conference on Machine Learning (ICML
time-series modeling. Pattern Recognition Letters 42: 08); 1096-1103.
11–24. Wagner N., Michalewicz Z., Khouja M., McGregor R. R.
Larochelle H., Bengio Y., Louradour, P. Lamblin. 2009. 2007. Time Series Forecasting for Dynamic
Exploring strategies for training deep neural networks. Environments: The DyFor Genetic Program Model.
The Journal of Machine Learning Research 10: 1-40. Trans. Evol. Comp. 11 (4): 433-452.
LeCun Y., Bengio Y., Hinton G. 2015. Deep Learning. Xiao R. 2014. Deepnet: deep learning toolkit in R. R
Nature 521: 436–444. package version 0.2. http://CRAN.R-
LeCun Y., Bottou L., Bengio Y., Haffner P. 1998. project.org/package=deepnet.
Gradient-based learning applied to document Yeh S-H., Wang C.J., Tsai M.F. 2014. Corporate Default
recognition. Proc. IEEE. 86(11); 2278–2324. Prediction via Deep Learning. ISF.
Lee H., Largman Y., Pham P., Ng A. 2009. Unsupervised
feature learning for audio classification using
convolutional deep belief networks. Advances in
Neural Information Processing Systems 22.
Lukas M., Taylor M. 2007. The Obstinate Passion of
Foreign Exchange Professionals: Technical Analysis.
Journal of Economic Literature 45 (4): 936–972.
Masci J., Meier U., Ciresan D., Schmidhuber J. Stacked
Convolutional Auto-Encoders for Hierarchical Feature
Extraction. Lecture Notes in Computer Science 6791:
52-59.
Meese R., Rogoff K. 1983. The Out-of-Sample Failure of
Empirical Exchange Rate Models: Sampling Error or
Misspecification? NBER Chapters, in Exchange Rates
and International Macroeconomics: pp. 67–112.
Mundell R. A. 1963. Capital mobility and stabilization
policy under fixed and flexible exchange rates.
Canadian Journal of Economic and Political Science
29 (4): 475–485.
Nag A. 2002. Forecasting daily foreign exchange rates
using genetically optimized neural networks. Journal
of Forecasting 21(7), pp. 501- 511, 2002.
Neely C., Sarno L. 2002. How well do monetary
fundamentals forecast exchange rates? Federal
Reserve Bank of St. Louis Working Paper Series:
2002-2007.
Osadchy M., LeCun Y., Miller M. 2013. Synergistic face
detection and pose estimation with energybased
models. Journal of Machine Learning Research 8:
1197–1215.
Report on global foreign exchange market activity in
2013. April 2013. Triennial Central Bank Survey.
Basel, Switzerland: Bank for International

686

You might also like