Event-Driven LSTM For Forex Price Prediction
Event-Driven LSTM For Forex Price Prediction
Abstract— The majority of studies in the field of AI guided financial data. This includes impulse waves that set up a
financial trading focus on purely applying machine learning pattern, and corrective waves that oppose the larger trend. The
algorithms to continuous historical price and technical analysis identification of such waves helps us to discover the correct
data. However, due to non-stationary and high volatile nature of entry and exit trading points which would ultimately generate
Forex market most algorithm fail when put into real practice. the most profit. Fig. 1 is an example of a typical Elliott Wave.
We developed novel event-driven features which indicate a We can see either peak or trough point “e1” and retrace point
change of trend in direction. We then build long deep learning “e3” are the best market entry points and “e4” is the best
models to predict a retracement point providing a perfect entry market exit point.
point to gain maximum profit. We use a simple recurrent neural
network (RNN) as our baseline model and compared with short- The determination of peak or trough points within a certain
term memory (LSTM), bidirectional long short-term memory window requires future price information. We therefore
(BiLSTM) and gated recurrent unit (GRU). Our experiment introduce a moving average crossover event (“e2”), as
results show that the proposed event-driven feature selection confirmation of the form of point “e1” and consider point “e3”
together with the proposed models can form a robust prediction as our prediction target to enter the market. Point “e4” can also
system which supports accurate trading strategies with minimal be another prediction target to exit the market however we will
risk. Our best model on 15-minutes interval data for the not be covering that in this paper. Unlike other research that
EUR/GBP currency achieved RME 0.006x 𝟏𝟏𝟏𝟏−𝟑𝟑 , RMSE uses all historical data, we select data at “e2” and go back “n”
2.407x𝟏𝟏𝟏𝟏−𝟑𝟑 , MAE 1.708x𝟏𝟏𝟏𝟏−𝟑𝟑 , MAPE 0.194% outperforming timesteps as training data, while ignoring any other data which
previous studies. does not suggest a reliable trading opportunity. The selected
Keywords— Deep neural network, LSTM, GRU, Machine-
data is then compressed with more relevant information.
learning techniques, Feature engineering, Financial prediction, We developed a LSTM model to predict price at the
Foreign exchange, Technical analysis retracement point “e3”. Technical indicators are used as
features. We used four pairs of currency for experimentation.
I. INTRODUCTION A simple RNN is used as the baseline model. The result
Forex Trading (FX) is the largest financial market in the confirms the designed architecture is over performing.
world, consisting of multiple international participants
including professionals and individuals who invest and
speculate for profit due to its nature of robust liquidity. An
automated system which could predict correct entry and exit
points will help investors generate considerable profit with
minimum risk. In recent years, machine learning has been
used by many academics to study the exchange rate market.
[1] has provided a summary of research in this field from 2009
to 2015. Fig.1 Example of Elliott Waves
As a type of non-stationary time-series data, financial trading (left is an upper trend, right is a downtrend)
data is highly volatile and complex. Technical analysis can
smooth out noise and help to identify trends and has now
become more popular in trading research. In addition, based
on the "history repeating itself" theory, analysts believe The remainder of this paper is structured as follows: Section
patterns underlying historical data will repeat again in the II discusses the related work. Section III describes details of
future and being able to identify historical price movements is feature and training data selection. Section IV explains model
important for future price trend predictions. Therefore, the use architecture, followed by experiment details in Section V. Our
of historical data and technical indicators are required for
effective examination of trends that occur within desired
trading windows. Literature has shown that a vast majority of
work in this field is through historical data and technical
analysis. This can be seen in [2] where a change of direction
in the FX market was predicted by simply using the closing
price and moving average individually with the moving
average outperforming the closing price. More examples can
be found from [3], [4], [5], [6], [7].
In this paper we will use an alternate method to handle the
noisy and chaotic environment of high frequent trading data.
In 1935, R.N. Elliott introduced “Elliott Wave Theory” [8].
One of its main components is to identify different waves in Fig.2 example of ZigZag
results appear in Section VI and Section VII presents our stock chart images and proved that candlestick charts are the
conclusion. best stock chart images to predict stock prices among bar
charts, line charts and filled-line charts. This study also could
II. RELATED WORK not prevent the lagging issue that occurred in other LSTM
Forecasts on FX trends have become challenging due to stock price predictions. The authors then suggested to add
random fluctuations in price caused by market uncertainties noise-cancelling methods such as autoencoder or wavelet
such as political conditions, regulatory policy changes, transformation or add technical indicators to the image to
economic conditions, banking operations and capital achieve better performance. [20] had a similar approach to
movements. Researchers have therefore tried a few varied extract features from CNN then feed CNN outputs to a LSTM
approaches in this field. In order to gauge the direction of our memory to predict stock prices. [16] introduced a LSTM-
study and the suitability of our own algorithm in generating based agent to learn the temporal pattern in data and
reliable and valid results, we compare and discuss recent work
developed an automatic trading system based on historical
in market prediction.
price data and current market conditions. [21] is another
A. Technical analysis indicators example of using LSTM to predict stock prices.
Technical analysis (TA) indicators are commonly used
features in Forex prediction and many papers can confirm the LSTM and its variations have proven their effectiveness in
effectiveness of TA indicators in forex forecasts. [9] analysed time series forecast, they have dominated the financial time
the performance of six simple machine learning models to series forecasting domain in recent years. [22] survey shows
predict a binary classification for price movements (up and above 80 publications in the last 5 years.
down) for the USD/JPY currency pair. Nine features were
generated from raw data based on technical indicators (MA, III. FEATURE AND TRAINING DATA SELECTION
RSI and WR). [10] introduced a SVM to predict future price A. A sequence of events
trend directions, it generated seventy indicators as features
derived from technical analysis. The proposed model achieved As mentioned previously the sequence of events was derived
81% accuracy rate in forecasting future price trends. from Elliott Wave to determine peaks and troughs of a uptrend
or downtrend. We then incorporate a ZigZag indicator for
Due to the rapid growth of computational power in recent technical analysis. Zigzag is a popular technical indicator
years, a wide range of Neural Networks have been which identifies peaks and troughs by identifying the highest
implemented in forex trading prediction. However, most high or the lowest low within a certain period. Zigzag allows
papers conducting neural network research are still using traders to observe price movements holistically and avoid
historical price data and technical indicators as features for market noise from small movements. ZigZag (e1) is the first
training. In 2011, [11] forecasts FX markets by feeding price event of the sequence.
and technical indicators into a neural network system.
Similarly, [12] predicted future prices by using popular Zigzag indicators can be modified or defined using three
technical indicators – RSI, CCI, MACD and ROC. This paper parameters, the Depth, Deviation and Backstep. The “Depth”
focused more on building a trading agent to feed predicted is an integer which requires that a candidate Peak or Trough
prices into a decision module. Experiment results showed cannot have a lower low or higher high within the “Depth”
model effectiveness. [13] promoted a Convolutional Neural range of the candidate period. “Deviation” refers to the
Network (CNN) model to forecast monthly and weekly price amount of deviation (in pips) that is required to identify a new
trends. The data features they chose were still technical peak (from a trough) or a new trough (from a peak). Lastly,
indicators, however, they were used in conjunction with “Backstep” refers to the minimum number of periods required
exchange rates, commodity prices and world indices. [13] between adjacent peaks and troughs. Fig. 2 is an example of a
achieved a 65% accuracy rate for monthly price trends and ZigZag line.
60% for weekly price trends. [14] used nine indicators each ZigZag indicators are transition points in a certain time
with two different parameters together with a close price line window. It could be the perfect entry or exit trading point to
chart as input for the CNN models. [15] used moving average make a profit. However, the calculation of ZigZag requires
5 (MA5), moving average 10 (MA10), and moving average future price information [19], thus we introduced the second
20 information (MA20) line charts as input images to build a event which can confirm the ZigZag point.
CNN to predict weekly price movements. The target
movements were classified as up, down and non-movement The second event we detected is a moving average crossover
based on movement percentage threshold. More examples can event (e2). Moving average is a calculation which produces a
be seen from [16], [17], [18]. series of averages of combined price points of an instrument
over a specified time frame. There are two commonly used
B. Long Short-Term Memory (LSTM) moving averages - the simple moving average (SMA) and the
During recent years, LSTM has gained popularity in time exponential moving average (EMA). EMA is a weighted
series prediction due to its capability to remember the average of the last n prices, where the weighting decreases
previous inputs and its ability to prevent information from the exponentially with each previous price/period. EMA gives a
past being lost. [19] introduced a feature fusion LSTM-CNN greater weight to more recent prices. Its formula is as follows:
model for forecasting stock prices by taking the 𝐸𝐸𝐸𝐸𝐸𝐸 = Price(t) × 𝑘𝑘 + EMA(y) × (1 − 𝑘𝑘)
characteristics of both the chart image data and the time series
data. The proposed model firstly used a SC-CNN model to where t is today, y is yesterday, N is number of days in EMA,
2
extract hidden patterns in stock chart images, then used a ST- and 𝑘𝑘 = .
𝑁𝑁+1
LSTM model to work on close prices and trading volumes.
A moving average crossover occurs when two or more
Kim and Kim’s study tested performances on four different
moving average lines cross over each other. Amongst traders
this is interpreted as a signal that a change in trend is To reduce noise from highly volatile frequent trading data, we
occurring. In trading, the crossover point is often used as a selected all data at a crossover event (e2) and then go back n
trigger for a trading action, either to enter (buy or sell) or exit timesteps for training. Our prediction target is price at
(sell or buy) in the market. retracement point e3.
We lastly detected a retracement point (e3) which is also our The feature process and training data selection method is
prediction target. A retracement point is any temporary described in Fig. 4.
reversal in price within a major price trend. Retracement can
be a confirmation of a trend. It can also help traders identify if IV. MODEL ARCHITECTURE
the current trend is likely to continue or if a reversal is taking RNN is a generalized feedforward neural network which is
place. The right retracement point is the location for traders to capable of processing sequences of data one element at a time
enter into the market giving them the potential for good profits while retaining an internal memory. RNN’s recurrent nature
at minimum risk. In this paper we aim to predict price at the performs the same function of every input of data and uses
retracement point. outputs from the previous input together with current timestep
An example of a sequence of the above three events is shown data as an input for functions. The recurrent nature has
in Fig. 3. memory of the previous state and allows the network to learn
long-term dependencies in a sequence and take the entire
context into account when making a prediction. Fig. 5 depicts
RNN architecture.
B. Feature extraction Although RNN has the capability to predict sequences of data,
In addition to the above events, technical indicators are also it has the disadvantage of gradient vanishing and exploding
used as features. A technical indicator is a mathematical problems. LSTM is a modified version of RNN which
calculation based on historical price, volume, or (in the case resolves the vanishing gradient problem. LSTM architecture
of futures contracts) open interest information that aims to is composed of a cell and three gates – an input gate, an output
forecast financial market direction. Technical indicators
normally are used in conjunction with other techniques such
as the occurrence of a sequence of events to determine the
next trade action. We generated 28 technical indicators from
a TA-LIB package for our models. The 28 indicators are
derived from 6 types of indicators with different window size,
namely Moving Average Convergence Divergence
(MACD), Simple Moving Average (SMA), Relative Strength
Index (RSI), Average Directional Index (ADX), Bollinger
Band Indicator and William R Indicator (WR). All these
indicators are non-volume based due to the difficulty in
collecting reliable volume data.
C. Training data selection
In order to capture price movements of time series data, most
Fig. 6 Long Short-Term Memory Architecture
researchers chose whole continuous datasets as training data.
gate and a forget gate. Fig. 6 describes the LSTM architecture.