Financial Time Series Forecasting: A Data Stream Mining-Based System 2023
Financial Time Series Forecasting: A Data Stream Mining-Based System 2023
Article
Financial Time Series Forecasting: A Data Stream
Mining-Based System
Zineb Bousbaa 1, *,†,‡ , Javier Sanchez-Medina 2,‡ and Omar Bencharef 1,‡
1 Computer and System Engineering Laboratory, Faculty of Science and Technology, Cadi Ayyad University,
Marrakech 40000, Morocco; [email protected]
2 Innovation Center for the Information Society (CICEI), Campus of Tafira, University of Las Palmas de Gran
Canaria, 35017 Las Palmas de Gran Canaria, Spain; [email protected]
* Correspondence: [email protected]
† Current address: Faculty of Sciences and Technology, Cadi Ayyad University, Marrakesh 40000, Morocco.
‡ These authors contributed equally to this work.
Abstract: Data stream mining (DSM) represents a promising process to forecast financial time series
exchange rate. Financial historical data generate several types of cyclical patterns that evolve, grow,
decrease, and end up dying. Within historical data, we can notice long-term, seasonal, and irregular
trends. All these changes make traditional static machine learning models not relevant to those
study cases. The statistically unstable evolution of financial market behavior yields a progressive
deterioration in any trained static model. Those models do not provide the required characteristics
to evolve continuously and sustain good forecasting performance as the data distribution changes.
Online learning without DSM mechanisms can also miss sudden or quick changes. In this paper,
we propose a possible DSM methodology, trying to cope with that instability by implementing an
incremental and adaptive strategy. The proposed algorithm includes the online Stochastic Gradient
Descent algorithm (SGD), whose weights are optimized using the Particle Swarm Optimization
Metaheuristic (PSO) to identify repetitive chart patterns in the FOREX historical data by forecasting
the EUR/USD pair’s future values. The data trend change is detected using a statistical technique
that studies if the received time series instances are stationary or not. Therefore, the sliding window
size is minimized as changes are detected and maximized as the distribution becomes more stable.
Citation: Bousbaa, Z.; Results, though preliminary, show that the model prediction is better using flexible sliding windows
Sanchez-Medina, J.; Bencharef, O. that adapt according to the detected distribution changes using stationarity compared to learning
Financial Time Series Forecasting: A using a fixed window size that does not incorporate any techniques for detecting and responding to
Data Stream Mining-Based System. pattern shifts.
Electronics 2023, 12, 2039. https://
doi.org/10.3390/electronics12092039 Keywords: data stream mining; forex; online learning; adaptive learning; incremental learning;
Academic Editors: Taiyong Li, Wu
sliding window; concept drift; financial time series forecasting
Deng and Jiang Wu
Some studies limit their focus to forecasting future trends, while others go beyond that to
implement trading strategies that work on maximizing profit.
Speaking about the FTSERF difficulties, dealing with the data’s volatile and chaotic
nature is the biggest one. To learn from historical financial time series datasets, you have
to be able to adapt to new patterns all the time. This adaptivity is called reacting to the
detected changes within the data stream mining (DSM) context. This task is challenging
as it requires both recognizing real changes and avoiding false alerts. There are many
change detection techniques. Ref. [6] cites some techniques, including the CUSUM test, the
geometric moving average test, statistical tests, and drift detection methods.
Our paper’s contribution is to show how integrating DSM techniques into the online
learning process can increase FTSERF performance. In our experimental study, we propose
the SGD algorithm optimized using the PSO. We managed the changes that occurred
in the financial time series trends by implementing a sliding window mechanism. The
sliding window size is flexible. It is minimized when a high fluctuation is detected and
maximized when the time series pattern is more or less stable. As a change detection
mechanism, we test for each sliding window the stationarity of the data stream within,
and based on the results, we decide to maintain or adapt the window size that will be
passed as input to our forecasting model. We have compared going through the learning
process using a traditional algorithm vs. integrating DSM techniques. The traditional
algorithm combines the SGD, whose parameters are optimized using the PSO metaheuristic
periodically. The DSM version involves the adaptive sliding window mechanism and the
statistical stationarity test.
The remainder of this paper is structured as follows. The following section carries
out a literature review that we have performed concerning data mining, in addition to
DSM’s application to FTSERF. For further illustration, Section 3 is devoted to describing our
dataset’s components, its preprocessing, analysis, and input selection processes. Section 4
represents the various proposed forecasting system components and illustrates its architec-
ture and algorithm. Section 4 also shows the various experimental studies, analysis of their
results, and further discussions. Finally, some concluding remarks are made in Section 5.
2. Literature Review
2.1. Machine Learning Application to Financial Forecasting
2.1.1. Overview
The financial forecasting research field is very dynamic. The value forecasting of
financial assets is a field that attracts the interest of multiple profiles, from accountants,
mathematicians, and statisticians to computer scientists. In addition, when it comes to
investing, we find that investors with no scientific background easily integrate themselves
into the field. Tools like technical analysis are highly used as they are easy to apply and
have shown effectiveness in trading. Despite the fact that financial asset markets first
appeared in the 16th century (see Figure 1), the first structured approaches to economic
forecasting of these assets date from the last century [7–10].
Research employs methods ranging from mathematical models to statistical models
like ARMA, suggested by Wold in 1938 by combining both AR and MA schemes [11],
as well as to macroeconomic models like Tinbergen’s (1939) [12]. By this time, various
accounting, mathematical, statistical, and machine learning models were constructed to
predict different kinds of financial assets, which led to their exponential increase. One of
the main apparent reasons for this increase is that this research field is one of the most
highly funded, since financial investment generates profit. Funding comes from many
sources; we can mainly mention big asset management companies and investment banks.
Electronics 2023, 12, 2039 3 of 29
Figure 1. The chronology of financial asset appearance and the existing forecasting approaches for
their valuation [7–10].
Back in the 1980s, exploring historical data was more difficult since the amount of data
had grown, but computational power was still weak [13]. By the 1990s, more economic and
statistical models had appeared, and they showed better performance than a random walk.
Still, these models do not perform the same way over all the financial assets [14].
However, by the beginning of the 21st century, machine learning models were all the
rage because computers had become faster and were capable of more work. During this
time, many hybrid algorithms, such as moving averages that take into account regressivity
(ARIMA) or algorithms that combine neural networks and traditional time series, have
been proposed. We also noticed the rapid, exponential growth in the number of papers
in this research field from 1998 to 2016. There has been a wide range of topics ranging
from credit rating to inflation forecasting and risk management, which has saturated this
research field and made finding innovative ideas more challenging [13].
On the other hand, scientists became more aware in the 1980s of how important it was
to process textual data. There were attempts to import other predictors developed from
linguistics by Frazier in 1984 [15]. More progress has been achieved, such as word spotting
using naïve statistical methods (Brachman and Khabaza 1996) [16]. Sentiment analysis
resources were proposed at the beginning of the 21st century (Hu and Liu 2004) [17].
Sentiment analysis is a critical component that employs both machine learning algorithms
and knowledge-based methodologies, particularly for analyzing the sentiments of social
media users [18]. From 2010 on, social media data increased exponentially, which attracted
Electronics 2023, 12, 2039 4 of 29
the interest of the news analytics community in processing real-time data (Cambria et al.
2014) [13,19].
While reviewing some surveys that shed light on research in financial forecasting,
we find some surveys that suggest categorizing the proposed models in different ways.
The study in [20] distinguished between singular and hybrid models, which include non-
linear models such as artificial neural networks (ANN), support vector machines (SVMs),
particle swarm optimization (PSO), as well as linear models like autoregressive integrated
moving average (ARIMA), etc. Meanwhile, the study in [21] revealed fundamental and
technical analysis as the two approaches that are commonly used to analyze and predict
financial market behaviors. It also distinguished between statistical and machine learning
approaches, assuming that machine learning approaches deal better with complex, dynamic,
and chaotic financial time series. In addition, Ref. [21] points out that the profit analysis of
the suggested forecasting techniques in real-world applications is generally neglected. As a
solution, they suggest a detailed process for creating a smart trading system. As a solution,
it is composed of data preparation, algorithm definition, training, forecasting evaluation,
trading strategies, and money evaluation [21]. Papers can also be summarized based on
their primary goal, which could be preprocessing, forecasting, or text mining. They can
likewise be classified based on the nature of the dataset, whether qualitatively derived
from technical analysis or quantitatively retrieved from financial news, business financial
reports, or other sources [21]. In addition, Ref. [22] distinguished between parametric
statistical methods like Discriminant Analysis and Logistic Regression, Non-Parametric
Statistical Methods like Decision Tree and Nearest Neighbor, or Soft Computing techniques
like Fuzzy Logic, Support Vector Machine, and Genetic Algorithm.
Essential conclusions are extracted from the survey [21], where it is considered that no
approach is better in every way than another, as there is no well-established methodology
to guide the construction of a successful, intelligent trading system. Another issue is
highlighted in [22] concerning the use of metrics like RMSE and MSE that depend on the
scale vs. those that do not, such as the MAPE metric. On the other hand, some papers
consider that once a forecasting system is trained, it is expected well in advance to forecast
future values. However, this is not possible in the financial time series study case, as they
change over time, making retraining with updated data necessary to collect the most recent
knowledge about the state of the market. This final issue has been our motivation to work
on the DSM application to FTSERF.
The PSO optimization technique is an iterative algorithm that works and converges
closer and faster to the solution search space. It is based on a population of solutions
collaborating together to reach the optimum results [28]. Despite its limitations, such as
the inability to guarantee a good convergence and its computational cost, researchers still
use it in financial optimization problems in particular and in other study fields as well,
as it helps exceed the convergence limits in many cases. PSO is proposed for the first
time by [29,30] as a solution to deal with problems presented in the form of nonlinear
continuous functions. In the literature, we find a lot of studies showing how the PSO
helped achieve a new score for forecasting time series or any other type of data. In [31],
authors have experimented with how PSO can help optimize the results given by neural
networks compared to the classic backpropagation algorithm for time series forecasting.
On the other hand, the article [32] shows how currency exchange rate forecasting capacity
can be polished by adjusting the model function parameters and using the PSO as a booster
to the generalized regression neural network algorithm’s performance. Another work
combining PSO and neural networks is [33], which worked on predicting the Singapore
stock market index. Results demonstrate the effectiveness of using the particle swarm
optimization technique to train neural network weights. The performance of the PSO
FFNN is assessed by optimizing the PSO settings. In addition, recurrent neural network
results predicting stock market behavior have been optimized using PSO in [34]. The
study’s findings demonstrate that the model used in this work has a noticeable impact on
performance compared to before the hybridization. Finally, we cite [35] where a competitive
swarm optimizer that is a PSO variant proved its efficiency dealing with datasets having a
large number of features. The experiment shows how the proposed technique can show a
fast convergence to greater accuracy.
constantly interact with its environment to optimize the reward [38]. Prequential learning
is also one of the efficient techniques used for online learning. It serves as an alternative
to the standard holdout evaluation that was carried over from batch-setting issues. The
prequential analysis is specifically created for stream environments. Each sample has two
distinct functions; it is analyzed sequentially in the order of receipt and then rendered
inaccessible. This approach makes predictions for each instance, tests the model, and then
trains it using the same sample (partial fit). The model is continually put to the test on
fresh samples. Validation techniques are also frequently used to help models be adaptive
and incremental. We can mention the following methods: data division into training,
test, and holdout groups. We can also cite cross-validation techniques including k-fold,
leave-one-out, leave-one-group, and nested. The problem with validation techniques is the
risk of overfitting. Techniques for adapting a model are many. We can mention, for example,
computing the area under the ROC Curve (AUC) using constant time and memory [39].
and be biased in identifying the principal class. The foundation for drift identification in un-
balanced streams should be AUC. The study in [46] includes Page–Hinkey (PH) statistical
test with some updates, include the best outcomes, or very nearly so, except that ADWIN
might require more time and memory when change is constant or there is no change.
Regarding concept drift detection for regression problems, there is a technique that
consists of studying the eigenvalues and eigenvectors. It allows the characterization of
the distribution using an orthogonal basis. This approach is the object of the principal
component analysis. A second technique is to monitor covariance. In probability theory
and statistics, covariance measures the joint variability of two random variables, or how
far two random variables differ when combined. It is also used for two sets of numerical
data, calculating deviations from the mean. The covariance between two random variables,
X and Y, is 0 if they are independent. The opposite, however, is untrue [47]. For further
details, Ref. [48] shows the covariance matrix types. The cointegration study is another
technique that detects concept drift, identifying the sensitivity degree of two variables to
the same average price over a specified period. The use of it in econometrics is common. It
can be used to discover mean-reversion trading techniques in finance [49]. In our study, we
compared the use of a fixed versus a flexible window size. We are also involved in studying
the stationary process thing the AUC in the process. We conclude that class inequality has
an impact on both prequential accuracy and AUC. However, AUC is statistically more
discriminant. While accuracy can only reveal genuine drifts, it shows both real and virtual
drifts. The authors used post hoc analysis, and the results confirm that AUC performs
the best but has issues with highly imbalanced streams. Another family of methods is
adaptive decision trees. It can adaptively learn from the data stream, and it is not necessary
to understand how frequently or quickly the stream will change [6]. The ADWIN window
is also a great technique for adaptive learning. We can either fix its size or make it variable.
ADWIN is a parameter-free adaptive size sliding window. When two large sub-windows
are too different, the window’s older section is dropped. ADWIN reacts when sudden,
infrequent or slow, gradual changes occur. To evaluate ADWIN’s performance, we can
follow the accuracy evolution of the false alarm probability, the probability of accurate
detection, and the average delay time in detection. Some hybrid models, such as ADWIN
with the Kalman filter, have demonstrated that they producat is related to regression
problems; more details about it will be explained in the preliminaries section.
can show how the combination of machine learning and finance knowledge can lead to
better predictions and more efficient decision systems.
The paper in [51] has also conducted an interesting study where forecasting simula-
tions are made using econometric methods and other simulations are carried out using
machine learning methods. The experiment reveals the importance of considering financial
factors such as market maturity in addition to technical factors such as the used forecasting
methods and evaluation metrics for good forecasting performance. Results show how
Support Vector Machines (SVMs), which is a machine learning method, have given better
results than the autoregressive model (AR), which is an econometric method. Advanced
machine learning techniques show efficiency in detecting market anomalies in numerous
significant financial markets. Authors also criticize studies that judge machine learning
efficiency based on experiments applying traditional models instead of advanced ones like
sliding windows and optimization mechanisms. They also refer to the fact that forecasting
results do not necessarily lead to good returns. In addition to that, many researchers do not
consider the transaction cost in their trading simulations.
In the literature, we can find multiple research studies showing how financial and
machine learning methods can both contribute to efficient financial forecasting and invest-
ment systems. The study in [52] shows how combining statistical and machine learning
metrics enhances the forecasting system’s evaluation performance. They compare the
forecasting abilities of well-known machine learning techniques: multilayer perceptrons
(MLP) and support vector machine (SVM) models, the deep learning algorithm, and long
short-term memory (LSTM), in order to predict the opening and closing stock prices for
the Istanbul Stock Exchange National 100 Index (ISE-100). The evaluation metrics used
are MSE, RMSE, and R2 . In addition, statistical tests are made using IBM SPSS statistics
software in order to evaluate the different machine learning models’ results. The findings
of this study demonstrate how favorable MLP and LSTM machine learning models are
for estimating opening and closing stock prices. Authors of the study in [53] also recom-
mended combining fundamental economic knowledge with machine learning systems,
as experts’ judgment strengthens ultimate risk assessment. Their experiment compared
machine learning algorithms to a statistical model in risk modeling. The study shows
how extreme gradient boosting (XGBoost) succeeds in generating stress-testing scenarios
surpassing the classical method. However, the lack of balance complicates class detection
for machine learning models. Another challenge is that their dataset was limited to the
Portuguese environment and needs to be expanded to other markets in order to improve
the system’s validation.
We also find finance and economy-oriented papers that have explored machine learn-
ing algorithms and proved their efficiency to forecast financial market patterns and generate
good returns for investors. For example, authors in [7] demonstrated how asset pricing with
machine learning algorithms is promising in finance. They implemented linear, tree, and
neural network-based models. They used machine learning portfolios as metastrategies,
where the first metastrategy combines all the models they have built and the second one
selects the best-performing models. Results show how high-dimensional machine learning
approaches can approximate unknown and potentially complex data-generating processes
better than traditional economic models.
We also cite the study in [54], which shows how institutional investors use machine
learning to estimate stock returns and analyze systemic financial risks for better investment
decisions. The authors concluded that big data analysis can efficiently contribute to detect-
ing outliers or unusual patterns in the market. They also recommend data-driven or data
science-based research as a promising avenue for the finance industry.
Despite its efficiency in forecasting, machine learning applications in financial markets
have some limitations. For example, authors in [55] concluded that machine learning sys-
tems’ performance can vary depending on the studied market conditions. They evaluated
the following machine learning algorithms: elastic net (Enet), gradient-boosted regression
trees (GBRTs), random forest (RF), variable subsample aggregation (VASA), and neural net-
Electronics 2023, 12, 2039 9 of 29
works with one to five layers (NN1–NN5). In addition, they tested ordinary least squares
(OLS), regression LASSO, an efficient neural network, and gradient-boosted regression
trees equipped with a Huber loss function. They encountered difficulties when using data
from the US market but achieved good results as they worked on the Chinese stock market
time series. However, their experiment did not involve advanced optimization techniques
for hyperparameter selection and adaptation to each time series nature.
An interesting book that compares econometric models to machine learning models
is [56]. The study shows how econometrics leans more toward statistical significance, while
machine learning models focus more on the data’s behavior over time. Machine learning
advantages include the fact that they do not skip important information related to data
interaction, unlike econometric models. In addition, machine learning models have the
capacity to break down complex trends into simple patterns. They also better prevent
overfitting using validation datasets. On the other hand, econometric models’ advantage
is the fact that their results are explicable, unlike those of machine learning methods,
whose learning process includes black boxes. Overall, the references provide valuable
insights into the advantages and limitations of machine learning and econometric models
in financial forecasting and highlight the need for careful evaluation and interpretation of
their results. In our current work, we focused on showing how DSM techniques can boost
online algorithms to adapt to financial time series data with time-varying characteristics.
Statistical methods play a major role in our system, as we use the stationarity test to detect
if there is a change in the data stream distribution.
one. Before making a prediction, the learner often receives a description of the situation.
The learner’s objective is to maximize the cumulative benefit or, conversely, reduce the
cumulative loss [65]. Finally, we find no incremental studies in the literature that can be
used for batch learning, such as [66,67]. A set or a series of observations are accepted as a
single input by a batch learning algorithm. The algorithm creates its model, and it does not
continue to learn. In contrast to online learning, batch learning is standing [65].
Adaptive approaches include several categories; we can mention concept drift for
change detection to update decision systems, as in [68,69]. Secondly, forgetting factors are
highly used in the FTSERF field, and they are especially dedicated for models that rely on
weight updates for model tuning. Refs. [70,71] are two examples from this category. The
third technique we mention is the order selection technique. It analyzes statistics, makes
decisions based on sentimental analysis modules, and votes, such as in [72,73]. The fourth
technique is pattern selection, which entails identifying profitable patterns and testing
them, as in [74,75]. Last, the weight update is a very common technique. It consists of
using new data to adjust the system parameters, or weights. It is proposed in many studies,
such as in [76,77]. More information concerning the state of the art of DSM application for
FTSERF will be presented in another work, our global survey.
3. Preliminaries
3.1. Dataset Description
In our dataset, we have chosen to include three currency pairs. The EUR/USD pair
represents our target to forecast, while the GBP/USD and the JPY/USD pairs are included
because of their significant impact and correlation to our target pair. Each pair’s historical
data have open, high, low, and close prices. The dataset we used ranges from 30 May 2000
to 28 February 2017, later expanded to 30 November 2022. More information can be found
in the data availability statement.
Our dataset also integrated 12 technical indicators calculated for each one of the three
used pairs: the stochastic oscillator, the Relative Strength Index (RSI), the StochRSI oscillator,
the Moving Average Convergence and Divergence (MACD), the average directive index
(ADX), the Williams% R, the Commodity Channel Index (CCI), the true mean range (ATR),
the High-Low index, the ultimate oscillator, the Price Rate Of Change Indicator (ROC), the
Bull power, and the Bear power. In addition, we used historical gold price data in Euro, US
Dollar, British Pound, and Japanese Yen.
As we draw a random line through some of these data points in the space, this straight
line’s equation would be Y = mX + b, where m is the slope and b is the Y-axis intercept. A
machine-learning model tries to predict what will happen with a new set of inputs based
on what happened with a known set of inputs. The discrepancy between the expected and
actual values would be the error:
The concept of a cost function or a loss function is relevant here. The loss function
calculates the error for a single training example in order to assess the performance of the
machine learning algorithm.
The cost function, on the other hand, is the average of all the loss functions from the
training samples. If the dataset has N total points and we want to minimize the error for
each of those N points, the total squared error would be the cost function. Any machine
learning algorithm’s goal is to lower the cost function. To do this, we identify the value
of X that results in the value of Y that is most similar to actual values. To locate the cost
function minima, we devise the gradient descent algorithm formula [82].
The gradient descent algorithm looks like this:
Repeat until convergence
m
1
θ j := θ j −
m ∑ (hθ (xi ) − yi )xij (2)
i =1
Electronics 2023, 12, 2039 14 of 29
The SGD is similar to the gradient descent algorithm structure, with the difference
that it processes one training sample at each iteration instead of using the whole dataset.
SGD is widely used for training large datasets because it is computationally faster and can
be processed in a distributed way. The fundamental idea is that we can arrive at a location
that is quite near the actual minimum by focusing our analysis on just one sample at a
time and following its slope. SGD has the drawback that, despite being substantially faster
than gradient descent, its convergence route is noisier. Since the gradient is only roughly
calculated at each step, there are frequent changes in the cost. Even so, it is the best option
for online learning and big datasets.
Algorithm 1: The SGD algorithm optimized using the PSO metaheuristic with
an adaptive sliding window
Input: The window is initialized with the first 15 elements, and the SGD and the
PSO metaheuristic parameters are initialized by learning from the first
window instances.
1 Initialization;
2 windowsize = 15;
3 numrows = 15;
4 rangeMin = 0;
5 rangeMax = 15;
6 PSOApplication = 1;
7 while We have a new input with 15 instances or higher do
8 Save the previous window in a variable previousWindow;
9 Save the next 15 instances in a variable currentWindow;
10 Create the target variable combining our target variable from the
previousWindow and currentWindow datasets;
11 if The target variable is stationary then
12 windowsize = 15;
13 else
14 windowsize = 1;
/* Put in the variable currentwindow the instances ranging from
numrows+1-15 to numrows+1 */
15 numrows=numrows+windowsize;
16 Validate the current SGD model using the current window;
17 Give as input the current window instance to the SGD to obtain new weights
for our online model;
18 if numrows>= 60*PSOApplication then
19 Give as input the last 60 days to the PSO optimizer and receive as output
the new weight for our SGD model;
20 PSOApplication+=1;
The implementation of the SGD and PSO is inspired by our previous work, Ref. [2],
where we developed the classic gradient descent optimized using Particle Swarm Optimiza-
tion. We used almost the same gradient descent algorithm for the SGD, as the difference
between them is limited to the used training data. For the gradient descent, all the training
data are received as an input to the Algorithm 2, and then a backpropagation is applied to
adjust the model weights. The SGD algorithm receives a new part of the training data at
each iteration and adjusts its weights to that subset of data using a backpropagation.
Figure 8 shows our proposed method flowchart, which consists of adapting the sliding
window size based on the stationarity statistical test. After this, the forecasting model
receives the sliding window instances as input for validation and training.
Electronics 2023, 12, 2039 17 of 29
The function f (x) is the gradient descent function for predicting our target variable,
which is the EUR/USD exchange rate:
d
f ( x ) = w0 + w1 x 1 + w2 x 2 + · · · + w d x d = w0 + ∑ w i x i (4)
i =1
Our goal is to find the optimal weights of the regression function in our iterative
process by optimizing our loss function. Weights are optimized in our SGD using the
following formula, where the tolerance is fixed to 0.001 in our experimental part and the
error is simply the difference between the real and predicted value of the target variable,
the EUR/USD exchange rate in our case:
Our second metric is a classification metric, where we study the accuracy of predicting
if the exchange rate will rise or fall. It is presented in the form of a percent, and its formula
is the following:
TP + TN
Accuracy = (6)
TP + TN + FP + FN
The third metric we used is average rectified value ARV, which shows the average
variation for a group of data points. The forecasting model performs identically if we
calculate the mean over the series with ARV = 1. The model is considered worse than just
taking the mean if ARV > 1.
2
∑iN=1 (yi − f ( xi ))
ARV = (7)
∑iN=1 |(yi − f ( xi ))|
xd ← xd + vd (8)
As illustrated in Algorithm 2, the speed is then calculated based on the the search space
range, the best particle weight found, the best swarm weight found, and the current weight.
Figures 9 and 10 show that the SGD itself is making good progress. After receiving
1000 instances, the mean squared error (MSE) became more stable, and the variance reached
its best stability after receiving 3000 instances.
We updated the learning rate by adding or subtracting 20% of its value and multi-
plying it by 0.99 or 1.01. We remarked that the learning rate has no impact on the model
performance improvement in the case of our architecture. The results did not change and
stayed similar to those obtained with the default parameters. As we noted, even when
making the previously mentioned updates on the learning rate, the error still does not
stabilize until the algorithm reaches around 1000 processed instances.
Figure 11 shows the EUR/USD close price historical data from 1 January 2001 to
1 January 2004. We notice that the value range changed completely comparing 2001 and
2002 to 2003 and 2004, revealing the importance of online learning for financial time
series processing.
Figures 12 and 13 show the predicted values in orange versus the actual values in blue
using the SGD alone. On the other hand, Figure 14 shows the results as we integrate the
PSO metaheuristic every 60 days into the learning process. The accuracy for all the plots
is good and reaches 82%. This means that the models correctly predict the price direction
in 82% of the cases. The added value of the PSO metaheuristic is noticeable in terms of
the margin error, which decreases significantly as the price decreases. The PSO helped
minimize the margin error between the predicted and actual values as the price crashed
between instances 20 and 30.
Electronics 2023, 12, 2039 20 of 29
Figures 15 and 16 show, the EUR/USD daily close price time series and histogram,
respectively, from 30/05/2000 to 28/07/2000. The price values show the volatility of the
time series data stream that we need to deal with using concept drift detection techniques.
Tables 1–3 summarize statistical values such as the mean and the variance. They also
contain a p-value that indicates whether the data is stationary or not. If the p-value is higher
than 0.05, the null hypothesis (H0) cannot be rejected, and the data are non-stationary. The
results show that in the case of this two-month time series, we have a stationary trend every
15 days, but as we study a whole month or two months, the trend is non-stationary.
We made tests to compare the fixed and flexible window sizes. For the fixed-size case,
the chosen size is 15 instances at each iteration because, according to our statistical studies,
the data tend to have the same pattern every two weeks. For the flexible window size, we
study the next 15 days’ stationarity. If the data are stationary, the algorithm receives 15 new
instances. If the data are not stationary, the algorithm receives only one new instance.
Table 4 shows the prediction results for year 2000 EUR/USD historical data as it
represents the first data received by the system. In most intervals except [75:90], [90:105],
and [120:135], the mean squared error for the flexible-size window case exceeds the fixed-
size window case. Meanwhile, for all intervals, we notice that the accuracy using the
flexible-size window exceeds or equals the accuracy given using a fixed-size window. To
illustrate the predicted vs. the real values, Figures 17 and 18 show the interval [60:74]. We
can see that at each point, the real and predicted values are closer in the flexible approach
compared to the fixed window approach. The ARV results are all way smaller than 1, which
means that our model predicts way better than simply taking the mean. ARV also shows
the data points’ variation, and from the obtained values, we can see that the instances are
not too correlated to one another.
Mean 0.9819729708789546
Variance 0.008933902258522159
Table 2. The EUR/USD statistics from 30 May 2000 to 28 July 2000 split into two equal parts.
Table 3. The EUR/USD statistics from 30 May 2000 to 28 July 2000 split into four equal parts.
Table 4. The flexible vs. the fixed sliding window results from year 2000 EUR/USD historical data.
Table 4. Cont.
Figure 11. The EUR/USD close price historical data from 1 January 2001 to 1 January 2004.
Figure 12. The real vs. the predicted values using the SGD algorithm.
Electronics 2023, 12, 2039 24 of 29
Figure 13. The real vs. the predicted values using the SGD algorithm on a bigger test dataset.
Figure 14. The real vs. the predicted values using the SGD algorithm optimized using the PSO
metaheuristic every 60 days.
Figure 15. The EUR/USD daily close price time series from 30 May 2000 to 28 July 2000.
Figure 16. The EUR/USD daily close price histogram from 30 May 2000 to 28 July 2000.
Electronics 2023, 12, 2039 25 of 29
Figure 17. The flexible window: the predicted vs. the real value for the interval [60:74].
Figure 18. The fixed window: the predicted vs. the real value for the interval [60:74].
4.7. Discussions
Figure 9 shows the regression score variance. We see that the model should perform
the learning through multiple sliding windows and receive a certain number of instances to
reach the point where we can rely on the proposed algorithm results for decision making.
The same is true for Figure 10, where the mean squared error convergence reached its
limit starting from receiving approximately 1000 instances. One of the biggest challenges of
using gradient descent algorithms is building a model that converges as much as possible.
In addition, the best convergence is not guaranteed with the first algorithm execution. Since
the weights are often primarily initialized randomly, little by little we limit the search space
of the optimal weights to a smaller range.
The learning rate speeds up as the gradient moves while descending. If you set it too
high, your path will become unstable, and if you set it too low, the convergence will be slow.
If you set it to zero, your model is not picking up any new information from the gradients.
As we worked on updating the learning rate alpha by decreasing or increasing its value,
we did not notice a difference, and we still obtained the best convergence beyond receiving
1000 instances. The fact that reducing the error to some extent only requires receiving a
certain amount of data may help to explain those results.
On the other hand, Figure 11 reveals the importance of DSM to erase the old irrelevant
models and build a newer one that fits the new data trends. However, keeping the irrelevant
models aside for potential future use can be a good idea. As for some study cases, patterns
can reappear occasionally or periodically.
Figures 12–14 compared integrating the PSO metaheuristic to online learning vs.
not using it. The positive impact is noticed as the price crashes. The margin error was
significantly reduced when the PSO was used. Even though the computational and time
costs of using the PSO are higher, integrating it periodically to enhance the forecasting
quality is promising.
The volatility illustrated in Figures 15 and 16 is one of the biggest challenges en-
countered in the FTSERF. It has to be managed by minimizing the risks that it reveals. In
Electronics 2023, 12, 2039 26 of 29
cases of high volatility, using flexible sliding windows becomes a must. By doing this,
we can guarantee that the windows are the right size to see emerging trends and make
wise choices.
As noticed from Figures 17 and 18, flexible sliding windows ensured the suggested
algorithm had an optimal duration, accuracy, and error margin. The PSO periodic integra-
tion and the adaptive sliding windows achieved the fastest convergence. The training and
forecasting performances of the algorithm with a flexible window size are better when we
compare them to those of the learning algorithm with a fixed window size.
In traditional machine learning, the future fluctuations are adjusted based on previous
expectation errors. It consists of investing historical knowledge about past fluctuations, and
the model is making decisions or forecasts based on the training it went through. However,
as we integrate DSM techniques, adaptive expectations are also ensured by calculating
the statistical distribution for every new data stream. The model receives at each iteration
fifteen instances, which are minimized to one instance at each iteration as soon as a high
level of volatility is detected in the fifteen instances of the new sliding window, which
makes the model more adaptive compared to real-time approaches without data stream
mining techniques that work on detecting the change and reacting to it.
Author Contributions: Conceptualization, Z.B.; methodology, Z.B.; software, Z.B.; validation, Z.B.;
formal analysis, Z.B.; investigation, Z.B.; resources, Z.B.; data curation, Z.B.; writing—original draft
preparation, Z.B.; writing—review and editing, Z.B.; visualization, Z.B.; supervision, O.B. and J.S.-M.;
project administration, O.B. and J.S.-M.; funding acquisition, O.B. and J.S.-M. All authors have read
and agreed to the published version of the manuscript.
Funding: This work was supported by a scholarship received from Erasmus+ exchange program and
funding from Centro de Innovación para la Sociedad de la Información, University of Las Palmas de
Gran Canaria (CICEI-ULPGC).
Data Availability Statement: We published the dataset we used in this research at the following link:
https://github.com/zinebbousbaa/eurusdtimeseries accessed on 27 March 2023.
Conflicts of Interest: All authors have no conflict of interest to disclose.
Abbreviations
The following abbreviations are used in this manuscript:
References
1. Gerlein, E.A.; McGinnity, M.; Belatreche, A.; Coleman, S. Evaluating machine learning classification for financial trading: An
empirical approach. Expert Syst. Appl. 2016, 54, 193–207. [CrossRef]
2. Bousbaa, Z.; Bencharef, O.; Nabaji, A. Stock Market Speculation System Development Based on Technico Temporal Indicators
and Data Mining Tools. In Heuristics for Optimization and Learning; Springer: Berlin/Heidelberg, Germany, 2021; pp. 239–251.
3. Stitini, O.; Kaloun, S.; Bencharef, O. An Improved Recommender System Solution to Mitigate the Over-Specialization Problem
Using Genetic Algorithms. Electronics 2022, 11, 242. [CrossRef]
4. Jamali, H.; Chihab, Y.; García-Magariño, I.; Bencharef, O. Hybrid Forex prediction model using multiple regression, simulated
annealing, reinforcement learning and technical analysis. Int. J. Artif. Intell. ISSN 2023, 2252, 8938. [CrossRef]
5. Bifet, A.; Holmes, G.; Pfahringer, B.; Kranen, P.; Kremer, H.; Jansen, T.; Seidl, T. Moa: Massive online analysis, a framework for
stream classification and clustering. In Proceedings of the First Workshop on Applications of Pattern Analysis, PMLR, Windsor,
UK, 1–3 September 2010; pp. 44–50.
6. Bifet, A. Adaptive Stream Mining: Pattern Learning and Mining from Evolving Data Streams; Ios Press: Amsterdam, The Netherlands,
2010; Volume 207.
7. Thornbury, W.; Walford, E. Old and New London: a Narrative of Its History, Its People and Its Places; Cassell Publisher: London, UK,
1878; Volume 6.
8. Cummans, J. A Brief History of Bond Investing. 2014. Available online: http://bondfunds.com/ (accessed on 24 February 2018).
9. BIS site development project. Triennial central bank survey: Foreign exchange turnover in April 2016. Bank Int. Settl. 2016.
Available online: https://www.bis.org/publ/rpfx16.htm (accessed on 24 February 2018).
10. Lange, G.M.; Wodon, Q.; Carey, K. The Changing Wealth of Nations 2018: Building a Sustainable Future; Copyright: International
Bank for Reconstruction and Development, The World Bank 2018, License type: CC BY, Access Rights Type: open, Post date: 19
March 2018; World Bank Publications: Washington, DC, USA, 2018; ISBN 978-1-4648-1047-3.
11. Makridakis, S.; Hibon, M. ARMA models and the Box–Jenkins methodology. J. Forecast. 1997, 16, 147–163. [CrossRef]
12. Tinbergen, J. Statistical testing of business cycle theories: Part i: A method and its application to investment activity. In Statistical
Testing of Business Cycle Theories; Agaton Press: New York, NY, USA, 1939; pp. 34–89.
13. Xing, F.Z.; Cambria, E.; Welsch, R.E. Natural language based financial forecasting: a survey. Artif. Intell. Rev. 2018, 50, 49–73.
[CrossRef]
14. Cheung, Y.W.; Chinn, M.D.; Pascual, A.G. Empirical exchange rate models of the nineties: Are any fit to survive? J. Int. Money
Financ. 2005, 24, 1150–1175. [CrossRef]
15. Clifton, C., Jr.; Frazier, L.; Connine, C. Lexical expectations in sentence comprehension. J. Verbal Learn. Verbal Behav. 1984,
23, 696–708. [CrossRef]
16. Brachman, R.J.; Khabaza, T.; Kloesgen, W.; Piatetsky-Shapiro, G.; Simoudis, E. Mining business databases. Commun. ACM 1996,
39, 42–48. [CrossRef]
17. Hu, M.; Liu, B. Mining and summarizing customer reviews. In Proceedings of the Tenth ACM SIGKDD International Conference
on Knowledge Discovery and Data Mining, Seattle, WA, USA, 22–25 August 2004; pp. 168–177.
18. Ali, T.; Omar, B.; Soulaimane, K. Analyzing tourism reviews using an LDA topic-based sentiment analysis approach. MethodsX
2022, 9, 101894. [CrossRef]
19. Cambria, E.; White, B. Jumping NLP curves: A review of natural language processing research. IEEE Comput. Intell. Mag. 2014,
9, 48–57. [CrossRef]
20. Rather, A.M.; Sastry, V.; Agarwal, A. Stock market prediction and Portfolio selection models: A survey. Opsearch 2017, 54, 558–579.
[CrossRef]
21. Cavalcante, R.C.; Brasileiro, R.C.; Souza, V.L.; Nobrega, J.P.; Oliveira, A.L. Computational intelligence and financial markets: A
survey and future directions. Expert Syst. Appl. 2016, 55, 194–211. [CrossRef]
22. Gadre-Patwardhan, S.; Katdare, V.V.; Joshi, M.R. A Review of Artificially Intelligent Applications in the Financial Domain. In
Artificial Intelligence in Financial Markets; Springer: Berlin/Heidelberg, Germany, 2016; pp. 3–44.
23. Curry, H.B. The method of steepest descent for non-linear minimization problems. Q. Appl. Math. 1944, 2, 258–261. [CrossRef]
24. Shao, H.; Li, W.; Cai, B.; Wan, J.; Xiao, Y.; Yan, S. Dual-Threshold Attention-Guided Gan and Limited Infrared Thermal Images for
Rotating Machinery Fault Diagnosis Under Speed Fluctuation. IEEE Trans. Ind. Inform. 2023, 1–10. [CrossRef]
25. Lv, L.; Zhang, J. Adaptive Gradient Descent Algorithm for Networked Control Systems Using Redundant Rule. IEEE Access 2021,
9, 41669–41675. [CrossRef]
26. Sirignano, J.; Spiliopoulos, K. Stochastic gradient descent in continuous time. Siam J. Financ. Math. 2017, 8, 933–961. [CrossRef]
27. Audrino, F.; Trojani, F. Accurate short-term yield curve forecasting using functional gradient descent. J. Financ. Econ. 2007,
5, 591–623.
28. Bonyadi, M.R.; Michalewicz, Z. Particle swarm optimization for single objective continuous space problems: A review. Evol.
Comput. 2017, 25, 1–54. [CrossRef]
29. Kennedy, J.; Eberhart, R. Particle swarm optimization. In Proceedings of the ICNN’95-International Conference on Neural
Networks, Perth, WA, Australia, 27 November–1 December 1995; IEEE: Piscataway, NJ, USA, 1995; Volume 4, pp. 1942–1948.
Electronics 2023, 12, 2039 28 of 29
30. Shi, Y.; Eberhart, R. A modified particle swarm optimizer. In Proceedings of the 1998 IEEE International Conference on
Evolutionary Computation Proceedings. IEEE World Congress on Computational Intelligence (Cat. No. 98TH8360), Anchorage,
AK, USA, 4–9 May 1998; IEEE: Piscataway, NJ, USA, 1998; pp. 69–73.
31. Jha, G.K.; Thulasiraman, P.; Thulasiram, R.K. PSO based neural network for time series forecasting. In Proceedings of the
2009 International Joint Conference on Neural Networks, Atlanta, GA, USA, 14–19 June 2009; IEEE: Piscataway, NJ, USA, 2009;
pp. 1422–1427.
32. Wang, K.; Chang, M.; Wang, W.; Wang, G.; Pan, W. Predictions models of Taiwan dollar to US dollar and RMB exchange rate
based on modified PSO and GRNN. Clust. Comput. 2019, 22, 10993–11004. [CrossRef]
33. Junyou, B. Stock Price forecasting using PSO-trained neural networks. In Proceedings of the 2007 IEEE Congress on Evolutionary
Computation, Singapore, 25–28 September 2007; IEEE: Piscataway, NJ, USA, 2007; pp. 2879–2885.
34. Yang, F.; Chen, J.; Liu, Y. Improved and optimized recurrent neural network based on PSO and its application in stock price
prediction. Soft Comput. 2021, 27, 3461–3476. [CrossRef]
35. Huang, C.; Zhou, X.; Ran, X.; Liu, Y.; Deng, W.; Deng, W. Co-evolutionary competitive swarm optimizer with three-phase for
large-scale complex optimization problem. Inf. Sci. 2023, 619, 2–18. [CrossRef]
36. Auer, P. Online Learning. In Encyclopedia of Machine Learning and Data Mining; Sammut, C., Webb, G.I., Eds.; Springer: Boston,
MA, USA, 2016; pp. 1–9.
37. Benczúr, A.A.; Kocsis, L.; Pálovics, R. Online machine learning algorithms over data streams. J. Encycl. Big Data Technol. 2018,
1207–1218.
38. Julie, A.; McCann, C.Z. Adaptive Machine Learning for Changing Environments. 2018. Available online: https://www.turing.ac.
uk/research/research-projects/adaptive-machine-learning-changing-environments (accessed on 1 September 2018).
39. Grootendorst, M. Validating your Machine Learning Model. 2019. Available online: https://towardsdatascience.com/validating-
your-machine-learning-model-25b4c8643fb7 (accessed on 26 September 2018).
40. Gepperth, A.; Hammer, B. Incremental learning algorithms and applications. In European Symposium on Artificial Neural Networks
(ESANN); HAL: Bruges, Belgium, 2016.
41. Li, S.Z. Encyclopedia of Biometrics: I-Z; Springer Science & Business Media: Berlin/Heidelberg, Germany, 2009; Volume 2.
42. Vishal Nigam, M.J. Advantages of Adaptive AI Over Traditional Machine Learning Models. 2019. Available online: https:
//insidebigdata.com/2019/12/15/advantages-of-adaptive-ai-over-traditional-machine-learning-models/ (accessed on 15 De-
cember 2018).
43. Santos, J.D.D. Understanding and Handling Data and Concept Drift. 2020. Available online: https://www.explorium.ai/blog/
understanding-and-handling-data-and-concept-drift/ (accessed on 24 February 2018).
44. Brownlee, J. A Gentle Introduction to Concept Drift in Machine Learning. 2020. Available online: https://machinelearningmastery.
com/gentle-introduction-concept-drift-machine-learning/ (accessed on 10 December 2018).
45. Das, S. Best Practices for Dealing With Concept Drift. 2021. Available online: https://neptune.ai/blog/concept-drift-best-
practices (accessed on 8 November 2018).
46. Brzezinski, D.; Stefanowski, J. Prequential AUC for classifier evaluation and drift detection in evolving data streams. In
Proceedings of the International Workshop on New Frontiers in Mining Complex Patterns; Springer: Berlin/Heidelberg, Germany, 2014;
pp. 87–101.
47. Dodge, Y. The Concise Encyclopedia of Statistics; Springer Science & Business Media: Berlin/Heidelberg, Germany, 2008.
48. Chan, J.; Choy, S. Analysis of covariance structures in time series. J. Data Sci. 2008, 6, 573–589. [CrossRef]
49. Ruppert, D.; Matteson, D.S. Statistics and Data Analysis for Financial Engineering; Springer: Berlin/Heidelberg, Germany, 2011;
Volume 13.
50. Zhang, C.; Zhang, Y.; Cucuringu, M.; Qian, Z. Volatility forecasting with machine learning and intraday commonality. arXiv 2022,
arXiv:2202.08962.
51. Hsu, M.W.; Lessmann, S.; Sung, M.C.; Ma, T.; Johnson, J.E. Bridging the divide in financial market forecasting: machine learners
vs. financial economists. Expert Syst. Appl. 2016, 61, 215–234. [CrossRef]
52. DEMİREL, U.; Handan, Ç.; Ramazan, Ü. Predicting stock prices using machine learning methods and deep learning algorithms:
The sample of the Istanbul Stock Exchange. Gazi Univ. J. Sci. 2021, 34, 63–82. [CrossRef]
53. Guerra, P.; Castelli, M.; Côrte-Real, N. Machine learning for liquidity risk modelling: A supervisory perspective. Econ. Anal.
Policy 2022, 74, 175–187. [CrossRef]
54. Kou, G.; Chao, X.; Peng, Y.; Alsaadi, F.E.; Herrera-Viedma, E. Machine learning methods for systemic risk analysis in financial
sectors. Technol. Econ. Dev. Econ. 2019, 25, 716–742. [CrossRef]
55. Leippold, M.; Wang, Q.; Zhou, W. Machine learning in the Chinese stock market. J. Financ. Econ. 2022, 145, 64–82. [CrossRef]
56. Shivarova, A.; Matthew, F. Dixon, Igor Halperin, and Paul Bilokon: Machine learning in Finance from Theory to Practice. 2021.
Available online: https://rdcu.be/daRTw (accessed on 8 November 2018).
57. Das, S.R.; Mishra, D.; Rout, M. A hybridized ELM-Jaya forecasting model for currency exchange prediction. J. King Saud-Univ.-
Comput. Inf. Sci. 2020, 32, 345–366. [CrossRef]
58. Nayak, S.C. Development and performance evaluation of adaptive hybrid higher order neural networks for exchange rate
prediction. Int. J. Intell. Syst. Appl. 2017, 9, 71. [CrossRef]
Electronics 2023, 12, 2039 29 of 29
59. Yu, L.; Wang, S.; Lai, K.K. An Online BP Learning Algorithm with Adaptive Forgetting Factors for Foreign Exchange Rates
Forecasting. In Foreign-Exchange-Rate Forecasting with Artificial Neural Networks; Springer: Boston, MA, USA, 2007; pp. 87–100.
[CrossRef]
60. Soares, S.G.; Araújo, R. An on-line weighted ensemble of regressor models to handle concept drifts. Eng. Appl. Artif. Intell. 2015,
37, 392–406. [CrossRef]
61. Carmona, J.; Gavalda, R. Online techniques for dealing with concept drift in process mining. In Proceedings of the International
Symposium on Intelligent Data Analysis; Springer: Berlin/Heidelberg, Germany, 2012; pp. 90–102.
62. Yan, H.; Ouyang, H. Financial time series prediction based on deep learning. Wirel. Pers. Commun. 2018, 102, 683–700. [CrossRef]
63. Barddal, J.P.; Gomes, H.M.; Enembreck, F. Advances on concept drift detection in regression tasks using social networks theory.
Int. J. Nat. Comput. Res. (IJNCR) 2015, 5, 26–41. [CrossRef]
64. Chen, J.F.; Chen, W.L.; Huang, C.P.; Huang, S.H.; Chen, A.P. Financial time-series data analysis using deep convolutional neural
networks. In Proceedings of the 2016 7th International Conference on Cloud Computing and Big Data (CCBD), Macau, China,
16–18 November 2016; IEEE: Piscataway, NJ, USA, 2016; pp. 87–92.
65. Sammut, C.; Webb, G.I. Encyclopedia of Machine Learning; Springer Science & Business Media: Berlin/Heidelberg, Germany, 2011.
66. Kumar Chandar, S. Fusion model of wavelet transform and adaptive neuro fuzzy inference system for stock market prediction.
J. Ambient. Intell. Humaniz. Comput. 2019, 1–9. [CrossRef]
67. Pradeepkumar, D.; Ravi, V. Forex rate prediction: A hybrid approach using chaos theory and multivariate adaptive regression
splines. In Proceedings of the 5th International Conference on Frontiers in Intelligent Computing: Theory and Applications; Springer:
Berlin/Heidelberg, Germany, 2017; pp. 219–227.
68. Wang, L.Y.; Park, C.; Yeon, K.; Choi, H. Tracking concept drift using a constrained penalized regression combiner. Comput. Stat.
Data Anal. 2017, 108, 52–69. [CrossRef]
69. Baier, L.; Hofmann, M.; Kühl, N.; Mohr, M.; Satzger, G. Handling Concept Drifts in Regression Problems–the Error Intersection
Approach. arXiv 2020, arXiv:2004.00438.
70. Maneesilp, K.; Kruatrachue, B.; Sooraksa, P. Adaptive parameter forecasting for forex automatic trading system using fuzzy time
series. In Proceedings of the 2011 International Conference on Machine Learning and Cybernetics, Guilin, China, 10–13 July 2011;
IEEE: Piscataway, NJ, USA, 2011; Volume 1, pp. 189–194.
71. Yu, L.; Wang, S.; Lai, K.K. An online learning algorithm with adaptive forgetting factors for feedforward neural networks in
financial time series forecasting. Nonlinear Dyn. Syst. Theory 2007, 7, 51–66.
72. Ilieva, G. Fuzzy Supervised Multi-Period Time Series Forecasting; Sciendo: Warszawa, Poland, 2019.
73. Bahrepour, M.; Akbarzadeh-T, M.R.; Yaghoobi, M.; Naghibi-S, M.B. An adaptive ordered fuzzy time series with application to
FOREX. Expert Syst. Appl. 2011, 38, 475–485. [CrossRef]
74. Martín, C.; Quintana, D.; Isasi, P. Grammatical Evolution-based ensembles for algorithmic trading. Appl. Soft Comput. 2019,
84, 105713. [CrossRef]
75. Hoan, M.V.; Mai, L.C.; Hui, D.T. Pattern discovery in the financial time series based on local trend. In Proceedings of the International
Conference on Advances in Information and Communication Technology; Springer: Berlin/Heidelberg, Germany, 2016; pp. 442–451.
76. Yu, L.; Wang, S.; Lai, K.K. Forecasting Foreign Exchange Rates Using an Adaptive Back-Propagation Algorithm with Optimal
Learning Rates and Momentum Factors. In Foreign-Exchange-Rate Forecasting with Artificial Neural Networks; Springer: Boston,
MA, USA, 2007; pp. 65–85.
77. Castillo, G.; Gama, J. An adaptive prequential learning framework for Bayesian network classifiers. In Proceedings of the European
Conference on Principles of Data Mining and Knowledge Discovery; Springer: Berlin/Heidelberg, Germany, 2006; pp. 67–78.
78. Ramírez-Gallego, S.; Krawczyk, B.; García, S.; Woźniak, M.; Herrera, F. A survey on data preprocessing for data stream mining:
Current status and future directions. Neurocomputing 2017, 239, 39–57. [CrossRef]
79. Husson, F.; Lê, S.; Pagès, J. Analyse de Données avec R; Presses universitaires de Rennes: Rennes, France, 2016.
80. Brockwell, P.J.; Davis, R.A. Introduction to Time Series and Forecasting; Springer: Berlin/Heidelberg, Germany, 2002.
81. Binder, M.D.; Hirokawa, N.; Windhorst, U. Encyclopedia of Neuroscience; Springer: Berlin/Heidelberg, Germany, 2009; Volume 3166.
82. Pandey, P. Understanding the Mathematics behind Gradient Descent. 2019. Available online: https://towardsdatascience.com/
understanding-the-mathematics-behind-gradient-descent-dde5dc9be06e (accessed on 18 March 2019).
83. Clerc, M.; Siarry, P. Une nouvelle métaheuristique pour l’optimisation difficile: La méthode des essaims particulaires. J3eA 2004,
3, 007. [CrossRef]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual
author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to
people or property resulting from any ideas, methods, instructions or products referred to in the content.