A Conceptual Model of Investment-Risk Prediction in The Stock Market Using EVT Eith Machine Learning - A Semisystematic Literature Review
A Conceptual Model of Investment-Risk Prediction in The Stock Market Using EVT Eith Machine Learning - A Semisystematic Literature Review
A Conceptual Model of Investment-Risk Prediction in The Stock Market Using EVT Eith Machine Learning - A Semisystematic Literature Review
Review
https://doi.org/10.3390/risks11030060
risks
Review
A Conceptual Model of Investment-Risk Prediction in the Stock
Market Using Extreme Value Theory with Machine Learning:
A Semisystematic Literature Review
Melina 1, *, Sukono 2 , Herlina Napitupulu 2 and Norizan Mohamed 3
1 Doctoral Program of Mathematics, Faculty of Mathematics and Natural Sciences, Universitas Padjadjaran,
Sumedang 45363, Indonesia
2 Department of Mathematics, Faculty of Mathematics and Natural Sciences, Universitas Padjadjaran,
Sumedang 45363, Indonesia
3 Faculty of Ocean Engineering Technology and Informatics, Universiti Malaysia Terengganu,
Kuala Terengganu 21030, Malaysia
* Correspondence: [email protected]
Abstract: The COVID-19 pandemic has been an extraordinary event, the type of event that rarely
occurs but that has major impacts on the stock market. The pandemic has created high volatility
and caused extreme fluctuations in the stock market. The stock market can be characterized as
either linear or nonlinear. One method that can detect extreme fluctuations is extreme value theory
(EVT). This study employed a semisystematic literature review on the use of the EVT method to
estimate investment risk in the stock market. The literature used was selected by applying the
preferred reporting items for systematic review and meta-analyses (PRISMA) guidelines, sourced
from the ScienceDirect.com, ProQuest, and Scopus databases. A bibliometric analysis was conducted
to determine the study characteristics and identify any research gaps. The results of the analysis show
that studies on this topic are rarely carried out. Research in this field is generally performed only
in univariate cases and is very complicated in multivariate cases. Given these limitations, further
Citation: Melina, Sukono, Herlina research could focus on developing a conceptual model that is dynamic and sensitive to extreme
Napitupulu, and Norizan Mohamed. fluctuations, with multivariable inputs, in order to predict investment risk. The model developed
2023. A Conceptual Model of here considered the variables that affect stock price fluctuations as the input data. The combination
Investment-Risk Prediction in the of VaR–EVT and machine-learning methods is effective in increasing model accuracy because it
Stock Market Using Extreme Value combines linear and nonlinear models.
Theory with Machine Learning: A
Semisystematic Literature Review. Keywords: COVID-19; extreme value theory; machine learning; nonlinear; VaR
Risks 11: 60. https://doi.org/
10.3390/risks11030060
MSC: 91G50; 91G70; 62M20; 60G70; 68T07
Academic Editor: Mogens
Steffensen
‐
2019 to 29 December 2022; the data were sourced from finance.yahoo.com (accessed on
29 January 2023).
Figure 1 shows that the composite stock index greatly fluctuated and fell to its lowest
point after COVID-2019‐ was declared a pandemic by the World Health Organization on
11 March 2020. The high volatility in the stock market creates a high level of risk. This high
risk can lead to large profits or large losses for investors. These conditions usually raise
doubts among investors about their investment activities because it is difficult to identify
the best decisions. Therefore, investors need an appropriate method that considers the‐
dynamics of extreme values in order to mitigate uncertainty before making investment‐
decisions.
The amount of risk or maximum loss that may occur should be estimated for every
investment. J.P. Morgan proposed a concept called value at risk (VaR), which summarizes
‐
the near-impossible losses on investments at a specified level of confidence (Morgan 1996).
This method is very popular in investment-risk ‐ prediction, and Basel II recommended it
as the main risk management tool (Rossignolo et al. 2012). However, in 2008, the global
financial crisis revealed that VaR ignores liquidity risk and underestimates correlation
risk. Therefore, these risks are very important to control. Tail risks are often associated
with negative events with a greater impact but have a low probability of occurrence. The
emergence of the extreme value theory (EVT) helped to solve the problem. Parkinson
(1980) was a pioneer in the use of the EVT method in finance. EVT is a method used to
assess the risk of extreme events caused by unwanted events, such as natural disasters
and pandemics, which have major social and economic impacts. This method can be
used to study the frequency of rare events and develop predictive models to predict the‐
frequency of extreme events in the future, to estimate the magnitude of the risks faced
(Longin 2000). In May 2012, the Basel Committee on Banking Supervision mentioned that
several weaknesses have been identified from using value at risk because it is unable to
capture tail risk. Since then, expected shortfall (ES) or conditional value at risk (CVaR) have
been recommended for calculating market, credit, and operational risks (Tabasi et al. 2019).
Risks 2023, 11, 60 3 of 24
According to Trabelsi and Tiwari (2019), CVaR is the expected loss under the condition that
it exceeds VaR.
A combination of several models shows better performance than a single model and is
the main direction in forecasting (Hajirahimi and Khashei 2019). The hybrid method is an
appropriate alternative to produce accurate performance when compared with the single
model (Büyükşahin and Ertekin 2019). A combination of the EVT and ANN methods has
been applied in various studies (Ibn Musah et al. 2018), such as aiming to investigate the
risks associated with the principal stock exchange of Ghana with the combined use of EVT
with artificial neural networks (ANNs). The log-return data were used in the empirical
analysis. ANNs are used for forecasting when the market will rise or fall in a 5-month
trading period. EVT can be used to calculate the measure of risk associated with both
tails of the daily return dataset and to determine the maximum monthly return to clarify
whether it is increasing or decreasing. The training was conducted to model the maximum
monthly increase and decrease, as well as to ascertain market trends over the previous 5
months. The results show that the stock will rise in the 4th to 5th months, whereas in the
3rd to 4th months, it experiences losses. Using GPD with the POT method shows good
agreement with the EVT above a certain threshold.
VaR can be much more accurately calculated by using EVT, such as in a study by
Omari et al. (2020), implementing a dynamic method for forecasting a 1-day-ahead VaR,
with combines the GARCH models and EVT to examine the extreme behavior of major
economic stock indices during the period before and during the outbreak of the pandemic.
Comprehensive in-sample volatility modeling was implemented with skewed Student’s-t
distribution assumptions, and the information selection criteria were used to establish their
goodness of fit. Furthermore, the VaR quantiles were estimated by using the conditional-
EVT (C-EVT) framework to obtain out-of-sample VaR forecasting results. The combined
GARCH and EVT model performed relatively well in estimating the risk for all stock
indices. The back-testing results demonstrate that the E-GARCH skewed-Student’s-t and
C-EVT models are the most appropriate techniques for better measuring and forecasting
VaR in comparison with the conventional method.
The GARCH–EVT combination method implemented by Echaust and Just (2020)
aimed to determine the predictive ability of value-at-risk estimates when each estimate
is made with the optimal choice of the tails of the distribution. Here, 5 methods were
applied to describe the tail, namely the distance-metric method with the mean absolute
penalty function, the minimization of the AMSE estimate, the path-stability algorithm, the
fixed-quantile procedure, and the automated eyeball method. The model with optimal tail
selection performed relatively well in estimating the risk for all threshold choices, and the
optimal tail selection method did not improve the value-at-risk prediction accuracy; using
the C-EVT approach while taking the 95th percentile of the sample as the threshold could
obtain an accurate estimate of the tail risk.
In investing, analyzing stocks is very important to observe the current situations and
conditions. Investors can predict stock prices by analyzing stock fluctuation trends on
the basis of using historical data on stock price movements. On the basis of the results of
this stock price forecasting, an overview of stock returns in the future is obtained. These
results are very important data for predicting investment risk. Data are crucial factors for
improving forecasting accuracy. Internet data and social media are regarded as significant
data sources for many public and private organizations, particularly in academia and
industry for research, thanks to the sophistication of information and communication
technology (Firdaniza et al. 2022). Developments in computing technology, with the
emergence of new technologies and the widespread adoption of artificial intelligence
techniques to make everyday tasks much more intelligent and predictable and to anticipate
changes (Najem et al. 2022), have made machine-learning-based forecasting popular. Wu
et al. (2021) collected considerable online oil news and used the convolutional neural
network method to automatically extract and filter relevant information. The experimental
results show that social media information contributes to oil price forecasting.
Risks 2023, 11, 60 4 of 24
Melina et al. (2022) developed a short-term prediction model to predict the price of
shares listed on the stock exchange based in Jakarta, Indonesia, during the pandemic, using
an ANN-based machine-learning approach. The proposed model predicts stock prices
with factors that influence stock fluctuations, including the COVID-19 trend indicator and
the COVID-19 government response stringency index in Indonesia, as input variables.
As a result, the proposed model achieved high forecast accuracy in terms of stock price
prediction.
Recent research conducted by Ilyas et al. (2022) has proposed a new hybrid method,
consisting of a fully modified Hodrick Prescott filter (FMHP) to improve prediction accuracy.
This method consists of three main components: machine-learning-based prediction, novel
features, and a noise-filtering technique. The FMHP aids in removing noise from the
financial dataset and smoothing it out. Sentiment features based on Twitter data and stock
price characteristics are examples of novel features. The machine-learning algorithms used
in the study include random forests, ARIMA, recurrent neural networks, and support
vector regression algorithms. Several new features are embedded for predicting stock
prices, such as the return open price, return of firm, return close price, changes in return
close price, changes in return open price, and volume per total. Sentiment scores, sentiment
features, and preprocessed Twitter data are all fed into the training model. To produce
precise forecasts for the closing price of the stock, the model learns from the supplied data.
The hybrid FMHP model improves its prediction accuracy to 70.88%, the error rate to 0.1,
and the root-mean-square error (RMSE) to 0.04.
This description shows that research on investment-risk prediction in the stock market
that uses the EVT method uses one input variable, namely daily stock returns. This model
is static because it does not consider other variables that arise from extraordinary events
that cause fluctuations in the stock market. The novelty of this research is the proposed
conceptual model for predicting investment risk in the stock market using an EVT approach
based on machine learning, which is dynamic and sensitive to extreme fluctuations. This
model was developed with multivariable inputs. Factors that affect stock fluctuations
and variables that arise from extraordinary events need to be considered when building
a model. The combination of VaR–EVT and machine-learning methods is effective for
increasing model accuracy because it combines linear and nonlinear models. We conclude
that modeling investment-risk predictions on the stock market with an EVT approach,
based on machine learning, is necessary for the development of investment-risk models on
the stock market in the future. This model can read heavy tail patterns in the distribution
of data; therefore, it can detect extreme values. This model can also study the relationship
patterns of nonlinear variables that affect stock price fluctuations when extraordinary
events occur and then create turmoil in the stock market. This model has the potential
to produce accurate results. It is dynamic and sensitive to extreme fluctuations because
it considers extreme variables that arise from extraordinary events, making stock market
input data volatile.
This research is very useful for investors in the stock market, policymakers, gov-
ernments, banks, academics, research institutions, and researchers. It is hoped that a
conceptual model for predicting investment risk, one that is dynamic and sensitive to
extreme fluctuations, will minimize the prediction error of investment risk in the stock
market because it will consider the variables that arise as a result of extraordinary events,
such as the COVID-19 pandemic, or other pandemics that will occur in the future, so that
the collapse of the financial sector does not happen again.
2. Results
In this section, we will present an analysis of the results obtained on the basis of the
plan represented by the previously defined research questions. The series of activities
carried out displays the results of the study selection, selection by quality assessment,
bibliometric analysis, and analysis of general characteristics of the literature. In addition,
the results of a review of the bibliographical information, publications, citations by year,
Risks 2023, 11, 60 5 of 24
articles by the number of citations, journals by the number of citations, keywords, stock
markets covered, methodologies, and properties will be presented.
2.1. Planning
Planning when conducting S-SLR is very important when performing a baseline study
and when reducing publication bias in this study. The scope of this S-SLR was determined
on the basis of the objectives represented by the research questions. We concentrated on
and limited ourselves to articles on the topic of the hybrid method including VaR and
CVaR while taking the EVT approach. The fundamental question is, what is the purpose
of this study? This study benefited from an S-SLR on the use of the EVT method to
estimate investment risk in the stock market, as a study basis and reference for developing
a conceptual model for predicting investment risk in the stock market that is dynamic
and sensitive to extreme fluctuations. Table 1 presents some research questions (QR) from
this study.
QR Questions
QR1 What is the purpose of this research?
How did the VaR–CVaR model function as an EVT method for predicting
QR2
investment risk in the stock market during the COVID-19 pandemic?
QR3 What are the input variables commonly used?
What is the investment-risk-prediction model that is dynamic and sensitive
QR4
to extreme fluctuations?
The answers to QR1 are described above; answers to QR2 –QR3 are presented in
Section 2.5; and the solution to QR4 is presented in the Section 4.
Criteria IC EC
The study of the analysis, prediction, Studies that are not related to the
forecasting, and estimation of investment analysis, prediction, forecasting,
IC1
risk in the stock market with the VaR–CVaR and estimation of investment risk
hybrid method with the EVT approach. in the stock market.
Research articles from peer-reviewed
IC2 None of the research articles.
international journals.
Articles published outside the
IC3 Articles published in the period 2019 to 2022.
period 2019 to 2022.
Using a language other than
IC4 Using English.
English.
The search strategy was carried out by using keywords that matched the topic of this
study, namely (“forecasting” OR “prediction” OR “predicting”) AND (“VaR” OR “CVaR”
OR “risk”) AND (“stock market”) AND (“extreme value theory” OR “EVT”). By using
these keywords, it was hoped that studies using the VaR–CVaR hybrid method, the EVT
approach, and those focusing on the stock market would be filtered.
Risks 2023, 11, 60 6 of 24
QA Information
Is the article analysis, forecasting, estimation, or prediction of investment
QA1
risk in the stock market?
Does the article use the hybrid VaR—CVaR method with EVT, block
QA2
maxima, peaks over threshold, GEV distribution, and GPD?
QA3 Is the primary source of the stock market data in the form of stocks?
A literature search was performed using the Publish or Perish 8 software for the
Scopus database sources, using search tools for peer-reviewed journal articles on www.
sciencedirect.com (accessed on 31 January 2023) for sources in the ScienceDirect database,
and using search tools on www.proquest.com (accessed on 31 January 2023) for the Pro-
Quest database source. Table 4 presents the process of searching the literature on the basis
of using keywords.
Results
K Query
S SD PQ Total
(“forecasting” OR “prediction” OR “predicting”)
K1 200 383,468 1,122,461 1,506,129
AND (“var” OR “cvar” OR “risk”)
K2 K1 AND (“stock market”) 200 11,514 76,943 88,657
K3 K2 AND (“extreme value theory” OR “EVT”) 8 152 578 738
According to the IC presented in Table 2, it was found that the literature did not meet
the IC2 criteria; thus, 13 articles were deleted from the SD sources, and 361 articles were
deleted from the PQ sources, leaving 364 from the three databases. Next, two articles were
deleted because of duplication, leaving 362 articles. Deletion was also performed if the title
and abstract were deemed not relevant to the topic. At this stage, 264 articles had been
deleted, leaving 98 articles. Further selection was conducted by reading the contents of the
articles. By following the QA presented in Table 2, 85 articles were removed because they
did not meet QA1 , QA2 , or QA3 . Table 5 presents the studies that were selected on the basis
of using the QA.
The result retained 13 selected articles, which were then used for the S-SLR. The
selected literature was compressed and compiled in a .ris file, a file type that is supported
by a number of reference managers. This format file can be used as an input file in
VOSviewer software. Figure 2 presents the stages of applying PRISMA in the search
process and strategies for obtaining relevant studies.
Risks 2023, 11, 60 7 of 24
Figure 3 shows a visualization of the bibliometric network, divided into three clus- ‐
ters. Cluster 1 is red, cluster 2 is green, and cluster 3 is blue. In cluster 1, the items of
model, value, risk, approach, VaR, generalized Pareto distribution, return, daily return, ‐
and high-frequency data have strong relationships because they are in the same cluster.
This cluster shows the existence of a word circle that refers to the approach used in the
investment prediction model on the stock market, namely the word circle “generalized ‐
Pareto distribution”. These words indicate that the most widely used method is the POT
method, which is based on the generalized Pareto distribution, rather than on the block
maxima method, to identify extreme values. In the GPD method, the extreme value is that
which exceeds the threshold. Generally, this model uses daily return data as the input. In
this cluster, risk and return items are also dominant. This clarifies that investment always
contains elements of risk and return. The goal of investors is to achieve the maximum profit
Risks 2023, 11, 60 9 of 24
while accounting for the elements of risk and return; therefore, the higher the expected ‐
return, the higher the risk that will be borne.
In cluster 2, extreme value theory and study are very dominant items. In this cluster,
there are also the items of GARCH, accuracy, back testing, the stock market index, and
performance. This cluster explains that the hybrid VaR model with the extreme value ‐
‐
theory and GARCH approaches is very dominant in this study. The back-testing method is
used for model validation.
In cluster 3, stock market and estimation are the dominant items, as seen from the size
‐
of each circle. In this cluster, there are also item analyses, data, shortfalls, and predictions.
The dominance of stock market items and items contained in this cluster illustrates that
the selection process from the literature has been carried out in accordance with this study,
namely the analysis, prediction, and estimation of investment risk in the stock market. ‐
Figure 4 shows the relationship between extreme value theory and other items.
Figure 4 shows that extreme value theory items have a strong relationship with VaR
items, as well as a direct relationship with daily returns, but no relationship with high-‐
‐
frequency-data items. This relation illustrates that VaR calculations can be performed with
‐
high-frequency ‐
data. However, there are very few cases of using high-frequency data in the
‐
EVT method because high-frequency data include multivariate cases. A bridging method
is needed so that the EVT approach can accommodate high-frequency ‐ data as an input
model for estimating investment risk. This image illustrates the investment-risk-prediction
‐ ‐
model with the EVT approach, generally using only one data input, namely daily returns.
This model works well in univariate cases and has weaknesses in multivariate cases. These ‐
findings can be used as basic reference points for developing future models. ‐
2.5. General Characteristic of the Literature
At this stage, we describe and analyze the general characteristics of the literature on
the basis of publications, citations, publications by journals, keywords, and others.
Risks 2023, 11, 60 10 of 24
Figure 5 shows the number of article publications and citations from 2019 to 2022.
In 2019, three articles were published; in 2020, six articles were published; in 2021, three‐
articles were published; and in 2022, one article was published. Figure 4 also shows the
total number of citations per year. In 2019, two articles yielded 45 citations. This is the
highest number of citations obtained for articles published during the COVID-19‐ pandemic.
In 2020, six articles yielded 29 citations; in 2021, they yielded 8 citations; and in 2022, they
yielded 4 citations. This illustrates that research on investment-risk ‐ predictions in the
stock market using the VaR or CVaR method with the EVT approach has very rarely been
carried out.
2.5.2. Citations
Table 6 presents the cited articles and information on each journal that published each
article.
Table 6 shows the most cited articles. The most cited article was that written by
Karmakar and Paul (2019), published in the International Journal of Forecasting, which
Risks 2023, 11, 60 11 of 24
obtained 32 citations. The second-most-cited article was that written by Tabasi et al. (2019),
published in Administrative Sciences, cited 11 times. The third-most-cited article was that
written by Sobreira and Louro (2020), published in Finance Research Letters, cited eight times.
The fourth-most-cited article was that written by Ji et al. (2020), published in the Journal of
Empirical Finance, cited seven times. The fifth-most-cited article was that written by Bień-
Barkowska (2020), published in the journal Entropy, cited seven times. The sixth-most-cited
article was that written by Song et al. (2021), published in Journal of Asian Economics, cited
five times. Furthermore, the article was that was written by Chebbi and Hedhli (2022),
published in the quarterly review of economics and finance, was cited four times. Finally,
the thirteenth-most-cited article was that written by Ghourabi et al. (2021), published in
the International Journal of Finance and Economics, cited one time. The number of citations
illustrates that research on this topic is still scant and that more research is needed.
2.5.3. Journals
Table 7 presents the most influential journals in this study. The data and information
were sourced from www.scimagojr.com (accessed on 2 February 2023). The table is sorted
by the most citations.
H Quartiles SJR
Number Journal ISSN Country Publisher Articles Citations
Index 2021
International Journal of
1 1692070 Netherlands Elsevier 100 Q1 1.99 1 32
Forecasting
2 Administrative Sciences 20763387 Switzerland MDPI AG 23 Q2 0.48 1 11
3 Finance Research Letters 15446123 Netherlands Elsevier BV 62 Q1 2.01 1 8
4 Entropy 10994300 Switzerland MDPI 81 Q2 0.55 2 8
Journal of Empirical
5 9275398 Netherlands Elsevier 80 Q1 1.20 1 7
Finance
Journal of Asian
6 10490078 Netherlands Elsevier 51 Q2 0.65 1 5
Economics
Physica A: Statistical
7 Mechanics and its 3784371 Netherlands Elsevier 170 Q1 0.89 1 4
Applications
Quarterly Review of
8 10629769 Netherlands Elsevier 55 Q2 0.69 1 4
Economics and Finance
9 Economies 22277099 Switzerland MDPI 19 Q2 0.44 1 2
Sage Publi-
cations
10 Global Business Review 9721509 India 30 Q2 0.45 1 2
India Pvt.
Ltd.
11 Economic Modeling 2649993 Netherlands Elsevier 87 Q2 1.07 1 2
John Wiley
International Journal of
12 10769307 UK and Sons 41 Q2 0.42 1 1
Finance and Economics
Ltd.
Table 7 shows all the studies sourced from reputable journals. In total, four articles
were sourced from Q1 journals, and nine articles were sourced from Q2 journals. This
illustrates that the literature in this study was of high quality and scientific because it all
came from reputable journals. This fact also explains that research on the analysis and
prediction of the level of investment risk in the capital market is a very important topic for
scientific developments, especially risk management.
Risks 2023, 11, 60 12 of 24
2.5.4. Keywords
‐
In research articles, the list of keywords contains the most important words, making the
article searchable for other researchers. In addition, keywords are needed for bibliometric
analyses. Figure 6 shows the 10 most commonly used keywords in the selected literature.
Figure 6 shows as many as 65 keywords used in all studies. Value at risk is the
‐ ‐ ‐
most frequently used keyword, used in 12% of studies; the second-most-frequently-used ‐
‐ ‐
keyword was extreme value theory, used in 9% of studies; and the third-most-frequently-‐
used keywords were back testing and expected shortfall, used in 5% of the studies. These
keywords indicate that the selected literature adhered to the topic of this study.
2.5.6. Methodology‐ ‐ ‐
‐ the ‐methodology
Table 8 presents ‐ used in this study to model investment-risk predic-
tions with the EVT approach.
Risks 2023, 11, 60 14 of 24
Table 8 shows a summary of the proposed model for modeling investment-risk esti-
mation, which showed better performance than that of competing models.
3.2. Methods
This study is a semisystematic literature review (S-SLR) with a hybrid of VaR, CvaR,
and the EVT method in the analysis and estimation of investment risk in the stock market,
which can identify and assess gaps in the literature with scientific evidence to provide a
framework/background for developing a conceptual model for predicting investment risk
in the dynamic stock market while being sensitive to extreme fluctuations. The stages in
an S-SLR are divided into three main phases: planning, conducting, and analyzing and
reporting (Kitchenham and Charters 2007).
The S-SLR planning stage begins with determining the objectives of this study and
then determining the research questions to ensure that the review is focused. This stage
also determines the need for researchers to summarize all available information about the
topic being studied to identify gaps in previous research.
The stages associated with conducting the review are identifying research and selecting
the main studies. Research identification generates a search strategy and selects the initial
articles on the basis of defined keywords, aiming to detect as many relevant studies as
possible. The selection process was carried out by using PRISMA guidelines that are based
on inclusion and exclusion criteria. An assessment of the quality of the studies was carried
out to provide more-detailed inclusion/exclusion criteria and minimize publication bias.
Analyzing and reporting the review consist of the following stages:
• Interpret all available research to provide specific answers to the research questions
developed at the planning stage.
• Perform a bibliometric analysis by using the VOSviewer application. The bibliometric
analysis is carried out on the selected studies to determine the relationships between
words contained in the article; next, the results were processed to identify shifts in
topics in the article (Sukono et al. 2022).
• Analyze the general characteristics of the literature and examine the mathematical
model to predict investment risk in the stock market in reference to the methods and
models used in the development of the conceptual model.
• Determine gaps in the literature from models and methods to predict investment risks
in the stock market by using EVT. The goal is to identify gaps to fill, which will assist
in developing future models.
• Report the review, propose a conceptual model, and provide directions for future
studies.
4. Discussion
In this section, we will review and analyze the literature, gaps in the existing literature,
and conceptual models for predicting investment risks in the stock market, which is
dynamic and sensitive to extreme fluctuations.
(σ), and the shape parameter (ξ). According to Chebbi and Hedhli (2022), this method is
inefficient because it identifies only one extreme value and ignores other extreme values;
this method focuses only on events with a larger magnitude. The BM method largely
removes data because only one extreme value from each block is used; thus, in practice, it
is increasingly being replaced by methods based on peaks over threshold (POT), where all
the data representing extreme values are used.
One well-known EVT model is the POT, which assumes that extreme risks are inde-
pendently and identically distributed from the generalized Pareto distribution (GPD) (Ji
et al. 2020). The POT method is preferred over the BM method (Song et al. 2021). This can
be seen from the literature used in this study, in which the POT method was used to identify
extreme values. The POT method is generally used because of its efficiency when data
on extreme events are limited (Chen and Yu 2020). According to Ji et al. (2019), the GPD
assumes a flexible structure by changing the shape parameter to accommodate various
tail behaviors in the general framework of the EVT. Research by Bień-Barkowska (2020)
concluded that the POT method is more efficient for practical applications because it uses
all large realizations of variables, provided that they exceed a sufficiently high threshold.
The POT method is one way of identifying extreme data behavior patterns by deter-
mining the extreme threshold value. Data that exceed the threshold are extreme values
(Saputra et al. 2022). The threshold value (u) is determined as optimally as possible, re-
sulting in a minimum error rate. Let X1 , X2 , X3 , . . . , Xn be a sequence of independent and
identically distributed random variables, with a common distribution function, F. The POT
model approach focuses on estimating the distribution function, Fu , of values of X above a
high u. The distribution of excesses over a high u is defined as follows:
F (u + y) − F (u) F ( x ) − F (u)
Fu (y) = P( X − u ≤ y| X > u) = = (1)
1 − F (u) 1 − F (u)
The function F (u) can be estimated nonparametrically by using the empirical dis-
tribution function as an estimate of the cumulative distribution function (Omari et al.
2020):
n − Nu
F (u) = (5)
n
where n is the total number of observations and Nu is the number of observations that
exceed the threshold. By substituting Equation (3) and Equation (5) into Equation (4), an
estimate for F ( x ) can be obtained as follows:
− 1
x−u
Nu ξ
F̂ ( x ) = 1 − 1 + ξ̂ (6)
n σ̂
The high quantile estimator, or the VaR, for α ≥ F̂ (u) can be obtained from inverting
Equation (6), as follows:
1
qα ( F ) − u − ξ̂
Nu
α = 1− 1 + ξ̂ (7)
n σ̂
1
qα ( F ) − u − ξ̂
n
1 + ξ̂ = (1 − α ) (8)
σ̂ Nu
−ξ̂
qα ( F ) − u
n
ξ̂ = (1 − α ) −1 (9)
σ̂ Nu
" −ξ̂ #
α̂ n
qα ( F ) − u (1 − α ) −1 (10)
ξ̂ Nu
" −ξ̂ #
ˆ σ̂ n
VaRα = qα ( F ) = u + (1 − α ) −1 (11)
ξ̂ Nu
where α is the confidence level of VaR, Nu is the observations that exceed the threshold, n
is the number of observations, σ̂ is the scale parameter, and ξ̂ is the shape parameter.
The conditional expected loss under the assumption that it surpasses VaR is referred
to as CVaR. Contrary to VaR, CVaR always returns a bigger magnitude for risk because it
measures the average loss in the very tail of the distribution. VaR can be derived as follows
(Long et al. 2020):
CVaRα ( X ) := E[ X | X ≥ VaRα ( X )] (12)
The combination of EVT with other models yields better forecasting accuracy, as
shown in research conducted by Chaiboonsri and Wannapan (2021), which aimed to me-
thodically devise a quantum-wave distribution (QWD) to better analyze risks and returns
for stock markets in ASEAN countries, especially in extreme value predictions of VaR and
ES, as based on quantum mechanics (QM). The scope of the research process starts from ob-
servation and screening data; next, the raw data are modified by a Gaussian–random-walk
distributional set and QWD. Afterward, two values are inserted into the function of the
GPD extreme value analysis. By setting the prior density for parameters at the Bayesian
estimation u, heavy loss tails are clarified and evaluated. Bayesian simulations and statistics
are applied to the present estimation outputs. Bayesian inference for calculating risks and
the ES predictions are both compatible with the distribution produced by the QM carried
out in the wave equation. Quantum distributions are empirically notable for generating
genuine distributions, and they may be able to close the information gap in data analyses.
Ghourabi et al. (2021) conducted research that aimed to evaluate the estimation ability
of the generalized autoregressive score model to calculate risk scores by applying EVT.
The generalized autoregressive score section is responsible for capturing the dynamics of
transient volatility. EVT provides a model of extreme tail behavior. This method produces
Risks 2023, 11, 60 18 of 24
much-more-accurate VaR predictions. In research performed by Chen and Yu (2020), the au-
thors proposed an asymmetric power autoregressive conditional heteroscedasticity model
with the generalized Pareto distribution, aiming to determine the optimal margin level.
Estimations of VaR were measured by using Equation (11). The residual tail distribution
of the APARCH model was estimated by using the generalized Pareto distribution, based
on EVT, by using Equation (3). The result was that the proposed model offered better
1-day forecasts than the other models did. Research by Ji et al. (2020) introduced a general
framework of a SEPP with a truncated the generalized Pareto distribution to measure
extreme risk in the stock market below price limits. Similar to GARCH modeling, where
the variance is a function of past shocks and where the variance in the sign distribution
depends on previous events through intensity, the flexible, truncated, generalized Pareto
distribution works to accommodate price constraints. The measurement results showed
that the proposed process can accurately explain the empirical data. Research conducted
by Ji et al. (2019) focused on investigating the extreme risk of returning financial assets
by using the agent-based model. The spread of extreme risk is caused by two important
mechanisms that contribute to fact style, namely panic aggregation and market fraction
movements. Extreme risks above a certain threshold can be independent and identically
distributed by the generalized Pareto distribution by using Equation (3). A Monte Carlo
simulation was performed for the VaR estimation. The results showed that the proposed
model had good performance in predicting VaR. Tabasi et al. (2019) conducted research to
calculate market risk in Iran’s largest stock exchange, by estimating the CVaR. This research
applied the GARCH model, in combination with the POT model, assuming t-distributions
or normal for the RV. The GARCH procedure described the random variable’s volatility,
and then used the EVT, to model the residuals. After the estimation of the VaR and the
ES, the validity of these estimations needed to be investigated by the back-testing models.
The results of the study showed that utilizing the POT model had a positive impact on the
models and on the estimation of risk in the financial market.
Predicting VaR by taking only the EVT approach identifies the limitations of this
model in predicting dynamic VaR. The GARCH approach allows the model to dynamically
capture the volatility characteristics of financial time series. Predicting the VaR of financial
markets by accounting for the volatility in the extreme value approach is predominant in
the literature. A good model uses several combinations with complementary goals, such as
the research by Karmakar and Paul (2019), employing the CGARCH–EVT-Copula model
to predict intraday VaR and ES or CVaR portfolios by using high-frequency data. EVT
focuses directly on the tails and could therefore yield better estimates and forecasts of risk.
EVT is not independently and identically distributed, and the GARCH model is used to
fit the return series. The GARCH–EVT model is used to draw the marginal distributions,
and the multivariate dependence structure between markets is modeled by a parametric
family of extreme value copulas that are perfectly suitable for non-normal distributions and
nonlinear dependence. The combined GARCH–EVT-Copula model becomes the natural
choice for estimating the portfolio of VaR, as well as that of ES or CVaR.
A POT approach using Equation (3) managed to catch the extreme values and was
successful during the research. VaR was estimated by using Equation (11). Back-testing
evidence showed that the employed model showed relatively better performance than the
other models. A study by Banerjee and Paul (2020) explored the MCS-GARCH model’s
forecasting intraday VaR and ES for both developed and emerging markets.
This study proposes the MCS-GARCH model for superior volatility estimation because
it expresses the intraday conditional variance in prices as a product of three components:
the daily variance component, the intraday variance component, and the diurnal variance
pattern. The results show that the combined conditional-EVT model performs much better
than the standalone GARCH model.
In research conducted by Miloš (2020), procedures were developed to assess tail risk
portfolios on the basis of using EVT, without the need to use multivariate constraining
relationships. This study overcame the main drawback of EVT against multivariate cases
Risks 2023, 11, 60 19 of 24
‐
‐ ‐ ‐
Risks 2023, 11, 60 20 of 24
‐
The bk parameter is biased, in that it has the effect of increasing or decreasing the
network input of the activation function ϕ(.). The result of Equation (13) is later changed to
𝑢 𝑏 𝑤 ,𝑥
be nonlinear by the activation function, before it becomes a neuron output signal, as shown
in Equation (14):
yk𝑦= ϕ𝜑(uk𝑢+ bk𝑏) (14)
The values of the parameters b𝑏1 ,, b𝑏2 ,, b𝑏3 and w𝑤k1 , w
, 𝑤k2 , w
, 𝑤k3 , ., .…. ,, 𝑎𝑛𝑑 wkn are obtained as
and 𝑤
a result of learning from the input variables. The value of the weight is often limited to
prevent it from becoming too large; this is generally achieved through the decay parameter,‐
which is usually set to a value of 0.1. Next, the weights take random values, which are
updated using the observed data, thus indicating the presence of nonlinear elements in the
forecasts generated by this machine learning. The output of this model is a prediction based
on the results of learning and testing variables that affect stock fluctuations, including‐
X-FV, ‐where the lowest error rate is based on two measured metrics: mean-square ‐ error
and RMSE (Bakar et al. 2021).
Furthermore, the EVT method will identify extreme values of the machine-learning ‐
output by using Equation (3), to obtain the parameters σ and 𝜎 ξ. These 𝜉 parameters will later
be used to obtain a 1-day-ahead
‐ ‐ estimate of investment risk by using Equation (11). Back
testing was performed to validate the model (Berger and Moys 2021). Figure 10 shows the
framework for the conceptual model of the stock market.
‐
This model will continuously predict short-term investment risk. The purpose of
‐
this short-term prediction is that the output of the model will follow the dynamics of the‐
variables that affect the stock market ecosystem. Variable changes that occur every day
will be the input data for the next prediction; thus, this model is dynamic and sensitive to
extreme fluctuations.
5. Conclusions
‐
In this study, an S-SLR was conducted to research the topic of investment-risk ‐ predic-‐
tion in the stock market. The aim was to utilize the S-SLR ‐ to develop a predictive model for
the level of investment risk in the stock market, which is dynamic and sensitive to extreme
fluctuations. This study started from the planning stage, and at the selection study stage,
13 relevant articles had been identified in the literature. A bibliometric analysis was carried
out to obtain quantitative and qualitative descriptions of the literature based on the year of
publication, citations, journal sources, methodology, etc. Next, the results were processed
with VOSviewer software to identify the mapping of words in articles that were relevant to
this study. This S-SLR was developed by using quality literature. This is reflected in the
identification of journal sources from the literature, where all the studies were sourced from
reputable journals from Q1 and Q2. The S-SLR showed that most of the research in this
field uses only daily returns as input data. This series of processes provides insights into
Risks 2023, 11, 60 22 of 24
Author Contributions: Conceptualization, M.; methodology, S., H.N. and N.M.; validation, S.; formal
analysis, S.; investigation, S., H.N. and N.M.; resources, S.; writing—original draft preparation, M.;
writing—review and editing, S.; visualization, S.; supervision, S. All authors have read and agreed to
the published version of the manuscript.
Funding: The APC was funded by Universitas Padjadjaran. Grant number 2203/UN6.3.1/PT.00/2022.
Data Availability Statement: Not applicable.
Acknowledgments: The authors are grateful to the Directorate of Research, Community Service and
Innovation or DRPM Universitas Padjadjaran for providing an internal research grant, fiscal year
2022, and to the “Academic Leadership Grant (ALG)” program under Sukono.
Conflicts of Interest: The authors declared no conflict of interest.
References
Altig, Dave, Scott Baker, Jose Maria Barrero, Nicholas Bloom, Philip Bunn, Scarlet Chen, Steven J. Davis, Julia Leather, Brent Meyer,
Emil Mihaylov, and et al. 2020. Economic Uncertainty before and during the COVID-19 Pandemic. Journal of Public Economics 191:
104274. [CrossRef] [PubMed]
Bakar, Maharani A., Norizan Mohamed, Danang A. Pratama, M. Fawwaz, A. Yusran, Nor Azlida Aleng, Z. Yanuar, and L. Niken.
2021. Modelling Lock-down Strictness for COVID-19 Pandemic in ASEAN Countries by Using Hybrid ARIMA-SVR and Hybrid
SEIR-ANN. Arab Journal of Basic and Applied Sciences 28: 204–24. [CrossRef]
Balkema, A. A., and L. de Haan. 1974. Residual Life Time at Great Age. The Annals of Probability 2: 792–804. [CrossRef]
Banerjee, A., and Samit Paul. 2020. Idiosyncrasies of Intraday Risk in Emerging and Developed Markets: Efficacy of the MCS-GARCH
Model and Extreme Value Theory. Global Business Review 1–23. [CrossRef]
Berger, Theo, and Gunnar Moys. 2021. Value-at-Risk Backtesting: Beyond the Empirical Failure Rate. Expert Systems with Applications
177: 114893. [CrossRef]
Bień-Barkowska, Katarzyna. 2020. Looking at Extremes without Going to Extremes: A New Self-Exciting Probability Model for
Extreme Losses in Financial Markets. Entropy 22: 789. [CrossRef]
Buizza, Caterina, César Quilodrán Casas, Philip Nadler, Julian Mack, Stefano Marrone, Zainab Titus, Clémence Le Cornec, Evelyn
Heylen, Tolga Dur, Luis Baca Ruiz, and et al. 2022. Data Learning: Integrating Data Assimilation and Machine Learning. Journal
of Computational Science 58: 101525. [CrossRef]
Büyükşahin, Ümit Çavuş, and Şeyda Ertekin. 2019. Improving Forecasting Accuracy of Time Series Data Using a New ARIMA-ANN
Hybrid Method and Empirical Mode Decomposition. Neurocomputing 361: 151–63. [CrossRef]
Chaiboonsri, Chukiat, and Satawat Wannapan. 2021. Applying Quantum Mechanics for Extreme Value Prediction of VaR and ES in the
ASEAN Stock Exchange. Economies 9: 13. [CrossRef]
Risks 2023, 11, 60 23 of 24
Chebbi, Ali, and Amel Hedhli. 2022. Revisiting the Accuracy of Standard VaR Methods for Risk Assessment: Using the Copula-EVT
Multidimensional Approach for Stock Markets in the MENA Region. Quarterly Review of Economics and Finance 84: 430–45.
[CrossRef]
Chen, Yan, and Wenqiang Yu. 2020. Setting the Margins of Hang Seng Index Futures on Different Positions Using an APARCH-GPD
Model Based on Extreme Value Theory. Physica A: Statistical Mechanics and Its Applications 544: 123207. [CrossRef]
Chen, Yanjun, Kun Liu, Yuantao Xie, and Mingyu Hu. 2020. Financial Trading Strategy System Based on Machine Learning.
Mathematical Problems in Engineering 2020: 3589198. [CrossRef]
Echaust, Krzysztof, and Małgorzata Just. 2020. Value at Risk Estimation Using the GARCH-EVT Approach with Optimal Tail Selection.
Mathematics 8: 114. [CrossRef]
Fausett, Laurene. 1994. Fundamentals of Neural Networks: Architectures, Algorithms, and Applications. Upper Saddle River: Prentice-Hall,
Inc.
Firdaniza, Firdaniza, Budi Nurani Ruchjana, Diah Chaerani, and Jaziar Radianti. 2022. Information Diffusion Model in Twitter: A
Systematic Literature Review. Information 13: 13. [CrossRef]
Ghourabi, Mohamed E. L., Asma Nani, and Imed Gammoudi. 2021. A Value-at-Risk Computation Based on Heavy-Tailed Distribution
for Dynamic Conditional Score Models. International Journal of Finance & Economics 26: 2790–99. [CrossRef]
Hajirahimi, Zahra, and Mehdi Khashei. 2019. Hybrid Structures in Time Series Modeling and Forecasting: A Review. Engineering
Applications of Artificial Intelligence 86: 83–106. [CrossRef]
Haykin, Simon. 2009. Neural Networks and Learning Machines, 3rd ed. New York: Pearson Education, Inc.
Hidayana, Rizki Apriva, Herlina Napitupulu, and Sukono Sukono. 2022. An Investment Decision-Making Model to Predict the Risk
and Return in Stock Market: An Application of ARIMA-GJR-GARCH. Decision Science Letters 11: 235–46. [CrossRef]
Ibn Musah, Abdul-Aziz, Jianguo Du, Hira Salah ud din Khan, and Alhassan Alolo Abdul-Rasheed Akeji. 2018. The Asymptotic
Decision Scenarios of an Emerging Stock Exchange Market: Extreme Value Theory and Artificial Neural Network. Risks 6: 132.
[CrossRef]
Ilyas, Qazi M., Khalid Iqbal, Sidra Ijaz, Abid Mehmood, and Surbhi Bhatia. 2022. A Hybrid Model to Predict Stock Closing Price Using
Novel Features and a Fully Modified Hodrick–Prescott Filter. Electronics 11: 3588. [CrossRef]
Ji, Jingru, Donghua Wang, and Dinghai Xu. 2019. Modelling the Spreading Process of Extreme Risks via a Simple Agent-Based Model:
Evidence from the China Stock Market. Economic Modelling 80: 383–91. [CrossRef]
Ji, Jingru, Donghua Wang, Dinghai Xu, and Chi Xu. 2020. Combining a Self-Exciting Point Process with the Truncated Generalized
Pareto Distribution: An Extreme Risk Analysis under Price Limits. Journal of Empirical Finance 57: 52–70. [CrossRef]
Kalfin, Sukono, Sudradjat Supian, and Mustafa Mamat. 2022. Insurance as an Alternative for Sustainable Economic Recovery after
Natural Disasters: A Systematic Literature Review. Sustainability 14: 4349. [CrossRef]
Karmakar, Madhusudan, and Girja K. Shukla. 2015. Managing Extreme Risk in Some Major Stock Markets: An Extreme Value
Approach. International Review of Economics & Finance 35: 1–25. [CrossRef]
Karmakar, Madhusudan, and Samit Paul. 2019. Intraday Portfolio Risk Management Using VaR and CVaR:A CGARCH-EVT-Copula
Approach. International Journal of Forecasting 35: 699–709. [CrossRef]
Kitchenham, Barbara, and Stuart Charters. 2007. Guidelines for Performing Systematic Literature Reviews in Software Engineering.
Available online: https://www.elsevier.com/__data/promis_misc/525444systematicreviewsguide.pdf (accessed on 29 January
2023).
Liberati, Alessandro, Douglas G. Altman, Jennifer Tetzlaff, Cynthia Mulrow, Peter C. Gøtzsche, John P. A. Ioannidis, Mike Clarke, P. J.
Devereaux, Jos Kleijnen, and David Moher. 2009. The PRISMA statement for reporting systematic reviews and meta-analyses of
studies that evaluate healthcare interventions: Explanation and elaboration. BMJ 339: b2700. [CrossRef]
Long, H. V., H. B. Jebreen, I. Dassios, and D. Baleanu. 2020. On the Statistical GARCH Model for Managing the Risk by Employing a
Fat-Tailed Distribution in Finance. Symmetry 12: 1698. [CrossRef]
Longin, François M. 2000. From Value at Risk to Stress Testing: The Extreme Value Approach. Journal of Banking & Finance 24: 1097–130.
[CrossRef]
Martin, Ian W. R., and Stefan Nagel. 2022. Market Efficiency in the Age of Big Data. Journal of Financial Economics 145: 154–77.
[CrossRef]
Melina, Sukono, Herlina Napitupulu, Aceng Sambas, Anceu Murniati, and Valentina Adimurti Kusumaningtyas. 2022. Artificial
Neural Network-Based Machine Learning Approach to Stock Market Prediction Model on the Indonesia Stock Exchange During
the COVID-19. Engineering Letters 30: 988–1000.
Miloš, Božović. 2020. Portfolio Tail Risk: A Multivariate Extreme Value Theory Approach. Entropy 22: 1425. [CrossRef]
Morgan, John Pierpont. 1996. RiskMetrics Technical Document, 4th ed. New York: RiskMetrics.
Najem, Rihab, Meryem Fakhouri Amr, Ayoub Bahnasse, and Mohamed Talea. 2022. Artificial Intelligence for Digital Finance, Axes and
Techniques. Procedia Computer Science 203: 633–38. [CrossRef]
O’Donnell, Niall, Darren Shannon, and Barry Sheehan. 2021. Immune or At-Risk? Stock Markets and the Significance of the COVID-19
Pandemic. Journal of Behavioral and Experimental Finance 30: 1–10. [CrossRef]
Omari, Cyprian, Simon Mundia, Immaculate Ngina, Mundia Maina, and Immaculate Ngina. 2020. Forecasting Value-at-Risk of
Financial Markets under the Global Pandemic of COVID-19 Using Conditional Extreme Value Theory. Journal of Mathematical
Finance 10: 569–97. [CrossRef]
Risks 2023, 11, 60 24 of 24
Parkinson, Michael. 1980. The Extreme Value Method for Estimating the Variance of the Rate of Return. The Journal of Business 53:
61–65. [CrossRef]
Pickands, James. 1975. Statistical Inference Using Extreme Order Statistics. The Annals of Statistics 3: 119–31. [CrossRef]
Qiu, Mingyue, and Yu Song. 2016. Predicting the Direction of Stock Market Index Movement Using an Optimized Artificial Neural
Network Model. PLoS ONE 11: e0155133. [CrossRef]
Rossignolo, Adrian F., Meryem Duygun Fethi, and Mohamed Shaban. 2012. Value-at-Risk Models and Basel Capital Charges: Evidence
from Emerging and Frontier Stock Markets. Journal of Financial Stability 8: 303–19. [CrossRef]
Saputra, Moch Panji Agung, Sukono, and Diah Chaerani. 2022. Estimation of Maximum Potential Losses for Digital Banking
Transaction Risks Using the Extreme Value-at-Risks Method. Risks 10: 10. [CrossRef]
Singvejsakul, Jittima, Chukiat Chaiboonsri, and Songsak Sriboonchitta. 2021. The Optimization of Bayesian Extreme Value: Empirical
Evidence for the Agricultural Commodities in the US. Economies 9: 30. [CrossRef]
Sobreira, Nuno, and Rui Louro. 2020. Evaluation of Volatility Models for Forecasting Value-at-Risk and Expected Shortfall in the
Portuguese Stock Market. Finance Research Letters 32: 101098. [CrossRef]
Song, Shijia, Fei Tian, and Handong Li. 2021. An Intraday-Return-Based Value-at-Risk Model Driven by Dynamic Conditional Score
with Censored Generalized Pareto Distribution. Journal of Asian Economics 74: 101314. [CrossRef]
Sukono, Hafizan Juahir, Riza Andrian Ibrahim, Moch Panji Agung Saputra, Yuyun Hidayat, and Igif Gimin Prihanto. 2022. Application
of Compound Poisson Process in Pricing Catastrophe Bonds: A Systematic Literature Review. Mathematics 10: 2668. [CrossRef]
Tabasi, Hamed, Vahidreza Yousefi, Jolanta Tamošaitienė, and Foroogh Ghasemi. 2019. Estimating Conditional Value at Risk in the
Tehran Stock Exchange Based on the Extreme Value Theory Using GARCH Models. Administrative Sciences 9: 40. [CrossRef]
Trabelsi, Nader, and Aviral K. Tiwari. 2019. Market-Risk Optimization among the Developed and Emerging Markets with CVaR
Measure and Copula Simulation. Risks 7: 78. [CrossRef]
Ullah, Malik Z., Fouad O. Mallawi, Mir Asma, and Stanford Shateyi. 2022. On the Conditional Value at Risk Based on the Laplace
Distribution with Application in GARCH Model. Mathematics 10: 3018. [CrossRef]
Wu, Binghui, and Tingting Duan. 2017. A Performance Comparison of Neural Networks in Forecasting Stock Price Trend. International
Journal of Computational Intelligence Systems 10: 336–46. [CrossRef]
Wu, Binrong, Lin Wang, Sirui Wang, and Yu-Rong Zeng. 2021. Forecasting the US Oil Markets Based on Social Media Information
during the COVID-19 Pandemic. Energy 226: 120403. [CrossRef]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual
author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to
people or property resulting from any ideas, methods, instructions or products referred to in the content.