0% found this document useful (0 votes)
187 views14 pages

Machine Learning-Driven Credit Risk A Systemic Rev

Uploaded by

Willem Vincent
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
187 views14 pages

Machine Learning-Driven Credit Risk A Systemic Rev

Uploaded by

Willem Vincent
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 14

Neural Computing and Applications (2022) 34:14327–14339

https://doi.org/10.1007/s00521-022-07472-2 (0123456789().,-volV)(0123456789().
,- volV)

REVIEW

Machine learning-driven credit risk: a systemic review


Si Shi1 • Rita Tse1,2 • Wuman Luo1 • Stefano D’Addona3 • Giovanni Pau4,5

Received: 12 February 2022 / Accepted: 26 May 2022 / Published online: 16 July 2022
Ó The Author(s) 2022

Abstract
Credit risk assessment is at the core of modern economies. Traditionally, it is measured by statistical methods and manual
auditing. Recent advances in financial artificial intelligence stemmed from a new wave of machine learning (ML)-driven
credit risk models that gained tremendous attention from both industry and academia. In this paper, we systematically
review a series of major research contributions (76 papers) over the past eight years using statistical, machine learning and
deep learning techniques to address the problems of credit risk. Specifically, we propose a novel classification methodology
for ML-driven credit risk algorithms and their performance ranking using public datasets. We further discuss the challenges
including data imbalance, dataset inconsistency, model transparency, and inadequate utilization of deep learning models.
The results of our review show that: 1) most deep learning models outperform classic machine learning and statistical
algorithms in credit risk estimation, and 2) ensemble methods provide higher accuracy compared with single models.
Finally, we present summary tables in terms of datasets and proposed models.

Keywords Credit risk  Machine learning  Deep learning  Statistical learning

1 Introduction

Machine learning advances heavily affected industry and


academia in the past decades, ultimately transforming
& Giovanni Pau people’s daily life. Artificial Intelligence (AI) has been
[email protected] applied to almost every human activity, including pattern
Si Shi recognition, image classification, business, agriculture,
[email protected] transportation, and finance. This paper focuses on machine
Rita Tse learning applied to finance and credit risk estimation.
[email protected] Modern financial systems rely on credit and trust. Credit
Wuman Luo risk is a fundamental parameter that measures and predicts
[email protected] the default probabilities of a debtor. The correct estimation
Stefano D’Addona of credit risk is paramount for the entire system. Failing in
[email protected] the credit risk estimation can lead to systemic failures such
1
as the sub-prime crisis of 2008. Consequently, lenders
Faculty of Applied Sciences, Macao Polytechnic University, devote large amounts of resources to predict the credit-
Macao SAR, China
worthiness of consumers and companies to develop
2
Engineering Research Centre of Applied Technology on appropriate lending strategies that minimize their risks.
Machine Translation and Artificial Intelligence of Ministry of
Education, Macao Polytechnic University, Macao SAR, Historically, credit risk approaches use statistical methods
China such as Linear Discriminant Analysis [1] and Logistic
3
Department of Political Science, University of Roma Tre, Regression [2]. These methods, however, do not easily
Rome, Italy handle large datasets.
4
Department of Computer Science and Engineering, Advances in computing power and availability of large
University of Bologna, Bologna, Italy credit datasets paved the way to AI-Driven credit risk
5
UCLA Samueli Computer Science, University of California, estimation algorithms such as traditional machine learning
Los Angeles, USA and deep learning. Conventional machine learning

123

Content courtesy of Springer Nature, terms of use apply. Rights reserved.


14328 Neural Computing and Applications (2022) 34:14327–14339

techniques, e.g., k-Nearest Neighbor [3], Random Forest 2.2 Inclusion and exclusion criteria
[4] and Support Vector Machines [5], are more effective
and flexible than statistical methods. In particular, the vital In this paper, we select three inclusion criteria: (1) the
branch of machine learning-deep learning techniques [6] relevance of research topic, (2) the precision of evaluation
applied to large credit risk data lake outperform their metrics, (3) the publication year and citations. Moreover,
predecessors both in accuracy and efficiency. the papers will be excluded if they are duplicated, incom-
This paper presents a systemic review of credit risk plete, too early, low-related with the topic, having no clear
estimation algorithms. It analyzes both the major statistical metrics or comparatively low citations.
approaches and AI-based techniques with a critical spirit. We show the whole workflow of the selection process in
The aim is to provide a comprehensive overview of the Fig. 2.
current leading credit risk estimation technology, providing
justification and connections between past and present 2.3 The datasets and approaches
works. This work proposes a novel taxonomy combining of the reviewed articles
finance with machine learning techniques. In addition, this
work ranks their performance in terms of accuracy and The mainly used datasets by the papers under review are
costs. This paper also discusses the challenges and possible German and Australian public credit data from the UCI
solutions in terms of four aspects: data imbalance, dataset Machine Learning repository [7–11]. In addition, there
inconsistency, model transparency, and inadequate uti- exist some researches that discover and mine their own
lization of deep learning methods. data. For example, Chee Kian Leong (2015) uses data from
The remainder of the paper is organized as follows: the a firm in Singapore [12]. Authors in [13–26] all employ
survey methodology will be discussed in Sect. 2. Section 3 their unique dataset. Those articles mainly emphasize the
introduces the principles of statistical learning, machine significance and the veracity of the original data.
learning and deep learning. Section 4 analyzes credit risk- We discuss the principles and application of the overall
related applications in detail. In Sect. 5, presented algo- machine learning approaches. The traditional machine
rithms are discussed and ranked by their performance learning models for credit risk contain Support Vector
against public datasets. Finally, results and current chal- Machines (SVMs) [5], k-Nearest Neighbor (k-NN) [3],
lenges are summarized in Sect. 6; while Sect. 7 concludes Random Forests (RFs) [4], Decision Trees (DTs) [27–29],
this work. AdaBoost [30], Extreme Gradient Boost (XGBoost) [31],
Stochastic Gradient Boosting (SGB) [32], Bagging [33],
Extreme Learning Machine (ELM) [34] and GA (Genetic
2 Survey methodology Algorithm) [35]. Neural network models generally belong
to deep learning methods. Most of them include Convo-
2.1 Methodology lutional Neural Networks (CNNs) [36], Deep Belief Neural
Networks (DBNs) [37], Artificial Neural Networks (ANNs)
We applied PRISMA (Preferred Reporting Items for Sys- [38], LSTM (Long Short-Term Memory) [39], Restricted
tematic Reviews and Meta-Analyses Fig. 1) reviewing Boltzmann Machines (RBMs) [40], Deep Multi-Layer
methodology in our paper. First, we adopted five searching Perceptron (DMLP) [41], and Recurrent Neural Networks
platforms for our investigation: Google Scholar, ACM, (RNNs) [42].
IEEEXplore, Springerlink, and ScienceDirect. We used the Summary tables and bar charts regarding all the methods
keywords ‘‘machine learning’’ or ‘‘deep learning’’ com- of the reviewed papers are provided.
bined with ‘‘credit risk’’ while searching. We got 2400
articles in total. Then, we applied a filtering algorithm 2.4 Taxonomy
considering the trade-off between publication year and
citations to proceed. After removing 1400 duplicate The taxonomy is shown as Fig. 3. We can divide it into two
records, 800 ineligible records, and 76 incomplete articles, parts: the first is regarding computing technology and the
we obtained 124 screened records. Based on the relevance, second is credit risk application domain. The two parts are
we excluded 24 articles less related to the topic. After further categorized into subsections. These two parts are
manually checking whether the paper has clear evaluation connected and fused with each other. All the right-side sub-
metrics, we further excluded another 24 papers. Finally, we domains include the left-side techniques, and all the tech-
kept 76 studies in terms of the relevancy to the research niques can be applied in the financial domains.
topic, precision of evaluation metrics, publication time, and
number of citations as our source of reviewing.
Figure1 depicts the PRISMA flow diagram

123

Content courtesy of Springer Nature, terms of use apply. Rights reserved.


Neural Computing and Applications (2022) 34:14327–14339 14329

Fig. 1 The PRISMA flow diagram. From: Page MJ, McKenzie JE, Bossuyt PM, Boutron I, Hoffmann TC, Mulrow CD, et al. The PRISMA 2020
statement: an updated guideline for reporting systematic reviews. BMJ 2021;372:n71. doi:https://doi.org/10.1136/bmj.n71

Fig. 2 The workflow of selecting papers

123

Content courtesy of Springer Nature, terms of use apply. Rights reserved.


14330 Neural Computing and Applications (2022) 34:14327–14339

Fig. 3 The taxonomy of selecting paper

3 Computing approaches the class variable is given [43]. A Bayesian network is a


probabilistic model based on graphs. It measures the con-
This section briefly introduces three main computing ditional dependence structure of a series of random vari-
techniques used for credit analysis, i.e., statistical learning, ables that comply with the Bayes theorem [44].
machine learning and deep learning, each of which has its
own characteristics and similar principles. Statistical 3.2 Machine learning methods
approaches are traditional ways to classify a customer’s or
enterprise’s credit behavior. However, with the rapid We review a series of conventional machine learning
development of artificial intelligence, machine learning algorithms that can be applied well in credit risk area.
and deep learning gradually took the place of statistical k-NN [3] belongs to classification methods that appoint
analysis. the class of the majority of the k nearest neighbors of an
input variable x to it in a dataset [3].
3.1 Statistical learning approaches Tree-related methods show their effects in credit risk
domain. Typical examples include DTs [27–29], Random
We divide the statistical approaches into three subsec- Forests (RFs) [4], Classification and Regression Trees
tions—discriminant analysis, logistic regression and (CART) [45], C4.5 [46], and Diverse Ensemble Creation
Bayesian related model. by Oppositional Relabeling of Artificial Training Examples
LDA (Linear Discriminant Analysis) is a classic tech- (DECORATE) [47].
nique for predicting groups of samples [1]. It aims at Support Vector Machine [5] implements a hyperplane (a
generating characteristics that can separate binary decision boundary) which can separate classes in a high
variables. dimensional feature space. It outputs a class identity
Logistic regression is a classification algorithm which according to whether wT þ b is positive or not [6]. Here,
uses the logistic sigmoid function to squash the output of w stands for the margin between the negative and positive
the linear function into the interval (0, 1) and interpret that hyperplane while b means the bias.
value as a probability [6]. Boosting is an ensemble method that combines the
Naı̈ve Bayes methods are statistical learning algorithms individual models to gain higher capacity [6]. Adaptive
that apply Bayes’ theorem with the ‘‘naı̈ve’’ assumption of Boosting (AdaBoost) belongs to the most popular boosting
conditional independence between every pair of features if algorithms as the weights are re-assigned to each instance,

123

Content courtesy of Springer Nature, terms of use apply. Rights reserved.


Neural Computing and Applications (2022) 34:14327–14339 14331

with higher weights assigned to incorrectly classified DMLP is a Multi-Layer Perceptron with multiple hidden
instances [48]. SGB (Stochastic Gradient Boosting) [32] layers. It is a directed neural network. In order to update the
can add incorporate randomness as an integral part when weights, the loss function for DMLP uses Softmax and
created from Gradient Boosting algorithm. This family of Cross-Entropy. [50].
algorithms includes Extreme Gradient Boost (XGBoost) LeCun et al. first introduced CNNs [36] which were
[31] similar to Gradient boosting. However, it includes the widely applied in image processing, voice recognition,
decision trees built in parallel rather than in a series automatic QA systems, and many other computing fields.
manners. CNNs consist of an input layer, convolutional layers,
Bagging is an ensemble method which contains the pooling layers and fully connected layers.
same kind of model, training algorithm, and objective The convolution function is following [6]:
function in recycling [6]. It is also known as bootstrap Z 1
aggregation [33]. sðtÞ ¼ xðaÞwðt  aÞ da ð2Þ
1
Extreme learning machine [34] was developed by
Guang-Bin Huang in 2006. It targets at building single- where w(a) is a weighing function where a is the age of a
hidden layer feedforward neural networks (SLFNs) which measurement.
randomly chooses hidden nodes and outputs the weights of Hinton et al. introduced DBNs [37] which are a class of
SLFNs logically [34]. deep neural networks. A typical DBN consists of several
Genetic algorithm (GA) [35] is a heuristic search algo- hidden layers of Restricted Boltzmann Machines. An out-
rithm to solve searching and optimization problems. It first put of a lower level RBM can be regarded as input of the
generates an initial population, then obtains a fitness score higher level RBM [50].
for all individuals in it. Individuals are selected for the RBMs are some of the most common building blocks of
reproduction of offspring [49]. deep probabilistic models. They are undirected proba-
bilistic graphical models containing a layer of observable
3.3 Deep learning methods variables and a single layer of latent variables [6].
It has the similar energy function like Boltzmann
Deep learning has deeper layers and more units within a Machine. The function is as follows [50]:
layer compared with traditional machine learning. It can Xn X
m n X
X m
represent functions of increasing complexity [6]. In this Eðv; hÞ ¼  ai v i  bj hj  aij vi hj ð3Þ
i¼1 j¼1 i¼1 j¼1
section, we review some crucial deep learning methods
used in credit risk. where ai ; bj are biases for binary variables vi ; hj , and aij are
Artificial Neural Networks [38] were inspired by a weights between j and i.
biological neural network system. It has three layers gen-
erally: an input, hidden and output layers. Given a feature
vector x, the ANN outputs y^ through the following formula 4 Credit risk application with computing
[50]: algorithms
ð1Þ ð2Þ
y^ ¼ a2 ða1 ðað1Þ x þ a0 Það2Þ þ a0 Þ ð1Þ
In the past decades, a lot of scholars have employed vari-
ð1Þ ð2Þ
where a0 ; að1Þ ; a0 ; að2Þ are weights and a1 ; a2 are activa- ous computing algorithms and models to solve credit risk
tion functions. prediction and assessment. Binary classification problem is
Recurrent neural networks (RNNs) are a family of the most fundamental and essential computing technique in
neural networks for processing sequential data [6]. They credit risk scenarios. In this section, we divide the related
can better handle sequential information rather than the studies into two groups from the perspective of finance:
spatial data which Convolutional Neural Networks (CNNs) consumer and corporate.
can effectively process. RNNs introduce state variables to
store past information as well as the current inputs, both 4.1 Consumer credit risk
determining the current outputs [51].
LSTM [39] was first developed to produce paths in Consumer credit scoring is one of the main parts of credit
which the gradient flows for long durations [6]. It is the risk management. It is a kind of system which determines
variant of Recurrent Neural Networks (RNNs). Compared the creditworthiness of a customer based on his/her past
with traditional RNNs, it can solve gradient disappearance credit situation. In [52], the Bayesian network method is
and explosion in the long-term sequence process. improved to find out whether there is a change in credit risk
profiles. Numerous approaches have been implemented in

123

Content courtesy of Springer Nature, terms of use apply. Rights reserved.


14332 Neural Computing and Applications (2022) 34:14327–14339

this domain. Typical examples include Extreme learning Online supply chain financial risk can be controlled by
machines (ELM) [7], Ensemble of classifiers [8], Bayesian proper estimation and assessment. The authors in [24]
networks [12], Deep Genetic Cascade Ensembles [11], a construct a deep belief network based on Restricted
hybrid model with convolutional neural networks and Boltzmann Machine and classifier SOFTMAX. The dataset
Relief algorithm [22], Genetic Programming [53], feature came from annual financial reports of Chinese
selection [54], RNN [55], ensemble of supervised learning notable companies. The model shows an accuracy which is
and statistical learning [56], Radial Basis Function [57], far beyond SVM and Logistic Regression. In [79, 80],
TreeSHAP method for Stochastic Gradient Boosting [58], a SVM and XGBoost were more accurate than LR and NB in
real-time binary classification model [59], CNN [60], MLP supply chain fraud detection.
[61], etc. The authors in [62] compared the traditional and Because of the remaining effect of Global Financial
machine learning models in the credit score evaluation Crisis in 2008, a large number of corporations are under the
area. threats of bankruptcy. Neural networks can help those in
Predicting a consumer’s future credit condition is also danger detect the early signals of collapsing. A series of
valuable for credit risk and quantitative analysis. The machine learning methods are enforced to predict bank-
authors in [19] conduct a comparison between deep ruptcy [16, 81]. Bagging, boosting and random forest have
learning techniques and other machine learning methods. It the best performance. In [10], random forest trees are
proves that XGBoost overperforms traditional machine proven to outperform most of the other machine learning
learning techniques like Logistic regression, SVM and models.
Random Forest. It turns out that a hybrid model is capable In [26], statistical methods—probit models and CART
of predicting credit risk. In [63], a unique model named (Classification And Regression Trees), machine learning
TRUST (Trainable Undersampling with SELF Training) methods—Neural Networks and k-NN are applied and
was proved to be decisive. compared to make a prediction in financial intermediary
CCF (Credit card fraud) is a specific crime in the domain.
banking system and becoming a substantially growing International finance, which has an important branch
problem worldwide [64]. Detection of it helps to control peer-to-peer lending, once flourished in the past decades.
the credit risk in banking security issue. A novel frame- Normally, it has greater credit risk than common financial
work called DEAL (Deep Ensemble Algorithm) is industry. Neural Networks [17, 82, 83], Attention Mecha-
employed [64]. Recurrent Neural Network (RNN) [65], nism LSTM [20], word embedding models [84], Ensemble
Boosted Decision Tree [66–68], a deep learning structure Learning Method [85, 86], Restricted Boltzmann Machine
with an advanced feature engineering [69] display a satis- (RBM) [87] all exert their impact on predicting the risk of
factory performance. The authors in [70] conduct a com- P2P industry.
parison among Deep Learning, Logistic Regression and Mortgage credit and prepayment risk are vital issues for
Gradient Boosted Tree. In [71–73], the authors imple- measuring a borrower’s behavior in real estate financial
mented LR, SVM, k-NN, NB, RF, DT, MLP methods and industry. In [88], the authors find a highly nonlinear rela-
found that they were all robust while tree-related models tionship between a borrower’s behavior and risk factors
have the best performance. By using an auto-encoder, the with deep neural networks. Deep learning is proved to be
authors in [70] create features with domain expertise. It is effective in measuring mortgage risks.
proved to be an improvement in predictive power. In [74], Big data technology triggered the massive transforma-
Visual Analytics were used to help reduce the incidence of tion of finance. According to Denis Ostapchenya, a finan-
false positives. cial expert, big data in banking can be deployed to assess
risks in the procedure of trading stocks or checking the
4.2 Corporate credit risk creditworthiness of a loan applicant. Big Data analysis also
accelerates and ensures the processes which require com-
Credit risk in corporate aspect also demands the necessity pliance verification, auditing, and reporting [89]. In the
of machine learning and deep learning. credit risk domain, the combination of machine learning,
Deep learning plays a significant role in corporate credit big data and specific financial techniques has achieved
rating and assessment. Two-layer additive risk model [13], satisfactory results. BP neural networks, genetic algorithm
Artificial Neural Network [15], LSTM and AdaBoost [9], [90], logistic regression with XGBoost and AdaBoost
denoising-based neural network [21], deep belief network [91, 92], Synthetic Minority Oversampling Technique
[14, 75], probabilistic neural network (PNN) [76], Genetic algorithm [93], integrated and mixed models [94, 95] all
algorithm with neural network [77], CNN [78] all show play a vital role in predicting and classifying credit risk
their great competency in estimation and assessing. assessment.

123

Content courtesy of Springer Nature, terms of use apply. Rights reserved.


Neural Computing and Applications (2022) 34:14327–14339 14333

5 Performance ranking of machine learning


techniques

5.1 Data imbalance

Generally, data imbalance often occurs in the credit risk


classification due to the huge differences of the number of
good borrowers and bad borrowers. SMOTE [93] is one of
the most widely used approach to address this problem. In
addition, over-sampling and under-sampling techniques are
also employed. Nevertheless, data imbalance has been
severely underestimated in many credit risk researches.
Fig. 4 The accuracy from German credit data
5.2 Evaluation metrics
Table 1 Rank from German credit data
In this review, we select ACC (accuracy) and AUC as main
Methods Rank according to AUC Rank according to ACC
metrics for performance evaluation. The metric accuracy
(ACC) is calculated through correctly classified values Bagging 1 3
divided by the total number of samples while the metric LR 2 4
AUC is the area under ROC curve which is also a mea- SVM 3 6
surement of precision of classification. ANN 4 7
ACC is calculated as follows: Decorate 5 5
TP þ TN ELM 6 2
ACC ¼ ð4Þ
TP þ FP þ FN þ TN AdaBoost 7 12
MLP 8 8
where TP denotes true positive, TN stands for true nega-
CART 9 13
tive, FP means false positive, FN denotes false negative.
RF 10 1
AUC [96] can be expressed as the following formula:
NB 11 11
TP FP
1 þ TPþFN  FPþTN k-NN 12 10
AUC ¼ ð5Þ
2 C4.5 13 9
All of the above methods are abbreviations for the notions introduced
5.3 Ranking of techniques in the former sections

There hasn’t been consensus on the specific ranking of


each machine learning technique. In this section, we pro-
Similarly, we sort and calculate the mean ACC and
pose our own thoughts that is based on a thorough and
AUC appearing in the Australian Credit Risk dataset. The
objective investigation. Because the open-source databases
result is shown in Fig. 5. It turns out that the accuracy in
of German and Australian credit risk have uniform judging
criteria, we select the common techniques appearing in the
related literature to compare their performances. We use
the mean of each metric of the methods. The bar charts are
shown in Fig. 4.
The graph shows that machine learning methods have a
higher accuracy universally than statistical methods. Bag-
ging has the highest AUC and Random Forest (RF) has the
highest ACC. Logistic Regression is the most powerful tool
among the statistical methods in the credit risk classifica-
tion. Naı̈ve Bayes (NB), k-Nearest Neighbor (k-NN) and
Classification and Regression Trees (CART) have com-
paratively low rankings regarding German credit dataset.
The detailed ranking results are shown in Table 1.
Fig. 5 The accuracy from Australian credit data

123

Content courtesy of Springer Nature, terms of use apply. Rights reserved.


14334 Neural Computing and Applications (2022) 34:14327–14339

the Australian dataset exceeds the one in the German 6.2 The summary tables
dataset because the imbalanced ratio of German dataset is
comparatively higher. The best AUC is contributed by We summarize our survey in the following four tables. A
ANN. The best ACC belongs to ELM method. whole summary table is shown in S1 Table.
From Fig. 5, we can conclude that deep learning Table 3 shows that LR and Bayesian models are the
methods are more potent than traditional machine learning mostly used ones among the statistical learning techniques.
and statistical methods from the above graph. The specific As shown in Table 4, we find that AdaBoost, SVM,
ranking is shown in Table 2. Tree-related, k-NN and Bagging are the primarily imple-
In short, deep learning techniques have better perfor- mented models among the machine learning techniques
mance regarding public credit risk data sets compared with while SGB (Stochastic Gradient Boosting) and ELM
machine learning and statistical learning methods based on (Extreme Learning Machine) have a relatively low citation.
ACC and AUC values. Table 5 shows that ANN and MLP are the widely used
deep learning models. Moreover, nearly all of the listed
deep learning methods have a balanced citation
6 Discussions distribution.
We list several important works containing unique
6.1 Existing survey papers datasets as Table 6. Almost all of them deploy their own
computing models that improve the original algorithms.
In this section, we review several typical surveys published The results show that the models are effective.
recently. In [97], the majority of machine learning methods
and data imbalance are discussed, but the discussion only 6.3 Challenges
focuses on the card defraud domain and the authors didn’t
consider the synergetic effects of models. Xolani Dastile, We summarize four major challenges in the research of
Turgay Celik et al. [50] had a thorough investigation of machine learning-driven credit risk. First, data imbalance
systematic machine learning and its application in credit in credit risk is quite severe. Although several approaches
risk. Nevertheless, the role of deep learning models in such as over-sampling and under-sampling (usually chosen
credit risk hasn’t been fully expressed. In [98], principles to under-sample the majority) have been proposed to solve
of machine learning methods are not clearly displayed. In this problem, the results are still unsatisfactory in terms of
[99], abundant bibliography is shown. However, the both effectiveness and efficiency. Second, the shortage of
structure of the paper is not balanced. Siddharth Bhatore benchmark datasets is serious. Most existing works use
et al. [100] displayed an intact review of machine learning private datasets, thus the results of performance compar-
in credit risk and showed clear graphs, but they ignored the ison cannot be fair enough. Third, most machine learning
limitation of datasets in some sense. In [101], similar models are black boxes since they are generally not
problems with [98] occurred. transparent. Information transparency should be noticed.
In our work, we give a comprehensive analysis and
provide detailed comparison among methods, hoping to Table 3 Papers containing sta-
Source LDA LR Bayesian
improve existing results. tistical learning models
[9] ? ? ?
[12] ? ?
Table 2 Rank from Australian credit data [23] ? ?
[102] ?
Methods Rank according to AUC Rank according to ACC
[92] ?
ANN 1 2 [72] ? ?
k-NN 2 7 [79] ? ?
ELM 3 1 [103] ? ?
CART 4 5 [78] ? ?
LR 5 4 [95] ? ?
SVM 6 3 [73] ? ?
MLP 7 6 [80] ? ?
All of the above methods are abbreviations for the notions introduced LDA stands for linear discrimi-
in the former sections nant analysis, LR stands for
logistic regression

123

Content courtesy of Springer Nature, terms of use apply. Rights reserved.


Neural Computing and Applications (2022) 34:14327–14339 14335

Table 4 Papers containing


Source AdaBoost SVM Tree kNN Bagging XGBoost SGB ELM GA
machine learning models
[13] ? ? ?
[81] ? ? ? ?
[7] ? ? ? ?
[8] ? ? ?
[9] ? ? ?
[10] ? ? ?
[16] ? ? ?
[18] ? ? ?
[19] ? ? ?
[11] ?
[26] ? ?
[53] ?
[76] ? ?
[90] ?
[93] ? ? ?
[92] ? ? ?
[72] ? ? ?
[78] ? ? ? ? ?
All of the above methods are abbreviations for the notions introduced in the former sections

Table 5 Papers containing deep learning models Fourth, the application of deep learning models is still
limited in credit risk.
Source CNN MLP ANN LSTM RBM DBN RNN
These four challenges are what we are supposed to
[14] ? ? overcome in future work. We hope more and more deep
[7] ? advanced models will emerge in this area.
[9] ? ?
[17] ?
[12] ? 7 Conclusions
[20] ?
[21] ? In conclusion, we have witnessed an overall application of
[22] ? machine learning as well as deep learning methods in credit
[75] ? ? risk area. We build a taxonomy which links computing
[23] ? algorithms and finance. We also briefly introduce the
[24] ? principles of statistical and machine learning approaches.
[25] ? ? ? ? As for public datasets, we rank them according to their
[26] ? accuracy. In addition, we list some of the accuracy for the
[55] ? private and unique datasets. A checklist is provided in S2
[65] ? Table.
[102] ? The results show that deep learning methods are more
[87] ? powerful than the traditional machine learning and statis-
[93] ?
tical approaches although they haven’t been fully
[71] ?
employed. Also, the conclusion that ensembles of several
[78] ? ?
methods outperform a single one has been proved in some
of the related researches [9, 11, 75, 81, 103, 104].
[60] ?
In the future, we are supposed to find proper solutions to
[61] ?
the challenges mentioned above. First, we should find new
All of the above methods are abbreviations for the notions introduced ways to tackle the problem of imbalanced data. Second, we
in the former sections
will find a comprehensive judging criterion to make up for
the default of specific methods and the inconsistency of

123

Content courtesy of Springer Nature, terms of use apply. Rights reserved.


14336 Neural Computing and Applications (2022) 34:14327–14339

Table 6 A summary table with unique datasets


Source Datasets Model Accuracy

[13] Balanced FICO (Fair Isaac Corporation) Two-Layer Additive Risk Model 0.7404
[14] Credit Default Swap DBN 1
[15] 76 small businesses from a bank in Italy Feedforward networks 0.87
[16] NYU’s Salomon Center database Boosting 0.8631
[17] A leading European P2P Platform Bondora Neural networks 0.7438
[12] An original credit scoring dataset of a firm providing credit and loans in Neural networks 0.84
Singapore
[18] Lending club and kaggle, German credit Balasso based RF 0.8975
[19] A dataset belonging to National Bank of Canada XGBoost 0.7822(Auc)
[20] A real-world P2P Chinese data platform AM LSTM 0.669(Auc)
[21] Lending club A denoising-autoencoder-based neural 0.875
network model
[22] A real-world dataset from a Chinese company Relief-CNN 0.6989
(Auc)
[23] The credit data from CBRC, German dataset, Darden datase FV-SMO 0.849 (Auc)
[24] The database from Tai’an DBN 0.9604
[25] Bloomberg and Compustat CNN2d 0.9013
[26] Mortgage loan data from a commercial bank CART 0.917
All of the above methods are abbreviations for the notions introduced in the former sections

datasets. Third, we should seek improvements in machine Supplementary Information The online version contains
learning methods in tackling data transparency. Fourth, we supplementary material available at https://doi.org/10.1007/s00521-
022-07472-2.
should try our new and improved deep learning models in
credit risk classification problem. Acknowledgements This work was supported in part by the Macao
Moreover, in recent years, some authors proposed a Polytechnic University – Edge Sensing and Computing: Enabling
series of representative nature-inspired metaheuristic Human-centric (Sustainable) Smart Cities (RP/ESCA-01/2020) and
by the Emilia Romagna Region within the European S3 program with
algorithms such as (monarch butterfly optimization) MBO the Project LiBER. We want to thank the Macao government and the
[105], (earthworm optimization algorithm) EOA [106], Emilia Romagna regional government for supporting this work.
(elephant herding optimization) EHO [107], (moth search
algorithm) MS [108], (Slime mould algorithm) SMA [109], Author Contributions Author Si Shi’s contribution is to write and edit
the whole paper. Author RT contribution is to provide the funding and
(hunger games search) HGS [110], (colony predation resources as well as conceptualization of the paper. Author WL took
algorithm) CPA [111] and (Harris hawks optimization) part in the revising and conceptualization of the paper. Author SD’A
HHO [112]. They can also be applied in credit risk pre- helped with the formation of initial financial framework and revising.
diction. Besides, (Runge Kutta optimizer) RUN [113] is an Author GP guided the whole research process and revised the paper.
algorithm that excludes the general characteristics of Funding Open access funding provided by Alma Mater Studiorum -
metaphor among other metaheuristic algorithms. Gener- Università di Bologna within the CRUI-CARE Agreement. This work
ally, those novel intelligent computational algorithms was funded in part by the Macao Polytechnic University – Edge
haven’t been sufficiently applied in finance due to the Sensing and Computing: Enabling Human-centric (Sustainable)
Smart Cities (RP/ESCA-01/2020).
complexity and instability of risk related problems. How-
ever, they may have promising results when the analysis Availability of data and materials Not applicable.
tools become more mature.
Last but not least, big data technology and its applica- Declarations
tion in credit risk is a newly booming area. We will explore
them and utilize the vast amounts and efficiency of big data Conflict of interest The authors declare that they have no conflict of
tools like MapReduce and Hadoop platform to get better interest.
results. Ethics approval The authors are consistent with the ethical
requirements.

123

Content courtesy of Springer Nature, terms of use apply. Rights reserved.


Neural Computing and Applications (2022) 34:14327–14339 14337

Consent to participate The authors all consent to participate in the 16. Barboza F, Kimura H, Altman E (2017) Machine learning
paper editing. models and bankruptcy prediction. Expert Syst Appl
83:405–417
Consent for publication The authors all consent to the publication of 17. Byanjankar A, Heikkilä M, Mezei J (2015) Predicting credit risk
the paper. in peer-to-peer lending: A neural network approach. In: 2015
IEEE symposium series on computational intelligence, IEEE,
Code availability Not applicable. pp 719–725
18. Arora N, Kaur PD (2020) A bolasso based consistent feature
Open Access This article is licensed under a Creative Commons selection enabled random forest classification algorithm: an
Attribution 4.0 International License, which permits use, sharing, application to credit risk assessment. Appl Soft Comput
adaptation, distribution and reproduction in any medium or format, as 86(105):936
long as you give appropriate credit to the original author(s) and the 19. Marceau L, Qiu L, Vandewiele N, et al (2019) A comparison of
source, provide a link to the Creative Commons licence, and indicate deep learning performances with other machine learning algo-
if changes were made. The images or other third party material in this rithms on credit scoring unbalanced data. arXiv preprint arXiv:
article are included in the article’s Creative Commons licence, unless 1907.12363
indicated otherwise in a credit line to the material. If material is not 20. Wang C, Han D, Liu Q et al (2018) A deep learning approach
included in the article’s Creative Commons licence and your intended for credit scoring of peer-to-peer lending using attention
use is not permitted by statutory regulation or exceeds the permitted mechanism lstm. IEEE Access 7:2161–2168
use, you will need to obtain permission directly from the copyright 21. Fan Q, Yang J (2018) A denoising autoencoder approach for
holder. To view a copy of this licence, visit http://creativecommons. credit risk analysis. In: Proceedings of the 2018 international
org/licenses/by/4.0/. conference on computing and artificial intelligence, pp 62–65
22. Zhu B, Yang W, Wang H, et al (2018) A hybrid deep learning
model for consumer credit scoring. In: 2018 International
References Conference on Artificial Intelligence and Big Data (ICAIBD),
IEEE, pp 205–208
23. Zhang Q, Wang J, Lu A et al (2018) An improved smo algo-
1. Moo-Young M (2019) Comprehensive biotechnology. Elsevier, rithm for financial credit risk assessment-evidence from china’s
Amsterdam banking. Neurocomputing 272:314–325
2. Cox DR (1958) The regression analysis of binary sequences. J R 24. Xu RZ, He MK (2020) Application of deep learning neural
Stat Soc Ser B 20(2):215–232 network in online supply chain financial credit risk assessment.
3. Cover T, Hart P (1967) Nearest neighbor pattern classification. In: 2020 international conference on computer information and
IEEE Trans Inf Theory 13(1):21–27 big data applications (CIBDA), IEEE, pp 224–232
4. Breiman L (2001) Random forests. Mach Learn 45(1):5–32 25. Golbayani P, Wang D, Florescu I (2020) Application of deep
5. Cortes C, Vapnik V (1995) Support-vector networks. Mach neural networks to assess corporate credit rating. arXiv preprint
Learn 20(3):273–297 arXiv:2003.02334
6. Goodfellow I, Bengio Y, Courville A (2016) Deep Learn. MIT 26. Galindo J, Tamayo P (2000) Credit risk assessment using sta-
press, Cambridge tistical and machine learning: basic methodology and risk
7. Bequé A, Lessmann S (2017) Extreme learning machines for modeling applications. Comput Econ 15(1):107–143
credit scoring: An empirical evaluation. Expert Syst Appl 27. Quinlan JR (1993) C4. 5: Programming for machine learning.
86:42–53 Morgan Kauffmann 38(48):49
8. Abellán J, Castellano JG (2017) A comparative study on base 28. Breimann L, Friedman JH, Olshen RA et al (1984) Classif
classifiers in ensemble methods for credit scoring. Expert Syst Regres Trees. Wadsworth, Pacific Grove
Appl 73:1–10 29. Quinlan JR (1986) Induction of decision trees. Mach Learn
9. Shen F, Zhao X, Kou G et al (2021) A new deep learning 1(1):81–106
ensemble credit risk evaluation model with an improved syn- 30. Freund Y, Schapire RE (1997) A decision-theoretic general-
thetic minority oversampling technique. Appl Soft Comput ization of on-line learning and an application to boosting.
98(106):852 J Comput Syst Sci 55(1):119–139
10. Ghatasheh N (2014) Business analytics using random forest 31. Chen T, Guestrin C (2016) Xgboost: A scalable tree boosting
trees for credit risk prediction: a comparison study. Int J Adv Sci system. In: Proceedings of the 22nd acm sigkdd international
Technol 72(2014):19–30 conference on knowledge discovery and data mining,
11. Pławiak P, Abdar M, Acharya UR (2019) Application of new pp 785–794
deep genetic cascade ensemble of svm classifiers to predict the 32. Friedman JH (2002) Stochastic gradient boosting. Comput Statis
australian credit scoring. Appl Soft Comput 84(105):740 Data Anal 38(4):367–378
12. Leong CK (2016) Credit risk scoring with bayesian network 33. Breiman L (1996) Bagging predictors. Mach Learn
models. Comput Econ 47(3):423–446 24(2):123–140
13. Chen C, Lin K, Rudin C, et al (2018) An interpretable model 34. Huang GB, Zhu QY, Siew CK (2006) Extreme learning
with globally consistent explanations for credit risk. arXiv pre- machine: theory and applications. Neurocomputing
print arXiv:1811.12615 70(1–3):489–501
14. Luo C, Wu D, Wu D (2017) A deep learning approach for credit 35. Holland JH (1975) Adaptation in natural and artificial systems:
scoring using credit default swaps. Eng Appl Artif Intell an introductory analysis with applications to biology, control,
65:465–470 and artificial intelligence. U Michigan Press
15. Angelini E, Di Tollo G, Roli A (2008) A neural network 36. LeCun Y, Boser B, Denker JS et al (1989) Backpropagation
approach for credit risk evaluation. Quarte Rev Econ Finan applied to handwritten zip code recognition. Neural Comput
48(4):733–755 1(4):541–551
37. Hinton GE, Osindero S, Teh YW (2006) A fast learning algo-
rithm for deep belief nets. Neural Comput 18(7):1527–1554

123

Content courtesy of Springer Nature, terms of use apply. Rights reserved.


14338 Neural Computing and Applications (2022) 34:14327–14339

38. McCulloch WS, Pitts W (1943) A logical calculus of the ideas 60. Dastile X, Celik T (2021) Making deep learning-based predic-
immanent in nervous activity. Bull Math Biophys 5(4):115–133 tions for credit scoring explainable. IEEE Access
39. Hochreiter S, Schmidhuber J (1997) Lstm can solve hard long 9:50,426–50,440
time lag problems. Advances in neural information processing 61. Iwai K, Akiyoshi M, Hamagami T (2020) Structured feature
systems pp 473–479 derivation for transfer learning on credit scoring. In: 2020 IEEE
40. Smolensky P (1986) Information processing in dynamical sys- International Conference on systems, man, and cybernetics
tems: foundations of harmony theory. Colorado Univ at Boulder (SMC), IEEE, pp 818–823
Dept of Computer Science, Tech. rep 62. Kumar MR, Gunjan VK (2020) Review of machine learning
41. Wan S, Liang Y, Zhang Y, et al (2018) Deep multi-layer per- models for credit scoring analysis. Ingenierı́a Solidaria 16(1)
ceptron classifier for behavior analysis to estimate parkinson’s 63. Chi J, Zeng G, Zhong Q, et al (2020) Learning to undersampling
disease severity using smartphones. IEEE Access for class imbalanced credit risk forecasting. In: 2020 IEEE
6:36,825–36,833 International Conference on data mining (ICDM), IEEE,
42. Elman JL (1990) Finding structure in time. Cogn Sci pp 72–81
14(2):179–211 64. Arya M, Sastry GH (2020) Deal-‘deep ensemble algo-
43. Buitinck L, Louppe G, Blondel M, et al (2013) API design for rithm’framework for credit card fraud detection in real-time data
machine learning software: experiences from the scikit-learn stream with google tensorflow. Smart Sci 8(2):71–83
project. In: ECML PKDD Workshop: languages for data mining 65. Hsu TC, Liou ST, Wang YP et al (2019) Enhanced recurrent
and machine learning, pp 108–122 neural network for combining static and dynamic features for
44. Liu S, McGree J, Ge Z et al (2015) Computational and statistical credit card default prediction. ICASSP 2019–2019 IEEE Inter-
methods for analysing big data with applications. Academic national Conference on Acoustics. Speech and Signal Processing
Press (ICASSP), IEEE, pp 1572–1576
45. Grajski KA, Breiman L, Di Prisco GV, et al (1986) Classifica- 66. Alam TM, Shaukat K, Hameed IA, et al (2020) An investigation
tion of eeg spatial patterns with a tree-structured methodology: of credit card default prediction in the imbalanced datasets.
Cart. IEEE transactions on biomedical engineering BME- IEEE Access 8:201,173–201,198
33(12):1076–1086 67. Yiheng Wei QMYu Qi (2020) Fraud detection by machine
46. Quinlan JR et al (1996) Bagging, boosting, and c4. 5. Aaai/iaai learning. 2020 2nd International Conference on Machine
1:725–730 Learning. Big Data and Business Intelligence (MLBDBI), IEEE,
47. Melville P (2003) Creating diverse ensemble classifiers. Com- pp 101–115
puter Science Department, University of Texas at Austin 68. Shivanna A, Agrawal DP (2020) Prediction of defaulters using
48. Kumar A (2022) The ultimate guide to adaboost algorithm : machine learning on azure ml. In: 2020 11th IEEE Annual
What is adaboost algorithm? https://www.mygreatlearning.com/ Information Technology, Electronics and Mobile Communica-
blog/adaboost-algorithm/. Accessed 27 March 2022 tion Conference (IEMCON), IEEE, pp 0320–0325
49. Muthee A (2021) The basics of genetic algorithms in machine 69. Zhang X, Han Y, Xu W et al (2021) Hoba: a novel feature
learning. https://www.section.io/engineering-education/the- engineering methodology for credit card fraud detection with a
basics-of-genetic-algorithms-in-ml/. Accessed 27 March 2022 deep learning architecture. Inf Sci 557:302–316
50. Dastile X, Celik T, Potsane M (2020) Statistical and machine 70. Rushin G, Stancil C, Sun M, et al (2017) Horse race analysis in
learning models in credit scoring: a systematic literature survey. credit card fraud-deep learning, logistic regression, and gradient
Appl Soft Comput 91(106):263 boosted tree. In: 2017 systems and information engineering
51. Zhang A, Lipton ZC, Li M, et al (2021) Dive into deep learning. design symposium (SIEDS), IEEE, pp 117–121
arXiv preprint arXiv:2106.11342 71. Can B, Yavuz AG, Karsligil EM, et al (2020) A closer look into
52. Masmoudi K, Abid L, Masmoudi A (2019) Credit risk modeling the characteristics of fraudulent card transactions. IEEE Access
using bayesian network with a latent variable. Expert Syst Appl 8:166,095–166,109
127:157–166 72. Ahmed F, Shamsuddin R (2021) A comparative study of credit
53. Tran K, Duong T, Ho Q (2016) Credit scoring model: a com- card fraud detection using the combination of machine learning
bination of genetic programming and deep learning. In: 2016 techniques with data imbalance solution. In: 2021 2nd Interna-
Future Technologies Conference (FTC), IEEE, pp 145–149 tional Conference on Computing and Data Science (CDS),
54. Ha VS, Nguyen HN (2016) Credit scoring with a feature IEEE, pp 112–118
selection approach based deep learning. In: MATEC Web of 73. Khatri S, Arora A, Agrawal AP (2020) Supervised machine
Conferences, EDP Sciences, p 05004 learning algorithms for credit card fraud detection: a compari-
55. Babaev D, Savchenko M, Tuzhilin A, et al (2019) Et-rnn: son. In: 2020 10th International Conference on Cloud Com-
Applying deep learning to credit loan applications. In: Pro- puting, Data Science & Engineering (Confluence), IEEE,
ceedings of the 25th ACM SIGKDD international conference on pp 680–683
knowledge discovery & data mining, pp 2183–2190 74. Torres RAL, Ladeira M (2020) A proposal for online analysis
56. Twala B (2010) Multiple classifier application to credit risk and identification of fraudulent financial transactions. In: 2020
assessment. Expert Syst Appl 37(4):3326–3336 19th IEEE International Conference on machine learning and
57. Zhang T, Zhang W, Wei X et al (2018) Multiple instance applications (ICMLA), IEEE, pp 240–245
learning for credit risk assessment with transaction data. Knowl 75. Yu L, Yang Z, Tang L (2016) A novel multistage deep belief
Based Syst 161:65–77 network based extreme learning machine ensemble learning
58. Roa L, Correa-Bahnsen A, Suarez G et al (2021) Super-app paradigm for credit risk assessment. Flex Serv Manuf J
behavioral patterns in credit risk models: financial, statistical 28(4):576–592
and regulatory implications. Expert Syst Appl 169(114):486 76. Huang X, Liu X, Ren Y (2018) Enterprise credit risk evaluation
59. Abakarim Y, Lahby M, Attioui A (2018) Towards an efficient based on neural network algorithm. Cogn Syst Res 52:317–324
real-time approach to loan credit approval using deep learning. 77. Oreski S, Oreski G (2014) Genetic algorithm-based heuristic for
2018 9th International Symposium on Signal. Image, video and feature selection in credit risk assessment. Expert Syst Appl
communications (ISIVC), IEEE, pp 306–313 41(4):2052–2064

123

Content courtesy of Springer Nature, terms of use apply. Rights reserved.


Neural Computing and Applications (2022) 34:14327–14339 14339

78. Feng B, Xue W, Xue B, et al (2020) Every corporation owns its 96. Tomczak JM, Zieba M (2015) Classification restricted boltz-
image: Corporate credit ratings via convolutional neural net- mann machine for comprehensible credit scoring model. Expert
works. In: 2020 IEEE 6th International Conference on Computer Syst Appl 42(4):1789–1796
and Communications (ICCC), IEEE, pp 1578–1583 97. Lucas Y, Jurgovsky J (2020) Credit card fraud detection using
79. Dong Y, Xie K, Bohan Z et al (2021) A machine learning model machine learning: A survey. arXiv preprint arXiv:2010.06479
for product fraud detection based on svm. 2021 2nd Interna- 98. Wang X, Xu M, Pusatli ÖT (2015) A survey of applying
tional Conference on Education. Knowledge and Information machine learning techniques for credit rating: Existing models
Management (ICEKIM), IEEE, pp 385–388 and open issues. In: International Conference on neural infor-
80. Zhou Y, Song X, Zhou M (2021) Supply chain fraud prediction mation processing, Springer, pp 122–132
based on xgboost method. 2021 IEEE 2nd International Con- 99. Breeden JL (2020) Survey of machine learning in credit risk.
ference on Big Data. Artificial Intelligence and Internet of Available at SSRN 3616342
Things Engineering (ICBAIE), IEEE, pp 539–542 100. Bhatore S, Mohan L, Reddy YR (2020) Machine learning
81. Garcı́a V, Marqués AI, Sánchez JS (2019) Exploring the syn- techniques for credit risk evaluation: a systematic literature
ergetic effects of sample types on the performance of ensembles review. J Bank Financ Technol 4(1):111–138
for credit risk and corporate bankruptcy prediction. Inf Fusion 101. Leo M, Sharma S, Maddulety K (2019) Machine learning in
47:88–101 banking risk management: a literature review. Risks 7(1):29
82. Giudici P, Hadji-Misheva B, Spelta A (2020) Network based 102. Chi G, Uddin MS, Abedin MZ, et al (2019) Hybrid model for
credit risk models. Qual Eng 32(2):199–211 credit risk prediction: an application of neural network approa-
83. Chen YR, Leu JS, Huang SA, et al (2021) Predicting default risk ches. International Journal on Artificial Intelligence Tools
on peer-to-peer lending imbalanced datasets. IEEE Access 28(05):1950,017
9:73,103–73,109 103. Najadat H, Altiti O, Aqouleh AA, et al (2020) Credit card fraud
84. Liang K, He J (2020) Analyzing credit risk among chinese p2p- detection based on machine and deep learning. In: 2020 11th
lending businesses by integrating text-related soft information. International Conference on Information and Communication
Electron Commer Res Appl 40(100):947 Systems (ICICS), IEEE, pp 204–208
85. Song Y, Wang Y, Ye X et al (2020) Multi-view ensemble 104. Chen X, Li S, Xu X, et al (2020) A novel gsci-based ensemble
learning based on distance-to-model and adaptive clustering for approach for credit scoring. IEEE Access 8:222,449–222,465
imbalanced credit risk assessment in p2p lending. Inf Sci 105. Wang GG, Deb S, Cui Z (2019) Monarch butterfly optimization.
525:182–204 Neural Comput Appl 31(7):1995–2014
86. Niu K, Zhang Z, Liu Y et al (2020) Resampling ensemble model 106. Wang GG, Deb S, Coelho LDS (2018) Earthworm optimisation
based on data distribution for imbalanced credit risk evaluation algorithm: a bio-inspired metaheuristic algorithm for global
in p2p lending. Inf Sci 536:120–134 optimisation problems. Int J Bioinsp Comput 12(1):1–22
87. Yang J, Li Q, Luo D (2019) Research on p2p credit risk 107. Wang GG, Deb S, Coelho LdS (2015) Elephant herding opti-
assessment model based on rbm feature extraction-take sme mization. In: 2015 3rd international symposium on computa-
customers as an example. Open J Busin Manag 7(4):1553–1563 tional and business intelligence (ISCBI), IEEE, pp 1–5
88. Sirignano J, Sadhwani A, Giesecke K (2016) Deep learning for 108. Wang GG (2018) Moth search algorithm: a bio-inspired meta-
mortgage risk. arXiv preprint arXiv:1607.02470 heuristic algorithm for global optimization problems. Memetic
89. Ostapchenya D (2021) The role of big data in banking : How do Comput 10(2):151–164
modern banks use big data? https://www.finextra.com/blogpost 109. Li S, Chen H, Wang M et al (2020) Slime mould algorithm: a
ing/20446/the-role-of-big-data-in-banking–how-do-modern- new method for stochastic optimization. Future Gener Comput
banks-use-big-data. Accessed 27 March 2022 Syst 111:300–323
90. Du G, Liu Z, Lu H (2021) Application of innovative risk early 110. Yang Y, Chen H, Heidari AA et al (2021) Hunger games search:
warning mode under big data technology in internet credit visions, conception, implementation, deep analysis, perspec-
financial risk assessment. J Comput Appl Math 386(113):260 tives, and towards performance shifts. Expert Syst Appl
91. Gao L, Xiao J (2021) Big data credit report in credit risk 177(114):864
management of consumer finance. Wireless Communications 111. Tu J, Chen H, Wang M et al (2021) The colony predation
and Mobile Computing 2021 algorithm. J Bionic Eng 18(3):674–710
92. Wang H (2021) Credit risk management of consumer finance 112. Heidari AA, Mirjalili S, Faris H et al (2019) Harris hawks
based on big data. Mobile Information Systems 2021 optimization: algorithm and applications. Future Gener Comput
93. Niu A, Cai B, Cai S (2020) Big data analytics for complex credit Syst 97:849–872
risk assessment of network lending based on smote algorithm. 113. Ahmadianfar I, Heidari AA, Gandomi AH et al (2021) Run
Complexity 2020 beyond the metaphor: an efficient optimization algorithm based
94. Pérez-Martı́n A, Pérez-Torregrosa A, Vaca M (2018) Big data on runge kutta method. Expert Syst Appl 181(115):079
techniques to measure credit banking risk in home equity loans.
J Bus Res 89:448–454 Publisher’s Note Springer Nature remains neutral with regard to
95. Tang H, Zhang Y, Qiao Q, et al (2020) Risk assessment of credit jurisdictional claims in published maps and institutional affiliations.
field based on pso-svm. In: 2020 2nd International Conference
on Economic Management and Model Engineering (ICEMME),
IEEE, pp 809–813

123

Content courtesy of Springer Nature, terms of use apply. Rights reserved.


Terms and Conditions
Springer Nature journal content, brought to you courtesy of Springer Nature Customer Service Center GmbH (“Springer Nature”).
Springer Nature supports a reasonable amount of sharing of research papers by authors, subscribers and authorised users (“Users”), for small-
scale personal, non-commercial use provided that all copyright, trade and service marks and other proprietary notices are maintained. By
accessing, sharing, receiving or otherwise using the Springer Nature journal content you agree to these terms of use (“Terms”). For these
purposes, Springer Nature considers academic use (by researchers and students) to be non-commercial.
These Terms are supplementary and will apply in addition to any applicable website terms and conditions, a relevant site licence or a personal
subscription. These Terms will prevail over any conflict or ambiguity with regards to the relevant terms, a site licence or a personal subscription
(to the extent of the conflict or ambiguity only). For Creative Commons-licensed articles, the terms of the Creative Commons license used will
apply.
We collect and use personal data to provide access to the Springer Nature journal content. We may also use these personal data internally within
ResearchGate and Springer Nature and as agreed share it, in an anonymised way, for purposes of tracking, analysis and reporting. We will not
otherwise disclose your personal data outside the ResearchGate or the Springer Nature group of companies unless we have your permission as
detailed in the Privacy Policy.
While Users may use the Springer Nature journal content for small scale, personal non-commercial use, it is important to note that Users may
not:

1. use such content for the purpose of providing other users with access on a regular or large scale basis or as a means to circumvent access
control;
2. use such content where to do so would be considered a criminal or statutory offence in any jurisdiction, or gives rise to civil liability, or is
otherwise unlawful;
3. falsely or misleadingly imply or suggest endorsement, approval , sponsorship, or association unless explicitly agreed to by Springer Nature in
writing;
4. use bots or other automated methods to access the content or redirect messages
5. override any security feature or exclusionary protocol; or
6. share the content in order to create substitute for Springer Nature products or services or a systematic database of Springer Nature journal
content.
In line with the restriction against commercial use, Springer Nature does not permit the creation of a product or service that creates revenue,
royalties, rent or income from our content or its inclusion as part of a paid for service or for other commercial gain. Springer Nature journal
content cannot be used for inter-library loans and librarians may not upload Springer Nature journal content on a large scale into their, or any
other, institutional repository.
These terms of use are reviewed regularly and may be amended at any time. Springer Nature is not obligated to publish any information or
content on this website and may remove it or features or functionality at our sole discretion, at any time with or without notice. Springer Nature
may revoke this licence to you at any time and remove access to any copies of the Springer Nature journal content which have been saved.
To the fullest extent permitted by law, Springer Nature makes no warranties, representations or guarantees to Users, either express or implied
with respect to the Springer nature journal content and all parties disclaim and waive any implied warranties or warranties imposed by law,
including merchantability or fitness for any particular purpose.
Please note that these rights do not automatically extend to content, data or other material published by Springer Nature that may be licensed
from third parties.
If you would like to use or distribute our Springer Nature journal content to a wider audience or on a regular basis or in any other manner not
expressly permitted by these Terms, please contact Springer Nature at

[email protected]

You might also like