10 Model
10 Model
ABSTRACT Credit scoring models have been widely used in traditional financial institutions for many
years. Using these models in P2P Lending have limitations. First, the credit data of P2P usually contains
dense numerical features and sparse categorical features. Second, the existing credit scoring models are
generally cannot be updated online. The loan transaction of P2P lending is very frequent and the new data
leads data distribution to change. A credit scoring model without considering data update causes a serious
deviation or even failure in subsequent credit assessment. In this paper, we propose a new online integrated
credit scoring model (OICSM) for P2P Lending. OICSM integrates gradient boosting decision tree and
neural network to make the credit scoring model can handle two types of features more effectively and
update online. Offline and online experiments based on real and representative credit datasets are conducted
to verify the effectiveness and superiority of the proposed model. Experimental results demonstrate that
OICSM can significantly improve performance due to its advantage in deep learning over two features, and
it can further correct model deterioration due to its online dynamic update capability.
INDEX TERMS Online P2P lending, deep learning, credit scoring model, machine learning, online update.
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
VOLUME 8, 2020 177307
Z. Zhang et al.: Deep Learning Based Online Credit Scoring Model for P2P Lending
Second, existing credit scoring models require offline of OICSM. Experimental results illustrate that OICSM
training, which makes it difficult to realize online learning can significantly improves performance of credit scoring
and updating of the models. These models are generally con- and has the ability to update model online.
structed and verified offline. They cannot be updated online The rest of this paper is organized as follows. We intro-
when they are running. They usually are retrained offline duce the related work in Section II. We describe the
with new data after a period time (such as one month, one design and implementation of OICSM in Section III. Per-
quarter or even longer). However, especially for P2P lending, formance evaluations and results discussion of OICSM are
the loan transaction is very frequent. A large number of presented in Sections IV and V. Section VI concludes this
new loan transactions will be generated which will cause the study.
data distribution of lending to change before retraining the
model. Lack of the latest data will affect the effectiveness II. RELATED WORK
of a updated credit scoring model. A credit scoring model Credit scoring is essentially a binary classification method.
must be able to be trained and updated online to be suitable It is generally used to predict the default probability of
for scenarios where P2P loan data grows rapidly and changes loan applicants, and accordingly divides loan applicants into
frequently. defaulters and non-defaulters [1]. Corresponding models of
In order to solve the above problems, we propose an online credit scoring have roughly gone through three development
integrated credit scoring model (OICSM) for P2P lending stages: linear discriminant method, statistical method, and
based on machine learning methods. OICSM integrates gradi- machine learning method.
ent boosting decision tree (GBDT) and neural network (NN) The linear discriminant method is first adopted by Durand
to make the credit scoring model has online training and and used to discriminate between benign and non-performing
update capabilities, and can handle multiple types of fea- loans [9]. So far, it is still used as a benchmark method in
tures. GBDT has a good performance in learning over dense a certain range and can be well applied. In 1970, Orgler
numerical data [2], [4] and NN method is better at learning first introduces a linear regression model in credit scor-
over sparse categorical data [3], [5]. The proposed OICSM ing [10]. Since this method is later proved to be flawed,
can effectively process dense numerical features and sparse the non-linear statistical methods (e.g. logistic regression)
categorical features at the same time. Furthermore, because [11], [12], and nonparametric statistical methods (e.g. deci-
GBDT cannot process the batch data, OICSM uses a neural sion tree, bayesian network model) [13], [14] been introduced
network to perform knowledge distillation on the knowledge successively.
learned by GBDT to realize the batch processing, so OICSM With the great development of optimization theory and
can be updated online dynamically. computer technology, machine learning method has gradu-
To verify the effectiveness of OICSM, we select two real ally become the mainstream of personal credit assessment
and representative credit datasets of P2P Lending platform, research, and the performance has been greatly improved.
Lending Club (LC) in the United States and Paipaidai (PPD) Typical methods include neural network [15]–[17], genetic
in China. Experimental results show that the OICSM not only algorithm [18], support vector machine [19], refusal inference
can solve the above two problem effectively, but also has algorithm [6], gradient boosting decision tree [2], [4], et al.
better performance than existing credit scoring models. There are also some works to improve the performance
This paper makes the following contributions: through ensemble models [7], [20]–[22]. These methods
• We study the P2P lending credit scoring model from solve problems such as increased data size and unbalanced
a new perspective of online update. To the best of our data structure from different angles, and greatly promoted
knowledge, the problem of credit scoring model online the development of personal credit assessment. According
update has not attracted the attention of researchers. to the current research, GBDT and NN are particularly out-
With the generation of new loan data, we verified that standing in the field of credit scoring due to their good
the unupdated model will have a great impact on the performance [2]–[5], but they also have weaknesses.
performance of the credit assessment. Gradient boosting decision tree (GBDT) is a very popular
• We propose an online integrated credit scoring model integrated learning algorithm in recent years. It performs
for P2P lending based on deep learning. OICSM can not well in various machine learning tasks, such as click pre-
only learn over both categorical and numerical credit diction [23], learning to rank [24]. In the field of credit
data effectively, but also update dynamically using the scoring, Chang et al. [4] use eXtreme gradient boosting
newly generated loan data to avoid prediction devia- tree (XGBoost) and Ma et al. [2] use LightGBM to build
tions. It is especially applicable to P2P lending, which credit scoring models respectively. XGBoost and LightGBM
generally has massive and various borrowers, and with are two most popular variants of GBDT. The significant
very frequent loan transactions. advantage of GBDT depends on its superiority in processing
• We select two real and representative credit datasets dense numerical features [25], [26]. But meanwhile, it has
of P2P lending platforms and several representative two limitations [8]. First, it is difficult to update the GBDT
baseline models for comparison. Offline and online model online because the basic learned trees are not differ-
experiments are conducted to verify the effectiveness entiable. This weakness makes the credit scoring model can
only be updated offline after a fixed period. The update inter- III. METHODOLOGY
val will cause the data distribution to change which will cause We present the design of online integrated credit scoring
the model to be biased or even invalid. In addition, GBDT model (OICSM) for P2P lending based on the framework of
does not work well when used for sparse categorical features, DeepGBM [8]. OICSM integrates the advantages of GBDT
and it usually fails to generate trees effectively. Although and NN. It not only can learn over different two data types of
some variants of GBDT can convert categorical features into P2P lending data, but also be updated online.
dense numeric features, the raw information may be hurt
during the conversion process and resulting in the reduction
A. GENERAL DESCRIPTION OF OICSM
of model accuracy. Some variants of GBDT [27] can also
In the data warehouse of P2P lending platforms, there are
directly use the categorical feature in tree learning, but these
mainly three categories of data: pre-loan data, unfinished
models usually over-fits since the data in each category is too
loan data, and finished loan data (i.e. the P2P credit data
little.
used in our model). And the state of each corresponding
Neural network can learn complex and non-linear knowl-
loan datasets is constantly changing over time, as shown
edge from massive data. When applying it to the field of
in Figure 1. Specifically, at the pre-loan stage, the new
credit scoring, its two advantages can help construct models
loan applications are divided into two categories after
effectively. First, the batch-mode backpropagation algorithm
credit assessment, namely accepted loan and rejected loan.
makes it can not only learning over large scale data efficiently,
An accepted loan first forms an unfinished loan after the loan
but also use the newly generated loan data to update model
is obtained by applicant. Then it finally forms a finished loan
dynamically while does not need to train the model from
after the repayment period ends. In other words, the state of
scratch. Second, it is excellent at learning over sparse cate-
a loan data is in a progressive relationship. The variable that
gorical features by embedding structure [28], [29]. However,
controls the progressive relationship is time and the data in
its inefficiency in learning over dense numerical features is
each state is gradually completed. Due to P2P lending’s wide
a weakness [8]. Currently, NN has been well applied to the
customer coverage and large number of users, the number
field of credit scoring, such as the wide & deep Learning
of loan transactions is huge and the data updated frequently.
model [1] and RNN model [3], but the weakness above has
Thus, these have become the two most significant character-
not been completely overcome. Although a fully connected
istics of the P2P loan data.
neural network (FCNN) can be used to learn over numeri-
Figure 2 shows the framework of OICSM proposed in this
cal features, it easily leads to local optimization due to its
paper. In this model, the finished loan data in data warehouse
complex structure [30]. Therefore, in learning with numerical
is preprocesses firstly. We divide the data into two types
data, the performance of NN does not exceed GBDT [25].
with numerical and categorical features, and encode them
P2P credit data mainly includes two data types: sparse
separately. Next, the two types of data are imported into
categorical data and dense numerical data. In addition, a P2P
the ‘‘learning over two features’’ module for offline training
lending platform generally has a huge number of users and
to generate an initial credit scoring model. The method of
very frequent transactions. With the rapid increase of users
learning over two features will be detailed in Section 3.2.
and transactions, new loan data also accumulates rapidly,
When a new loan application appears, the corresponding
which will change data distribution. If a credit scoring model
applicant will be assessed by the trained credit scoring model.
cannot updated in time, the prediction results are likely
If the application is accepted, the applicant will get a loan.
to deviate or fail. GBDT and NN have advantages, but
After the repayment period ends, a new finished loan data is
using either method alone cannot meet the above-mentioned
generated. The state of the three categories of data (new loan
requirements of P2P lending credit scoring.
application data, unfinished loan data, and new finished loan
Currently, some papers try to combine the two meth-
data) are dynamically changing. As time goes on, more and
ods. Some researchers construct tree-like NN models [31],
more new finished loan data are accumulated as a new dataset.
but these works are mainly for computer vision tasks.
In OICSM, when a predetermined model update period or
Humbird et al. try to convert the decision trees to NN [32],
new dataset size is reached, the credit scoring model can be
but it consume many computing resources. [33], [34] use
updated online with new data, and without offline retraining.
GBDT and NN together directly, but they cannot be used
The above process is continuously executed in a loop to form
online efficiently due to the inherent weakness of GBDT.
a dynamic update P2P credit scoring model using online data.
In addition, DeepGBM [8] framework is designed for online
The offline training and online dynamic update of OICSM
prediction tasks such as flight delay prediction by integrating
will be described in Section 3.3.
GBDT and NN. Chen et al. [35] proposes a credit assessment
model based on DeepGBM for home credit default risk of
bank, but it does not consider the problem of deviations in B. DEEP LEARNING METHOD BASED ON TWO FEATURES
the model caused by changes in data distribution and cannot As mentioned earlier, the P2P credit data mainly includes
online update. Although the above works have made great sparse categorical and dense numerical features. In this
progress, no similar attempts has been made for credit scoring section, we show the method to learn over the two
in P2P Lending. different data types simultaneously for P2P lending.
The learning method in this paper contains two modules: neural network structures that already proven successful to
CatNN and GBDT2NN [8]. CatNN is a neural network play as the CatNN. CatNN can convert the high dimensional
structure performs better on learning over sparse categorical sparse categorical vectors into dense vectors effectively using
data and GBDT2NN is also a neural network structure dis- embedding technology, as shown in Equation 1:
tilled from GBDT and performs better on processing dense
numerical data. EVi (xi ) = embedding_lookup(Vi , xi ), (1)
1) CatNN FOR CATEGORICAL DATA where xi is the value of ith feature of sample x, Vi stores all
Neural networks are widely used to construct prediction mod- embeddings of the ith feature, EVi (xi ) denotes the embedding
els over sparse categorical data. We directly use the existing vector for xi .
b: LEAF EMBEDDING DISTILLATION where ||(·) is a connection operation, which connects one-hot
the essential difference of structures between tree-based encoded leaf index vectors of multipletrees in tree group
model and NN model makes it difficult to transform between
T into a multi-hot vector. GT,i = H ||t∈T (L t,i ); ωT is
them directly. But a NN model can be used to fit functions
of the tree model approximately to realize the knowledge one-layered fully connected NN that converts multi-hot vec-
distillation. We use NN to fit the result clusters of decision tors to a dense embedding GT,i . Correspondingly, we need to
tree, so that it is close to the structure functions of the decision use this new embedding as the distillation target of NN model,
tree. and the learning process can be extended from Equation 6
The structure function of a tree t is denoted as C t (x), and to 8.
its return value is the output leaf index of sample x. For the n
leaf index C t (xi ) of the ith training sample xi on the tree t, 1X 0
T
L = min L N x [I ]; θ , G
i T T T,i
, (8)
it is denoted by one-hot encoded L t,i . Then, embedding tech- θT n
i=1
nology is used to reduce the dimension of L t,i . In addition,
due to the existence of the bijection relations between leaf where IT is features used in T. If the number of trees in T is
indexes and values, leaf values can be used to learn embed- large, we only select the top features in IT according to their
ding. The embedding learning process can be denoted by importance.
Through above procedure, the output of the NN model through Equation 11 to get the offline OICSM model:
obtained by knowledge distillation from T is:
ŷ(x) = σ 0 (w1 × yGBDT 2NN (x) + w2 × yCat (x)), (11)
yT (x) = w × N x[I ]; θ
T T T
+ w0 . (9) where w1 , w2 are trainable parameters, σ0
is sigmoid func-
tion. The loss function is shown in Equation 12.
Because a GBDT uses k tree groups, the final output of the k
X
GBDT2NN is: Loffline = αL(ŷ(x), y) + β LTj , (12)
k j=1
X
yGBDT 2NN (x) = yTj (x). (10) where L is the cross-entropy loss function. LT is the
j=1 embedding loss defined by Equation 8. α, β are the weight
hyper-parameters used to adjust the importance of the two
C. MODEL TRAINING AND ONLINE DYNAMIC UPDATE losses.
In this section, we present the offline training and online
dynamic update of OICSM. The implementation process is 2) ONLINE UPDATE
showed in Algorithm 1. In the update period of the traditional credit scoring model,
a large number of new loan transactions are finished, which
Algorithm 1 The Implementation Process of OICSM make the distribution of new data to be inconsistent with the
Require: Offline credit data Doff ; Batch training data (newly data used to train offline models. Thus, the prediction of the
generated credit data) Dbatch ; Initialization trainable offline model will be biased or even invalid. This problem
parameters w1 , w2 ; Hyper-parameters α, β; can be solved by manually updating. However, during this
Ensure: 0 (non-defaulter), 1 (defaulter). period, with a large number of new loan data coming, if the
1: // Offline Training model predictions are biased, it may cause serious loss to
2: Train GBDT with Doff ; the platforms and investors of P2P Lending. A credit scoring
3: Use Eqn.(7) to obtain the leaf embedding for the tree model should has the ability to update online dynamically
groups; with newly generated data to adapt to the changes in data
4: Use Eqn.(8) to learn GBDT2NN; distribution.
5: Train CatNN with Doff ; The proposed OICSM can not only process two different
6: Combine CatNN and GBDT2NN to get offline OICSM. features effectively, but its batch processing mode can pro-
k
cess massive data and update online dynamically in a timely
7: Using loss function Loffline = αL(ŷ(x), y) + β
P
LTj to
j=1 manner. The update process is that, when a predetermined
train OICSM; model update period or new dataset size is reached, to input
8: // Online Update the newly generated credit data as batch training data into
9: When a predetermined model update period or newly the currently running model, then the model parameters are
generated dataset size is reached, instead of retraining updated accordingly, to better adapt to changes in data distri-
from scratch, we only need Dbatch to update OICSM by bution.
loss function Lonline = L(ŷ(x), y) Due to the need for dynamic update, the loss function of
the online model is different from the offline model. Let α =
1, β = 0 in Equation 12, we can obtain the loss function of
1) OFFLINE TRAINING
the online model as shown in Equation 13.
Assume that the credit data (finished loan data) used in offline Lonline = L(ŷ(x), y), (13)
training is denoted by Doff . Doff = (x1 , y1 ), . . . , (xi , yi ),
. . . , (xn , yn ), xi = (fNum , fCat ), fNum and fCat denote numerical IV. EXPERIMENT SETUP
and categorical features respectively. yi ∈ {0, 1}, 1 means In this section, we describe the experimental settings in detail,
default and 0 means no default. For the GBDT2NN module in including compared models, data description, and specific
OICSM, we use Doff = {(fNum , fCatToNum , y)} to train GBDT experimental design.
model, where fCatToNum denotes numeric features converted
by categorical features through a certain feature engineering A. COMPARED MODELS
method. The feature engineering method used here will be In our experiments, we select the following baseline models
described in section IV. For a trained GBDT model, we use to compare with OICSM:
Equation 7 to obtain the leaf embedding of tree groups. Thus, • Logistic Regression (LR): LR is widely used in the
a GBDT2NN model can be learned by using Equation 8. construction of credit scoring models [11], [12].
For the CatNN module, unlike GBDT2NN, a feature engi- • GBDT: GBDT is a very popular tree-based algorithm
neering method is used to convert numeric features into cate- with good performance, and is widely used in the con-
gorical features. Doff = {(fNumToCat , fCat , y)} is used to train struction of credit scoring models [2], [4]. There are mul-
the CatNN model. Finally, combine GBDT2NN and CatNN tiple variants of GBDT, we select LightGBM [27] in this
paper. It is good at learning over numeric features, but TABLE 1. Details of datasets used in experiments. Sample is the number
of samples. Num and Cat is the number of numerical and categorical
cannot process categorical features well, and it cannot features, respectively. Qn is nth quarter.
be updated online.
• Wide&Deep: Wide&Deep [36] is designed for recom-
mender systems and [1] builts a credit scoring model
based on it. It is a deep learning framework composed
of a shallow linear model and a deep natural network.
• DeepFM: DeepFM [37] improves the Wide&Deep TABLE 2. Details of the credit datasets used in offline experiments.
learning framework by adding an additional FM com-
ponent. In this paper, we use DeepFM as basic CatNN.
Both it and Wide&Deep are good at learning over cat-
egorical features and can be updated online, but they
cannot process numerical features well.
• GBDT2NN: GBDT2NN is a part of the OICSM pro- TABLE 3. Details of batch data division for LC and PPD credit datasets.
Sample is the number of samples. Qn is nth quarter.
posed in this paper. Similar to GBDT, it can learn over
numerical features and not good at process categorical
features. But different from GBDT, it can be updated
online.
B. DATA DESCRIPTION
To verify the effectiveness of OICSM, lending club (LC) in
the United States and Paipaidai (PPD) in China are chose as
the test datasets.
For LC, we select its published credit data from 2015 to
2017. This dataset contains more than 800,000 items and For LC credit dataset, the data in 2015 is used as training set,
more than 100 features. For PPD, we select its published and the data in 2016Q1-2016Q2 (Qn is nth quarter) is used as
credit data from 2013-11 to 2014-11. This dataset contains test set. For PPD credit dataset, the data in 2013.11-2014.08 is
more than 80,000 items and more than 200 features. used as training set, and the remaining data is used as test set.
Since the original datasets contain a lot of post-loan fea- The details of divided datasets are shown in Table 2.
tures and noise data, we execute pre-processing operations
including deleting post-loan features, deleting features with 2) ONLINE EXPERIMENT
smaller variances, deleting items and features with a lot of There are two purpose of this experiment, to verify whether
missing values, and ignoring unfinished loan items. After the models that can be updated online are better than the mod-
then, the details of the two datasets used in our experiments els that cannot, and to verify whether our model is superior
are shown in Table 1. to other baseline models that can be updated online. Next,
In addition, we execute different feature engineerings for we will detail the experimental design from two aspects: the
different baseline models to improve their performance and division of batch data, and the model training approach.
increase the credibility of comparative experiments. Specif- First, in terms of batch data division, we divide each
ically, for models that cannot learn over categorical features credit dataset into 6 consecutive batches (Batches 0 to 5)
such as GBDT, we use label-encoding [38] and binary encod- according to the time slice. Specifically, for LC credit dataset,
ing methods to convert categorical features to numerical we use the data in 2015 as Batch 0, and the data in
features. For models cannot learn over numerical features 2016Q1-2017Q1 are divided into 5 consecutive Batches 1 to
(LR, DeepFM and Wide&Deep), we discretize the numerical 5, which use quarter (Q) as time slice. For PPD credit dataset,
features into categorical features. After above processing, the data in 2013.11-2014.05 is used as Batch 0, and the
each baseline model can use the information of all features. remaining data is divided into 5 consecutive batches which
use month (M) as time slice. The specific details of the
C. EXPERIMENTAL DESIGN divided datasets are shown in Table 3.
We design two experiments, execute offline and online Here, we use quarter (Q) and month (M) as time slices
respectively, to verify the effectiveness and superiority of because it is more convenient to observe the online update
proposed OICSM. process and performance changes of the models along with
the time. In real application, smaller time slices can be used
1) OFFLINE EXPERIMENT as required. Moreover, the smaller the time slices, the better
The purpose of this experiment is to verify the offline per- the performance will be. To verify this, we also design a
formance of OICSM, that is, the effectiveness on learning comparison experiment by introducing smaller time slices,
over two different features. To imitate real business scenarios, specifically, for LC credit dataset, we use quarter (Q) and
we divide each dataset into two parts based on time stamps. month (M) as time slices respectively for comparison; and for
PPD credit dataset, we use month (M) and half month (HM)
as time slices respectively for comparison.
Second, in terms of model training, we distinguish two
FIGURE 3. Online performance comparison on LC dataset.
types of models based on whether they can be updated online.
For updatable models, including Wide&Deep, DeepFM,
GBDT2NN and OICSM, they are trained by using the data of
each batch along with time. Specifically, for ith batch, we use
the samples only in that batch to train or update the model.
Samples in (i + 1)th are used for evaluation.
For non-updatable models, including GBDT and the offline
version of OICSM (denote as OICSM-off), we only use Batch
0 to train them, and then use the trained models to predict
Bathc 1-Batch 5 separately, without updating the model.
V. RESULTS DISCUSSION
The experimental results are analyzed and discussed in this
section. Considering that the credit data is unbalanced and
the overall accuracy is not appropriate to evaluate the models, FIGURE 4. Online performance comparison on PPD dataset.
we use area under roc curve (AUC) [7] as a performance
evaluation indicator. All experiments run 5 times, each time and numerical features at the same time effectively. This
using a different random number seed. superiority of OICSM shows that it is very suitable for
P2P lending credit scoring.
A. OFFLINE EXPERIMENT RESULTS
The offline experimental results of all models on two credit B. ONLINE EXPERIMENT RESULTS
datasets are shown in Table 4. The results show that: The online experimental results of all models on LC and PPD
• LR has the worst performance. Because LR is difficult datasets are shown in Tables 5 to 6. From the results of AUC
to fit the true distribution of massive and complex credit scores, we can see that, with the addition of each batch data,
data. the performance of all models changes accordingly. To show
• GBDT performs better than Wide&Deep and DeepFM these changes more clearly, the AUC scores in Table 5 and
on both datasets. We can see that the number of Table 6 are plotted as figures in Figure 4 and Figure 5 respec-
numerical features is significantly more than categorical tively. From the results we can see that:
features in both LC and PPD credit datasets. GBDT • For non-updatable models of GBDT and OICSM-off,
performs better than NN models (Wide&Deep and on Batch 1, they have good performance on both the
DeepFM) in learning tasks with more numerical fea- two credit datasets. However, because they cannot be
tures. updated online, their performance drops rapidly after
• GBDT2NN, as an integral part of OICSM, is distilled by Batch 1. when Batch 5 arrives, the performance of
GBDT. The experimental results show that GBDT2NN GBDT become the worst.
is superior to GBDT in both credit datasets. This indi- • For updatable baseline models of Wide&Deep, DeepFM
cates that, for credit datasets that contain both numerical and GBDT2NN, their performance on the LC credit
and categorical features, GBDT2NN can improve the dataset also gradually decline after Batch 1, but the
performance of GBDT through knowledge distillation. decline rate is slower than GBDT and OICSM-off.
• OICSM is superior to all other baseline models. The On the PPD credit dataset, their performance are rela-
results show that OICSM increases AUC by 1%-7% tively stable. This proves that these updatable models
compared to other four baseline models and even more have obvious advantages than non-updatable models.
to LR. Because OICSM combines the advantages of It is necessary to update credit scoring model online in
GBDT and NN, it has the ability to deal with categorical time with the newly generated data.
TABLE 7. The AUC scores of OICSM using different time slices on LC dataset. Q is quarter and M is month.
TABLE 8. The AUC scores of OICSM using different time slices on PPD dataset. M is month and HM is half month.
• In addition, on these two credit datasets, the performance As shown in Table 7, the AUC scores of OICSM (M) are
of GBDT2NN is better than Wide&Deep and DeepFM. higher than OICSM (Q) in all batch stages. Similarly
This indicates that, for credit datasets that contain more in Table 8, the AUC scores of OICSM (HM) are higher than
numerical features than categorical features, GBDT2NN OICSM (M) in all batch stages. Thus, the performance based
is superior than other models with NN structures. on the smaller time slice is better. This result further verifies
• Finally, the performance of OICSM is better than all that updating the credit scoring model in a timely manner can
baseline models. Because it can not only use the newly improve the classification performance and stability of the
generated batch data to update the model online dynam- model, and avoid the model deviation caused by changes in
ically, but also learn over the categorical and numerical the data distribution.
features at the same time effectively.
In addition, the smaller the time slices, the better the C. EXPERIMENT SUMMARY
performance of models. We design a comparison experi- In summary, the following conclusions can be drawn from the
ment. The experimental results on LC and PPD datasets are experimental results:
shown in Tables 7 to 8 respectively. OICSM (Q) and OICSM • In offline experiment, the performance of OICSM is
(M) in Table 7 denote the results using quarter (Q) and better than all baseline models, which shows that an
month (M) as time slices respectively. OICSM (M) and effective credit scoring model needs the capability to
OICSM (HM) in Table 8 denote the results using month learning over both the categorical and numerical features
(M) and half month (HM) as time slices respectively. simultaneously.
Please note that in order to show the comparison results • In online experiment, the performance of updatable
more clearly, for the results using smaller time slices, models is better than non-updatable models, which
we take the average value to compare. For example, for the shows that it is necessary to use the newly generated data
2016Q1 credit dataset of LC, we first divide it into three to update the model online dynamically to correct the
subsets as 2016-01, 2016-02 and 2016-03, then repeat the model deviation caused by changes in the data distribu-
online experiments on these three subsets respectively to get tion.
three AUC scores. Finally, the average value of three AUC • Combining offline and online experiments, the per-
scores is used to present the performance of OICSM (M). formance of OICSM is better than all baseline mod-
Meanwhile, the AUC score of OICSM (Q) on 2016Q1 can els, which shows that our model is effective, and
be obtained from Table 5. Through above processing,OICSM can simultaneously solve the two problems in existing
(Q) and OICSM (M) are comparable on same dataset. models.
VI. CONCLUSION [16] L. Ma, X. Huo, X. Zhao, and G. D. Zong, ‘‘Observer-based adaptive
In this paper, we propose a new credit scoring model OICSM neural tracking control for output-constrained switched MIMO nonstrict-
feedback nonlinear systems with unknown dead zone,’’ Nonlinear Dyn.,
for P2P lending. OICSM is composed of two parts. This vol. 99, no. 2, pp. 1019–1036, Jan. 2020.
integration can not only learning over two features simulta- [17] C. Deng, W.-W. Che, and P. Shi, ‘‘Cooperative fault-tolerant output regu-
neously, but also update online dynamically using the batch lation for multiagent systems by distributed learning control approach,’’
IEEE Trans. Neural Netw. Learn. Syst., early access, Dec. 31, 2020,
processing capability of its NN structure. In order to verify the doi: 10.1109/TNNLS.2019.2958151.
effectiveness and superiority of proposed OICSM, we select [18] V. Kozeny, ‘‘Genetic algorithms for credit scoring: Alternative fitness
two real and representative credit datasets of P2P Lending and function performance comparison,’’ Expert Syst. Appl., vol. 42, no. 6,
pp. 2998–3004, 2015.
design offline and online experiments. Experimental results [19] S. Maldonado, J. Pérez, and C. Bravo, ‘‘Cost-based feature selection for
demonstrate that OICSM outperforms all other baseline mod- support vector machines: An application in credit scoring,’’ Eur. J. Oper.
els. This method has a cold start problem. To solve this Res., vol. 261, no. 2, pp. 656–665, Sep. 2017.
[20] S. Finlay, ‘‘Multiple classifier architectures and their application to credit
problem, we will try to use the transfer learning method risk assessment,’’ Eur. J. Oper. Res., vol. 210, no. 2, pp. 368–378,
in the future. OICSM can make a more accurate assess- Apr. 2011.
ment of loan applicant’s credit and is especially suitable for [21] G. Wang, J. Hao, J. Ma, and H. Jiang, ‘‘A comparative assessment of
ensemble learning for credit scoring,’’ Expert Syst. Appl., vol. 38, no. 1,
P2P Lending with very frequent transactions and massive pp. 223–230, Jan. 2011.
users. [22] Y. Xia, C. Liu, B. Da, and F. Xie, ‘‘A novel heterogeneous ensemble credit
scoring model based on bstacking approach,’’ Expert Syst. Appl., vol. 93,
pp. 182–199, Mar. 2018.
REFERENCES [23] X. Qiu, Y. Zuo, and G. Liu, ‘‘ETCF: An ensemble model for CTR predic-
[1] K. Bastani, E. Asgari, and H. Namavari, ‘‘Wide and deep learning tion,’’ in Proc. 15th Int. Conf. Service Syst. Service Manage. (ICSSSM),
for peer-to-peer lending,’’ Expert Syst. Appl., vol. 134, pp. 209–224, Jul. 2018, pp. 1–5.
Nov. 2019. [24] C. J. Burges, ‘‘From ranknet to lambdarank to lambdamart: An overview,’’
[2] X. Ma, J. Sha, D. Wang, Y. Yu, Q. Yang, and X. Niu, ‘‘Study on a Microsoft Res., Redmond, WA, USA, Tech. Rep. MSR-TR-2010-82, 2010.
prediction of P2P network loan default based on the machine learning [25] T. Chen and C. Guestrin, ‘‘XGBoost: A scalable tree boosting system,’’
LightGBM and XGboost algorithms according to different high dimen- in Proc. 22nd ACM SIGKDD Int. Conf. Knowl. Discovery Data Mining,
sional data cleaning,’’ Electron. Commerce Res. Appl., vol. 31, pp. 24–39, 2016, pp. 785–794.
Sep. 2018. [26] V. Sugumaran, V. Muralidharan, and K. I. Ramachandran, ‘‘Feature selec-
[3] D. Babaev, M. Savchenko, A. Tuzhilin, and D. Umerenkov, ‘‘ET- tion using decision tree and classification through proximal support vector
RNN: Applying deep learning to credit loan applications,’’ in Proc. machine for fault diagnostics of roller bearing,’’ Mech. Syst. Signal Pro-
25th ACM SIGKDD Int. Conf. Knowl. Discovery Data Mining, 2019, cess., vol. 21, no. 2, pp. 930–942, Feb. 2007.
pp. 2183–2190. [27] G. Ke, Q. Meng, T. Finley, T. Wang, W. Chen, W. Ma, Q. Ye, and T.-Y. Liu,
[4] Y.-C. Chang, K.-H. Chang, and G.-J. Wu, ‘‘Application of eXtreme gra- ‘‘LightGBM: A highly efficient gradient boosting decision tree,’’ in Proc.
dient boosting trees in the construction of credit risk assessment mod- 31st Int. Conf. Neural Inf. Process. Syst., 2017, pp. 3149–3157.
els for financial institutions,’’ Appl. Soft Comput., vol. 73, pp. 914–920, [28] T. Mikolov, K. Chen, G. S. Corrado, and J. Dean, ‘‘Efficient estimation of
Dec. 2018. word representations in vector space,’’ in Proc. ICLR (Workshop Poster),
[5] C. Wang, D. Han, Q. Liu, and S. Luo, ‘‘A deep learning approach for credit 2013, pp. 1–12.
scoring of Peer-to-Peer lending using attention mechanism LSTM,’’ IEEE [29] S. Rendle, ‘‘Factorization machines,’’ in Proc. IEEE Int. Conf. Data Min-
Access, vol. 7, pp. 2161–2168, 2019. ing, Dec. 2010, pp. 995–1000.
[6] Y. Liu, X. Li, and Z. Zhang, ‘‘A new approach in reject inference of using [30] M. Fernández-Delgado, E. Cernadas, S. Barro, and D. Amorim, ‘‘Do we
ensemble learning based on global semi-supervised framework,’’ Future need hundreds of classifiers to solve real world classification problems,’’
Gener. Comput. Syst., vol. 109, pp. 382–391, Aug. 2020. J. Mach. Learn. Res., vol. 15, no. 1, pp. 3133–3181, 2014.
[7] K. Niu, Z. Zhang, Y. Liu, and R. Li, ‘‘Resampling ensemble model based [31] C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan,
on data distribution for imbalanced credit risk evaluation in P2P lending,’’ V. Vanhoucke, and A. Rabinovich, ‘‘Going deeper with convolutions,’’
Inf. Sci., vol. 536, pp. 120–134, Oct. 2020. in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), Jun. 2015,
[8] G. Ke, Z. Xu, J. Zhang, J. Bian, and T.-Y. Liu, ‘‘DeepGBM: A deep pp. 1–9.
learning framework distilled by GBDT for online prediction tasks,’’ in [32] K. D. Humbird, J. L. Peterson, and R. G. Mcclarren, ‘‘Deep neural network
Proc. 25th ACM SIGKDD Int. Conf. Knowl. Discovery Data Mining, 2019, initialization with decision trees,’’ IEEE Trans. Neural Netw. Learn. Syst.,
pp. 384–394. vol. 30, no. 5, pp. 1286–1295, May 2019.
[9] D. Durand, ‘‘Risk elements in consumer instalment financing,’’ in NBER [33] X. Ling, W. Deng, C. Gu, H. Zhou, C. Li, and F. Sun, ‘‘Model ensemble for
Books. Cambridge, MA, USA: National Bureau of Economic Research, click prediction in bing search ads,’’ in Proc. 26th Int. Conf. World Wide
Inc., 1941. Web Companion (WWW Companion), 2017, pp. 689–698.
[10] Y. E. Orgler, ‘‘A credit scoring model for commercial loans,’’ J. Money, [34] J. Zhu, Y. Shan, J. Mao, D. Yu, H. Rahmanian, and Y. Zhang, ‘‘Deep
Credit Banking, vol. 2, no. 4, pp. 435–445, 1970. embedding forest: Forest-based serving with deep embedding features,’’
[11] E. I. Altman and G. Sabato, ‘‘Modeling credit risk for SMEs: Evidence in Proc. 23rd ACM SIGKDD Int. Conf. Knowl. Discovery Data Mining,
from the US market,’’ in World Scientific Book Chapters. Singapore: World 2017, pp. 1703–1711.
Scientific, 2013, pp. 251–279. [35] X. Chen, Z. Liu, M. Zhong, X. Liu, and P. Song, ‘‘A deep learning approach
[12] S. Y. Sohn, D. H. Kim, and J. H. Yoon, ‘‘Technology credit scoring model using DeepGBM for credit assessment,’’ in Proc. Int. Conf. Robot., Intell.
with fuzzy logistic regression,’’ Appl. Soft Comput., vol. 43, pp. 150–158, Control Artif. Intell. (RICAI), 2019, pp. 774–779.
Jun. 2016. [36] H.-T. Cheng, L. Koc, J. Harmsen, T. Shaked, T. Chandra, H. Aradhye,
[13] Y. Xia, C. Liu, Y. Li, and N. Liu, ‘‘A boosted decision tree approach using G. Anderson, G. Corrado, W. Chai, M. Ispir, R. Anil, Z. Haque, L. Hong,
Bayesian hyper-parameter optimization for credit scoring,’’ Expert Syst. V. Jain, X. Liu, and H. Shah, ‘‘Wide & deep learning for recommender
Appl., vol. 78, pp. 225–241, Jul. 2017. systems,’’ in Proc. 1st Workshop Deep Learn. Recommender Syst., 2016,
[14] N.-C. Hsieh and L.-P. Hung, ‘‘A data driven ensemble classifier for pp. 7–10.
credit scoring analysis,’’ Expert Syst. Appl., vol. 37, no. 1, pp. 534–545, [37] H. Guo, R. Tang, Y. Ye, Z. Li, and X. He, ‘‘DeepFM: A factorization-
Jan. 2010. machine based neural network for CTR prediction,’’ in Proc. 26th Int. Joint
[15] Z. Zhao, S. Xu, B. H. Kang, M. M. J. Kabir, Y. Liu, and R. Wasinger, Conf. Artif. Intell., Aug. 2017, pp. 1725–1731.
‘‘Investigation and improvement of multi-layer perceptron neural networks [38] A. Veronika Dorogush, V. Ershov, and A. Gulin, ‘‘CatBoost: Gradient
for credit scoring,’’ Expert Syst. Appl., vol. 42, no. 7, pp. 3508–3516, boosting with categorical features support,’’ 2018, arXiv:1810.11363.
May 2015. [Online]. Available: http://arxiv.org/abs/1810.11363
ZAIMEI ZHANG received the Ph.D. degree in YAN LIU received the Ph.D. degree in computer
management science and engineering from Hunan science and technology from Hunan University,
University, China, in 2011. She is currently an China, in 2010. He is currently an Associate Pro-
Assistant Professor with the College of Economics fessor with the College of Computer Science and
and Management, Changsha University of Sci- Electronic Engineering, Hunan University, China.
ence and Technology, China. Her research inter- His research areas include big data, artificial intel-
ests include financial engineering, big data, and ligence, and parallel and distributed systems.
artificial intelligence.