Credit Scoring With A Feature Selection Approach Based Deep Learning PDF
Credit Scoring With A Feature Selection Approach Based Deep Learning PDF
1051/ matecconf/20165405004
MIMT 2016
Abstract. In financial risk, credit risk management is one of the most important issues in financial decision-making.
Reliable credit scoring models are crucial for financial agencies to evaluate credit applications and have been widely
studied in the field of machine learning and statistics. Deep learning is a powerful classification tool which is
currently an active research area and successfully solves classification problems in many domains. Deep Learning
provides training stability, generalization, and scalability with big data. Deep Learning is quickly becoming the
algorithm of choice for the highest predictive accuracy. Feature selection is a process of selecting a subset of relevant
features, which can decrease the dimensionality, reduce the running time, and improve the accuracy of classifiers. In
this study, we constructed a credit scoring model based on deep learning and feature selection to evaluate the
applicant’s credit score from the applicant’s input features. Two public datasets, Australia and German credit ones,
have been used to test our method. The experimental results of the real world data showed that the proposed method
results in a higher prediction rate than a baseline method for some certain datasets and also shows comparable and
sometimes better performance than the feature selection methods widely used in credit scoring.
© The Authors, published by EDP Sciences. This is an open access article distributed under the terms of the Creative Commons Attribution
License 4.0 (http://creativecommons.org/licenses/by/4.0/).
MATEC Web of Conferences 54, 05004 (2016) DOI: 10.1051/ matecconf/20165405004
MIMT 2016
2
MATEC Web of Conferences 54, 05004 (2016) DOI: 10.1051/ matecconf/20165405004
MIMT 2016
In step 2, we use deep learning with n-fold cross tried with value of 10. The averages of classification
validation to train the classifier. In the jth cross validation, results are depicted in Fig. 1.
we will obtain a set of (Fj, Ajlearn, Ajvalidation) that are the The best subset contains 19 features and its accuracy
feature importance, the learning accuracy and the is 74.68 %.
validation accuracy respectively.
We will use those values to compute the score
criterion in step 3.
In step 3 we use the results from step 1 and step 2 to
build the score criterion which will be used in step 4. The
score of feature ith is calculated by:
(1)
3
MATEC Web of Conferences 54, 05004 (2016) DOI: 10.1051/ matecconf/20165405004
MIMT 2016
4
MATEC Web of Conferences 54, 05004 (2016) DOI: 10.1051/ matecconf/20165405004
MIMT 2016