0% found this document useful (0 votes)

150 views12 pages

SVM - Report

Credit scoring is the set of decision models and techniques that aid lenders in granting consumer credit by assessing the risk of lending to different consumers. Logistic regression and discriminant analysis are techniques traditionally used in credit scoring. Support vector machines can be used as the basis of a feature selection method to discover those features that are most significant in determining risk of default.

Uploaded by

Gautam Praveen

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

150 views12 pages

SVM - Report

Uploaded by

Gautam Praveen

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 12

Support vector machines for credit scoring and

discovery of significant features

Tony Bellotti and Jonathan Crook (7 May 2007)

Credit Research Centre, School of Management, University of Edinburgh, William

Robertson Building, 50 George Square, Edinburgh EH8 9JY, UK

Abstract
The assessment of risk of default on credit is important for financial institutions.
Logistic regression and discriminant analysis are techniques traditionally used in
credit scoring for determining likelihood to default based on consumer application
and credit reference agency data. We test support vector machines against these
traditional methods on a large credit card database. We find that they are competitive
and can be used as the basis of a feature selection method to discover those features
that are most significant in determining risk of default.

1. Introduction
Credit scoring is the set of decision models and techniques that aid lenders in granting
consumer credit by assessing the risk of lending to different consumers. It is an
important area of research that enables financial institutions to develop lending
strategies to optimise profit. Additionally, bad debt is a growing social problem that
could be tackled partly by better informed lending enabled by more accurate credit
scoring models. A range of different data mining and statistical techniques have been
used since the 1930’s when numerical score cards were first introduced by mail-order
companies (Thomas et al. 2002, Section 1.3). It is now common for financial
institutions to use statistical methods such as logistic regression (LR) and linear
discriminant analysis (LDA) to build credit scoring models. Potential borrowers are
classified according to their probability to default on a loan, based on application and
credit reference agency data collected about them. Such models are used by setting a
threshold on the probability to default and rejecting loan applications that fall below
this level.

Support vector machines (SVMs) have been applied successfully in many

classification problems such as text categorisation, image recognition and gene
expression analysis (eg see Cristianini and Shawe-Taylor (2000), chapter 8).
Experiments using SVM for credit scoring are relatively new, however. Several
papers have recently been published assessing the performance of SVM for credit
scoring. Baesens et al. (2003) apply SVMs, along with other classifiers to several
data sets. They report that SVMs perform well in comparison with other algorithms,
but do not always give the best performance. Schebesch and Stecking (2005) apply
SVM to a database of applicants for building and loan credit. They conclude that
SVMs perform slightly better than LR, but not significantly so. They also use SVMs,
with its capacity to output support vectors, to discover typical and critical regions of
the problem space. Both papers report using linear SVM and a Gaussian radial basis
function (RBF) kernel. In both cases, the size of the credit database is much smaller
than would typically be used in a real application. Van Gestel et al (2006) use least
squares SVMs with a Bayesian kernel to derive classifiers for corporate bankruptcy.
They find no significant difference between SVM, LR and LDA in terms of
proportion of test cases correctly classified and no difference between LR and SVM
in terms of area under the ROC curve. Li et al (2006) find SVMs outperform multi-
layer perceptrons for consumer credit data, but their results are also based on a small
sample size. Huang et al (2004) compare SVMs with a back-propagation neural
network to predict corporate credit ratings but find inconsequential differences in
performance. Lee (2007) find a similar result for corporate loans. Dikken (2005)
finds SVMs to be inferior to LR when modelling corporate credit risk. Huang et al
(2007) find SVMs classify credit applications no more accurately than artificial neural
nets (ANN), decision trees or genetic algorithms (GA), and compared the relative
importance of using features selected by GA and SVM along with ANN and genetic
programming. However, they use data sets far smaller and with fewer features than
would be used by a financial institution and do not compare the features selected by
SVM alone, nor do they compare with methods used in practice such as LR.

In this paper, our general framework is to compare the performance of SVM against
several other well-known algorithms: LR, LDA and k-nearest neighbours (kNN). We
extend the work on assessing SVM for credit scoring in several ways.

1. SVM is tested against a much larger database of credit card customers than
has been considered in the literature so far. We restrict our attention to those
accounts opened in the same three month period. Hand (2006) points out that
for many classification problems, the data suffers from population drift, in that
the class distributions shift over time. This is particularly true of credit data
with customer behaviour changing over time due to economic circumstances
or changes in product development and marketing. For this reason a clearer
model can be developed if it is based on data taken from a narrow time period
within which there is likely to be less variability in these circumstances.

2. SVM is tested with a polynomial kernel to determine if a non-linear

polynomial decision space yields better performance than linear SVM or using
the Gaussian RBF kernel.

3. SVM performance is assessed in light of the number of support vectors

required to model the data.

4. Financial institutions are primarily interested in determining which consumers

are most likely to default on loans. However, they are also interested in
knowing which characteristics of a consumer are most likely to affect their
likelihood to default. For example, is a home-owner less likely to default than
a tenant? This information allows credit modellers to stress test their
predictions. Traditionally a test of significance of features is used to discover
these characteristics. When a LR problem is solved using maximum likelihood
estimation, a Wald statistic can be computed for each feature which is then
used to determine significance. As an alternative, we follow Guyon et al.
(2002) who select significant features in the data using the square of weights
on features output by SVM. We apply this approach to select the top ranking
features that are significant for credit scoring. This selection is compared with
that given using LR.
2. Data
A data set of approximately 25,000 customers with credit cards opened in the same
three month period of 2004 was provided by a major financial institution. Four
different credit card products are represented in the data. A customer is defined to
have defaulted if he or she has become three months or more behind in their payments
within the first 12 months after the account is opened. Other definitions of default can
be used, but this one is common in credit scoring (Thomas, et al. 2002, Section 8.3).
Defaulting customers are referred to as bad cases and all others as good cases. The
data includes 34 features taken from each customer’s original application along with
features extracted from a credit reference agency at the time of application. The data
is standardized before use, so each feature has the same mean (0) and variance (1).

Typically, credit data is not easily separable by any decision surface. This is natural
since the data at time of application cannot capture the complexities in each individual
customer’s life that may lead to default. The application data can at best only provide
an indication of default. Consequently, it is usual for the rates of misclassification on
credit data to be between around 20% and 30% (eg see Baesens et al, 2003). This
would be considered a poor result for many other classification problems but is
typical of credit data. The poor separability of the credit data is illustrated in Figure 1.
The good cases tend to cluster towards the bottom-right and the bad towards the top-
left, but this is only a very general trend and there is no clear separation.

Figure 1. Illustration the of poor separability of the credit data.

Partial least squares was used to transform data for 100 good cases (black) and 100
bad cases (white) selected randomly from the data into two factors given as the x
and y axis of the graph.

3. Methods
The SVM is a relatively new learning algorithm that can be used for classification.
We compare its performance against three older statistical classification methods: LR,
LDA and kNN. All algorithms are described briefly below for a sequence of n
training examples (x1 , y1 , … , x n , y n ) with feature vectors xi and class labels y i . For
credit scoring, the class label is either bad or good.
3.1. Support Vector Machine (SVM) classifier
SVM separates binary classified data by a hyperplane such that the margin width
between the hyperplane and the examples is maximized. Statistical learning theory
shows that maximizing the margin width reduces the complexity of the model,
consequently reducing the expected general risk of error. For problems where data is
not separable by a hyperplane, typical of most real-world classification problems, a
soft margin is used. In this case, training examples are allowed some slack to be on
the wrong side of the margin. However, they accrue a penalty proportional to how far
they are on the wrong side. The sum of the penalties is minimized whilst maximizing
the margin width. A parameter C controls the relative cost of each goal in the overall
optimization problem. The SVM optimization problem can be expressed
algebraically as a dual form quadratic programming problem.

Let y i ∈ {− 1,+1} for all i=1 to n. Then the SVM optimization problem is
 n 1 n 
max  ∑ α i + ∑ y i y j α iα j k (x i , x j ) 
α
 i =1 2 i , j =1 
subject to constraints
n
0 ≤ α i ≤ C for all i=1 to n and ∑yα
i =1
i i =0

where α i is a Lagrange multiplier for each training example i. The kernel function k
can be used to implement non-linear models of the data. For this paper, we consider
three commonly used kernels.

Linear model k (x i , x j ) = x i ⋅ x j

Polynomial model, with degree d k (x i , x j ) = (x i ⋅ x j + 1)

Gaussian radial basis function (RBF),  x −x 2


with parameter σ  i j 
k (x i , x j ) = exp − 
 2σ 2

 

Using non-linear kernels means that it is not feasible to extract an explicit scorecard,
although predictions can be made using them.

The vector of Lagrange multipliers α is sufficient to define the output decision rule.
A classification prediction is made on a new example x as
 n 
yˆ = sgn  ∑ α i y i k (x i , x) + b 
 i =1 
where b is a threshold term computed as
n
b = ∑ α i y i k (x i , x j ) for any j ∈ {1,… , n} such that 0 < α j < C .
i =1
Training examples are called “support vectors” (SVs) if they are on the margin or are
on the wrong side of the margin. This is because together they are sufficient to
“support” the optimal separating hyperplane, since only SVs are such that α i > 0 . It
follows that the decision rule can be expressed simply in terms of SVs. See Vapnik
(1998) and Cristianini and Shawe-Taylor (2000) for details about SVMs.

3.2. Logistic Regression (LR)

Let yi ∈ {0,1} for all i=1 to n. LR estimates the probability that the label is 1 for a
given example x using the model
1
P ( y = 1 | x) = .
1 + exp(− w ⋅ x − b )
Parameters w and b can be estimated using the maximum likelihood procedure to
maximize the log-likelihood function, with respect to w and b,
n
L(x1 , … , x n | w, b) = ∑ y i log p i + (1 − y i ) log(1 − p i ) where pi = P ( y = 1 | x i ) .
i =1
Non-linear LR models are allowed by including interaction variables chosen for
inclusion using a likelihood ratio test.

3.3. Linear Discriminant Analysis (LDA)

Let yi ∈ {0,1} for all i=1 to n. The Fisher discriminant criterion is given by the
function
(m − m0 )2
J (w, (x1 , y1 , … , x n , y n ) ) = 12
s1 + s 02
where m y and s y2 are in-class means and variances given by

my =
1
∑ w ⋅ x i , s y2 =
1
∑ (w ⋅ x i − m y )2 where C y = {i = 1,…, n | yi = y} .
| C y | i∈C y | C y | i∈C y

Maximizing the Fisher discriminant criterion yields a hyperplane 0 = w ⋅ x which

maximizes the distance between the means of the two classes in relation to their
variance. This is the optimization problem for LDA. Once a hyperplane is computed,
a prediction is made on a new example x as ŷ = sgn (w ⋅ x ) . Assuming a Gaussian
distribution of examples about the class mean, this method can also yield probabilities
for each class label prediction. Duda et al. (2001), Section 3.8.3, give further details.

3.4. k-Nearest Neighbours (kNN)

kNN is a nonparametric classification algorithm that uses a distance measure to make
predictions without building a model. The prediction for a new example x is given by
the majority class label within a neighbourhood of x in the training data. Formally,
yˆ = arg max k y where k y is the number of cases of class y amongst the k nearest
y

neighbours taken from the training set. The probability of the example belonging to
class y is estimated as pˆ = k y / k (Hand 1981, Section 2.4). We use the usual
Euclidean distance measure to determine the neighbourhood of an example. Henley
and Hand (1997) use kNN for credit scoring and compare with other methods
including LR.

3.5. Validation and assessment

The data set is randomly divided into a training and test set in the ratio 2:1, whilst
preserving the same proportion of bad cases in each. A hold-out procedure is then
used to assess each classification method. Experiments are repeated 10 times for
different random splits of the data to test that the performance measures are stable
with low standard deviation across the 10 permutations.

Error rates are reported on the test set as the proportion of test examples wrongly
classified. It is possible to set a threshold term for the decision rule output by each
algorithm to control the distribution of cases classified as good or bad. For example,
in LR, we set a threshold t and classify all examples x with P ( y = 1 | x) < t as good
( yˆ = 0 ). Otherwise x is classified as a bad case. The threshold setting depends on a
prior assumption of the relative cost of misclassifying good or bad cases. For
example, we expect that a bad case misclassified as good, and so given a loan, would
yield a greater loss – ie the loss of a substantial part of the loan value – than a good
case misclassified as bad, leading to a loan not being made and the subsequent loss of
interest payments. However, it is not reasonable to assume this relative cost for
assessment. Also, using error rates makes it difficult to compare algorithms with
different threshold terms that would lead to different distributions of misclassification
of good and bad cases. Therefore it is usual to measure performance with a receiver
operating characteristic (ROC) curve which plots sensitivity (true positive rate)
against 1-specificity (false negative rate) for the full range of possible threshold
values. This is a typical performance measure for credit scoring (eg Engelmann et al.
2003, Baesens et al. 2003). The area under the ROC curve (AUC) is used as a single
summary statistic for measuring performance and comparing algorithms (DeLong et
al. 1988). Note that a ROC curve is constructed for SVM by varying the threshold
term b. Reducing this threshold will increase the number of cases classified as bad
( yˆ = +1 ).

3.6. Parameter Tuning

Both SVM and kNN require parameters to be set prior to classification. These
parameters are tuned against a separate data set, independent of the data set used for
classification as described in Section 2. The tuning data set comprises 17,585
customers taken from 2003 for the same products. It is split into a training and test set,
again in a 2:1 ratio. A grid search is used to determine those parameter values that
maximize the AUC on the test set. SVM is tuned for a range of values of C from 10-9
to 104. Additionally the polynomial kernel is tuned for degrees d=2, 3 and 4; and the
Gaussian RBF kernel is tuned for values of σ from 10-8 to 10-3. kNN is tuned for
values of k from 50 to 6000.

3.7. Discovering significant features

The maximum likelihood method used for LR is typically used to determine the most
significant features in a model. The process of maximizing the likelihood function
yields a Wald statistic for each coefficient to test the null-hypothesis that its true value
is zero. A p-value is computed based on the Wald statistic using a chi-square test.
The lower the p-value, the less likely the true value is zero, hence the more significant
the feature with the coefficient is likely to be for the model. We set a significance
level of 0.01 and select those features with p-values less than this.

Guyon et al. (2002) propose using the square of the weights from the hyperplane
generated by SVM as a feature selection criterion. They show this will minimize
generalized risk and apply the technique to cancer classification. They used a
recursive procedure, removing a few features at a time. However, since for the credit
scoring problem, there are relatively few features to begin with, we do not need to
apply a recursive procedure. We simply use the magnitude of weights on features as a
feature selection criterion. We set a threshold of 0.1 and all features with weights
greater than this will be selected as significant features. This threshold level is chosen
since we found it yields approximately the same number of features as the LR method
described above. Since the data is standardized, it is reasonable to directly compare
the magnitudes of weights on different features.

4. Results
Results are given in this section for pre-classification parameter tuning, algorithm
comparison and significant feature discovery using LR and SVM.

4.1. Parameter tuning

Table 1 shows the optimal parameter values that maximize test AUC for SVM on the
tuning data set.

Table 1. Optimal parameter values.

SVM kernel Optimal parameters #SV Training Test
AUC AUC
Linear C=0.001 7961 0.796 0.791
Polynomial C=10-5, d=2 9458 0.808 0.758
Gaussian RBF C=1000, σ = 10 − 6 7803 0.796 0.791
#SV = number of support vectors

It is interesting that the non-linear kernels do not perform better than the simple linear
model. In particular, the polynomial kernel performs poorly. This may be because this
non-linear model is over-fitting the data. This is evident in the difference between the
relatively high training AUC and low test AUC. However, the results are not
sufficient to assert this conclusively.

The best results are achieved when large numbers of SVs are extracted. Over 50% of
training examples are SVs. This is due to the fact that credit data is not easily
separable by any decision surface as explained in Section 2, so many of the training
examples remain misclassified.

For kNN, test AUC was stable at over 0.760 for values of k between 500 and 4000. It
is usual for performance to be stable across a wide range of values of k for large
training sets (Olsson 2006). We choose the mid-range figure k=2000 as an optimal
value for the following comparative experiments.

4.2. Comparing algorithms

Table 2 shows results on the test set. Mean and standard deviations of AUC are given
for the 10 experiments with different random splits of the data into training and test
sets. The optimal parameter settings found in the previous section are used for these
experiments.

Table 2. Performance for different algorithms

Classification algorithm Test AUC Standard
(mean) deviation
LR 0.779 0.0063
LR with interaction variables 0.777 0.0076
SVM: linear 0.783 0.0055
SVM: polynomial 0.755 0.0068
SVM: Gaussian RBF 0.783 0.0053
LDA 0.781 0.0058
kNN 0.756 0.0063

The standard deviations are relatively low (less than 1% of mean AUC) indicating
that the measured performance is stable.

SVM with a linear or Gaussian model performs best yielding the highest AUC.
However, the differences in performance are small and are not significant. Schebesch
and Stecking (2005) reach a similar conclusion with their experiments.

The only algorithms that stand out as particularly poor are SVM with polynomial
kernel and kNN. As mentioned in Section 4.1, we suspect the polynomial kernel over-
fits the training data. The poor result with kNN corroborates the results given by
Baesens et al (2003).

Neither LR with interaction variables nor SVM with non-linear kernels give an
improvement over the simpler models. This indicates that the data is broadly linearly
separable. Gayler (2006) has argued that interaction variables are less stable than the
main effects and they would usually only be included in a model if the modeller has
prior belief in their relevance to credit scoring. Our results tend to support this view.

Figure 2 shows typical ROC curves taken from one experiment. It is clear that the
ROC curve for SVM, LR and LDA are all very similar. The only algorithm which
gives a distinctly poor ROC curve is kNN which is outperformed by SVM across the
whole range of the graph.

Figure 2. ROC curves for performance on test data, comparing the performance of
linear SVM with LR, LDA and kNN.

SVM (unbroken line) and LR (broken line) SVM (unbroken line) and LDA (broken line)
SVM (unbroken line) and kNN (broken line)

Error rates can be derived by setting a cut-off threshold for each model and predicting
those test cases with a score computed from the model below the cut-off as bads and
those above as goods. Error rates are given by comparing predicted against actual
classifications across the test set. However, error rates are not comparative since each
classifier will yield a different distribution of errors on good and bad cases. This is
why using AUC is a better comparative measure, since it measures predictive
performance across all possible chosen cut-off thresholds. Nevertheless, it is
interesting to review error rates on good and bad cases for SVM to ensure that they
represent an acceptable performance. The natural cut-off for SVM is the threshold
term b described in Section 3.1. For linear SVM the error rates for good and bad
cases in the test set are 27.4% and 29.6% respectively and for SVM with a Gaussian
RBF kernel the error rates for good and bad cases are 27.2% and 30.1%. These
outcomes are within the range of error rates we would expect from predicting default
in credit data, as we discussed in Section 2.

4.3. Significant features

Table 3 shows significant features selected using LR and SVM with a linear model as
described in Section 3.7. A count of the number of times each feature is selected in
each of the 10 experiments is made. Only those features with a count greater than 4
for either the LR or SVM selection methods are reported. Typical weights and
coefficient estimates are also given for each variable.

Table 3. Significant features.

Feature SVM LR Typical Typical LR
count count SVM coefficient
weight estimate
F1: Home owner 10 10 -0.280 -0.357
F2: Time with bank 10 10 -0.180 -0.240
F3: Insurance required 10 10 +0.346 +0.337
F4: No. settled non-mail order CAIS 10 10 -0.187 -0.230
accounts
F5: Total outstanding balance excluding 10 10 +0.325 +0.357
mortgages on all active CAIS accounts
F6: Total no. of credit searches in last 6 10 10 +0.208 +0.221
months
F7: Worst account status (0 to 99) 10 10 +0.248 +0.211
F8: Age 10 10 * *
F9: Product (type of credit card) 10 10 * *
F10: Time since most delinquent account 7 7 +0.139 +0.112
F11: UK Mosaic code 7 5 +0.103 +0.129
*These variables are course classified into several indicator variables. Therefore there are several
weights and coefficient estimates associated with each of them.

The direction of SVM weights and LR coefficient estimates is the same and indicates
how each feature contributes to the risk of default. A positive value indicates higher
risk and a negative value a lower risk. For example, an applicant who has already
applied for credit several times (F6) will be more likely to default and a home owner
(F1) is less likely to default.

These results show that the two methods agree strongly on the most significant
features. The fact that two very different methods give the same results provides
further confidence that these features can be taken forward for use in credit scorecards
to determine the risk of default for individual applicants for credit. It shows that SVM
can be used successfully for feature selection in credit scoring.

5. Conclusions
SVMs are a relatively new technique for application to credit scoring. We test them
on a much larger credit data set than has been used in previous studies. We find that
SVMs are successful in comparison to established approaches to classifying credit
card customers who default. This corroborates the findings of previous researchers. In
addition, we find that, unlike many other learning tasks, a large number of support
vectors are required to achieve the best performance. This is due to the nature of the
credit data for which the available application data can only be broadly indicative of
default. Finally, we show that SVM can be used successfully as a feature selection
method to determine those application variables that can be used to most significantly
indicate the likelihood of default.

There are several further lines of investigation. Firstly, we discovered that the type of
product (F9 in Table 3) is an important indicator of default. It would be interesting to
build separate models for each product to determine how performance and significant
features vary between them. Secondly, we took data from just one three-month
period to avoid the problem of population drift (Hand 2006). It would be interesting to
see how models and performance change across time and how robust simple and
complex models are when tested against test sets drawn from later dates.

Acknowledgements
We used SVM light for this project which is available at
http://svmlight.joachims.org and is documented by Joachims (1999). This
research is funded through EPSRC grant EP/D505380/1.
References
Baesens B, van Gestel T, Viaene S, Stepanova M, Suykens J and Vanthienen J
(2003). Benchmarking state-of-the-art classification algorithms for credit scoring.
Journal of the Operational Research Society 54: 1082-1088.

Cristianini N and Shawe-Taylor J (2000). Support vector machines and other kernel-
based learning methods. Cambridge University Press.

Duda RO, Hart PE, Stork DG (2001). Pattern Classification. Wiley.

Gayler RW (2006). Comment: Classifier Technology and the Illusion of Progress -

Credit Scoring. Statistical Science 21(1):19-23.

Guyon I, Weston J, Barnhill S, Vapnik V (2002). Gene selection for cancer

classification using support vector machines. Machine Learning 46(1-3):389-422.

Hand DJ (1981). Discrimination and Classification. Wiley.

Hand DJ (2006). Classifier Technology and the Illusion of Progress. Statistical

Science 21(1):1-14.

Henley WE, Hand DJ (1997). Construction of a k-nearest-neighbour credit-scoring

system. Journal of Management Mathematics 1997 8(4):305-321.

Huang C-L, Chen M-C, Wang C-J (2007). Credit scoring with a data mining
approach based on support vector machines. Expert Systems with Applications
33(4):847-856

Huang Z, Chen H, Hsu C-J, Chen W-H, Wu S (2004). Credit rating analysis with
support vector machines and neural networks: a market comparative study. Decision
Support Systems (Special issue: Data mining for financial decision making)
37(4):543-558.

Joachims T (1999). Making large-Scale SVM Learning Practical. Advances in Kernel

Methods - Support Vector Learning, Schölkopf B, Burges C, Smola A (ed.), MIT-
Press.

Lee Y-C (2007). Application of support vector machines to corporate credit rating
prediction. Expert Systems with Applications 33(1):67-74.

Li ST, Shiue W, Huang MH (2006). The evaluation of consumer loans using support
vector machines. Expert Systems with Applications 30(4):772-782.

Olsson J S (2006). An analysis of the coupling between training set and

neighbourhood sizes for the kNN classifier. 29th annual international ACM SIGIR
2006.

Schebesch K B and Stecking R (2005). Support vector machines for classifying and
describing credit applicants: detecting typical and critical regions. Journal of the
Operational Research Society 56:1082-1088.
Thomas LC, Edelman DB and Crook JN (2002). Credit Scoring and its Applications.
SIAM Monographs on Mathematical Modeling and Computation. SIAM:
Philidelphia, USA.

Van Gestel T, Baesens B, Suykens JAK, Van den Poel D, Baestaens D, Willekens M
(2006). Bayesian kernel based classification for financial distress detection. European
Journal of Operational Research 172: 979-1003.

Vapnik V (1998). Statistical Learning Theory. Wiley: New York.

Basis Worksheet
No ratings yet
Basis Worksheet
52 pages
FSCUT9100 Installation User Manual
No ratings yet
FSCUT9100 Installation User Manual
85 pages
Lecture Notes Solid State Physics 1
No ratings yet
Lecture Notes Solid State Physics 1
28 pages
Trading Volatility PDF
90% (10)
Trading Volatility PDF
317 pages
A Comparative Study of Forecasting Corporate Credit Ratings Using Neural Networks, Support Vector Machines, and Decision Trees
No ratings yet
A Comparative Study of Forecasting Corporate Credit Ratings Using Neural Networks, Support Vector Machines, and Decision Trees
40 pages
JRFM 18 00023
No ratings yet
JRFM 18 00023
20 pages
MaiXuanBach 11200489 CreditRiskAnalysis Paper
No ratings yet
MaiXuanBach 11200489 CreditRiskAnalysis Paper
20 pages
DataMining - CaseStudy
No ratings yet
DataMining - CaseStudy
48 pages
Raf 07 2017 0143
No ratings yet
Raf 07 2017 0143
26 pages
6 Benefits of DL Techniques For Credit Scoring
No ratings yet
6 Benefits of DL Techniques For Credit Scoring
14 pages
Survival Analysis
No ratings yet
Survival Analysis
6 pages
An Empirical Comparison of Machine-Learning Methods On Bank Client Credit Assessments
No ratings yet
An Empirical Comparison of Machine-Learning Methods On Bank Client Credit Assessments
24 pages
Emailing Metrology Lab Manual - Consolidated Mar2021
No ratings yet
Emailing Metrology Lab Manual - Consolidated Mar2021
113 pages
Marked
No ratings yet
Marked
10 pages
Full Text 01
No ratings yet
Full Text 01
30 pages
Project Report On Conflict Management
No ratings yet
Project Report On Conflict Management
57 pages
7 Comparison of DL Models
No ratings yet
7 Comparison of DL Models
13 pages
Mathematics 09 00746 v2
No ratings yet
Mathematics 09 00746 v2
22 pages
10.3934 Dsfe.2024009
No ratings yet
10.3934 Dsfe.2024009
14 pages
Credit Risk Assessment For Unbalanced Datasets Based On Data Mining
No ratings yet
Credit Risk Assessment For Unbalanced Datasets Based On Data Mining
21 pages
The Impact of Feature Selection and Transformation On Machine Learning Methods in Determining The Credit Scoring
No ratings yet
The Impact of Feature Selection and Transformation On Machine Learning Methods in Determining The Credit Scoring
15 pages
5621 17575 1 PB
No ratings yet
5621 17575 1 PB
12 pages
Modeling Credit Risk
No ratings yet
Modeling Credit Risk
24 pages
Prediction of Credit-Card Defaulters A Comparative
No ratings yet
Prediction of Credit-Card Defaulters A Comparative
6 pages
Pławiak Et Al.2019
No ratings yet
Pławiak Et Al.2019
14 pages
Ajol-File-Journals 543 Articles 255840 650d5184b77f4
No ratings yet
Ajol-File-Journals 543 Articles 255840 650d5184b77f4
14 pages
Benchmarking State-Of-The-Art Classification Algorithms For Credit Scoring: A Ten-Year Update
No ratings yet
Benchmarking State-Of-The-Art Classification Algorithms For Credit Scoring: A Ten-Year Update
30 pages
Coser Al. Crisan Albu (T)
No ratings yet
Coser Al. Crisan Albu (T)
17 pages
SHIVAM - A00876542A - Capstone Paper
No ratings yet
SHIVAM - A00876542A - Capstone Paper
25 pages
Fleet Management System-Sample
83% (6)
Fleet Management System-Sample
30 pages
Class 6 Maths Test (30!06!2025)
No ratings yet
Class 6 Maths Test (30!06!2025)
2 pages
Optimization of Credit Scoring Model Using Stackin
No ratings yet
Optimization of Credit Scoring Model Using Stackin
10 pages
2.003J/1.053J Dynamics and Control I Fall 2007 Problem Set 4
No ratings yet
2.003J/1.053J Dynamics and Control I Fall 2007 Problem Set 4
4 pages
Algorithm Comparison For Data Mining Classification: Assessing Bank Customer Credit Scoring Default Risk
No ratings yet
Algorithm Comparison For Data Mining Classification: Assessing Bank Customer Credit Scoring Default Risk
10 pages
Design of Short Columns
No ratings yet
Design of Short Columns
26 pages
Akademia Credit Scoring
No ratings yet
Akademia Credit Scoring
14 pages
Cls 8 - Math D - Term 1 - LP 4
No ratings yet
Cls 8 - Math D - Term 1 - LP 4
2 pages
Credit Scoring Models Enhancement Using Support Vector Machines
No ratings yet
Credit Scoring Models Enhancement Using Support Vector Machines
6 pages
BULLETIN FOR THE HISTORYvol30-2
No ratings yet
BULLETIN FOR THE HISTORYvol30-2
100 pages
Research 1 PDF
No ratings yet
Research 1 PDF
10 pages
Grundfosliterature 5769232
No ratings yet
Grundfosliterature 5769232
14 pages
Progress Report 2
No ratings yet
Progress Report 2
10 pages
Data Analysis Powerpoint
100% (1)
Data Analysis Powerpoint
17 pages
Credit Scoring Through Data Mining Approach A Case Study of Mortgage Loan in Indonesia
No ratings yet
Credit Scoring Through Data Mining Approach A Case Study of Mortgage Loan in Indonesia
5 pages
PowerWave Observer
No ratings yet
PowerWave Observer
21 pages
Synthetic Feature Generation To Improve Accuracy in Prediction of Credit Limits
No ratings yet
Synthetic Feature Generation To Improve Accuracy in Prediction of Credit Limits
15 pages
PARA DISCUSSÃO1-s2.0-S0957417414007143-main - GA - Credit - Scoring - Fitness
No ratings yet
PARA DISCUSSÃO1-s2.0-S0957417414007143-main - GA - Credit - Scoring - Fitness
7 pages
How To Enable and Read QueryService Logs
No ratings yet
How To Enable and Read QueryService Logs
3 pages
Empirical Analysis of Ensemble Learning For Imbalanced Credit Scoring
No ratings yet
Empirical Analysis of Ensemble Learning For Imbalanced Credit Scoring
18 pages
Practice Exam I Solutions
No ratings yet
Practice Exam I Solutions
18 pages
Support Vector Machine - Theory
No ratings yet
Support Vector Machine - Theory
8 pages
Credit Scoring With A Feature Selection Approach Based Deep Learning PDF
No ratings yet
Credit Scoring With A Feature Selection Approach Based Deep Learning PDF
5 pages
Qpwugerqwjbrchapter 2 Descriptive Statistics: Tabular and Graphical Presentations
No ratings yet
Qpwugerqwjbrchapter 2 Descriptive Statistics: Tabular and Graphical Presentations
37 pages
Gr09 Maths Term2 Pack01 Practice Paper Memo
No ratings yet
Gr09 Maths Term2 Pack01 Practice Paper Memo
5 pages
Credit Risks in Daily Lives: Yijia Mao Faculty Advisor: David Aldous Jul. 13, 2014
No ratings yet
Credit Risks in Daily Lives: Yijia Mao Faculty Advisor: David Aldous Jul. 13, 2014
8 pages
Lyn Thomas-Book
No ratings yet
Lyn Thomas-Book
85 pages
Credit Risk Prediction Using Support Vector Machines
No ratings yet
Credit Risk Prediction Using Support Vector Machines
17 pages
Credit Scoring and Default Risk Prediction: A Comparative Study Between Discriminant Analysis & Logistic Regression
No ratings yet
Credit Scoring and Default Risk Prediction: A Comparative Study Between Discriminant Analysis & Logistic Regression
15 pages
Chapter 15 PDF
No ratings yet
Chapter 15 PDF
39 pages
Worksheet Graphing Systems
No ratings yet
Worksheet Graphing Systems
3 pages
Audio Amplifier Applications Low Noise Audio Amplifier Applications
No ratings yet
Audio Amplifier Applications Low Noise Audio Amplifier Applications
5 pages
Xtreme Boosting Machine
No ratings yet
Xtreme Boosting Machine
5 pages
Short Volatility
No ratings yet
Short Volatility
37 pages
Credit Scoring With A Feature Selection Approach B
No ratings yet
Credit Scoring With A Feature Selection Approach B
6 pages
An Empirical Comparison of Machine-Learning Methods On Bank Client Credit Assessments
No ratings yet
An Empirical Comparison of Machine-Learning Methods On Bank Client Credit Assessments
23 pages
An Application of Support Vector Machine To Companies' Financial Distress Prediction
No ratings yet
An Application of Support Vector Machine To Companies' Financial Distress Prediction
9 pages
Ch3 Rotor System Operation PDF
No ratings yet
Ch3 Rotor System Operation PDF
13 pages
Credit Scoring Model Based On Kernel Density Estimation and Support Vector Machine For Group Feature Selection
No ratings yet
Credit Scoring Model Based On Kernel Density Estimation and Support Vector Machine For Group Feature Selection
8 pages
Specification and Description
No ratings yet
Specification and Description
16 pages
Master Thesis
No ratings yet
Master Thesis
78 pages
Research Proposal
No ratings yet
Research Proposal
8 pages
Third Order Intercepts
No ratings yet
Third Order Intercepts
6 pages
FOBroadcast Ver1.0.4
No ratings yet
FOBroadcast Ver1.0.4
86 pages
Riemann - Biography - Wiki
No ratings yet
Riemann - Biography - Wiki
7 pages
Project Lit Final1
No ratings yet
Project Lit Final1
15 pages
Credit Loan Default Prediction Based On Data Mining
No ratings yet
Credit Loan Default Prediction Based On Data Mining
4 pages
Credit Scoring For Microfinance Institutions in Mexico An Ensemble and Hybridized Approach
No ratings yet
Credit Scoring For Microfinance Institutions in Mexico An Ensemble and Hybridized Approach
7 pages
Methods of Artificial Intelligence For Prediction and Prevention Crisis Situations in Banking Systems
No ratings yet
Methods of Artificial Intelligence For Prediction and Prevention Crisis Situations in Banking Systems
6 pages
BIHANA2015 - Hollis - Performance Tuning in Sap Hana PDF
No ratings yet
BIHANA2015 - Hollis - Performance Tuning in Sap Hana PDF
75 pages
Loan Approval Prediction Using DM Techniques: Pusendra Chaudhary, Sumit Chaudhary, Arpan Mahatra
No ratings yet
Loan Approval Prediction Using DM Techniques: Pusendra Chaudhary, Sumit Chaudhary, Arpan Mahatra
8 pages
NEW333
No ratings yet
NEW333
9 pages
Bose
No ratings yet
Bose
9 pages
Credit Scoring For VN Retail Banking
No ratings yet
Credit Scoring For VN Retail Banking
36 pages
A Support Vector Machine Approach To Credit Scoring
No ratings yet
A Support Vector Machine Approach To Credit Scoring
15 pages
Credit Analysis: Alina Mihaela Dima
No ratings yet
Credit Analysis: Alina Mihaela Dima
22 pages
3.futures Trading Strategies Python
No ratings yet
3.futures Trading Strategies Python
19 pages
Effectiveness Assessment Between Sequential Minimal Optimization and Logistic Classifiers For Credit Risk Prediction
No ratings yet
Effectiveness Assessment Between Sequential Minimal Optimization and Logistic Classifiers For Credit Risk Prediction
9 pages
Bench Marking State-Of-The-Art Classification Algorithms For Credit Scoring
No ratings yet
Bench Marking State-Of-The-Art Classification Algorithms For Credit Scoring
10 pages
Brso97lm PDF
No ratings yet
Brso97lm PDF
14 pages
Digital Filter Design (FIR) Using Frequency Sampling Method: Abstract
No ratings yet
Digital Filter Design (FIR) Using Frequency Sampling Method: Abstract
10 pages
4.options Trading Strategies Python
No ratings yet
4.options Trading Strategies Python
29 pages
Arholwr Yn Unig: Examiner Only
No ratings yet
Arholwr Yn Unig: Examiner Only
4 pages
M. B. Yobas, J. N. Crook, D P. Ross - Credit Scoring Using Neural and Evolutionary Techniques
No ratings yet
M. B. Yobas, J. N. Crook, D P. Ross - Credit Scoring Using Neural and Evolutionary Techniques
15 pages
Zephyr Concepts - Black-Litterman
No ratings yet
Zephyr Concepts - Black-Litterman
4 pages
s15 Pin Out
No ratings yet
s15 Pin Out
4 pages
Data Mining 101: Core Concepts and Algorithms
From Everand
Data Mining 101: Core Concepts and Algorithms
Swarnalata Verma
No ratings yet
Dictionary of Credit Risk Business Terms - EXTRACT
From Everand
Dictionary of Credit Risk Business Terms - EXTRACT
Steve Preece
No ratings yet

SVM - Report

Uploaded by

SVM - Report

Uploaded by

Support vector machines for credit scoring and

discovery of significant features

Credit Research Centre, School of Management, University of Edinburgh, William

Support vector machines (SVMs) have been applied successfully in many

2. SVM is tested with a polynomial kernel to determine if a non-linear

3. SVM performance is assessed in light of the number of support vectors

4. Financial institutions are primarily interested in determining which consumers

Figure 1. Illustration the of poor separability of the credit data.

Polynomial model, with degree d k (x i , x j ) = (x i ⋅ x j + 1)

Gaussian radial basis function (RBF),  x −x 2

3.2. Logistic Regression (LR)

3.3. Linear Discriminant Analysis (LDA)

Maximizing the Fisher discriminant criterion yields a hyperplane 0 = w ⋅ x which

3.4. k-Nearest Neighbours (kNN)

3.5. Validation and assessment

3.6. Parameter Tuning

3.7. Discovering significant features

4.1. Parameter tuning

Table 1. Optimal parameter values.

4.2. Comparing algorithms

Table 2. Performance for different algorithms

4.3. Significant features

Table 3. Significant features.

Duda RO, Hart PE, Stork DG (2001). Pattern Classification. Wiley.

Gayler RW (2006). Comment: Classifier Technology and the Illusion of Progress -

Guyon I, Weston J, Barnhill S, Vapnik V (2002). Gene selection for cancer

Hand DJ (1981). Discrimination and Classification. Wiley.

Hand DJ (2006). Classifier Technology and the Illusion of Progress. Statistical

Henley WE, Hand DJ (1997). Construction of a k-nearest-neighbour credit-scoring

Joachims T (1999). Making large-Scale SVM Learning Practical. Advances in Kernel

Olsson J S (2006). An analysis of the coupling between training set and

Vapnik V (1998). Statistical Learning Theory. Wiley: New York.

You might also like