Bankruptcy Prediction Report
Bankruptcy Prediction Report
Abstract
Bankruptcy prediction is the task of predicting bankruptcy and various measures of
financial distress of firms. It is a vast area of finance and accounting research. The
importance of the area is due in part to the relevance for creditors and investors in
evaluating the likelihood that a firm may go bankrupt. The aim of predicting financial
distress is to develop a predictive model that combines various econometric parameters
which allow foreseeing the financial condition of a firm. In this domain various methods
were proposed that were based on statistical hypothesis testing, statistical modeling (e.g.,
generalized linear models), and recently artificial intelligence (e.g., neural networks,
Support Vector Machines, decision trees). In this paper we document our observations as
we explore, build, and compare, some of the widely used classification models: Extreme
Gradient Boosting for Decision Trees, Random Forests, Naïve Bayes, Balanced Bagging
and Logistic Regression, pertinent to bankruptcy prediction. We have chosen the Polish
companies’ bankruptcy data set where synthetic features were used to reflect higher-order
statistics. A synthetic feature is a combination of the econometric measures using
arithmetic operations (addition, subtraction, multiplication, division). We begin by
carrying out data preprocessing and exploratory analysis where we impute the missing
data values using some of the popular data imputation techniques like Mean, k-Nearest
Neighbors, Expectation-Maximization and Multivariate Imputation by Chained
Equations (MICE). To address the data imbalance issue, we apply Synthetic Minority
Oversampling Technique (SMOTE) to oversample the minority class labels. Later, we
model the data using K-Fold Cross Validation on the said models, and the imputed and
resampled datasets. Finally, we analyze and evaluate the performance of the models on
the validation datasets using several metrics such as accuracy, precision, recall, etc., and
rank the models accordingly. Towards the end, we discuss the challenges we faced and
suggest ways to improve the prediction, including scope for future work.
Bankruptcy Prediction: Mining the Polish Bankruptcy Data
1. Introduction
The purpose of the bankruptcy prediction is to assess the financial condition of a company
and its future perspectives within the context of long-term operation on the market [4].
It is a vast area of finance and econometrics that combines expert knowledge about the
phenomenon and historical data of prosperous and unsuccessful companies. Typically,
enterprises are quantified by numerous indicators that describe their business condition
that are further used to induce a mathematical model using past observations [5].
There are different issues that are associated with the bankruptcy prediction. Two main
problems are the following: First, the econometric indicators describing the firm’s
condition are pro- posed by domain experts. However, it is rather unclear how to combine
them into a successful model. Second, the historical observations used to train a model
are usually influenced by imbalanced data phenomenon, because there are typically much
more successful companies than the bankrupted ones. As a consequent, the trained model
tends to predict companies as successful (majority class) even when some of them are
2
Bankruptcy Prediction: Mining the Polish Bankruptcy Data
distressed firms. Both these issues mostly influence the final predictive capability of the
model.
To speak about the modern methods of approach for the field of bankruptcy prediction,
it is worth noting that survival methods are now being applied. Option valuation
approaches involving stock price variability have been developed. Under structural
models, a default event is deemed to occur for a firm when its assets reach a sufficiently
low level compared to its liabilities. Neural network models and other sophisticated models
have been tested on bankruptcy prediction. Modern methods applied by business
information companies surpass the annual accounts content and consider current events
like age, judgements, bad press, payment incidents and payment experiences from
creditors.
Since nineties of the 20th century artificial intelligence and machine learning have become
a major research direction in the bankruptcy prediction. In the era of increasing volumes
of data, it turned out that the linear models like the logistic regression or logit (probit)
models are unable to reflect non-trivial relationships among economic metrics. Moreover,
3
Bankruptcy Prediction: Mining the Polish Bankruptcy Data
the estimated weights of the linear models are rather unreliable to indicate the importance
of the metrics.
In order to obtain comprehensible models with an easy to understand knowledge
representation, decision rules expressed in terms of first-order logic were induced using
different techniques, naming only a few, like rough sets (Dimitras, Slowinski, Susmaga, &
Zopounidis, 1999)[9] or evolutionary programming (Zhang et al., 2013). However, the
classification accuracy of the decision rules are very often insufficient, therefore, more
accurate methods were applied to the bankruptcy prediction. One of the most successful
model was support vector machines (SVM) (Shin, Lee, & Kim, 2005). The disadvantages
of SVM are that the kernel function must be carefully hand-tuned and it is impossible to
obtain comprehensible model.
A different approach aims at automatic feature extraction from data, i.e., automatic non-
linear combination of econometric indicators, which alleviates the problem of a specific
kernel function determination in the case of SVM. This approach applies neural networks
to the bankruptcy prediction (Bell, Ribar, & Verchio, 1990; Cadden, 1991; Coats & Fant,
1991; Geng, Bose, & Chen, 2015; Koster, Sondak, & Bourbia, 1991; Salchenberger, Cinar,
& Lash, 1992; Serrano-Cinca, 1996; Tam, 1991; Tam & Kiang, 1992; Wilson & Sharda,
1994; Zhang, Hu, Patuwo, & Indro, 1999) [10]. The main problem of the neural networks
lies in the fact that they can fail in case of multimodal data. Typically, the econometric
metrics need to be normalized/standardized in order to have all features of the same
magnitude. This is also necessary for training neural networks so that the errors could be
backpropagated properly. However, the normalization/standardization of data do not
reduce the problem of the data multimodality that may drastically reduce predictive
capabilities of the neural networks. That is why it has been advocated to take advantage
of different learning paradigm, namely, the ensemble of classifiers (Kittler, Hatef, Duin,
& Matas, 1998) [11]. The idea of the ensemble learning is to train and combine typi- cally
weak classifiers to obtain better predictive performance. First approaches but still very
successful were bagging (Breiman, 1996)[12] and boosting (Freund & Schapire, 1996;
Friedman, 20 01; 20 02; Zi˛eba, Tomczak, Lubicz, & ´Swi˛atek, 2014)[13]. The idea of
boosting was further developed to the case of unequal classification costs (Fan, Stolfo,
Zhang, & Chan, 1999) and imbalanced data (Galar, Fernandez, Barrenechea, Bustince, &
Herrera, 2012)[14]. Recently, the boosting method was modified to optimize a Taylor
expansion of the loss functions, an approach known as Extreme Gradient Boosting (Chen
& He, 2015a) that obtains state-of-the-art results in many problems on Kaggle
competitions. 1 Recently, it has been shown that the ensemble classifier can be successfully
applied to the bankruptcy prediction (Nanni & Lumini, 2009)[15] and it significantly beats
other methods (Alfaro, García, Gámez, & Elizondo, 2008)[16].
4
Bankruptcy Prediction: Mining the Polish Bankruptcy Data
2. Methodology
2.1 Data
The dataset we have considered for addressing the bankruptcy prediction problem is the
Polish bankruptcy data, hosted by the University of California Irvine (UCI) Machine
Learning Repository—a huge repository of freely accessible datasets for research and
learning purposes intended for the Machine Learning/Data Science community. The
dataset is about bankruptcy prediction of Polish companies. The data was collected from
5
Bankruptcy Prediction: Mining the Polish Bankruptcy Data
6
Bankruptcy Prediction: Mining the Polish Bankruptcy Data
ID Description ID Description
7
Bankruptcy Prediction: Mining the Polish Bankruptcy Data
Table 1 shows the total number of features and instances in the dataset, and the number
of samples in each class (bankrupt or not-bankrupt) of all the 5 datasets. The features
are explained in Table 2 above. As shown in the table, there are 64 features labelled X1
through X64, and each feature is a synthetic feature. A synthetic feature is a combination
of the econometric measures using arithmetic operations (addition, subtraction,
multiplication, division). Each synthetic feature is as a single regression model that is
developed in an evolutionary manner. The purpose of the synthetic features is to combine
the econometric indicators proposed by the domain experts into complex features. The
synthetic features can be seen as hidden features extracted by the neural networks but
the fashion they are extracted is different.
Figure 1: Sparsity matrix for dataset ‘Year 1’. The white spaces indicate
missing data values for the feature in the corresponding column.
Now, we explore to see the correlation among various features in the 1st Year data as an
example. Shown in Figure 2 is a correlation heatmap for the 1st Year data that describes
the degree of nullity relationship between the different features. The range of this nullity
correlation is from -1 to 1 (-1 ≤ R ≤ 1). Features with no missing value are excluded in
the heatmap. If the nullity correlation is very close to zero (-0.05 < R < 0.05), no value
8
Bankruptcy Prediction: Mining the Polish Bankruptcy Data
will be displayed. Also, a perfect positive nullity correlation (R=1) indicates when the
first feature and the second feature both have corresponding missing values while a perfect
negative nullity correlation (R=-1) means that one of the features is missing and the
second is not missing.
We have visually seen the sparsity in the data, as well as correlation among the features
with respect to missing values. Now, let us see how much of data is actually missing. In
Table 3 shown below, the second column shows the total number of instances in each
dataset, and third column shows the number of instances or rows with missing values for
at least one of the features. A naive approach of dealing with missing values would be to
9
Bankruptcy Prediction: Mining the Polish Bankruptcy Data
drop all such rows as in Listwise deletion. But dropping all such rows leads to a
tremendous data loss. Column 4 shows the number of instances that would remain in each
dataset if all rows with missing values were dropped. Column 5 shows the percent of data
loss if all the rows with missing data values were indeed dropped. As the data loss in most
of the datasets is over 50%, it is now clear that we cannot simply drop the rows with
missing values, as it leads to severe loss in the representativeness of data.
10
Bankruptcy Prediction: Mining the Polish Bankruptcy Data
# Non-
# Bankrupt
Bankrupt Percentage of
Data # Total instances in
instances in minority class
Set Instances this forecasting
this forecasting samples
period
period
Year 1 7027 271 6756 3.85 %
Year 2 10173 400 9773 3.93 %
Year 3 10503 495 10008 4.71 %
Year 4 9792 515 9277 5.25 %
Year 5 5910 410 5500 6.93 %
Table 4: Assessing the Data Imbalance for all the datasets.
Dropping all the rows with missing values or Listwise deletion, introduces bias and affects
representativeness of the results. The only viable alternative to Listwise deletion of
missing data is Imputation. Imputation is the process of replacing missing data with
substituted values and it preserves all the cases by replacing missing data with an
estimated value, based on other available information. In our project we explored 4
techniques of imputation, and we will see them in the subsequent sections.
1. Mean Imputation
2. k-Nearest Neighbors Imputation
3. Expectation-Maximization Imputation
4. Multivariate Imputation Using Chained Equations
imputation replaces NaNs in Data with the corresponding value from the nearest-neighbor
row or column depending upon the requirement. The nearest-neighbor row or column is
the closest row or column by Euclidean distance. If the corresponding value from the
nearest-neighbor is also NaN, the next nearest neighbor is used. We used the
fancyimpute library to perform k-NN data imputation, and we used 100 nearest
neighbors for the process.
12
Bankruptcy Prediction: Mining the Polish Bankruptcy Data
13
Bankruptcy Prediction: Mining the Polish Bankruptcy Data
Raw Data
(.arff files)
Formatted 5
Years’ Data
SMOTE Oversampling
Results
6 models × 4 imputers
= 24 analyses
Figure 3 shows the pipeline of data modeling for our project. After having obtained the
formatted datasets from the raw data (.arff files), we have imputed the missing values via
4 different independent imputer methods (mean, k-NN, EM and MICE). Later, we have
oversampled all these 4 imputed datasets with SMOTE oversampling technique and thus
obtained 4 imputed and oversampled datasets. They datasets are ready for the data
14
Bankruptcy Prediction: Mining the Polish Bankruptcy Data
modeling step. We model each of these 4 datasets with the 6 models listed above. While
modeling, we use the K-Fold Cross Validation technique for validation.
Hence, towards the end of the modeling step, we obtain 24 different results (6 models ×
4 imputer datasets). In each of the sub-sections that follow, we first explain the model
briefly, explain the experiment by specifying the hyperparameters of the model. Later, in
the Results section we report the (Cross-Validation-Average) performance of the model
on the validation data.
15
Bankruptcy Prediction: Mining the Polish Bankruptcy Data
𝑛𝑛
�𝑦𝑦 = arg max 𝑃𝑃(𝑦𝑦) � 𝑃𝑃(𝑥𝑥𝑖𝑖 ∣ 𝑦𝑦)
𝑦𝑦
𝑖𝑖=1
Gaussian Naïve Bayes implements the Gaussian Naive Bayes algorithm for classification.
The likelihood of the features is assumed to be Gaussian:
1 (𝑥𝑥𝑖𝑖 − 𝜇𝜇𝑦𝑦 )2
𝑃𝑃(𝑥𝑥𝑖𝑖 | 𝑦𝑦) = exp(− )
�2𝜋𝜋σ2𝑦𝑦 2σ2𝑦𝑦
The parameters σ𝑦𝑦 and µ𝑦𝑦 are estimated using maximum likelihood.
𝐽𝐽(𝜃𝜃) = � ||𝑥𝑥𝑖𝑖 ||
𝑖𝑖=1
We implemented the model with λ = 1 and equal weights are given for all the features,
using L1 regularization.
16
Bankruptcy Prediction: Mining the Polish Bankruptcy Data
and control over-fitting. In random forests, each tree in the ensemble is built from a
sample drawn with replacement from the training set. Also, when splitting a node during
the construction of the tree, the split that is chosen is no longer the best split among all
features. Instead, the split that is picked is the best split among a random subset of the
features. As a result of this randomness, the bias of the forest usually slightly increases
but, due to averaging, its variance also decreases, usually more than compensating for the
increase in bias, hence yielding an overall better model. In our model, the number of
estimators used are 5 and we have considered ‘Entropy’ as a measure of the quality of a
split.
3. Code
The programming environment used for the project is Python v3.6. We used an Intel Core
i5 2.5 GHz Dual Core processor with 8 GB Memory (RAM) and 1 TB of storage (disk
space) to run our experiments. Our code workflow exactly mimics the data modeling
pipeline shown in Figure 3.
17
Bankruptcy Prediction: Mining the Polish Bankruptcy Data
We used the libraries listed in Table 5 to run our experiments and achieve our results.
Library Description
Data organization and statistical
numpy
operations.
Data manipulation and analysis.
pandas Storing and manipulating numerical
tables.
matplotlib Plotting library
scipy.io Loading .arff raw data
Generate nullity matrices and
missingno
correlation heatmaps for missing data.
fancyimpute Perform k-NN and MICE imputation
impyute Perform EM imputation
sklearn.preprocessing.Imputer Perform Mean imputation
sklearn.model_selection.KFold Perform K-Fold Cross Validation
imblearn.over_sampling.SMOTE Perform SMOTE Oversampling
xgboost.XGBClassifier Extreme Gradient Boosting classifier
sklearn.ensemble.RandomForestClassifier Random Forest Classifier
sklearn.linear_model.LogisticRegression Logistic Regression Classifier
imblearn.ensemble.BalancedBaggingClassifie
Balanced Bagging Classifier
r
sklearn.tree.DecisionTreeClassifier Decision Tree Classifier
sklearn.naive_bayes.GaussianNB Gaussian Naïve Bayes Classifier
Performance evaluation metrics like
sklearn.metrics accuracy score, recall, precision, ROC
curve, etc.
Table 5: Libraries used for the project.
18
Bankruptcy Prediction: Mining the Polish Bankruptcy Data
4. Then we perform imputation of the missing data using Mean, k-NN, EM and MICE
imputation techniques and generate fresh dataframes of imputed data.
5. We apply SMOTE oversampling on all these imputed dataframes to get fresh
dataframes of imputed-and-oversampled dataframes and store them in a dictionary.
6. We create (instantiate) the 6 classifier models (GNB, LR, DT, RF, XGB, BB) and
store them in a dictionary.
7. We iterate over all the models. In each model, we iterate over all the 4 imputed-
oversampled dataset collections. Each collection has 5 dataframes corresponding to
5 years’ data. On each of these years’ datasets, we train the model using K-Folds
Cross Validation and store the results in nested dictionaries.
8. We use the nested dictionaries of results to export the results as CSV files and
generate charts using Excel.
4. Results
Our results are organized as follow: Firstly, we report the accuracy score of the 6 models
we have experimented with, using a plot of the accuracy score against each of the
imputation method (Mean, k-NN, EM and MICE), and internally, on each of the 5
datasets (Year 1 – Year 5). Later, we also report the accuracy scores by years’ datasets,
i.e., plot of accuracy scores for each year’s dataset, plotted against the 4 imputation
techniques, and internally, the 6 models.
19
Bankruptcy Prediction: Mining the Polish Bankruptcy Data
0.52
0.515
0.51
0.505
0.5
0.495
Mean KNN EM MICE
Imputation techniques
0.5
0.4
0.3
0.2
0.1
0
Mean KNN EM MICE
Imputation techniques
20
Bankruptcy Prediction: Mining the Polish Bankruptcy Data
Although the Year 5 dataset for all the imputation techniques obtained an accuracy of >
60%, the average accuracy of all the years is lower than the GNB, when compared with
the corresponding imputation techniques.
0.94
0.92
Accuracy Score
0.9
0.88
0.86
0.84
0.82
Mean KNN EM MICE
Imputation techniques
21
Bankruptcy Prediction: Mining the Polish Bankruptcy Data
only could obtain 91.10%. The highest difference between the performance of RF (90.28%)
and DT (87.48%) models was noted for the k-NN imputation method.
Random Forest Classifier
0.96
0.94
0.92
Accuracy Score
0.9
0.88
0.86
0.84
Mean KNN EM MICE
Imputation techniques
0.88
0.86
0.84
0.82
0.8
0.78
Mean KNN EM MICE
Imputation techniques
22
Bankruptcy Prediction: Mining the Polish Bankruptcy Data
0.98
0.97
Accuracy Score
0.96
0.95
0.94
0.93
0.92
Mean KNN EM MICE
Imputation techniques
23
Bankruptcy Prediction: Mining the Polish Bankruptcy Data
Year 1
1
0.8
Accuracy Score
0.6
0.4
0.2
0
Mean K-NN EM MICE
Imputation techniques
24
Bankruptcy Prediction: Mining the Polish Bankruptcy Data
Year 2
1
0.8
Accuracy Score
0.6
0.4
0.2
0
Mean K-NN EM MICE
Imputation techniques
Year 3
1
0.8
Accuracy Score
0.6
0.4
0.2
0
Mean K-NN EM MICE
Imputation techniques
25
Bankruptcy Prediction: Mining the Polish Bankruptcy Data
Year 4
1
0.8
Accuracy Score
0.6
0.4
0.2
0
Mean K-NN EM MICE
Imputation techniques
26
Bankruptcy Prediction: Mining the Polish Bankruptcy Data
Year 5
1
0.8
Accuracy Score
0.6
0.4
0.2
0
Mean K-NN EM MICE
Imputation techniques
IMPUTATION TECHNIQUES
Mean k-NN EM MICE
M Gaussian Naïve Bayes 51.58 51.77 51.77 51.63
O Logistic Regression 49.00 49.16 48.58 49.48
D Decision Tree 91.10 87.48 90.24 89.72
E Random Forests 92.89 90.28 91.86 91.32
L Extreme Gradient Boosting 90.67 85.75 89.36 88.82
S Balanced Bagging 96.59 95.14 96.07 95.86
Table 6: Mean accuracies across all years’ datasets for various
models and imputation methods
27
Bankruptcy Prediction: Mining the Polish Bankruptcy Data
5. Discussion
As we have seen the commentary in results section, the best model we have experimented
with so far is the Balanced Bagging Classifier, with the Mean Imputation method. We
have implemented the Balanced Bagging Classifier with Decision Tree as a base estimator
and n-estimators as > 5. The number of samples to draw from the give training data X
to train each base estimator is 1. The number of features to draw from the training set X
to train each base estimator is also 1. The samples are drawn with replacement, and are
not using the out-of-bag samples to estimate the generalization error. Figure 15 shows the
effect of varying the number of estimators in the Balanced Bagging classifier, for various
years’ datasets. We could observe from the figure that as the number of estimators grew,
initially, the accuracy plots peaked up quickly. After some estimators (10), the accuracy
plots began to converge, although they pulsed slightly. The highest accuracy was obtained
by the Year 1 dataset and the least accuracy was obtained by the Year 5 dataset. This
was also depicted in Figure 16. It shows the accuracy performance of Balanced Bagging
classifier by years’ datasets, for the Mean imputation technique.
Table 6 summarizes the mean accuracies across all the years’ datasets for various models
and imputation methods. Looking, at the performance of the models on each imputation
technique, we were surprised by the statistics of the mean imputation technique. We were
expecting that the MICE imputation technique will account for the best accuracy of
models, and the mean imputation technique was a baseline imputation method. But it
turned out that Mean imputation gave better results than most other imputation
techniques, even though the logic of its operation is naïve and simple.
When it comes to model ranking, Balanced Bagging outperforms all other models in terms
of accuracy, for all the imputation techniques. While Extreme Gradient Boosting was
expected to show up as the next best model, it ended up in the third place, giving away
the second place to Random Forest model. Random Forest model, as we seen in its
analysis in the results section, performed slightly better than the Decision Tree model,
thus pushing the Decision Tree model, to the fourth place. Although Logistic Regression
was expected to perform better than the Gaussian Naïve Bayes model, it showed the
worst performance among all the models and the Gaussian Naïve Bayes model was only
slightly better than the Logistic Regression model, and ranked 5th best model.
28
Bankruptcy Prediction: Mining the Polish Bankruptcy Data
0.98
0.97
0.96
0.95
0.94
Mean
Years
Figure 16: Performance comparison of various years’ data for Balanced Bagging model.
29
Bankruptcy Prediction: Mining the Polish Bankruptcy Data
6. Future Work
In this section, we intend to discuss the future work in bankruptcy prediction. So far, in
our experiments, we have dealt with the synthetic features—an arithmetic combination
of core econometric features. But it is also possible to gather more core features and hence
synthesize more synthetic features by varying the arithmetic operations performed on
these core features. Also, it is possible to synthesize more features considering the current
synthetic features as base features. Doing so may result in better prediction of bankruptcy,
but it has to be thoroughly validated by domain experts, as to whether such highly
complex synthetic features would be meaningful, in terms of financial economics. It is also
feasible to reduce the dimensionality of features. But for a dataset like the Polish
bankruptcy data we have seen so far, with such high missing data, it becomes difficult to
rank the features and perform feature extraction. The features dropped in such manner
might actually bear significant impact in the prediction, had the data not been so much
sparse. So, the takeaway is that, if the data to be collected in the future, pertinent to
bankruptcy prediction, is made sure to less sparse, it is possible to apply all the techniques
mentioned in the future scope so far, and hence obtain better predictive models.
7. Conclusion
In this section we discuss the summary of our work done in this project thus far. We have
successfully modelled 6 classification models: Gaussian Naïve Bayes, Logistic Regression,
Decision Trees, Random Forests, Extreme Gradient Boosting and Balanced Bagging
classifiers. The training sets were made sure to have a balanced sets of class labels, by
oversampling the minority class labels using Synthetic Minority Oversampling technique.
Also, we have imputed the missing values in the data using 4 imputer techniques: Mean,
k-Nearest Neighbors (k-NN), Expectation-Maximization (EM) and Multiple Imputation
using Chained Equations (MICE). The biggest challenge was to deal with the
missing/sparse data. Since all the companies being evaluated for bankruptcy don’t operate
on the same timelines, it is difficult to gather meaningful data and organize it. The
features on which the bankruptcy prediction is based, are not as straightforward as the
financial ratios found on the balance sheets of the companies, and need to thoroughly be
studied and validated. We have successfully documented our findings and suggested the
best bankruptcy prediction model we have seen in our project.
30
Bankruptcy Prediction: Mining the Polish Bankruptcy Data
8. References
[1] Zieba, M., Tomczak, S., Tomczak, J. (2016). Ensemble boosted trees with synthetic
features generation in application to bankruptcy prediction. Expert Systems with
Applications, 2016.
[2] Zhang, Y., Wang, S., & Ji, G. (2013). A rule-based model for bankruptcy prediction
based on an improved genetic ant colony algorithm. Mathematical Problems in
Engineering, 2013.
[3] Wikipedia contributors. "Bankruptcy prediction". Wikipedia, The Free
Encyclopedia, <https://en.wikipedia.org/wiki/Bankruptcy_prediction>
[4] Constand, R. L., & Yazdipour, R. (2011). Firm failure prediction models: a critique
and a review of recent developments. Advances in Entrepreneurial Finance (pp.
185–204). Springer.
[5] Altman, E. I., & Hotchkiss, E. (2010). Corporate financial distress and bankruptcy:
Predict and avoid bankruptcy, analyze and invest in distressed debt: 289. Hoboken,
New Jersey: John Wiley & Sons.
[6] Koh, H. C., & Killough, L. N. (1990). The use of multiple discriminant analysis in
the assessment of the going-concern status of an audit client. Journal of Business
Finance & Accounting, 17, 179–192.
[7] Laitinen, E. K. (1991). Financial ratios and different failure processes. Journal of
Business Finance & Accounting, 18, 649–673.
[8] Wilcox, J. W. (1973). A prediction of business failure using accounting data.
Journal of Accounting Research, 11, 163–179.
[9] Chen, T., & He, T. (2015b). xgboost: extreme gradient boosting. R package version
0.3-0. Technical Report .
[10] Friedman JH (2001). “Greedy function approximation: a gradient boosting
machine.” Annals of Statistics, pp. 1189–1232.
[11] Bache K, Lichman M (2013). “UCI Machine Learning Repository.” URL
http://archive.ics. uci.edu/ml. Friedman J, Hastie T, Tibshirani R, et al. (2000).
“Additive logistic regression: a statistical view of boosting (with discussion and a
rejoinder by the authors).” The annals of statistics, 28(2), 337–407. Friedman JH
(2001). “Greedy function approximation: a gradient boosting machine.” Annals of
Statistics, pp. 1189–1232.
[12] Friedman J, Hastie T, Tibshirani R, et al. (2000). “Additive logistic regression: a
statistical view of boosting (with discussion and a rejoinder by the authors).” The
annals of statistics, 28(2), 337–407.
[13] Kittler, J., Hatef, M., Duin, R. P., & Matas, J. (1998). On combining classifiers.
Pattern Analysis and Machine Intelligence, IEEE Transactions on, 20, 226–239.
31
Bankruptcy Prediction: Mining the Polish Bankruptcy Data
[14] Quinlan, J.R. (1983a). Learning efficient classification procedures and their
application to chess endgames. In R.S. Michalski, J.G. Carbonell & T.M. Mitchell,
(Eds.), Machine learning: An artificial intelligence approach. Palo Alto: Tioga
Publishing Company.
[15] Merwin, C. L. (1942). Financing small corporations in five manufacturing
industries. NBER Books p. 1926-1936. New York: National Bureau of Economic
Research, Inc.
[16] Shapiro, A. (1983). The role of structured induction in expert systems. Ph.D.
Thesis, University of Edinburgh.
[17] Sinkey, J. F. (1975). A multivariate statistical analysis of the characteristics of
problem banks. The Journal of Finance, 30, 21–36.
32