0% found this document useful (0 votes)

30 views7 pages

Prediction of Bankruptcy Using Big Data Analytic Based On Fuzzy C-Means Algorithm

This paper has suggested an optimization approach of the cluster-based sampling using Fuzzy c means algorithm to the classifier in order to select the most appropriate instances of bankruptcy. This method was examined with the help of a clustering method and GA based artificial neural network in order to solve the existing data imbalance issue. The objective of this paper is to optimize the selected design model of GA-ANN by using Fuzzy C means algorithm to predict corporate bankruptcies by considering different financial ratios of companies across several industries within the period from 1994 to 2014. Effectiveness of this method was proved by comparing its accuracy rate with the results of existing method. From the performance result the accuracy rate of this method was found to be 78.2% and misclassification rate to be 0.2178.

Uploaded by

IAES IJAI

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

30 views7 pages

Prediction of Bankruptcy Using Big Data Analytic Based On Fuzzy C-Means Algorithm

Uploaded by

IAES IJAI

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 7

IAES International Journal of Artificial Intelligence (IJ-AI)

Vol. 8, No. 2, June 2019, pp. 168~174

ISSN: 2252-8938, DOI: 10.11591/ijai.v8.i2.pp168-174  168

Prediction of bankruptcy using big data analytic based on fuzzy

c-means algorithm

Arup Guha, N. Veeranjaneyulu

Vignan’s Foundation for Science, Technology and Research, Vadlamudi, India

Article Info ABSTRACT

Article history: This paper has suggested an optimization approach of the cluster-based
sampling using Fuzzy c means algorithm to the classifier in order to select
Received Feb 18, 2019 the most appropriate instances of bankruptcy. This method was examined
Revised Apr 14, 2019 with the help of a clustering method and GA based artificial neural network
Accepted May 16, 2019 in order to solve the existing data imbalance issue. The objective of this
paper is to optimize the selected design model of GA-ANN by using Fuzzy C
means algorithm to predict corporate bankruptcies by considering different
Keywords: financial ratios of companies across several industries within the period from
1994 to 2014. Effectiveness of this method was proved by comparing its
Artificial neural network accuracy rate with the results of existing method. From the performance
Cluster-based sampling result the accuracy rate of this method was found to be 78.2% and
Fuzzy c means clustering misclassification rate to be 0.2178.
Genetic algorithm
Machine learning Copyright © 2019 Institute of Advanced Engineering and Science.
Under-sampling technique All rights reserved.

Corresponding Author:
Arup Guha,
Vignan’s foundation for Science,
Technology and Research,
Vadlamudi, India.
Email: [email protected]

1. INTRODUCTION
Increasing amount of data has led to the evolution of data science and its application to solve
complex classification issues in a set of data to take important managerial decisions [1]. This paper is based
on an optimization approach in ANN (artificial neural network) using the concept of Fuzzy clustering which
is a type of sampling technique for extracting appropriate information from the random data set [2].
Clustering is one of the effective form of data mining techniques that are widely used for performing
descriptive learning technique in analytics for predicting the corporate bankruptcy [3]. This technique is
based on the determination of similar groups with identical features among a huge random data set. This
method is a popularly used sampling technique in the case of imbalance data within the set of random data
because it is very difficult to identify patterns among data comprising of odd data values, either very high or
very low [4]. The method of handling such imbalance set of data is important prior to model development
because if the difference in data size is too large or too small, then the cases of bankruptcy are ignored in
the analysis. The basic methods that were considered in this bankruptcy prediction were based on applying
undersampling technique to the majority group and over-sampling to the minority class.
The paper is based on application of Genetic algorithm (GA) and its combination with Artificial
Neural Network (ANN) i.e GA-ANN modelling technique [5]. Conducting classification tasks using
unbalanced data usually deteriorates the classification performance. If the difference of the data size between
the two categories is greater, most of the data is strongly classified as the majority class to decrease the
overall misclassification [4]. Therefore, handling unbalanced data may be a crucial procedure in model
development. This remains to be a major drawback in the classification/prediction techniques. The above
drawback will definitely have an impact on classification performance. The performance metrics, like the
AUROC, the AR, or the H-measure had no definite criteria to produce evidence for evaluating the excellence

Journal homepage: http://iaescore.com/online/index.php/IJAI

IJ-AI ISSN: 2252-8938  169

of the model performance. Refining the data will add to the performance improvement of ANN as it can keep
a check on the computation time and also reduce spending extra computing resources on training ANNs [5].
Optimizing the required data will aid in providing improved classification accuracy and thereby enhancing
the results of prediction.
The proposed method is applied to the problem of bankruptcy predictions using the financial data
that were collected in order to focus on the proportion of the small and medium-scale bankruptcy firms.
The intention of the study is to clarify and investigate how a machine learning technique can be exploited
within the field of economics. More specifically, aim of the research is to refine how the machine learning
strategies could be harnessed to predict corporate bankruptcies. We intent to apply an approach for selecting
the optimal training data set and found a proper connection weight to learn the ANN model where we can
employ multi-modal GA using Fuzzy C means algorithm to find multiple solutions on the cut-off values of
every cluster. This way by employing clustering and optimal selection approach the neural networks will be
more improved because the feature selection method to identify the most effective features for the classifier
will enhance the accuracy of their prediction of corporate bankruptcy [4]. The remaining section of the paper
is organized in the following way. Section 2 describes the existing technique of sampling and how sampling
technique has been used by many researchers over the period of time inorder to solve complex issues of
imbalance data management. Section 3 mentioned the proposed Fuzzy cluster-based technique of solving
bankruptcy problem in a more specific manner. Section 4 has presented the outcome of the proposed
technique in the form of their implementation and experimental results. Section 5 briefs about the conclusion
of the result considering the findings of the proposed algorithm with research gaps and limitations.

2. LITERATURE REVIEW
Previous studies in solving data imbalance problem were referred at the two approaches of data
level and algorithm level. The various concepts, techniques and systems are discussed in this section based
on the existing research in the current scenario.

2.1. Undersampling technique

Undersampling technique is referred to classification in terms of reduction in the number of
instances to balance the dataset consisting of majority class and the minority class. This is an efficient model
in the case of dealing with large amount of data. This technique is helpful since the training time of the
dataset is reduced. However, this method possess disadvantages in the form of risk of distorting the original
distribution of the majority class. Moreover, in this technique the potential useful data is discarded. It is
crucial to have a relevant dataset to improve the classification performance of a model by sampling data with
similar properties. Random under sampling reduces the dataset by removing a randomly sampled dataset
from the majority class as the simplest method. However, partial data can also be used in data modeling
because this huge amount of data is sufficient for analysis in the era of big data [4].
A cluster-based undersampling approach was performed where the approach has first conducted
clustering of all instances of data and divided them into several clusters [6]. Next, it selects the potential
relevant number of instances that is belonging to the majority class from each cluster on the basis of
proportional instances majority class to the number of instances of the minority class within the cluster.
Clustering, ensemble and undersampling methods were performed in one study to solve the class imbalance
problem [7]. They first conducted clustering using instances of the majority class and then constructed
multiple training datasets comprising of sampled instances of the majority class from each cluster, preserving
instances of the minority class. The evolutionary sampling method based on GA has been deployed in order
to selectively remove instances from the majority class [8,9]. However, previous studies on evolutionary
sampling using GA have showed performance results of time-consuming tasks in exploring optimal or near
optimal solutions, since instances of the majority class has become strings for GA searching. Thus, in this
study a cluster-based sampling supported by GA is suggested in order to handle the in- efficiency problem of
the previous existing evolutionary sampling method.

2.2. Clustering of non-bankruptcy firm data based on majority class

A cluster based boosting algorithm was performed in one study using the Instance Hardness
Threshold and CBoost algorithm with a robust framework in order to predict bankruptcy effectively of the
financial imbalance dataset [3]. This proposed framework is also verified by the KBD (Korean bankruptcy
dataset) having a small balancing ratio in both the testing and training phases. The proposed model
experiment results has achieved 86.8% in AUC i.e. the area under ROC curve. It has also outperformed other
existing methods for bankruptcy prediction using imbalance set of data. Machine learning methods were
applied to the dataset collected from the manufacturing companies in Korea, in order to know their future

Prediction of bankruptcy using big data analytic based on fuzzy c-means algorithm... (Arup Guha)
170  ISSN: 2252-8938

state with the help of certain financial measures [10]. Using several machine learning method result showed
an accuracy of more than 95%. However, this study has some limitation also in the form of dimensional
issue.

2.3. Under sampling technique based on genetic algorithms (GA)

A re-sampling approach is performed in a study in order to solve the unbalanced data sets [5]. In this
approach, both the oversampling and under sampling method are combined with the help of genetic
algorithm (GA). The application of genetic algorithm is based on a set of determined criteria and
the unbalance rate. This approach has been tested on literature as well as industrial datasets and a desired
improvement on the classification performance has been observed [11].
An under sampling approach and GA-ANN model has been approached in a study to improve
the existing traditional approach of classification, which were usually costly and slow [5]. The undersampling
approach is based on K-means cluster distribution in order to solve the problems of imbalance set of data.
This method is effective to enhance the rate of sampling and improved the final classification. At the same
time, this method has lower time of processing. GA-ANN method used in their study uses the algorithm to
optimize the bias and weight of the neural network and thereby resulted into better performance. To
increasing the classification accuracy a new genetic algorithm was proposed based on over sampling in order
to solve the class imbalance data sets [12]. It can create optimized minority class events to balance the
training datasets. The experimental results on imbalanced datasets proved better performance over the
previous sampling methods in terms of AUC and F-measure.

3. PROPOSED MODEL
The proposed cluster-based method is based on a clustering algorithm. In this study, the method is
adopted using Fuzzy C clustering. Fuzzy c-means algorithm applies the concept of fuzzy logic where
the objects of classifications are allowed for more than one cluster. This type of classification makes high
clarity sense since all the clusters are well separated. In this technique, value are assigned to all the wights.
Repetition is done until the centroid is computed for each of the cluster with the help of fuzzy partition. This
concept is related with the development of k-means algorithm for the sensor network. Using the fuzzy
c-means algorithm the partitioning of data is possible by the nodes into different measure-dependent
set of groups [13]. The role of this algorithm is to classify the data into separate groups. Each of the separated
groups are then used to find out the centroids and based on these, high priority and low priority values are
determined for the bankruptcy and non bankruptcy data. The purpose of this newly proposed model is to
determine the risk of bankruptcy within these predicted range of gathered data, considering 12 set of
attributes. In our proposed implementation we are using enhanced GANN based multimodal GA based
neural network.
 Constant capital or fixed assets.
 Current assets, inventory and receivables or short-term liabilities
 (Receivables * 365) / total assets
 (Net profit + depreciation) / total assets
 Total sales / total assets
 Short-term liabilities / total assets
 Working capital / total assets
 Working capital / sales
 (Current liabilities * 365) / cost of products sold
 (Current assets -inventory - receivables) / long-term liabilities
 (Inventory * 365) / sales
 Net profit/inventory

The step by step process of the proposed model is shown in the figure.
Figure: Proposed model of bankruptcy prediction
The process of the model comprises of the following process:
Step 1: In the first step of the model design, we have gathered the financial data of companies across several
industries in India along with their different financial ratios within the period 1994 to 2014. Big Data related
to bankruptcy is considered. These set of bankruptcy and non-bankruptcy data are being stored in merged
data_10X.csv. The data is then preprocessed to clean noise data, null data and missing data and then stored in
transformed_new data.csv by creating a specific path of preprocessed data. The Figure 1 shows clustered data
along with their centroids, using Fuzzy c means clustering.

IJ-AI Vol. 8, No. 2, June 2019: 168 – 174

IJ-AI ISSN: 2252-8938  171

Figure 1. Fuzzy c-means clustering

Step 2: Data gathered is preprocessed using undergone fuzzy c means algorithm and followed by data
filtering. With the help of this data, a 12*12 correlation matrix is formed considering each of
the attributes. Then the matrix has been arranged considering their correlation heatmap. The Figure 2 shows
the correlation heatmap. With this matrix, maximum priority can be determined of each attribute values with
the help of correlation matrix.

Figure 2. Correlation matrices with heatmap

Step 3: These set of attribute clustered data is then analysed with the help of histogram in order to predict
bankruptcy and no bankruptcy data, as shown in the figure. Hadoop map reduce algorithm has been applied
to these preprocessed data.
Step 4: Bankruptcy and non-bankruptcy status of data is found with the first attribute i.e constant capital or
fixed assets. Likewise we have proceeded with each attribute. The matrices were determined along with heat
map that are classified colourwise with attributes range as shown in the Figure 3.

Prediction of bankruptcy using big data analytic based on fuzzy c-means algorithm... (Arup Guha)
172  ISSN: 2252-8938

Figure 3. Histogram analysis of processed data

Agglomerative hierarchical Cluster technique has been employed in this case to improve
the efficiency of the bankruptcy. After performing clustering on the extracted attributes, the cluster feature
vector is applied to modify the classifiers for predicting bankruptcy from the data.
Step 5: The preprocessed data and the clustered data is stored into the transformed_new data.csv.
The file is created automatically and renamed as data.csv, which is our main data. This main data is now
separated into testing data and training data for the prediction of bankruptcy by considering them with the set
of 12 attribute. The classification is done with classifier support vector machine, logistic regression and
GA-ANN in order to compare.
Step 6: Before classification of the data is done, the classifier is trained in order to predict the exact
bankruptcy. The prediction results for bankruptcy results are enhanced by employing multi modal GA based
neural network. Correlation matrix will calculate the maximum values of attributes on the basis of mapping
technique. After that, we need to give this data to the classifier, shown in the Figure 4 Correlation matrices of
bankruptcy data and non bankruptcy data with the status of ID.

Figure 4. Correlation matrices of bankruptcy data and non bankruptcy data with the status of ID (0 and 1)

Step 7: At the end, the confusion matrices are calculated based on TP FP TN FN, to analyze the performance
of GA-ANN classifier. The matrix will show the bankruptcy and non bankruptcy data prediction capacity of
the classifier along with the misclassification rate.
TP means bankruptcy data was classified as bankruptcy and non bankruptcy data was classified as
bankruptcy. FP means bankruptcy data was classified as non bankruptcy, FN means non bankruptcy data was
classified as bankruptcy. TN means non bankruptcy data was classified as bankruptcy and bankruptcy data
was classified as non bankruptcy. Once the proposed scheme is designed, the performance of the method will

IJ-AI Vol. 8, No. 2, June 2019: 168 – 174

IJ-AI ISSN: 2252-8938  173

be evaluated based on accuracy, precision, specificity and sensitivity, shown in the Figure 5.
Step 8: The final comparison has been done with the existing methods to know the effectiveness
of the method.

Figure 5. Performance evaluation matrix

4. EXPERIMENTS AND RESULTS

4.1. Research data and experiments
The dataset comprise of financial ratios of several small and medium scale companies from
1994-2014. The bankruptcy and non-bankruptcy data status are shown in the Figure 6. The number of
non-audited companies are found comparatively higher that the total firms. The dataset was split into two
subsets by considering 80% of the data for training dataset which is used to develop undersampling method
for data class balancing and 20% for a validation dataset, which is arranged w.r.t the training data
distribution. Two stage selection process of the input variable has been applied based on the
previous method [1,3]. The chosen final variables were based on the variant test and these variable were used
for the credit evaluation of the selected companies. The model is implemented using tools python 3.6 and
Anaconda navigator.

Figure 6. Status of bankruptcy and nonbankruptcy

4.2. Result and analysis

Effectiveness of the cluster-based GA-ANN undersampling method using Fuzzy C means algorithm
applied to the classifier was being investigated for the bankruptcy prediction application. Here, we have set
GA to search the cut-off for each cluster that represents the minimum distance of the clusters from
the centroid. The optimization techniques are applied using GA-ANN, that has led to accurate prediction in
this feature matrix. In the classification model, the applied classification algorithms used were Genetic
Algorithm based Artificial Neural Networks, logistic Regression, Support Vector Machines and Decision

Prediction of bankruptcy using big data analytic based on fuzzy c-means algorithm... (Arup Guha)
174  ISSN: 2252-8938

Trees to predict bankruptcy. Tested Genetic Algorithm based Artificial Neural Networks were found
accuracy rate of 78.21% with comparison to existing method accuracy rate and showed misclassification rate
0.2178. Effectiveness of this method was proved by comparing its accuracy rate with the results of existing
method. Thus, this method has proved effective in the handling of such imbalance dataset prior to model
development, shown in the Figure 7.

Figure 7. Comparison of model for accuracy rate

5. CONCLUSION
This study verified the effectiveness of the proposed approach of cluster-based under-sampling
using Fuzzy C means algorithm in order to optimize GA-ANN for effective prediction of bankruptcy. In this
the data is structured by classifying them using clustering technique and performing simultaneous
optimization for the ANN model. This method has led to the effectiveness of the classifier and decreasing the
data imbalance rate at the same time. The experimental result showed an accuracy of 78.2% as compared to
the existing methods.

REFERENCES
[1] Tambe, P. (2014). Big data investment, skills, and firm value. Management Science, 60 (6), 1452-1469.
[2] Kim, K. J., & Ahn, H. (2012). A corporate credit rating model using multi-class support vector machines with an
ordinal pairwise partitioning approach. Computers & Operations Research, 39 (8), 1800-1811
[3] Le, T., Le Son, H., Vo, M., Lee, M., & Baik, S. (2018). A cluster-based boosting algorithm for bankruptcy
prediction in a highly imbalanced dataset. Symmetry, 10(7), 250.
[4] Kim, H. J., Jo, N. O., & Shin, K. S. (2016). Optimization of cluster-based evolutionary undersampling for the
artificial neural networks in corporate bankruptcy prediction. Expert Systems with Applications, 59, 226-234.
[5] Song, A., & Xu, Q. (2018). Imbalanced Data Classification Based on MBCDK-means Undersampling and GA-
ANN. In International Conference on Artificial Neural Networks (pp. 349-358). Springer, Cham.
[6] Yen, S. J., & Lee, Y. S. (2009). Cluster-based under-sampling approaches for imbalanced data distributions. Expert
Systems with Applications, 36 (3), 5718-5727
[7] Kang, P., Cho, S., & MacLachlan, D. L. (2012). Improved response modeling based on clustering, under-sampling,
and ensemble. Expert System with Applications, 39 (8), 6738-6753.
[8] Khoshgoftaar, T. M., Seliya, N., & Drown, D. J. (2010). Evolutionary data analysis for the class imbalance
problem. Intelligent Data Analysis, 14 (1), 69-88
[9] García, S., & Herrera, F. (2009). Evolutionary undersampling for classification with imbalanced datasets: Proposals
and taxonomy. Evolutionary Computation, 17 (3), 275-306.
[10] Chow, J. C. (2018). Analysis of Financial Credit Risk Using Machine Learning. arXiv preprint arXiv:1802.05326.
[11] Vannucci, M., & Colla, V. (2017). Genetic Algorithms Based Resampling for the Classification of Unbalanced
Datasets. In International Conference on Intelligent Decision Technologies (pp. 23-32). Springer, Cham.
[12] Dong, S., & Wu, Y. (2018, July). A genetic algorithm-based approach for class-imbalanced learning. In Third
International Workshop on Pattern Recognition (Vol. 10828, p. 108281D). International Society for Optics and
Photonics.
[13] Qin, J., Fu, W., Gao, H., & Zheng, W. X. (2017). Distributed $ k $-means algorithm and fuzzy $ c $-means
algorithm for sensor networks based on multi agent consensus theory. IEEE transactions on cybernetics, 47(3),
772-783.

IJ-AI Vol. 8, No. 2, June 2019: 168 – 174

A Comparative Study of SMOTE Borderline-SMOTE and ADASYN Oversampling Techniques Using Different Classifiers
No ratings yet
A Comparative Study of SMOTE Borderline-SMOTE and ADASYN Oversampling Techniques Using Different Classifiers
9 pages
Credit Risk Analysis Using Machine and Deep Learning
No ratings yet
Credit Risk Analysis Using Machine and Deep Learning
19 pages
Unit 5- Applications of AI and Machine Learning
No ratings yet
Unit 5- Applications of AI and Machine Learning
57 pages
Final Project Credit Risk_compressed_compressed
No ratings yet
Final Project Credit Risk_compressed_compressed
27 pages
Notes of advanced data structures
No ratings yet
Notes of advanced data structures
202 pages
Credit Analysis: Alina Mihaela Dima
No ratings yet
Credit Analysis: Alina Mihaela Dima
22 pages
Classifying Imbalanced Data Sets Using Similarity Based Hierarchical Decomposition
No ratings yet
Classifying Imbalanced Data Sets Using Similarity Based Hierarchical Decomposition
16 pages
Predictive Clustering For Credit Scoring
100% (1)
Predictive Clustering For Credit Scoring
5 pages
document (4)
No ratings yet
document (4)
40 pages
Business Failure Prediction Through Deep Learning
No ratings yet
Business Failure Prediction Through Deep Learning
6 pages
NTCC Seminar Sem6 Prachi Kumari A35400719009
No ratings yet
NTCC Seminar Sem6 Prachi Kumari A35400719009
30 pages
The Impact of Feature Selection and Transformation On Machine Learning Methods in Determining The Credit Scoring
No ratings yet
The Impact of Feature Selection and Transformation On Machine Learning Methods in Determining The Credit Scoring
15 pages
s41283-023-00132-2 1
No ratings yet
s41283-023-00132-2 1
23 pages
1 (1)
No ratings yet
1 (1)
28 pages
Physica A: Feng Shen, Xingchao Zhao, Zhiyong Li, Ke Li, Zhiyi Meng
No ratings yet
Physica A: Feng Shen, Xingchao Zhao, Zhiyong Li, Ke Li, Zhiyi Meng
17 pages
Empirical Analysis of Ensemble Learning For Imbalanced Credit Scoring
No ratings yet
Empirical Analysis of Ensemble Learning For Imbalanced Credit Scoring
18 pages
synth
No ratings yet
synth
6 pages
Video forgery: An extensive analysis of inter-and intra-frame manipulation alongside state-of-the-art comparisons
No ratings yet
Video forgery: An extensive analysis of inter-and intra-frame manipulation alongside state-of-the-art comparisons
13 pages
ML in Financial Crisis Prediction Survey
No ratings yet
ML in Financial Crisis Prediction Survey
16 pages
27450-53780-1-PB
No ratings yet
27450-53780-1-PB
11 pages
Performance Evaluation of Credit Risk Models
No ratings yet
Performance Evaluation of Credit Risk Models
11 pages
Financial Distress Prediction of K-Means Clustering Based On Genetic Algorithm and Rough Set Theory
No ratings yet
Financial Distress Prediction of K-Means Clustering Based On Genetic Algorithm and Rough Set Theory
6 pages
Optimization of Credit Scoring Model Using Stackin
No ratings yet
Optimization of Credit Scoring Model Using Stackin
10 pages
Credit Risk Assessment For Unbalanced Datasets Based On Data Mining
No ratings yet
Credit Risk Assessment For Unbalanced Datasets Based On Data Mining
21 pages
2.feb-2020 A Multiple Classifiers System For Anomaly
No ratings yet
2.feb-2020 A Multiple Classifiers System For Anomaly
12 pages
SSRN Id4249412
No ratings yet
SSRN Id4249412
45 pages
Knowledge-Based Systems: Chih-Fong Tsai
No ratings yet
Knowledge-Based Systems: Chih-Fong Tsai
8 pages
CS.IAABR
No ratings yet
CS.IAABR
6 pages
Credit Assessment of Bank Customers by A Fuzzy Exp
No ratings yet
Credit Assessment of Bank Customers by A Fuzzy Exp
6 pages
Artificial Intelligence Based Optimal Functional Link Neural Network For Financial Data ScienceComputers Materials and Continua
No ratings yet
Artificial Intelligence Based Optimal Functional Link Neural Network For Financial Data ScienceComputers Materials and Continua
16 pages
Bankruptcy Prediction
No ratings yet
Bankruptcy Prediction
33 pages
Prathyush_PullaUB9A
No ratings yet
Prathyush_PullaUB9A
9 pages
Ajol-File-Journals 543 Articles 255840 650d5184b77f4
No ratings yet
Ajol-File-Journals 543 Articles 255840 650d5184b77f4
14 pages
Business Failure Prediction With Support Vector Machines and Neural Networks: A Comparative Study
No ratings yet
Business Failure Prediction With Support Vector Machines and Neural Networks: A Comparative Study
14 pages
Predictive Accuracy: A Misleading Performance Measure For Highly Imbalanced Data
No ratings yet
Predictive Accuracy: A Misleading Performance Measure For Highly Imbalanced Data
12 pages
10.3934_DSFE.2024009 (2)
No ratings yet
10.3934_DSFE.2024009 (2)
14 pages
Bankruptcy Prediction Report
No ratings yet
Bankruptcy Prediction Report
32 pages
Performance Evaluation of Class Balancing
No ratings yet
Performance Evaluation of Class Balancing
6 pages
Fuzzy Decision Tree
No ratings yet
Fuzzy Decision Tree
12 pages
Decision Tree Combined With Neural Networks For Financial Forecast
No ratings yet
Decision Tree Combined With Neural Networks For Financial Forecast
7 pages
ssrn-4976040
No ratings yet
ssrn-4976040
14 pages
An Adaptive Fuzzy Neural Network Model For Bankruptcy Prediction of Listed Companies On The Tehran Stock Exchange
No ratings yet
An Adaptive Fuzzy Neural Network Model For Bankruptcy Prediction of Listed Companies On The Tehran Stock Exchange
6 pages
Application of Machine Learning Algorithms For Business Failure Prediction
No ratings yet
Application of Machine Learning Algorithms For Business Failure Prediction
15 pages
Financial Supervision and Management System
No ratings yet
Financial Supervision and Management System
9 pages
Early warning of enterprise finance risk of big data mining in internet of things based on fuzzy association rules
No ratings yet
Early warning of enterprise finance risk of big data mining in internet of things based on fuzzy association rules
9 pages
AReviewon Oversampling Techniquesfor Solvingthe Data Imbalance Problemin Classification
No ratings yet
AReviewon Oversampling Techniquesfor Solvingthe Data Imbalance Problemin Classification
11 pages
Paper IJRITCC
No ratings yet
Paper IJRITCC
5 pages
MIMO (eRAN13.1 02)
No ratings yet
MIMO (eRAN13.1 02)
274 pages
Loan Approval Prediction Using DM Techniques: Pusendra Chaudhary, Sumit Chaudhary, Arpan Mahatra
No ratings yet
Loan Approval Prediction Using DM Techniques: Pusendra Chaudhary, Sumit Chaudhary, Arpan Mahatra
8 pages
DataMining - CaseStudy
No ratings yet
DataMining - CaseStudy
48 pages
A Genetic Programming Approach For Bankruptcy Prediction Using A Highly Unbalanced Database
No ratings yet
A Genetic Programming Approach For Bankruptcy Prediction Using A Highly Unbalanced Database
10 pages
A Hybrid Model On Data Clustering and Computational Intelligence For Bank Crisis Classification and Prediction
No ratings yet
A Hybrid Model On Data Clustering and Computational Intelligence For Bank Crisis Classification and Prediction
8 pages
Viral Pandey Bankruptcy Prediction
No ratings yet
Viral Pandey Bankruptcy Prediction
7 pages
Adaptive Resonance Theory Applications For Business Forecasting
No ratings yet
Adaptive Resonance Theory Applications For Business Forecasting
12 pages
Vim Tutorial
No ratings yet
Vim Tutorial
262 pages
8769 Main PDF
No ratings yet
8769 Main PDF
28 pages
Spare Parts Calculations
100% (3)
Spare Parts Calculations
21 pages
Improving Prediction Accuracy Using Random Forest Algorithm
No ratings yet
Improving Prediction Accuracy Using Random Forest Algorithm
7 pages
Credit Card Score Prediction Using Machine Learning
No ratings yet
Credit Card Score Prediction Using Machine Learning
8 pages
Introduction To Machine Learning IIT KGP Week 2
100% (1)
Introduction To Machine Learning IIT KGP Week 2
14 pages
Presentation of Snake Game
50% (2)
Presentation of Snake Game
16 pages
What Is Wireless Communication
100% (1)
What Is Wireless Communication
43 pages
Methods of Artificial Intelligence For Prediction and Prevention Crisis Situations in Banking Systems
No ratings yet
Methods of Artificial Intelligence For Prediction and Prevention Crisis Situations in Banking Systems
6 pages
Networking and Data Communication MBA Sem II FMS MSU
No ratings yet
Networking and Data Communication MBA Sem II FMS MSU
91 pages
Past, Present and Future of Ecommerce and Emarketing
No ratings yet
Past, Present and Future of Ecommerce and Emarketing
19 pages
الدرس الاول
No ratings yet
الدرس الاول
23 pages
A4 Paper Emails
100% (2)
A4 Paper Emails
25 pages
Clustering Before Classification
No ratings yet
Clustering Before Classification
3 pages
Enterprise Credit Risk Evaluation Based On Neural Network Algorithm
No ratings yet
Enterprise Credit Risk Evaluation Based On Neural Network Algorithm
8 pages
A Tour of Scheme in Gambit
No ratings yet
A Tour of Scheme in Gambit
61 pages
GPON Presentation
No ratings yet
GPON Presentation
14 pages
Why A CCM Is Not A CMS - SDL
No ratings yet
Why A CCM Is Not A CMS - SDL
6 pages
Contact Announcement Confirmation Letter
No ratings yet
Contact Announcement Confirmation Letter
13 pages
ECE 301 - Digital Electronics: Sequential Logic Circuits: FSM Design
No ratings yet
ECE 301 - Digital Electronics: Sequential Logic Circuits: FSM Design
27 pages
87-351 Lecture 11 Notes
No ratings yet
87-351 Lecture 11 Notes
10 pages
Homebrew Microcomputer Design
No ratings yet
Homebrew Microcomputer Design
35 pages
Biometric PDF
No ratings yet
Biometric PDF
14 pages
Two-dimensional Klein-Gordon and Sine-Gordon numerical solutions based on deep neural network
No ratings yet
Two-dimensional Klein-Gordon and Sine-Gordon numerical solutions based on deep neural network
13 pages
Hec HMS
No ratings yet
Hec HMS
59 pages
Deep ensemble learning with uncertainty aware prediction ranking for cervical cancer detection using Pap smear images
No ratings yet
Deep ensemble learning with uncertainty aware prediction ranking for cervical cancer detection using Pap smear images
11 pages
Developing a website for English-speaking practice to English as a foreign language learners at the university level
No ratings yet
Developing a website for English-speaking practice to English as a foreign language learners at the university level
12 pages
U-Net for wheel rim contour detection in robotic deburring
No ratings yet
U-Net for wheel rim contour detection in robotic deburring
14 pages
Adaptive kernel integration in visual geometry group 16 for enhanced classification of diabetic retinopathy stages in retinal images
No ratings yet
Adaptive kernel integration in visual geometry group 16 for enhanced classification of diabetic retinopathy stages in retinal images
12 pages
Optimizing deep learning models from multi-objective perspective via Bayesian optimization
No ratings yet
Optimizing deep learning models from multi-objective perspective via Bayesian optimization
10 pages
Errata Leon Ed
No ratings yet
Errata Leon Ed
7 pages
Tutorial 3 Pre Processing of ABAQUS
No ratings yet
Tutorial 3 Pre Processing of ABAQUS
8 pages
Deep learning-based techniques for video enhancement, compression and restoration
No ratings yet
Deep learning-based techniques for video enhancement, compression and restoration
13 pages
Squeeze-excitation half U-Net and synthetic minority oversampling technique oversampling for papilledema image classification
No ratings yet
Squeeze-excitation half U-Net and synthetic minority oversampling technique oversampling for papilledema image classification
10 pages
Detecting road damage utilizing retinanet and mobilenet models on edge devices
No ratings yet
Detecting road damage utilizing retinanet and mobilenet models on edge devices
11 pages
Event detection in soccer matches through audio classification using transfer learning
No ratings yet
Event detection in soccer matches through audio classification using transfer learning
9 pages
PTC LMS Download
No ratings yet
PTC LMS Download
5 pages
Multi-task deep learning for Vietnamese capitalization and punctuation recognition
No ratings yet
Multi-task deep learning for Vietnamese capitalization and punctuation recognition
11 pages
Graph-based methods for transaction databases: a comparative study
No ratings yet
Graph-based methods for transaction databases: a comparative study
10 pages
Hybrid horned lizard optimization algorithm-aquila optimizer for DC motor
No ratings yet
Hybrid horned lizard optimization algorithm-aquila optimizer for DC motor
10 pages
Hybrid object detection and distance measurement for precision agriculture: integrating YOLOv8 with rice field sidewalk detection algorithm
No ratings yet
Hybrid object detection and distance measurement for precision agriculture: integrating YOLOv8 with rice field sidewalk detection algorithm
11 pages
Automatic detection of dress-code surveillance in a university using YOLO algorithm
No ratings yet
Automatic detection of dress-code surveillance in a university using YOLO algorithm
8 pages
A novel scalable deep ensemble learning framework for big data classification via MapReduce integration
No ratings yet
A novel scalable deep ensemble learning framework for big data classification via MapReduce integration
15 pages
Enhancing emotion recognition model for a student engagement use case through transfer learning
No ratings yet
Enhancing emotion recognition model for a student engagement use case through transfer learning
11 pages
Artificial intelligence algorithms to predict customer satisfaction: a comparative study
No ratings yet
Artificial intelligence algorithms to predict customer satisfaction: a comparative study
9 pages
Ramesh QA 8409
No ratings yet
Ramesh QA 8409
4 pages
Hybrid model detection and classification of lung cancer
No ratings yet
Hybrid model detection and classification of lung cancer
11 pages
A proposed approach for plagiarism detection in Myanmar Unicode text
No ratings yet
A proposed approach for plagiarism detection in Myanmar Unicode text
9 pages
Abstractive summarization using multilingual text-to-text transfer transformer for the Turkish text
No ratings yet
Abstractive summarization using multilingual text-to-text transfer transformer for the Turkish text
10 pages
Evaluating ChatGPT’s Mandarin “yue” pronunciation system in language learning
No ratings yet
Evaluating ChatGPT’s Mandarin “yue” pronunciation system in language learning
8 pages
A comparative study of natural language inference in Swahili using monolingual and multilingual models
No ratings yet
A comparative study of natural language inference in Swahili using monolingual and multilingual models
8 pages
A contest of sentiment analysis: k-nearest neighbor versus neural network
No ratings yet
A contest of sentiment analysis: k-nearest neighbor versus neural network
9 pages
Primary phase Alzheimer's disease detection using ensemble learning model
No ratings yet
Primary phase Alzheimer's disease detection using ensemble learning model
9 pages
Improved convolutional neural networks for aircraft type classification in remote sensing images
No ratings yet
Improved convolutional neural networks for aircraft type classification in remote sensing images
8 pages
Enhancing fall detection and classification using Jarratt‐butterfly optimization algorithm with deep learning
No ratings yet
Enhancing fall detection and classification using Jarratt‐butterfly optimization algorithm with deep learning
10 pages
Exploring DenseNet architectures with particle swarm optimization: efficient tomato leaf disease detection
No ratings yet
Exploring DenseNet architectures with particle swarm optimization: efficient tomato leaf disease detection
9 pages
Dynamic Item Processor Notes
0% (1)
Dynamic Item Processor Notes
2 pages
3 5-5
No ratings yet
3 5-5
2 pages
Hindi spoken digit analysis for native and non-native speakers
No ratings yet
Hindi spoken digit analysis for native and non-native speakers
7 pages
Arabic Pad User Guide
No ratings yet
Arabic Pad User Guide
3 pages
Monitoring File System For Windows: Information Security
No ratings yet
Monitoring File System For Windows: Information Security
3 pages
PV Elite 2016
No ratings yet
PV Elite 2016
6 pages
Auto Print Course 10.0.700
No ratings yet
Auto Print Course 10.0.700
19 pages
MACHINE LEARNING FOR BEGINNERS: A Practical Guide to Understanding and Applying Machine Learning Concepts (2023 Beginner Crash Course)
From Everand
MACHINE LEARNING FOR BEGINNERS: A Practical Guide to Understanding and Applying Machine Learning Concepts (2023 Beginner Crash Course)
Elaine Tate
No ratings yet
DATA MINING and MACHINE LEARNING. PREDICTIVE TECHNIQUES: ENSEMBLE METHODS, BOOSTING, BAGGING, RANDOM FOREST, DECISION TREES and REGRESSION TREES.: Examples with MATLAB
From Everand
DATA MINING and MACHINE LEARNING. PREDICTIVE TECHNIQUES: ENSEMBLE METHODS, BOOSTING, BAGGING, RANDOM FOREST, DECISION TREES and REGRESSION TREES.: Examples with MATLAB
César Pérez López
No ratings yet
Data Mining: Fundamentals and Applications
From Everand
Data Mining: Fundamentals and Applications
Fouad Sabry
No ratings yet

Prediction of Bankruptcy Using Big Data Analytic Based On Fuzzy C-Means Algorithm

Uploaded by

Prediction of Bankruptcy Using Big Data Analytic Based On Fuzzy C-Means Algorithm

Uploaded by

IAES International Journal of Artificial Intelligence (IJ-AI)

Vol. 8, No. 2, June 2019, pp. 168~174

Prediction of bankruptcy using big data analytic based on fuzzy

Arup Guha, N. Veeranjaneyulu

Article Info ABSTRACT

Journal homepage: http://iaescore.com/online/index.php/IJAI

2.1. Undersampling technique

2.2. Clustering of non-bankruptcy firm data based on majority class

2.3. Under sampling technique based on genetic algorithms (GA)

IJ-AI Vol. 8, No. 2, June 2019: 168 – 174

Figure 1. Fuzzy c-means clustering

Figure 2. Correlation matrices with heatmap

Figure 3. Histogram analysis of processed data

IJ-AI Vol. 8, No. 2, June 2019: 168 – 174

Figure 5. Performance evaluation matrix

4. EXPERIMENTS AND RESULTS

Figure 6. Status of bankruptcy and nonbankruptcy

4.2. Result and analysis

Figure 7. Comparison of model for accuracy rate

IJ-AI Vol. 8, No. 2, June 2019: 168 – 174

You might also like