Comparative Analysis of Multi-Label Classification Algorithms
Comparative Analysis of Multi-Label Classification Algorithms
Abstract—Multi-label classification has generated multi-label classification and its usage in different
enthusiasm in many fields over the last few years. It allows the applications can be found in [17].
classifications of dataset where each instance can be associated
with one or more label. It has successfully ended up being The objective of taking this research work is the analysis
superiorstrategy as compared to Single labelclassification. In and application of Multi-label classification algorithms on
this paper, we provide an overview of multi-label classification undergraduate student dataset.
approaches. We also discussed the various tools thatutilizes
This research is undertaken to comprehend the effect of
MLC approaches. Lastly, we have presented an experimental
study to compare different algorithms of multi-label
Graduate student personality traits like proficiency in English
classification. After applying and studying the accuracies of & expressiveness, participation in extra co-curricular
various multilabel classification techniques, we have found that activities along with their academic performance on getting
performance of Random Forest is better than the rest of the successful Campus placement and chances of perusal of
other compared multilabelclassification algorithms with 96% higher studies. College management could use the results to
accuracy. think ahead and provide the appropriate career guidance to
students. This study was initiated to figure out what fits best
Keywords— Multi-label classification (MLC), Multi-label for the student career path.
dataset (MLD), Single-label, Problem transformation, Algorithm
adaptation, Ensemble. The paper has been organized as per below structure:
after the introduction in section I, section II details the Multi-
I. INTRODUCTION label classification, Section III highlights challenges faced by
Presently, there is a colossal amount of information being MLC, Section IV discusses tools available for MLC, section
gathered and put away in databases throughout the globe. So Vdiscusses the Experimental Study & Results achieved,
what do we do of this information? The appropriate Section VI provides conclusion and the last section refers to
response lies in the idea of data mining. Data mining is a the References.
computational procedure of extraction of information from II. MULTILABEL CLASSIFICATION
considerable datasets to recognize patterns inside the mass
data. Over the years numerous techniques have been “Multi-label classification can be defined as the problem
developed to process and extract conclusions from the vast of finding a model that maps input A to binary vector B,
datasets. Some of the techniques are classification, rather than scalar outputs as in case of single label
clustering, regression and association. In this paper, we will classification problem.
concentrate on the classification techniques, particularly on
Initially inspired by text categorization and medical
the multi-label classification.
diagnosis, multi-label classification is now expanding its
Classification is a supervised learning technique which horizon in other fields like audio categorization,
provides label to instances in the given dataset based on bioinformatics and many more. MLC has successfully gained
training data. This classification can be categorized into popularity in the recent times because of its ability to solve
single label and multi-label. real-time problems. For instance, single-label classification
can label an email message as work, or research but not both,
Single label classification associates one label with each where the truth of the matter is, it could be tagged as both
instance whereas multi-label classification associates a research and work. MLC labels it as both, as it contributes
subset of labels with each instance. Multi-label classification towards the association of the object to one or more classes.
ought not to be mistaken for the multiclass arrangement. In This property of multi-label classification makes it more
multiclass classification the object cannot simultaneously competitive than the previously used classification approach.
belong to more than one class. For example, in scenery MLC methods can be divided into three types.
theobject belongs to the class ’sky’, class ‘water’, class ‘bird’
all at the same time. Multi-label classification has an edge A. Problem Transformation Method
over generally utilised classification techniques as multi-
label classification deals with the association of each B. Algorithm Adaptation Method
instance with one or more classes or labels. This paper aims C. Ensemble methods
at giving a brief overview of multi-label classification
approaches and its challenges. Further, we have compiled an Problem Transformation method transforms the multi-
experimental study in which we have compared different label classification problem into a set of binary classification
MLC approaches. A recent survey on the advancement of problem, which is then handled using single-class classifiers.
Algorithm Adaptation method merely modifies single label
classifiers into multi-label classification.An ensemble
36
2018 First International Conference on Secure Cyber Computing and Communication (ICSCCC)
37
2018 First International Conference on Secure Cyber Computing and Communication (ICSCCC)
[14] Tsoumakas, G., & Vlahavas, I., Sechidis, K., (2011), “On the
stratification of multi-label data,” Machine Learning and Knowledge
Discovery in Databases, 145-158.
[15] Grigorios Tsoumakas, Ioannis Katakis, and Ioannis Vlahavas. 2008,
“Effective and efficient multilabel classification in domains with large
number of labels,” In Proceedings of the ECML/PKDD 2008
Workshop on Mining Multidimensional Data (MMD’08)
[16] Raed Alazaidah, Farzana Kabir Ahmad,“Trending Challenges in
Multi Label Classification,” (IJACSA) International Journal of
Advanced Computer Science and Applications,Vol. 7, No. 10, 2016
[17] Charte, F., Rivera, A.J., del Jesus, M.J., Herrera, F.,“MLSMOTE:
approaching imbalancedmultilabel learning through synthetic instance
generation. Knowl.-Based Syst,” Elsevier. 89, 385–397(2015)
Fig. 5.1. Comparison of MLC algorithm [18] Grigorios Tsoumakas, Ioannis Katakis, and Ioannis Vlahavas.“Mining
Multi-label Data. O.Maimon, L. Rokach (Ed.),” Springer, 2nd edition,
VI. CONCLUSION 2010.(1-20)
[19] Jesse Read, Bernhard Pfahringer, Geoffrey Holmes, Eibe Frank,
The paper reviewed multi label classifications and “Classifier chains for multi-label classification,” In Proceeding(s) of
additionally highlighted how different algorithms are utilized the ECML/PKDD, volume 5782 , pages 254–269. Springer, 2009
through an experimental study to solve problems. Using the [20] Yuhong Guo , Suicheng Gu,“Proceedings of the Twenty-Second
MEKA tool,algorithms of BR, CC, PS, LS and Random International Joint Conference on Artificial Intelligence”number of
Forest were applied. Conclusion of the algorithm labels,” In Proceedings of the ECML/PKDD 2008 Workshop on
applications that the Random Forest offers superior Mining Multidimensional Data (MMD’08)
approach. Random Forest performance was followed up by
BR, CC,PS and LS. We observed that multi-label
arrangement although offers superior progressions, though it
faces many difficulties also.
REFERENCES
[1] Read, J., Reutemann, P., Pfahringer, B., & Holmes, G. (2016).
“MEKA: A multi-label/multi-target extension to WEKA,” Journal of
Machine Learning Research, 17(21), 1–5.
[2] Asma Aldrees and Azeddine Chikh and Jawad Berri,“Comparative
Evaluation of Four Multi-labels ClassificationAlgorithms in
Classifying Learning Objects,” David CWyld etal. (Eds): CCSEA,
CLOUD, DKMP, SEA, SIPRO – 2016.
[3] Herrera, F., riveria, A.J. del Jesus, M.J ,”Multilabel Classification
Problem Analysis, Metrics and Techniques,” 2016,XVI, 194 P. 72
Illus., Hardcover ISBN:978-3-319-4110-1
[4] Jadon Mayurisingh Nareshpalsingh1, Prof. Hiteshri N. Modi,” Multi-
label Classification Methods:A Comparative Study,” International
Research Journal of Engineering and Technology (IRJET), Volume:
04 Issue: 12 | Dec-2017.
[5] Zeynep Ceylan, Ebru Pekel,”Comparison of Multi-Label
Classification Methods for Prediagnosis of Cervical Cancer,” IJISAE,
2017, 5(4), 232-236.
[6] M. Zhang, Z. Zhou, “A review on multi-label learning algorithms,”
IEEE Trans. Knowl. Data Eng. 26 (8) (2014) 1819–1837,
doi:10.1109/TKDE.2013.39.
[7] EVA GIBAJA, SEBASTIA´N VENTURA,“A Tutorial on Multi-
Label Learning,” ACM Computing Surveys, Vol. 9, No. 4, Article 39,
Publication date: March 2010
[8] Hyunki Lim, Jaesung Lee, Dae-Won Kim,”Optimization approach for
feature selection in multi-label classification,” Elsevier,2017
[9] Slovenia G. Tsoumakas, M.-L. Zhang, and Z.-H. Zhou, “Tutorial on
learning from multi-label data,” in ECML PKDD, Bled, ,2009
[10] M.-L. Zhang and K. Zhang, “Multi-label learning by exploiting label
dependency,” in Proc. 16th ACM SIGKDD Int. Conf.
KDD,Washington, DC, USA, 2010, pp. 999–1007.
[11] Amirhossein Akbarnejad, Mahdieh Soleymani Baghshah, “A
probabilistic multi-label classifier with missing and noisy
labelshandling capability,” 0167-8655/© 2017
[12] Wei Wenga,b , Yaojin Linc , Shunxiang Wua , Yuwen Li a , Yun
Kanga , “Multi-label learning based on label-specific features and
local pairwise label correlation,” 0925-2312/© 2017 Elsevier
[13] Wei Bi , James T. Kwok, “Multi-Label Classification on Tree- and
DAG-Structured Hierarchies,” International Conference on Machine
Learning, Bellevue, WA, USA, 2011
38