Machine Learning
Machine Learning
Article
An Improved Multilabel k‑Nearest Neighbor Algorithm Based
on Value and Weight
Zhe Wang 1,2 , Hao Xu 2 , Pan Zhou 2, * and Gang Xiao 1
Abstract: Multilabel data share important features, including label imbalance, which has a significant
influence on the performance of classifiers. Because of this problem, a widely used multilabel clas‑
sification algorithm, the multilabel k‑nearest neighbor (ML‑kNN) algorithm, has poor performance
on imbalanced multilabel data. To address this problem, this study proposes an improved ML‑kNN
algorithm based on value and weight. In this improved algorithm, labels are divided into minority
and majority, and different strategies are adopted for different labels. By considering the label of la‑
tent information carried by the nearest neighbors, a value calculation method is proposed and used
to directly classify majority labels. Additionally, to address the misclassification problem caused by
a lack of nearest neighbor information for minority labels, weight calculation is proposed. The pro‑
posed weight calculation converts distance information with and without label sets in the nearest
neighbors into weights. The experimental results on multilabel datasets from different benchmarks
demonstrate the performance of the algorithm, especially for datasets with high imbalance. Differ‑
ent evaluation metrics show that the results are improved by approximately 2–10%. The verified
algorithm could be applied to a multilabel classification of various fields involving label imbalance,
such as drug molecule identification, building identification, and text categorization.
the classifier performance decreases when classifiers process the increasing amount and
complexity of data. The algorithm adaptive method adjusts existing multiclass classifica‑
tion algorithms to address the multilabel classification problem and can flexibly perform
multilabel classification [21]. Thus, the algorithm adaptive method has recently received
considerable attention.
In multilabel problems, a class with a larger number of instances could be defined as
a majority class, corresponding to the majority label. In contrast, a class with a smaller
number of instances could be defined as a minority class, corresponding to the minority
label [22]. In multilabel data, an imbalance often occurs between the minority and majority
labels. Therefore, multilabel classification algorithms face a challenge in that existing meth‑
ods cannot be directly used as a solution to address an imbalanced problem in multilabel
classification. When classifying a test instance with a minority label, most of its nearest
neighbors may be unlabeled, and the classification will give the test instance a negative
bias. Hence, the overall performance of the classification is affected. As a widely used
algorithm, ML‑kNN has many improved algorithms. However, existing ML‑kNN‑based
algorithms have poor classification performance when classifying imbalanced multilabel
datasets, and the results tend to become a majority label in multilabel classification. There‑
fore, the classifier should be redesigned to be able to classify imbalanced data.
In this paper, an improved ML‑kNN algorithm is proposed on the basis of value and
weight (hereafter called VWML‑kNN) to address the imbalanced multilabel problem. The
proposed algorithm divides labels into majority and minority labels and uses different
classification strategies for different labels. Unlike conventional ML‑kNN‑based methods,
the value of an instance in VWML‑kNN is obtained by comprehensively considering the
label distribution of nearest neighbors and the classification of majority labels by comput‑
ing a new maximum a posteriori (MAP) from the obtained values. Then, VWML‑kNN
calculates the distances between labeled and unlabeled nearest neighbors and converts
these distances into different weights. Finally, the weight and new MAP are combined
to classify minority labels. The experimental results on multilabel datasets from different
benchmarks show that their performances are improved by the VWML‑kNN, especially
for datasets with high imbalance.
2. Related Work
This section outlines the development of multilabel classification methods, especially
ML‑kNN‑based methods.
Godbole et al. [17] proposed a problem transformation algorithm called LP. Specifi‑
cally, LP converts a multilabel dataset into a new multiclass dataset, regarding each distinct
label combination (or label set) as a class. It can improve classification accuracy but may ex‑
acerbate the label imbalance problem, resulting in overfitting. An effective multilabel clas‑
sification method, called ML‑kNN, was proposed by Zhang et al. [20]. ML‑kNN assumes
that the final classification results of the data with similar characteristics are also related
to the label of instances with similar characteristics. It is the first lazy learning method
based on a conventional kNN method that considers the label selection information of the
k‑nearest neighbors (kNN) of one instance. It also uses the highest MAP to adaptively
adjust the decision boundary for each new instance. However, most multilabel classifiers
perform poorly in minority‑class classification problems in imbalanced datasets. Younes
et al. [23] proposed a generalization of an ML‑kNN algorithm called DML‑kNN. Unlike
ML‑kNN, DML‑kNN considers the dependencies between labels and accounts for all labels
in the neighborhood rather than the assigned nearest neighbor label to calculate the MAP.
Cheng et al. [24] proposed a multilabel classification method called instance‑based learn‑
ing and logistic regression (IBLR) based on label correlation and dependency. Moreover,
in IBLR, interdependencies between labels can be captured, and model‑ and similarity‑
based inferences for multilabel classification can be combined. An MLCWkNN algorithm
is proposed in [25] based on the Bayesian theorem. The linear weighted sum of the kNN is
calculated using the least squares error to determine the approximate query instance. The
Computation 2023, 11, 32 3 of 15
3. Methods
This section introduces the measurement indicators for evaluating the degree of mul‑
tilabel imbalanced data and explains the proposed VWML‑kNN algorithm. Then, the eval‑
uation metrics and datasets of the experiments are discussed.
The average level of the imbalance in the dataset is defined as MeanIR, which also
represents the mean of all labels’ IR and can be calculated as
Y
1 |Y |
|Y | y∑
MeanIR = ( IR(y)) (3)
=Y 1
According to IR and MeanIR, majority and minority labels can be defined as follows:
when the IR of y is lower than the MeanIR, y is defined as a majority label; otherwise, it is
defined as a minority label.
Therefore, the number of the nearest neighbors of xi with or without label j can be counted
in advance, as expressed in Equation (4).
s + ∑im=1 y j ( x j )
P ( y j = 1) = (5)
s∗2+m
s + κ ′ j [r ]
P C j ( xi ) | y j = 0) = (6)
K
s × ( K + 1) + ∑ κ′ j [r ]
r =0
s + κ ′ j [r ]
P C j ( xi ) | y i = 0) = (8)
K
s × ( K + 1) + ∑ κ′ j [r ]
r =0
In Equation (9), kj [r] calculates the number of training data with label j and r nearest
neighbors with yi . In Equation (10), k′ j [r] calculates the number of training data without
label j and r nearest neighbors with yi . The initial value of kj [r] and k′ j [r] is 0, and the
maximum value of kj [r] and k′ j [r] is K:
m
κ j [r ] = ∑ yi ∈ Yi ·Cj (x) = r(0 ≤ r ≤ k) (9)
i =1
m
κ ′ j [r ] = ∑ yi ∈/ Yi ·Cj (x) = r(0 ≤ r ≤ k) (10)
i =1
s + h′ j [z]
P C j ( xi ) | y j = 0) = (12)
K
s × ( K + 1) + ∑ h ′ j [ z ]
r =0
The majority of labels have enough prior information about whether the test instance
contains label y can be directly determined by the new MAP. The minority labels require
additional information to classify the test instance because of insufficient prior information.
During the classification, we adopt different strategies for majority and minority labels. For
each nearest neighbor instance of the test instances, the closer the distance of the neighbor
instance, the greater the similarity. We proposed a weight conversion strategy based on
this theory.
Weight transform: first, the nearest neighbors of the test instance are divided into Con‑
Set and NconSet whether they have label y or not. Specifically, ConSet contains the nearest
neighbors that have the label y, whereas NconSet contains the nearest neighbors without
the label y. Furthermore, the distance between the set and test instance is calculated. To
convert the distance into a weight, an appropriate function should be selected. Through
experiments, the Gaussian function was found to be a suitable weight transform function.
In the Gaussian function, the change rate of the weight is gradual. During the conversion,
the weight will not become that large when the distance is quite small, while the weight
will not become 0 when the distance is large. The Gaussian function is defined as follows:
′ 2
− (s,d 2)
w = a×e 2b (13)
where (s, d′ ) defines the distance between the set and the test instance, b represents the
standard deviation, and a is generally regarded as 1.
Therefore, the decision function of the minority label can be obtained, as expressed in
Equation (14):
t t
y j ( x ) = argmax j∈{0,1} ( × w + (1 − ) P(y j ) P(C j(xi ) |y j )) (14)
K K
If j = 1, this implies that the test instance contains minority label j. Otherwise, the test
instance does not contain minority label j. Here, w represents the weight after distance
conversion and t represents the proportion of weight in the decision function.
Substituting the weights into Equation (14) yields Equation (15):
(s,d )′
t t
y j ( x ) = argmax j∈{0,1} ( × a × e 2b2 + (1 − ) P(y j ) P(C j(x) |y j )) (15)
K K
The pseudocode of Algorithm 1 of VWML‑kNN is presented as follows.
As shown in the pseudocode, from step 1 to step 10, prior information within the train‑
ing dataset is calculated. In steps 12–14, the value of each unknown instance is obtained
by calculating the label distribution of the nearest neighbors. In steps 16–17, when classify‑
ing the minority label, the weights between the unknown instance and different label sets
of nearest neighbors are calculated. Finally, the test instance is classified using the new
decision function.
Computation 2023, 11, 32 6 of 15
Algorithm 1: VWML‑kNN
Input: A multi‑label dataset D, test instance x
1. For i = 1 to m do:
2. Identify k nearest neighbors N(xi ) of xi
3. end for
4. For j = 1 to q do:
5. Calculate the IR and MeanIR according to Equations (1) and (3)
6. Estimate the prior probabilities P(yj = 1) and P(yj = 0) according to Equations (5) and (6)
7. If the label j of xi is 1 (yij = 1) and xi has z nearest neighbors containing label j
8. hj [z] = hj [z] + 1 and hj [z] = hj [z]−1
9. Else h’j [z] = h′ j [z] + 1 and hj [z] = hj [z]−1
10.end for
11.Identify k nearest neighbors N(x) of x
12.For j = 1 to q do:
13.Calculate Cj(x) according to Equation (4)
14.Calculate the hj [z] and h’j [z] of x according to step 7 to step 9
15.If j = = majority label, return y according to Equations (11) and (12)
16.If j = = minority label, calculate the distance (s, d′ ) between NconSet and x.
17.Convert distance to weight according to Equation (13)
18.Return y according to Equation (15)
19.end for
20.end
Finally, one error indicates the number of times a top‑ranked label is not in the true
label set:
1 m
One error ( f ) = ∑ g((argmax f ( xi , y) ∈
/ Yi ) (18)
m i =1
0, y ∈ Yi
g( x ) = (19)
1, y ∈
/ Yi
3.4. Datasets
In Table 1, three benchmark multilabel datasets of varying sizes and fields were se‑
lected as experimental datasets: Enron, Corel5k, and yeast [34]. Enron is a dataset based
on rebels, including 500,000 real‑world emails from 150 employees of Enron. This dataset
has no labeling information but can be used for internal threat detection based on text and
social network analysis. Corel5k contains a total of 5000 pictures collected by Corel, cov‑
ering multiple themes such as dinosaurs, cars, beaches, etc. Yeast consists of micro‑array
expression data, as well as phylogenetic profiles of yeast.
The performance of the classifier is related to not only the number of labels but also the
characteristics of the dataset [35]. To show the different characteristics of the dataset, Card,
TCS [36], and Dens are introduced as the measurement of the datasets. Card indicates the
mean number of labels for each instance and is defined in Equation (20); Dens measures
the density of labels, defined in Equation (21); and TCS evaluates the complexity of the
dataset and is defined in Equation (22). A larger value implies a higher complexity of the
dataset, which increases the difficulty of the prediction of the correct classification result
for the classifier:
1 m
Card( D ) = ∑ |Yi |, (20)
m i =1
11 m
q m i∑
Dens( D ) = |Yi |, (21)
=1
By changing the values of k and t, we explored the better parameters of VWML‑kNN and
analyzed the influence of different parameters. In our experiments, t was set to 1, 3, 5, and
7, and k was set to 5, 7, and 10 in each dataset. Other parameters in the algorithm were
selected as default parameters. A 10‑fold cross‑validation was used in this experiment. A
total of 10 experiments were performed on each dataset, and the results were averaged.
Figures 1–4 present the change in each evaluation metric with different parameter val‑
ues of k and t on different datasets. Among the experimental results, an overall superior
performance is achieved when k = 10 and t = 3 because of its lowest value of evaluation met‑
rics on these datasets. Unlike k = 10, the experimental results are influenced to a greater
extent by the imbalanced characteristic of the data. Intuitively, when t = 1 or t = 7, the clas‑
sification result is not good enough. This is due to the existence of two types of extreme
instances in the dataset. We found that instances in the dataset that are quite close or far
away both lead to the large weight difference between instances that contain labels and
those that do not contain labels. If the instances are quite close, when t = 7, latent infor‑
mation such as the label distribution cannot be acquired, resulting in poor classification
accuracy. If they are quite far from each other, when t = 1, the MAP accounts for a large
proportion of the decision function, resulting again in poor classification accuracy.
Figure 1. Hamming loss for different values of k and t in different datasets. Different colors represent
Figure 1. Hamming loss for different values of k and t in different datasets. Different colors repre‐
different values of t.
sent different values of t.
Computation 2023, 11, x FOR PEER REVIEW
Computation 2023, 11, 32 10 of 10
16 of 15
Figure 2. One‑error for different values of k and t in different datasets. Different colors represent
Figure 2. One‐error for different values of k and t in different datasets. Different colors represent
different values of t.
different values of t.
Computation 2023, 11, x FOR PEER REVIEW
Computation 2023, 11, 32 11 of
1116
of 15
Figure 3. Ranking loss for different values of k and t in different datasets. Different colors represent
Figure 3. Ranking loss for different values of k and t in different datasets. Different colors represent
different values of t.
different values of t.
Computation 2023, 11, x FOR PEER REVIEW 12 of 16
Computation 2023, 11, 32 12 of 15
Figure 4. The results of different evaluation metrics in different datasets when t = 3.
Figure 4. The results of different evaluation metrics in different datasets when t = 3.
Computation 2023, 11, 32 13 of 15
Table 2. Experimental results from different multilabel classification methods on the Enron dataset.
Table 3. Experimental results from different multilabel classification methods on the Corel5k dataset.
Table 4. Experimental results from different multilabel classification methods on the yeast dataset.
Therefore, the experimental results demonstrate that VWML‑kNN can effectively clas‑
sify imbalanced multilabel data and has the best performance among the selected multil‑
abel classification algorithms.
5. Conclusions
This paper established an algorithm for the classification of imbalanced multilabel
data. Labels were divided into minority and majority labels, and different strategies for
different labels were adopted. A value calculation was proposed to determine the value of
labels to calculate the value of the MAP. In the classification of minority labels, the nearest
neighbors of the test instance were divided into sets with and without labels. Because of
a lack of prior information on minority labels, the algorithm calculated the distances be‑
tween the test instance and different nearest neighbor sets and converted these distances
into weights of nearest neighbor instances with and without labels. Finally, the MAPs of
the value calculation and weights were combined to classify the minority label. The results
of a series of experiments conducted on different datasets demonstrate the ability of the
established algorithm to classify imbalanced multilabel data. The results indicate that our
proposed VWML‑kNN achieves outstanding results on datasets with high TCS and high
MeanIR. Therefore, the proposed algorithm can be applied to the multilabel classification
of various fields that involve label imbalance, such as drug molecule identification, build‑
ing identification, and text categorization. The VWML‑kNN also has some limitations. For
example, the calculation method of the distance could be improved from the ordinary Eu‑
clidean metric. Moreover, the features are not sufficiently fused in the VWML‑kNN. In
the future, the authors will plan in‑depth studies on multilabel imbalanced classification,
especially on the relationships within labels.
Computation 2023, 11, 32 14 of 15
Author Contributions: Conceptualization, Z.W. and P.Z.; methodology, Z.W. and H.X.; software,
H.X.; validation, Z.W., H.X. and P.Z.; formal analysis, Z.W. and H.X.; investigation, Z.W. and H.X;
resources, H.X. and G.X.; data curation, H.X.; writing—original draft preparation, Z.W. and H.X;
writing—review and editing, H.X., P.Z. and G.X.; visualization, Z.W.; supervision, G.X.; project
administration, P.Z. and G.X.; funding acquisition, Z.W., P.Z. and G.X. All authors have read and
agreed to the published version of the manuscript.
Funding: This research was funded by the Science and Technology Key Research Planning Project
of Zhejiang Province, China, under Grant no. 2021C03136; the Lishui Major Research and Develop‑
ment Program, China, under Grant no. 2019ZDYF03; the Postdoctoral Research Program of Zhejiang
University of Technology under Grant no. 236527; and the Public Welfare Technology Application
Research Program Project of Lishui, China, under Grant no. 2022GYX12.
Institutional Review Board Statement: Not applicable.
Informed Consent Statement: Not applicable.
Data Availability Statement: The datasets used for the experiments conducted in this paper can be
found in the Mulan repository (http://mulan.sourceforge.net/datasets‑mlc.html, URL accessed on 28
June 2021).
Conflicts of Interest: The authors declare no conflict of interest.
References
1. Qian, W.; Huang, J.; Wang, Y.; Xie, Y. Label distribution feature selection for multi‑label classification with rough set. Int. J.
Approx. Reason. 2021, 128, 32–35. [CrossRef]
2. Maser, M.; Cui, A.; Ryou, S.; Delano, T.; Yue, Y.; Reisman, S. Multilabel Classification Models for the Prediction of Cross‑Coupling
Reaction Conditions. J. Chem. Inf. Model. 2021, 61, 156–166. [CrossRef]
3. Bashe, A.; Mclaughlin, R.J.; Hallam, S.J. Metabolic pathway inference using multi‑label classification with rich pathway features.
PLoS Comput. Biol. 2020, 16, e1008174.
4. Che, X.; Chen, D.; Mi, J.S. A novel approach for learning label correlation with application to feature selection of multi‑label data.
Inf. Sci. 2019, 512, 795–812. [CrossRef]
5. Huang, M.; Sun, L.; Xu, J.; Zhang, S. Multilabel Feature Selection Using Relief and Minimum Redundancy Maximum Relevance
Based on Neighborhood Rough Sets. IEEE Access 2020, 8, 62011–62031. [CrossRef]
6. Chen, Z.M.; Wei, X.S.; Jin, X.; Guo, Y.W. Multi‑label image recognition with joint class‑aware map disentangling and label
correlation embedding. In Proceedings of the 2019 IEEE International Conference on Multimedia and Expo, Shanghai, China,
8–12 July 2019; pp. 622–627.
7. Ben‑Cohen, A.; Zamir, N.; Ben‑Baruch, E.; Friedman, I.; Zelnik‑Manor, L. Semantic Diversity Learning for Zero‑Shot Multi‑Label
Classification. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), Montral, QC, Canada, 10–
17 October 2021; pp. 640–650.
8. Yu, G.; Domeniconi, C.; Rangwala, H.; Zhang, G.; Yu, Z. Transductive multi‑label ensemble classification for protein function
prediction. In Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining,
Beijing, China, 12–16 August 2018; pp. 1077–1085.
9. Maltoudoglou, L.; Paisios, A.; Lenc, L.; Martinek, J.; Kral, P.; Papadopoulos, H. Well‑calibrated confidence measures for multi‑
label text classification with a large number of labels. Pattern Recognit. 2022, 122, 108271. [CrossRef]
10. Maragheh, H.K.; Gharehchopogh, F.S.; Majidzadeh, K.; Sangar, A.B. A new hybrid based on long short‑term memory network
with spotted hyena optimization algorithm for multi‑label text classification. Mathematics 2022, 10, 488. [CrossRef]
11. Bhusal, D.; Panday, S.P. Multi‑label classification of thoracic diseases using dense convolutional network on chest radiographs.
arXiv 2022, arXiv:2202.03583.
12. Xu, H.; Cai, Z.; Li, W. Privacy‑preserving mechanisms for multi‑label image recognition. ACM Trans. Knowl. Discov. Data 2022,
16, 1–21. [CrossRef]
13. García‑Pedrajas, N.E. ML‑k’sNN: Label Dependent k Values for Multi‑Label k‑Nearest Neighbor Rule. Mathematics 2023, 11, 275.
14. Crammer, K.; Singer, Y. On the algorithmic implementation of multiclass kernel‑based vector machines. J. Mach. Learn. Res.
2002, 2, 265–292.
15. Gao, B.B.; Zhou, H.Y. Learning to Discover Multi‑Class Attentional Regions for Multi‑Label Image Recognition. IEEE Trans.
Image Process. 2021, 30, 5920–5932. [CrossRef] [PubMed]
16. Wu, C.W.; Shie, B.E.; Yu, P.S.; Tseng, V.S. Mining top‑K high utility itemset. In Proceedings of the 18th ACM SIGKDD Interna‑
tional Conference on Knowledge Discovery and Data Mining, Beijing, China, 12–16 August 2012; pp. 78–86.
17. Godbole, S.; Sarawag, S.I. Discriminative methods for multi‑labeled classification. In Proceedings of the Pacific‑Asia Conference
on Knowledge Discovery and Data Mining, Sydney, Australia, 26–28 May 2004; pp. 22–30.
Computation 2023, 11, 32 15 of 15
18. Boutell, M.R.; Luo, J.; Shen, X.; Brown, C.M. Learning multi‑label scene classification. Pattern Recognit. 2004, 37, 1757–1771.
[CrossRef]
19. Elisseeff, A.E.; Weston, J. A kernel method for multi‑labelled classification. In Proceedings of the 14th International Conference
on Neural Information Processing Systems: Natural and Synthetic, Vancouver, BC, Canada, 3–8 December 2001; pp. 681–687.
20. Zhang, M.L.; Zhou, Z.H. ML‑KNN: A lazy learning approach to multi‑label learning. Pattern Recognit. 2007, 40, 2038–2048.
[CrossRef]
21. Zhang, M.; Zhou, Z. A Review on Multi‑Label Learning Algorithms. IEEE Trans. Knowl. Data Eng. 2014, 26, 1819–1837. [CrossRef]
22. Li, J.; Li, P.; Hu, X.; Yu, K. Learning common and label‑specific features for multi‑Label classification with correlation information.
Pattern Recognit. 2022, 121, 108259. [CrossRef]
23. Younes, Z.; Abdallah, F.; Denoeu, T.X. Multi‑label classification algorithm derived from k‑nearest neighbor rule with label de‑
pendencies. In Proceedings of the 2008 16th European Signal Processing Conference, Lausanne, Switzerland, 25–29 August 2008;
pp. 1–5.
24. Cheng, W.; Hüllermeier, E. Combining instance‑based learning and logistic regression for multilabel classification. Mach. Learn.
2009, 76, 211–225. [CrossRef]
25. Xu, J. Multi‑label weighted k‑nearest neighbor classifier with adaptive weight estimation. In Proceedings of the 18th International
Conference on Neural Information Processing, Shanghai, China, 13–17 November 2011; pp. 79–88.
26. Zhang, M. An Improved Multi‑Label Lazy Learning Approach. J. Comput. Res. Dev. 2012, 49, 2271–2282.
27. Reyes, O.; Morell, C.; Ventura, S. Evolutionary feature weighting to improve the performance of multi‑label lazy algorithms.
Integr. Comput. Aided Eng. 2014, 21, 339–354. [CrossRef]
28. Zeng, Y.; Fu, H.M.; Zhang, Y.P.; Zhao, X.Y. An Improved ML‑kNN Algorithm by Fusing Nearest Neighbor Classification.
DEStech Trans. Comput. Sci. Eng. 2017, 1, 193–198. [CrossRef]
29. Vluymans, S.; Cornelis, C.; Herrera, F.; Saeys, Y. Multi‑label classification using a fuzzy rough neighborhood consensus. Inf. Sci.
2018, 433–434, 96–114. [CrossRef]
30. Wang, D.; Wang, J.; Hu Fei Li, L.; Zhang, X. A Locally Adaptive Multi‑Label k‑Nearest Neighbor Algorithm. In Proceedings of
the Pacific‑Asia conference on knowledge discovery and data mining, Melbourne, Australia, 3–6 June 2018; pp. 81–93.
31. Charte, F.; Rivera, A.; Del Jesus, M.J. Addressing imbalance in multilabel classification: Measures and random resampling algo‑
rithms. Neurocomputing 2015, 163, 3–16. [CrossRef]
32. Madjarov, G.; Kocev, D.; Gjorgjevikj, D.; Dzeroski, S. An extensive experimental comparison of methods for multi‑label learning.
Pattern Recognit. 2012, 45, 3084–3104. [CrossRef]
33. Charte, F.; Rivera, A.; Del Jesus, M.J.; Herrera, F. Dealing with difficult minority labels in imbalanced multilabel data sets. Neu‑
rocomputing 2019, 326, 39–53. [CrossRef]
34. Tsoumakas, G.; Spyromitros‑Xioufis, E.; Vilcek, J.; Vlahavas, I. Mulan: A java library for multi‑label learning. J. Mach. Learn. Res.
2011, 12, 2411–2414.
35. Zhou, S.; Li, X.; Dong, Y.; Xu, H. A Decoupling and Bidirectional Resampling Method for Multilabel Classification of Imbalanced
Data with Label Concurrence. Sci. Program. 2020, 2020, 8829432. [CrossRef]
36. Charte, F.; Rivera, A.; Del Jesus, M.J.; Herrera, F. Concurrence among imbalanced labels and its influence on multilabel resam‑
pling algorithms. In Proceedings of the International Conference on Hybrid Artificial Intelligence Systems, Salamanca, Spain,
11–13 June 2014; pp. 110–121.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual au‑
thor(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to
people or property resulting from any ideas, methods, instructions or products referred to in the content.