Effective Detection of Alzheimer's Disease by Optimizing Fuzzy K-Nearest Neighbors Based On Salp Swarm Algorithm
Effective Detection of Alzheimer's Disease by Optimizing Fuzzy K-Nearest Neighbors Based On Salp Swarm Algorithm
A R T I C L E I N F O A B S T R A C T
Keywords: Alzheimer’s disease (AD) is a typical senile degenerative disease that has received increasing attention world
Feature selection wide. Many artificial intelligence methods have been used in the diagnosis of AD. In this paper, a fuzzy k-nearest
Salp swarm algorithm neighbor method based on the improved binary salp swarm algorithm (IBSSA-FKNN) is proposed for the early
Alzheimer’s disease
diagnosis of AD, so as to distinguish between patients with mild cognitive impairment (MCI), Alzheimer’s disease
Medical diagnosis
Swarm intelligence algorithm
(AD) and normal controls (NC). First, the performance and feature selection accuracy of the method are validated
on 5 different benchmark datasets. Secondly, the paper uses the Structural Magnetic Resolution Imaging (sMRI)
dataset, in terms of classification accuracy, sensitivity, specificity, etc., the effectiveness of the method on the AD
dataset is verified. The simulation results show that the classification accuracy of this method for AD and MCI,
AD and NC, MCI and NC are 95.37%, 100%, and 93.95%, respectively. These accuracies are better than the other
five comparison methods. The method proposed in this paper can learn better feature subsets from serial
multimodal features, so as to improve the performance of early AD diagnosis. It has a good application prospect
and will bring great convenience for clinicians to make better decisions in clinical diagnosis.
* Corresponding author. School of Computer Science and Artificial Intelligence, Wenzhou University, Wenzhou, 325035, China.
** Corresponding author.
E-mail addresses: [email protected] (Z. Hu), [email protected] (M. Xu).
https://doi.org/10.1016/j.compbiomed.2023.106930
Received 23 August 2022; Received in revised form 15 March 2023; Accepted 13 April 2023
Available online 14 April 2023
0010-4825/© 2023 Elsevier Ltd. All rights reserved.
D. Lu et al. Computers in Biology and Medicine 159 (2023) 106930
the depth feature reduction technology and the gradient face optimizer this method in dealing with practical problems, this paper discretizes it
optimized dual support vector machine classifier (TSVM-GDO) to clas into binary ISSA (IBSSA) and applies it to feature selection with the goal
sify AD diseases, which greatly improved the classification accuracy and of finding the optimal feature subset. On the one hand, on the datasets of
greatly shortened the execution time [7]. Seo Jungryul et al. used a deep BreastCancer, glass, hepatitisfulldata, Lymphography, and WDBC data
learning model combining multi-layer perceptron, SVM, and RNN and sets obtained from the UCI Machine Learning Repository, the effec
achieved an experimental accuracy rate of 70.97% [8]. tiveness of this method is tested in terms of classification accuracy,
The advantages of KNN are that it is user-friendly, easy to under sensitivity, and specificity, and other indicators. On the other hand, in
stand, interpretable, and has a high accuracy rate. The KNN method order to verify the effectiveness of this method in the diagnosis of early
equally weighs the selected neighbors without considering their space AD, we used MRI, PET, and CSF multimodal feature data from the in
with specific points [9]. Usually, we use a more advanced form of the ternational Alzheimer’s disease neuroimaging initiative (ADNI) and
KNN method. Keller introduces fuzzy sets to improve KNN and proposes compared them with other methods, which are swarm intelligence al
a fuzzy K-nearest neighbor (FKNN). It applies fuzzy logic by assigning a gorithms combined with a FKNN classifier. The experimental results
certain degree of membership to groups based on the space of each show that the IBSSA-FKNN method can effectively improve the classi
k-nearest neighbor [10]. Since FKNN was proposed, it has been widely fication performance and the performance of early AD diagnosis. It has a
used in various classification task, and has been applied in many fields, good application prospect and will bring great convenience for clini
such as biological and image data classification [11], face recognition cians to make better decisions in clinical diagnosis.
[12], Parkinson’s disease diagnosis [13], tracking moving targets in The rest of this paper is organized as follows: Section 2 introduces the
videos [14], etc. Meanwhile, some researchers use meta-heuristics to FKNN classifier; Section 3 introduces the salp swarm algorithm and the
solve practical problems, such as medical diagnosis [15,16], financial binary salp swarm algorithm; Section 4 introduces the classification
distress prediction [17], parameter extraction of solar cells [18], engi method proposed in this paper, namely IBSSA-FKNN; Section 5 conducts
neering design problems [19,20], feature selection [21,22], education experiments and results analysis on traditional datasets; and Section 6
prediction [23], PID control [24], wind speed prediction [25], rolling conducts experiments and results analysis on sMRI datasets. Finally,
bearing fault diagnosis [26], gate resource allocation [27] and sched Section 7 discusses the conclusions and introduces the prospects for
uling problems [28], etc. When using FKNN to solve practical problems, future work.
there are two problems to deal with. On the one hand, proper kernel
parameter settings play an important role in designing an effective SVM 2. Background materials
model. The first parameter, adaptively specified the neighborhood size
k, and the second parameter, fuzzy intensity parameter m. On the other 2.1. Fuzzy K-nearest neighbors(FKNN)
hand, choosing the optimal subset of input features also greatly affects
the performance of the FKNN model. KNN is one of the simplest classifiers. For the samples to be classified,
Feature selection is a commonly used dimensionality reduction KNN determines the sample class as the pattern of the neighbor’s class
method that refers to selecting a subset of attributes from the original set according to the k neighbors closest to the sample. However, this
of attributes. Its main purpose is to identify important features, elimi method defaults to assuming that each sample has the same weight and
nate the irrelevance of unnecessary features, and build a good learning has only one class, which is not the case in reality. In order to solve these
model. Feature selection greatly reduces the computational time of the two problems, Keller introduced fuzzy set theory into KNN and proposed
induction algorithm and improves the accuracy of the resulting model. the FKNN algorithm. In FKNN, each sample now belongs to multiple
Feature selection can be divided into two categories: correlation-based classes with different membership degrees, and no longer belongs to
filtered feature selection and search-based heuristic feature selection only one class. Furthermore, FKNN assigns different weights to each
[29]. In recent years, algorithms inspired by nature have become very neighbor according to the distance between samples. Simply put,
popular for solving various optimization problems. However, some meta neighbors with similar distances have greater weight in determining the
heuristic algorithms proposed recently, for example, the monarch but class than those with farther distances. In FKNN, the fuzzy membership
terfly optimization (MBO) [30], slime mould algorithm (SMA) [31], of samples is assigned to different classes according to the following
moth search algorithm (MSA) [32], hunger games search (HGS) [33], formula:
Runge Kutta method (RUN) [34], colony predation algorithm (CPA)
∑
k ( /⃦ ⃦2/(m− 1) )
[35], weighted mean of vectors (INFO) [36] and Harris hawks optimi uij 1 ⃦x − xj ⃦
zation (HHO) [37], have also attracted the attention of many scholars. In ui (x) =
j=1
(1)
k ( /⃦ ⃦2/(m− 1) )
this paper, the binary salp swarm algorithm(BSSA) is used to optimize ∑
1 ⃦x − xj ⃦
the FKNN classifier and perform feature selection at the same time. The j=1
2
D. Lu et al. Computers in Biology and Medicine 159 (2023) 106930
(4)
C
C(x) = argmax(ui (x))
i=1
2.2. Salp swarm algorithm (SSA) optimization algorithm, which plays the role of balancing global
exploration and local development, and is the most important control
Salp Swarm Algorithm (SSA) is a global optimization algorithm parameter in SSA. The expression of c1 is:
based on swarm intelligence that was proposed by Mirjalili et al., in ( )2
2017 [38]. The salp is a kind of marine creature with body tissue and a − 4l
of food in the d-th dimension, respectively; ub and lb are the corre Xdi =
2
sponding upper and lower bounds, respectively. where c1 , c2 , and c3 are
control parameters. where Xdi and Xdi are the updated follower’s position and the pre-update
′
Equation (2) shows that the leader’s location update is only related follower’s position in the d-th dimension, respectively.
to the location of the food. c1 is the convergence factor in the
3
D. Lu et al. Computers in Biology and Medicine 159 (2023) 106930
3. Improved binary salp swarm algorithm (IBSSA) vector in the continuous domain to move around the search space. The
transformation between the continuous solution space and the discrete
3.1. Population initialization strategy based on cubic chaotic map solution space can be discretized through a specific transformation
function, generally using a sigmoid transformation function. At the same
Chaotic sequences have the advantages of easy implementation, time, the position of the salps may stay at some local points and remain
short execution time, and being able to jump out of the local optimal unchanged when the value is large. To avoid this weakness, the sigmoid
value, so they are widely used in random based optimization algorithms. transformation function is used here [42]. Use the particle’s velocity
Lyapunov exponent is often used to judge the dynamic performance of probability to change the position of an element.
the system. The larger the value, the higher the degree of chaos. FENG ( ) 1
et al. analyzed the best chaotic sequences generated by 16 common S xij (t) = ( ) (13)
1 + exp − xij (t)
chaotic maps [41], and the results showed that the running time of cubic
chaotic map is short, and the lyapunov exponent is close to the optimal
where xij (t) is the velocity of the i-th individual in the j-th dimension at
value. In this paper, cubic chaotic map is used to optimize the initial
solution and improve the search efficiency. time t; S(xij (t)) is the transformation probability that the position xij (t)
The expression of the standard Cubic chaotic mapping function is: takes 1 or 0.
After calculating the transition probability, the following equation
xn− 1 = αxn3 − βxn (11) (10) needs to be used to update the position of the salps:
⎧ ( )
where, α and β are chaos influencing factors, and the range of Cubic ⎨ 1, rand ≥ S xji (t)
maps with different α and β values is different. Generally, when β ∈ xij (t) = ( ) (14)
⎩ 0, rand < S xi (t)
(2.3, 3), the sequence generated by Cubic mapping is chaotic. j
In addition, when α = 1, xn ∈ ( − 2, 2). when α = 4, xn ∈ ( − 1, 1). To
make xn ∈ (0, 1), the Cubic mapping used in the improved algorithm is where xij (t) is the position of the i-th salps in the j-th dimension at time t.
in the following form:
Xmax is selected as the maximum position value to limit the range of xij (t),
( )
xn+1 = ρxn 1 − x2n , xn ∈ (0, 1) (12) that is, xij (t) ∈ [ − Xmax , Xmax ], and also limit the probability that the po
sition xij (t) is converted to 1 or 0.
where, ρ is the control parameter. Chaos of Cubic map is closely related
to the value of parameter ρ. Here, take the initial value x0 = 0.3, and the
number of iterations is 10000. The simulation results of Cubic mapping 3.3. Follower position updating strategy based on the variable helix
are shown in Fig. 1. mechanism
It can be seen from the figure that when ρ = 2.59, Cubic map is a full
map between (0,1) and has the best chaotic ergodicity. The location update of the i-th follower of the thallus group algo
rithm is determined by the location coordinates of the i-th and i-1
thallus, and this location update rule is only determined by the positions
4
D. Lu et al. Computers in Biology and Medicine 159 (2023) 106930
of the previous individual and the current individual in the thallus chain,
Therefore, the updated thimbles are highly dependent on the leader
individuals of the previous update, which easily limits the global search
ability and local search speed of the algorithm. To solve the above
problems, the variable helix factor is introduced, which makes full use of X
the individual’s opposite solution about the origin, reduces the number
of individuals beyond the boundary, and ensures the algorithm has a
detailed and flexible search ability.
The variable helix factor is calculated as follows:
H = a⋅cos(k ⋅ l ⋅ π) (15)
⎧ km
X
⎪
⎨ M
1, t <
a= 2 (16)
⎪
⎩ e5⋅l , otherwise
t
l = 1 − 2⋅ (17)
M
follower to make full use of the entire search space, more easily get rid of
the attraction of the local optimal solution, strengthen the search of the
entire space, maintain the diversity of the population, enhance the early
algorithm exploration ability, and improve the later algorithm devel
opment ability. Based on this, the follower formula is updated as follows:
⎧
⎪ 1 ( i i− 1
) M
⎪
⎨ ⋅cos(a⋅l⋅π)⋅ xd + xd , t <
2 2
(18)
′
xid = Fig. 2. Flowchart of the proposed IBSSA-FKNN diagnostic system.
⎪
⎪ 1 ( i ) M
⎩ ⋅e ⋅cos(a⋅l⋅π)⋅ x + x
5⋅l i− 1
, t >
d d
2 2
different distances from different neighbors to the center of the sample
class. FKNN needs to get the distance from all training set samples to get
3.4. Dimensional random difference mutation
its k neighbors, resulting in a large amount of computation. In view of
the problems existing in FKNN, many scholars have studied and
Use random difference mutation to carry out dimensional mutation,
improved it, mainly aiming at some parameter selection or optimization
and obtain a new individual dimension through this mutation. The
problems involved in this algorithm. These improved methods have
specific formula is as follows.
improved accuracy and reliability to a certain extent, but still have
( ) ( ′ )
xij = r1 × Fj − xij + r2 × xj − xij (19) shortcomings. On the other hand, SSA has strong optimization ability
and high optimization accuracy. But for complex problems, they will
Among them, xij is the j-th dimension of the i-th individual in the salps also fall into local extremum. Therefore, first, the Cubic mapping
method is used to initialize the population, so that the initial salp pop
group; Fj is the j-th dimension of food source location; xj is the j-th
′
ulation covers the feasible region space more evenly; Secondly, the
dimension of a random individual in the population; r1 and r2 are
variable helix factor is introduced, which makes full use of the in
random numbers of [0,1]. After the population location update is
dividual’s opposite solution about the origin, reduces the number of
completed, use the dimension-by-dimension random differential muta
individuals beyond the boundary, and ensures the algorithm has a
tion to mutate each dimension of the individual, and evaluate a certain
detailed and flexible search ability. Finally, the best and the worst in
dimension after it mutates. If it is excellent, retain the solution after the
dividuals are selected for the updated individuals to carry out dimen
mutation. If the evaluation result after the mutation becomes poor,
sional random difference mutation. In order to further study the role of
discard the poor dimension information, reduce the interference be
this method in dealing with practical problems, this paper discretizes it
tween each dimension, and increase the search scope. Due to the
into binary ISSA (IBSSA) and applies it to feature selection with the goal
blindness of mutation operation, the search efficiency of the algorithm
of finding the optimal feature subset.
will be reduced and the calculation amount will be greatly increased if
In this section, we use the IBSSA algorithm for feature selection to the
all individuals are subjected to dimensional random differential muta
original FKNN and create a model called IBSSA-FKNN. The main goal of
tion. Therefore, only the best and worst individuals in the population are
this model is to optimize the FKNN classifier: (1) determine the number
selected for mutation. The best individual mutation can improve the
of the nearest neighbors k and the fuzzy strength parameter m;(2)
search efficiency, and the worst individual mutation can improve the
identify the best subset of discriminative features and feature selection.
search range and jump out of the local optimal solution.
The appropriate feature subset obtained is used as input to the optimized
FKNN model for classification. The IBSSA-FKNN method takes diag
4. Proposed IBSSA-FKNN model
nostic accuracy as the fitness of feature selection. The IBSSA-FKNN
flowchart of the overall architecture of the proposed model is shown
On the one hand, in FKNN, the distance weights of k neighbors are
in Fig. 2.
calculated based on distance measures, without distinguishing the
A flag vector for feature selection is shown in Fig. 3. The vector
importance of features and without taking into account the impact of
5
D. Lu et al. Computers in Biology and Medicine 159 (2023) 106930
consisting of a series of binary values of 0 and 1 represents a subset of 5. Traditional data set experiment
features, that is, an actual feature vector, which has been normalized.
For a problem with D dimensions, there are D bits in the vector. The i-th 5.1. Dataset description
feature is selected if the value of the i-th bit equals one; otherwise, this
feature will not be selected (i = 1, 2, …., D). The size of a feature subset is In order to verify the effectiveness of the proposed method, this
the number of bits, whose values are one in the vector. The pseudocode section conducts experiments on the SSA-FKNN method on 5 classifi
of the IBSSA algorithm is presented as shown in Algorithm 2. cation datasets, which are BreastCancer, glass, hepatitisfulldata,
Lymphography, WDBC. The datasets are from the UCI Machine Learning
Algorithm 2. Pseudo-code for feature selection procedure
Repository (http://archive.ics.uci.edu/ml/datasets). Among them, the
BreastCancer dataset has 699 data, including 9 features and 2 categories;
the grass dataset has 214 data, including 10 features and 2 categories;
the hepatitisfulldata dataset has 155 data, including 20 features and 2
After the parameter pair and feature subset were obtained, the FKNN
categories; Lymphography dataset has 148 data, including 18 features
model began to perform the classification tasks. At first, the FKNN
and 4 categories; WDBC dataset has 569 data, including 30 features and
trained on reduced the training feature space using the parameter pair to
2 categories. The specific description information of the dataset is shown
evolve an optimal model, and then the optimal FKNN model was
in Table 1.
employed to predict the new samples on the reduced testing feature
Before the experiment, the data needs to be preprocessed first. Since
space. The whole process was done via the 10-fold CV analysis, and
the BreastCancer data set has the missing features, in order to ensure the
finally the average results over 10 folds were computed. The detailed
integrity of the sample data, the average value of these records is pro
pseudo-code for the classification phase is as follows.
cessed in this experiment. At the same time, in order to reduce the dif
Algorithm 3. Pseudo-code f or the classification procedure ference between the eigenvalues and prevent the larger eigenvalues
6
D. Lu et al. Computers in Biology and Medicine 159 (2023) 106930
Table 3
The detailed results of the IBSSA-FKNN model on the BreastCancer dataset.
Runs of 10- ACC SEN SPE PRE No.of selected
fold CV feature
#1 1 1 1 1 4
#2 1 1 1 1 4
#3 1 1 1 1 4
#4 0.98571 1 1 0.97872 4
#5 1 1 0.95833 1 4
Fig. 3. A flag vector for feature selection. #6 1 1 1 1 4
#7 1 1 1 1 3
#8 0.98571 0.97826 1 1 3
from excessively affecting the smaller eigenvalues, we normalize each
#9 0.98551 0.97778 1 1 4
eigenvalue to the [-1,1] interval. The normalized calculation formula is: #10 0.97183 0.95652 1 1 5
( ) Mean 0.9928 0.9913 0.9958 0.9979 3.9
x − mina
(20)
′
x = ∗2− 1
maxa − mina
through the KNN model. K-fold cross-validation is mainly used to obtain
where x is the original value of the data, x is the normalized value, maxa
′
an unbiased estimate of generalization accuracy. If K is set to 10, the
is the maximum value in feature a, and mina is the minimum value in data set is divided into 10 subsets, one of which is taken as the test set,
feature a. and the remaining part is taken as the training set. Then, the average
error of all 10 tests is calculated. During the implementation of the K-
fold cross-validation strategy, all test sets are independent, and rela
5.2. Experimental setup and description
tively stable and reliable results can be obtained. In addition, this section
uses the IBSSA algorithm to generate the optimal feature subset on the
The proposed IBSSA-FKNN method is implemented on the MAT
training set and then uses the validation dataset filtered by the optimal
LAB2018b platform. This experiment is performed on an NVIDIA
feature subset to classify using the FKNN classifier to obtain the final
GeForce GTX 1660 with Windows 10 as the operating system. The
result. In subsequent experiments, the best results of the evaluation in
detailed parameters of IBSSA-FKNN are set as follows: the number of
dicators have been bolded in the table.
populations is 20, and the maximum number of iterations is set to 1000.
In order to verify the effectiveness of the improved IBSSA algorithm in
5.3. Evaluation criteria
feature selection, a total of 5 comparison algorithms are set up for
comparison, which are Binary Bat Algorithm (BBA) [43], Binary Moth
Evaluate the classification performance of this method, which are
Flame Optimizer (BMFO) [44], Quantum Gaussian Dragonfly Algorithm
classification Accuracy (ACC), Sensitivity (SEN), Specificity (SPE), Pre
(QGDA) [45], Binary Quantum Grey Wolf Optimization Algorithm
cision(PRE), F-measure. Defined as follows:
(BQGWO) [46], Binary Spread Strategy with the Chaotic Local Search
Accuracy is the proportion of the total number of correct predictions.
Grey Wolf Optimization (BSCGWO) [47]. The parameter settings of the
Use the following methods to determine:
contrast group intelligent optimization algorithm involved in this paper
are shown in Table 3. TP + TN
ACC = × 100% (21)
The experiments are mainly carried out by using the wrapped feature TP + TN + FN + FP
selection method. During the experiments, the IBSSA algorithm is used Sensitivity is an index used to measure the classifier’s recognition of
to generate feature subsets, and the resulting feature subsets are eval abnormal records, and is also often expressed as the TP rate.
uated using the results obtained by the FKNN classifier. In the feature
selection process, the IBSSA algorithm realizes the search through a ten- SEN =
TP
× 100% (22)
fold cross-validation strategy and applies it to practical problems TP + FN
Specificity is often used to estimate the ability of a classification
Table 1 model to identify normal examples, which is also often expressed as the
Detailed description of the dataset. TN rate.
NO. dataset number of number of number of is there a TN
categories samples features missing SPE = × 100% (23)
value
TN + FP
1 BreastCancer 2 699 9 yes Precision is the correct proportion of positive instances of prediction,
2 glass 2 214 10 no as calculated using:
3 hepatitisfulldata 2 155 20 no
4 Lymphography 4 148 18 no
5 WDBC 2 569 30 no
Table 4
The detailed results of the IBSSA-FKNN model on the glass dataset.
Table 2 Runs of 10-fold CV ACC SEN SPE PRE No.of selected feature
Parameter setting of swarm intelligence optimization algorithm.
#1 0.95238 0 0 0.95238 3
Algorithm Parameters #2 0.85 0 0 0.85 3
#3 0.90476 0 0 0.90476 4
BBA [fmin , fmax ] = [0, 2]; A = 0.5; r = 0.5; α = 0.95; γ = 0.05
#4 0.80952 0 0 0.80952 3
BMFO b =1
#5 0.90909 0 0 0.90909 3
BQGWO a = 2 − FEs × (2 /MaxFEs); r1 = r2 = rand(0, 1); A = 2 × a × r1 ; C =
#6 0.95 0 0 0.95 4
2 × r2 ; β = ω = 10
#7 0.86957 0 0 0.86957 0
BSCGWO a = 2 − FEs × (2 /MaxFEs); r1 = r2 = rand(0, 1); A = 2 × a × r1 ; C =
#8 0.90909 0 0 0.90909 0
2 × r2 ; β = a ∗ rand(0, 1)
) #9 0.78261 0 0 0.78261 4
IBSSA 4 × FEs 2
− ( #10 0.90476 0 0 0.90476 6
c1 = 2 × e MaxFEs ; c2 = c3 = rand(0, 1); ρ = 2.59; x0 = 0.3 Mean 0.8842 0 0 0.8842 3
7
D. Lu et al. Computers in Biology and Medicine 159 (2023) 106930
Table 5 Table 9
The detailed results of the IBSSA-FKNN model on the hepatitisfulldata dataset. Experimental results of six methods on the glass dataset.
Runs of 10-fold CV ACC SEN SPE PRE No.of selected feature Algorithm Features’ ACC SEN SPE PRE F-measure
size (%) (%) (%) (%) (%)
#1 1 1 1 1 2
#2 1 1 1 1 5 IBSSA- 3 0.8842 0 0 0.8842 0
#3 1 1 1 1 4 FKNN
#4 1 1 1 1 2 BBA-FKNN 4.2 0.6856 0 0 0.6856 0
#5 1 1 1 1 2 BMFO- 3.7 0.8788 0 0 0.8788 0
#6 1 1 1 1 3 FKNN
#7 1 1 1 1 3 QGDA- 4.3 0.8773 0 0 0.8773 0
#8 1 1 1 1 5 FKNN
#9 1 1 1 1 4 BQGWO- 3.9 0.8595 0 0 0.8595 0
#10 1 1 1 1 1 FKNN
Mean 1 1 1 1 3.1 BSCGWO- 3.7 0.8744 0 0 0.8744 0
FKNN
Table 6
The detailed results of the IBSSA-FKNN model on the Lymphography dataset. Table 10
Experimental results of six methods on the hepatitisfulldata dataset.
Runs of 10-fold CV ACC SEN SPE PRE No.of selected feature
Algorithm Features’ ACC SEN SPE PRE F-
#1 1 0 0 1 4
size (%) (%) (%) (%) measure
#2 0.9375 0 0 0.9375 3
(%)
#3 1 0 0 1 3
#4 1 0 0 1 7 IBSSA- 3.1 1 1 1 1 1
#5 1 0 0 1 7 FKNN
#6 1 0 0 1 7 BBA-FKNN 8.2 0.8410 0.6417 0.8949 0.6683 0.6309
#7 1 0 0 1 3 BMFO- 7.6 1 1 1 1 1
#8 1 0 0 1 6 FKNN
#9 0.92857 0 0 0.92857 3 QGDA- 3.6 1 1 1 1 1
#10 1 0 0 1 8 FKNN
Mean 0.9866 1 1 0.9866 5.1 BQGWO- 3.9 0.9875 0.9750 0.9923 0.9750 0.9714
FKNN
BSCGWO- 3.7 0.9933 0.9667 1 1 0.9800
FKNN
Table 7
The detailed results of the IBSSA-FKNN model on the WDBC dataset.
Runs of 10-fold CV ACC SEN SPE PRE No.of selected feature Table 11
#1 0.98276 0.95455 1 1 13 Experimental results of six methods on the Lymphography dataset.
#2 1 1 1 1 5
Algorithm Features’ ACC SEN SPE PRE F-measure
#3 1 1 1 1 9
size (%) (%) (%) (%) (%)
#4 1 1 1 1 2
#5 0.98246 1 1 1 2 IBSSA- 5.1 0.9866 0 0 0.9866 0
#6 1 1 1 1 6 FKNN
#7 1 1 1 1 8 BBA-FKNN 7.1 0.8268 0 0 0.8268 0
#8 1 1 1 1 3 BMFO- 6.7 0.9749 0 0 0.9749 0
#9 1 1 1 1 4 FKNN
#10 1 1 1 1 3 QGDA- 5.7 0.9799 0 0 0.9799 0
Mean 0.9965 0.9955 1 1 5.5 FKNN
BQGWO- 4.5 0.9804 0 0 0.9804 0
FKNN
BSCGWO- 4 0.9518 0 0 0.9518 0
Table 8 FKNN
Experimental results of six methods on the BreastCancer dataset.
Algorithm Features’ ACC SEN SPE PRE F-
size (%) (%) (%) (%) measure
(%)
Table 12
IBSSA- 3.3 0.9929 0.9913 0.9958 0.9979 0.9945 Experimental results of six methods on the WDBC dataset.
FKNN
Algorithm Features’ ACC SEN SPE PRE F-
BBA-FKNN 3.8 0.9411 0.9453 0.9333 0.9650 0.9543
size (%) (%) (%) (%) measure
BMFO- 3.6 0.9871 0.9847 0.9917 0.9957 0.9901
(%)
FKNN
QGDA- 3.6 0.9872 0.9847 0.9920 0.9957 0.9901 IBSSA- 5.5 0.9965 0.9955 1 1 0.9953
FKNN FKNN
BQGWO- 4.1 0.9885 0.9913 0.9833 0.9914 0.9913 BBA-FKNN 10 0.9474 0.9288 0.9583 0.9353 0.9297
FKNN BMFO- 12.5 0.9965 0.9952 0.9972 0.9955 0.9952
BSCGWO- 3.4 0.9857 0.9869 0.9833 0.9914 0.9890 FKNN
FKNN QGDA- 5.7 0.9982 0.9952 0.9972 0.9955 0.9977
FKNN
BQGWO- 4.1 0.9948 0.9907 0.9972 0.9952 0.9930
TP FKNN
PRE = × 100% (24) BSCGWO- 4.3 0.9947 0.9859 1 1 0.9928
TP + FP
FKNN
Among them, TP (True Positive), FP (False Positive), TN (True
Negative) and FN (False Negative) represent true positive, false positive,
true negative and false positive, respectively.
8
D. Lu et al. Computers in Biology and Medicine 159 (2023) 106930
9
D. Lu et al. Computers in Biology and Medicine 159 (2023) 106930
metaheuristic optimization algorithms on 5 different datasets to test its thickness, cell shape uniformity, edge adhesion, and chromatin, which
performance on feature selection problems. Tables 8–12 record the have been selected more than 5 times, so we think that these features can
mean values of the selected feature length, classification accuracy, be used as a reference for distinguishing breast cancer, which can be
sensitivity, specificity, precision, and F-measure obtained by the BMFO, found in the selection feature’s 10-fold CV.
BBA, QGDA, BQGWO, BSCGWO, and IBSSA algorithms under the The 10-fold selection features in the glass dataset are shown in Fig. 5.
experiment of 10-fold crossover. The original number of features in the glass dataset is 9. After feature
It can be seen from the experimental results in Tables 8–12 that for selection, not all features are selected for classification. The average
the IBSSA algorithm, only in the BreastCancer dataset, the sensitivity number of selected features of the IBSSA-FKNN method is 3.6, and its
index is slightly inferior to other algorithms. On the four datasets, most important features are F1, F4, F6, F7, and F8, all of which have
including grass, the algorithm achieves the best selected feature length, been selected more than 4 times or more, so we think that these features
classification accuracy, sensitivity, specificity, precision, and F-measure. can be used as a reference for distinguishing. It can be found in the se
For example, on the BreastCancer dataset, the IBSSA algorithm obtained lection feature’s 10-fold CV.
the optimal average number of feature selection 3.3, the optimal average The 10-fold selection features in the hepatitisfulldata dataset are
classification accuracy 99.29%, the optimal average sensitivity 99.13%, shown in Fig. 6. The original feature number in the hepatitisfulldata
the optimal average specificity 99.58%, the optimal average precision dataset is 19. After feature selection, not all features are selected for
99.79%, and the optimal average F-measure 99.45%. Experimental re classification. The average number of selected features of the IBSSA-
sults show that the IBSSA algorithm improves the classification accu FKNN method is 3.3, and its most important features are F1, F2, F3,
racy, sensitivity, specificity, precision, and F-measure of feature subsets and F11, all of which have been selected more than 3 times or more, so
to a certain extent. It is worth noting that although this algorithm does we think that these features can be used as a reference for distinguishing,
not perform very well in improving classification accuracy, they have a which can be found in the selection feature’s 10-fold CV.
better performance in reducing the data dimension. The 10-fold selection features in the Lymphography dataset are
In order to explore how many and which features are selected in the shown in Fig. 7. The original number of features in the Lymphography
feature selection process, we further conduct experiments on 5 datasets dataset is 17. After feature selection, not all features are selected for
to investigate the details of the feature selection mechanism of the salp classification. The average number of selected features of the IBSSA-
swarm optimization algorithm. Figs. 3–7 show the statistical diagram of FKNN method is 3.6, and its most important features are F2, F7, F11,
the number of times each feature value is selected in the 10-fold cross- F13, and F14, all of which have been selected more than 4 times or more,
validation experiment of the IBSSA-FKNN method. From these figures, so we think these features can be used as a reference for distinguishing.
we can find that some features are selected more times, while some It can be found in the selection feature’s 10-fold CV.
features are selected less times. The 10-fold selection features on the WDBC dataset are shown in
The 10-fold selection features in the BreastCancer dataset are shown Fig. 8. The original number of features on the Wdbc dataset is 29. After
in Fig. 4. The original number of features in the BreastCancer dataset is feature selection, not all features are selected for classification. The
9. After feature selection, not all features are selected for classification. average number of selected features of the IBSSA-FKNN method is 4.3,
The average number of selected features for the IBSSA-FKNN method is and its most important features are F1, F2, F3, F14, F17, F21, F24, which
3.3, and its most important features are F1, F3, F4, F7, i.e. bundle are all selected more than 3 times or more, so we think these features can
be used as a reference for distinguishing. It can be found in the selection
Table 13 feature’s10-fold CV.
Subject information(mean ± std).
category Number of Age Years of MMSE ASAS-
6. sMRI dataset experiment
subjects Education Cog
6.1. Dataset description
AD 51 75.2 ± 14.7 ± 3.6 23.8 ± 18.3 ±
7.4 2.0 6.0
NC 52 75.3 ± 15.8 ± 3.2 29.0 ± 7.4 ± The experimental data are obtained from the international Alz
5.2 1.2 3.2 heimer’s disease neuroimaging initiative (ADNI) database (http://adni.
MCI-C 43 75.8 ± 16.1 ± 2.6 26.6 ± 12.9 ± loni.usc.edu/). ADNI was established in 2003 by the National Institute
6.8 1.7 3.9
MCI-NC 56 74.7 ± 16.1 ± 3.0 27.5 ± 10.2 ±
on Aging (NIA), the National Institute of Biomedical Imaging and
7.7 1.5 4.3 Bioengineering (NIBIB), the Food and Drug Administration (FDA),
10
D. Lu et al. Computers in Biology and Medicine 159 (2023) 106930
Table 14 modalities are selected for the experiment, and only the data collected at
Different methods classify AD/MCI, AD/NC, MCI/NC on multimodal data. the benchmark time point of these subjects are selected. In the Inter
AD vs. MCI national AD Database, 202 subjects have the above three modalities at
the same time. Table 13 lists the demographic information of these
Algorithm Features’ ACC SEN SPE PRE F-
size (%) (%) (%) (%) measure subjects.
(%)
IBSSA- 11.5 0.9537 0.99 0.9233 0.9627 0.9657 6.2. Experimental setup and description
FKNN
BBA-FKNN 73.2 0.5608 0.6656 0.36 0.6696 0.6638
This paper adopts a 10-fold cross-validation strategy to evaluate the
BMFO- 117 0.8803 0.9289 0.79 0.8999 0.9109
FKNN classification performance of the proposed method. Specifically, the
QGDA- 44.7 0.9533 0.98 0.86 0.9409 0.9574 sample set is divided into 10 pieces on average, one of which is selected
FKNN one by one as the test set, and the remaining 9 pieces are used as the
BQGWO- 8.5 0.9667 0.9889 0.92 0.9642 0.9761
training set. Calculate the features’ size, average accuracy, sensitivity,
FKNN
BSCGWO- 5.6 0.9667 0.97 0.94 0.9727 0.9747
specificity, and F-measure of these 10 experiments as the experimental
FKNN results of one division. Then randomly exchange the order of the sam
ples, divide the 10-fold cross validation once more, and calculate the
AD vs. NC
features’ size, average accuracy, sensitivity, specificity, and F-measure.
IBSSA- 11.4 1 1 1 1 1
Repeat the division 10 times, and calculate the features’ size, average
FKNN
BBA-FKNN 76.5 0.8145 0.8 0.83 0.8223 0.8037 accuracy, sensitivity, specificity, and F-measure for these 10 divisions.
BMFO- 103.6 0.96 0.94 0.98 0.98 0.9578 The experiment adopts the two-class method (AD/MCI, AD/NC, and
FKNN MCI/NC) to fully verify the influence of different classifications on the
QGDA- 32.3 0.97 0.98 0.96 0.9667 0.9707 experimental results.
FKNN
In order to verify the performance of the method proposed in this
BQGWO- 20.8 0.9909 0.98 1 1 0.9889
FKNN paper for the diagnosis of early AD, it is compared with five classifica
BSCGWO- 2.6 0.99 0.98 1 0.9817 0.9889 tion methods that are also the same as the swarm intelligence optimi
FKNN zation algorithm combined with the FKNN classifier.
IBSSA- 27.3 0.9395 0.9789 0.94 0.9718 0.9555 In order to verify the performance of the IBSSA-FKNN method pro
FKNN
BBA-FKNN 76.5 0.7432 0.8 0.6367 0.8100 0.8024
posed in this paper for early AD diagnosis, it is compared with other
BMFO- 103.6 0.8686 0.9089 0.79 0.8994 0.8970 methods of swarm intelligence optimization combined with classifiers.
FKNN The five classification methods are: a feature selection method based on
QGDA- 39.3 0.9252 0.9478 0.88 0.9436 0.9437
FKNN
BQGWO- 21.6 0.9354 0.9589 0.8567 0.9292 0.9527 Table 15
FKNN Sample size and classification results of AD prediction and diagnosis methods.
BSCGWO- 5.8 0.9137 0.92 0.9067 0.9496 0.9314 references methods sample size Accuracy(%)
FKNN
Literature SVM with Gaussian Baseline MRI:198 Baseline MRI:
[49] kernel AD,409MCI(pMCI and AD vs. NC
private pharmaceutical enterprises, and non-profit organizations. Its sMCI), 231 NC 87.9%
pMCI vs. NC
main goal is to test whether the progress of MCI and early AD can be
83.2%
measured by combining MRI, PET, other biomarkers, and clinical neu pMCI vs.
ropsychological evaluation. The database contains data modalities, sMCI 70.4%
including MRI image data based on time series, PET image data, and Literature Bagging algorithm and 56 AD, 60 MCI,60 NC AD vs. NC
[50] SVM 89%%
other types of biomarker values, such as CSF, and some clinical neuro
MCI vs. NC
psychological assessment scores, such as the mini-mental state exami 72%
nation (MMSE) and the Alzheimer’s disease assessment scale-cognitive Literature deep full link network 65 AD,67 cMCI, 102 AD vs. HC
(ADAS-Cog). These data categories are mainly: patients with early AD, [51] and stacked self- ncMCI,77HC 87.76%%
patients with mild cognitive impairment (MCI), and the cognitive encoder MCI vs. HC
76.92%
normal control group (NC). While mild cognitive impairment (MCI) is
Literature multi-instance 198 AD,238 sMCI, 167 AD vs. NC
usually considered an early stage of AD, which is a transition state from [52] learning techniques of pMCI,234 NC 88.8%%
normal control (NC) to AD, especially late-stage MCI is likely to develop graph pMCI vs.
into AD. Therefore, MCI is generally divided into MCI converted to AD sMCI 69.6%
Literature Independent 202 AD, 410 MCI, 236 75% of data
(MCI patients who will convert to AD, MCIc) and MCI not converted to
[53] Component Analysis NC in training
AD (MCI patients who will not convert to AD, MCInc). The subjects of (ICA) and SVM set:
the ADNI database were recruited from 50 websites across the United AD vs. NC
States and Canada. Their initial goal was to recruit 800 adult volunteers, 78.4%
ranging in age from 55 to 90 years old. Among them, 200 were elderly MCI vs. NC
71.2%
people with normal cognition in the follow-up test for three consecutive
90% of data
years, 400 were patients with mild cognitive impairment in the follow- in training
up test for three consecutive years, and 200 were patients with AD in the set:
follow-up test for two consecutive years. The personal basic information AD vs. NC
85.7%%
of these subjects can be obtained from the official website of ADNI.
MCI vs. NC
In this paper, the sample data of subjects with MRI, PET, and CSF 79.2%
11
D. Lu et al. Computers in Biology and Medicine 159 (2023) 106930
a binary bat algorithm combined with a fuzzy k-nearest neighbor clas Batmanghelich N et al. used Bagging algorithm and SVM for AD/NC
sifier; Feature selection method based on a binary Moth-Flame Opti classification, and logistic regression model using Boosting algorithm
mization combined with a fuzzy k-nearest neighbor classifier; Feature was used for MCI/NC classification [49]. Liu S et al. realized the diag
selection method based on binary Gaussian discriminant analysis algo nosis and prediction of AD by using deep full link network and stacked
rithm combined with fuzzy k-nearest neighbor classifier; And two self-encoder [50]. Tong T et al. used multi-instance learning techniques
feature selection methods based on binary improved grey wolf optimi of graph to classify samples by extracting local density blocks as features
zation algorithm combined with a fuzzy k-nearest neighbor classifier. [51]. Yang W et al. used Independent Component Analysis (ICA) for
Table 14 shows the experimental results of the performance com feature extraction and combines it with SVM algorithm for AD predic
parison between the IBSSA-FKNN method and the other five methods on tion [52].
the concatenated multimodal data, respectively, for classifying AD/NC,
AD/MCI, and MCI/NC. In Table 14, IBSSA-FKNN indicates that the bi 7. Conclusion and future work
nary salp swarm algorithm is first used for feature selection, and then the
FKNN classification model is used for classification experiments. Other In this study, we propose a FKNN feature selection method based on
methods are the same. Among them, all the experimental results listed in the binary salp swarm algorithm and apply this method to the early
Table 14 are the average value of each index divided by 10 times of 10- diagnosis of AD. First, on the datasets of BreastCancer, glass, hep
fold cross-validation. atitisfulldata, Lymphography, and WDBC obtained from the UCI Ma
The experimental results in Table 14 show that employing the chine Learning Repository, the effectiveness of this method is tested in
feature selection step can improve the performance of the classification terms of classification accuracy, sensitivity, and specificity and other
model in diagnosing early AD. The classification accuracy of IBSSA- aspects. Second, in order to verify the effectiveness of this method in the
FKNN method for AD and MCI, AD and NC, MCI and NC is 95.37%, diagnosis of early AD, the multimodal feature data of MRI, PET and CSF
100%, and 93.95%, respectively. From the six indicators of the three from the international AD neuroimaging initiative (ADNI) were used
groups of classification results, the methods proposed in this paper are and compared with other swarm intelligence algorithms combined with
better than the other five methods. At the same time, the advantages of the FKNN classifier. The experimental results show that the proposed
BQGWO-FKNN and BSCGWO-FKNN methods are also obvious, second IBSSA-FKNN method is superior to the other five FKNN models based on
only to IBSSA-FKNN method. In AD/NC and MCI/NC classification ex swarm intelligence algorithms in various performance indicators and
periments, BSCGWO-FKNN method is superior to IBSSA-FKNN in that can effectively improve the classification performance and the
selecting the number of feature subsets, but ranks first in other in performance of early AD diagnosis. The promising application prospect
dicators. In the AD/MCI classification experiment, IBSSA-FKNN method will bring great convenience for clinicians to make better decisions in
is slightly lower than BQGWO-FKNN or BSCGWO-FKNN method in a clinical diagnosis.
certain index. The experimental results show that IBSSA-FKNN is still On the one hand, future research will expand the suggested strategy
better than other methods in the 3 groups of classification experiments. to considerably bigger datasets. Second, the approach suggested in this
Based on the experimental analysis results in Table 2 above, the study may be developed further to improve the AD classification impact.
following conclusions can be drawn: The FKNN feature selection method Next, I’ll aim to integrate deep learning with a swarm intelligence
based on the Salp Swarm Algorithm proposed in this paper can signifi optimization method and apply it to the early detection of Alzheimer’s
cantly improve the classification performance of only using the FKNN disease in order to accomplish AD classification. On the other hand, this
classifier. Compared with other swarm intelligence methods combined work only focuses on a small quantity of labeled training data, although
with FKNN classifier, the method proposed in this paper improves there is a large amount of unlabeled multimodal data accessible in clinic.
various indicators such as classification accuracy, sensitivity and spec Also, there are a considerable amount of incomplete multimodal data in
ificity. Among them, the improvement of AD/MCI classification per clinic. Making full use of this incomplete multimodal labeled data may
formance is particularly significant, and the combination of FKNN not only enhance the quantity of training samples, but also build
classifier can achieve higher classification performance, so the IBSSA- learning methods for incomplete multimodal data, which can improve
FKNN method proposed in this paper can be well applied to the diag the model’s promotion performance. In a nutshell, the experiment gives
nosis of early AD. a useful research idea and algorithm for the study of Alzheimer’s dis
From another point, this paper analyzes the time consumption of ease, and it demonstrates that the swarm intelligence optimization al
different algorithms on AD classification, as shown in Fig. 9. It can be gorithm has a positive influence on the early detection of Alzheimer’s
seen from the figure that the classification method proposed in this disease.
question takes a relatively long time. From the results of the above in
dicators, this method is superior to other methods in classification ac Data availability
curacy and other indicators. Next, I will continue to explore how to
ensure accuracy while saving time. It takes longer than other classifi The data used to support the findings of this study are included in the
cation techniques because of improvement approach 4, namely, the article.
dimensional random difference mutation. Since this technique requires
mutating each aspect of the individual and then judging and screening Declaration of competing interest
the outcomes. The blindness of the mutation process is certain to result
in a decrease in the algorithm’s search efficiency and a considerable The authors declare that they have no conflicts of interest.
increase in the quantity of computation. This method, however, can
cause the algorithm to deviate from the local optimum solution, boost Acknowledgments
ing the accuracy of AD classification.
In recent years, scholars have proposed many diagnostic algorithms Dongwan Lu and Yinggao Yue contributed equally to this work and
for AD. Since these algorithms use different databases and different should be considered as co-first authors. This work was supported in part
preprocessing methods, it is difficult to directly conduct comparative by the Natural Science Foundation of Zhejiang Province under Grant
experiments. Therefore, relevant algorithms that perform well in LY23F010002, in part by Wenzhou basic scientific research project
different sample numbers are selected for comparison. Table 15 lists the under Grant R20210030 and Service science and technology innovation
sample size and classification results of each algorithm. Janoušová E project of Wenzhou Science and Technology Association under Grant
et al. combined penalty regression data resampling to extract features kjfw36, the general scientific research projects of Zhejiang Provincial
and classify data by using SVM with Gaussian kernel [48]. Department of Education under Grant Y202250103, in part by Major
12
D. Lu et al. Computers in Biology and Medicine 159 (2023) 106930
scientific and technological innovation projects of Wenzhou Science and [24] G.Q. Zeng, X.Q. Xie, M.R. Chen, et al., Adaptive population extremal optimization-
based PID neural network for multivariable nonlinear control systems[J], Swarm
Technology Plan under Grant ZG2021021, School Level Scientific
Evol. Comput. 44 (2) (2019) 320–334.
Research Projects of Wenzhou University of Technology under grants [25] M.R. Chen, G.Q. Zeng, K.D. Lu, et al., A two-layer nonlinear combination method
ky202201 and ky202209, the Teaching Reform Research Project of for short-term wind speed prediction based on ELM, ENN, and LSTM[J], IEEE
Wenzhou University of Technology under grant 2022JG12, Major Internet Things J. 6 (4) (2019) 6997–7010.
[26] W. Deng, H. Liu, J. Xu, et al., An improved quantum-inspired differential evolution
Project of Zhejiang Natural Science Foundation under Grant algorithm for deep belief network[J], IEEE Trans. Instrum. Meas. 69 (10) (2020)
LD21F020001, Grant LSZ19F020001, and the National Natural Science 7319–7327.
Foundation of China under Grant U1809209, Wenzhou Intelligent [27] W. Deng, J. Xu, H. Zhao, et al., A novel gate resource allocation method using
improved PSO-based QEA[J], IEEE Trans. Intell. Transport. Syst. 23 (3) (2020)
Image Processing and Analysis Key Laboratory Construction Project 1737–1745.
under Grant 2021HZSY007105. [28] J. Pang, H. Zhou, Y.C. Tsai, et al., A scatter simulated annealing algorithm for the
bi-objective scheduling problem for the wet station of semiconductor
manufacturing[J], Comput. Ind. Eng. 123 (9) (2018) 54–66.
References [29] B. Venkatesh, J. Anuradha, A review of feature selection and its methods[J],
Cybern. Inf. Technol. 19 (1) (2019) 3–26.
[1] J. Weller, A. Budson, Current understanding of Alzheimer’s disease diagnosis and [30] G.G. Wang, S. Deb, Z. Cui, Monarch butterfly optimization[J], Neural Comput.
treatment[J], F1000Research 7 (7) (2018) 1–9. Appl. 31 (7) (2019) 1995–2014.
[2] J. Rasmussen, H. Langerman, Alzheimer’s disease–why we need early diagnosis[J], [31] S. Li, H. Chen, M. Wang, et al., Slime mould algorithm: a new method for stochastic
Degener. Neurol. Neuromuscul. Dis. 9 (12) (2019) 123–130. optimization[J], Future Generat. Comput. Syst. 111 (2020) 300–323.
[3] D. Lu, K. Popuri, G.W. Ding, et al., Multimodal and multiscale deep neural [32] G.G. Wang, Moth search algorithm: a bio-inspired metaheuristic algorithm for
networks for the early diagnosis of Alzheimer’s disease using structural MR and global optimization problems[J], Memetic Computing 10 (2) (2018) 151–164.
FDG-PET images[J], Sci. Rep. 8 (1) (2018) 1–13. [33] Y. Yang, H. Chen, A.A. Heidari, et al., Hunger games search: visions, conception,
[4] M. Gharaibeh, M. Almahmoud, M.Z. Ali, et al., Early diagnosis of alzheimer’s implementation, deep analysis, perspectives, and towards performance shifts[J],
disease using cerebral catheter angiogram neuroimaging: a novel model based on Expert Syst. Appl. 177 (2021), 114864.
deep learning approaches[J], Big Data and Cognitive Computing 6 (1) (2022) 2. [34] J.C. Butcher, A history of Runge-Kutta methods[J], Appl. Numer. Math. 20 (3)
[5] S. Singh, R.R. Janghel, Early diagnosis of alzheimer’s disease using aco optimized (1996) 247–260.
deep cnn classifier[C]//Ubiquitous Intelligent Systems, in: Proceedings of ICUIS [35] J. Tu, H. Chen, M. Wang, et al., The colony predation algorithm[J], J. Bionic Eng.
2021, Springer, Singapore, 2022, pp. 15–31. 18 (3) (2021) 674–710.
[6] D. Pan, G. Luo, A. Zeng, et al., Adaptive 3DCNN-Based Interpretable Ensemble [36] I. Ahmadianfar, A.A. Heidari, S. Noshadian, et al., INFO: an efficient optimization
Model for Early Diagnosis of Alzheimer’s Disease[J], IEEE Transactions on algorithm based on weighted mean of vectors[J], Expert Syst. Appl. 195 (2022),
Computational Social Systems, 2022. 116516.
[7] S. Velliangiri, S. Pandiaraj, S. Muthubalaji, Multiclass recognition of AD [37] A.A. Heidari, S. Mirjalili, H. Faris, et al., Harris hawks optimization: algorithm and
neurological diseases using a bag of deep reduced features coupled with gradient applications[J], Future Generat. Comput. Syst. 97 (2019) 849–872.
descent optimized twin support vector machine classifier for early diagnosis[J], [38] S. Mirjalili, A.H. Gandomi, S.Z. Mirjalili, et al., Salp Swarm Algorithm: a bio-
Concurrency Comput. Pract. Ex. 34 (21) (2022), e7099. inspired optimizer for engineering design problems[J], Adv. Eng. Software 114
[8] J. Seo, T.H. Laine, G. Oh, et al., EEG-based emotion classification for Alzheimer’s (12) (2017) 163–191.
disease patients using conventional machine learning and recurrent neural network [39] M. Emmanuel, J. Jabez, An enhanced fuzzy based KNN classification method for
models[J], Sensors 20 (24) (2020) 7212–7225. Alzheimer’s disease identification from SMRI images[J], JOURNAL OF
[9] T. Zheng, Y. Yu, H. Lei, et al., Compositionally graded KNN-based multilayer ALGEBRAIC STATISTICS 13 (3) (2022) 89–103.
composite with excellent piezoelectric temperature stability[J], Adv. Mater. 34 (8) [40] H. Abbad Ur Rehman, C.Y. Lin, Z. Mushtaq, et al., Performance analysis of machine
(2022), 2109175. learning algorithms for thyroid disease[J], Arabian J. Sci. Eng. 46 (10) (2021)
[10] A. Shokrzade, M. Ramezani, F.A. Tab, et al., A novel extreme learning machine 9437–9449.
based kNN classification method for dealing with big data[J], Expert Syst. Appl. [41] J. Feng, J. Zhang, X. Zhu, et al., A novel chaos optimization algorithm[J],
183 (11) (2021), 115293. Multimed. Tool. Appl. 76 (16) (2017) 17405–17436.
[11] Y. Huang, Y. Li, Prediction of protein subcellular locations using fuzzy k-NN [42] S.A. Mirjalili, S.Z.M. Hashim, BMOA: binary magnetic optimization algorithm[J],
method[J], Bioinformatics 20 (1) (2004) 21–28. International Journal of Machine Learning and Computing 2 (3) (2012) 204.
[12] K.C. Kwak, W. Pedrycz, Face recognition using a fuzzy fisherface classifier[J], [43] R.Y.M. Nakamura, L.A.M. Pereira, K.A. Costa, et al., BBA: a binary bat algorithm
Pattern Recogn. 38 (10) (2005) 1717–1732. for feature selection[C], in: 2012 25th SIBGRAPI Conference on Graphics, Patterns
[13] H.L. Chen, C.C. Huang, X.G. Yu, et al., An efficient diagnosis system for detection of and Images, IEEE, 2012, pp. 291–297.
Parkinson’s disease using fuzzy k-nearest neighbor approach[J], Expert Syst. Appl. [44] A. Patil, G. Soni, A. Prakash, A BMFO-KNN based intelligent fault detection
40 (1) (2013) 263–271. approach for reciprocating compressor[J], International Journal of System
[14] A. Mondal, S. Ghosh, A. Ghosh, Efficient silhouette-based contour tracking using Assurance Engineering and Management 13 (2) (2022) 797–809.
local information[J], Soft Comput. 20 (2) (2016) 785–805. [45] C. Yu, Z. Cai, X. Ye, et al., Quantum-like mutation-induced dragonfly-inspired
[15] W. Shan, Z. Qiao, A.A. Heidari, et al., Double adaptive weights for stabilization of optimization approach[J], Math. Comput. Simulat. 178 (2020) 259–289.
moth flame optimizer: balance analysis, engineering cases, and medical diagnosis [46] E. Emary, H.M. Zawbaa, A.E. Hassanien, Binary grey wolf optimization approaches
[J], Knowl. Base Syst. 214 (2) (2021), 106728. for feature selection[J], Neurocomputing 172 (2016) 371–381.
[16] S. Wang, Y. Zhao, J. Li, et al., Neurostructural correlates of hope: dispositional [47] J. Hu, A.A. Heidari, L. Zhang, et al., Chaotic diffusion-limited aggregation
hope mediates the impact of the SMA gray matter volume on subjective well-being enhanced grey wolf optimizer: insights, analysis, binarization, and feature
in late adolescence[J], Soc. Cognit. Affect Neurosci. 15 (4) (2020) 395–404. selection[J], Int. J. Intell. Syst. 37 (8) (2022) 4864–4927.
[17] Y. Zhang, R. Liu, A.A. Heidari, et al., Towards augmented kernel extreme learning [48] E. Janoušová, M. Vounou, R. Wolz, et al., Biomarker discovery for sparse
models for bankruptcy prediction: algorithmic behavior and comprehensive classification of brain images in Alzheimer’s disease[J], Annals of the BMVA (2)
analysis[J], Neurocomputing 430 (3) (2021) 185–212. (2012) 1–11.
[18] S. Jiao, G. Chong, C. Huang, et al., Orthogonally adapted Harris hawks [49] N. Batmanghelich, B. Taskar, C. Davatzikos, A general and unifying framework for
optimization for parameter estimation of photovoltaic models[J], Energy 203 (7) feature construction, in: Image-Based Pattern classification[C]//International
(2020), 117804. Conference on Information Processing in Medical Imaging, vol. 5636, Springer,
[19] J. Tu, H. Chen, J. Liu, et al., Evolutionary biogeography-based whale optimization Berlin, Heidelberg, 2009, pp. 423–434.
methods with communication structure: towards measuring the balance[J], Knowl. [50] S. Liu, S. Liu, W. Cai, et al., Early diagnosis of Alzheimer’s disease with deep
Base Syst. 212 (1) (2021), 106642. learning[C], in: 2014 IEEE 11th International Symposium on Biomedical Imaging
[20] S. Song, P. Wang, A.A. Heidari, et al., Dimension decided Harris hawks (ISBI), IEEE, 2014, pp. 1015–1018, 7.
optimization with Gaussian mutation: balance analysis and diversity patterns[J], [51] T. Tong, R. Wolz, Q. Gao, et al., Multiple instance learning for classification of
Knowl. Base Syst. 215 (3) (2021), 106425. dementia in brain MRI[J], Med. Image Anal. 18 (5) (2014) 808–818.
[21] Y. Fan, P. Wang, M. Mafarja, et al., A bioinformatic variant fruit fly optimizer for [52] W. Yang, R.L.M. Lui, J.H. Gao, et al., Independent component analysis-based
tackling optimization problems[J], Knowl. Base Syst. 213 (2) (2021), 106704. classification of Alzheimer’s disease MRI data[J], J. Alzheim. Dis. 24 (4) (2011)
[22] X. Zhang, Y. Xu, C. Yu, et al., Gaussian mutational chaotic fruit fly-built 775–783.
optimization and feature selection[J], Expert Syst. Appl. 141 (3) (2020), 112976.
[23] W. Zhu, C. Ma, X. Zhao, et al., Evaluation of sino foreign cooperative education
project using orthogonal sine cosine optimized kernel extreme learning machine
[J], IEEE Access 8 (3) (2020) 61107–61123.
13