Multidimensional

Download as pdf or txt
Download as pdf or txt
You are on page 1of 23

2nd Reading

June 14, 2018 18:31 1850022

International Journal of Neural Systems, Vol. 28, No. 0 (2018) 1850022 (23 pages)
c World Scientific Publishing Company
DOI: 10.1142/S0129065718500223

Multi-Objective Genetic Algorithms to Find Most


Relevant Volumes of the Brain Related to Alzheimer’s
Disease and Mild Cognitive Impairment
Olga Valenzuela
Department of Applied Mathematics, University of Granada, Spain
by THE UNIVERSITY OF NEW SOUTH WALES LIBRARY on 06/23/18. For personal use only.

Xiaoyi Jiang∗
Department of Computer Science, University of Munster, Germany
[email protected]
Antonio Carrillo
Int. J. Neur. Syst. Downloaded from www.worldscientific.com

Department of Computer Architecture and Computer Technology


University of Granada, Spain
Ignacio Rojas
Department of Computer Architecture and Computer Technology
CITIC-UGR, University of Granada, Spain

Accepted 28 April 2018


Published Online 18 June 2018

Computer-Aided Diagnosis (CAD) represents a relevant instrument to automatically classify between


patients with and without Alzheimer’s Disease (AD) using several actual imaging techniques. This study
analyzes the optimization of volumes of interest (VOIs) to extract three-dimensional (3D) textures from
Magnetic Resonance Image (MRI) in order to diagnose AD, Mild Cognitive Impairment converter (MCIc),
Mild Cognitive Impairment nonconverter (MCInc) and Normal subjects. A relevant feature of the pro-
posed approach is the use of 3D features instead of traditional two-dimensional (2D) features, by using
3D discrete wavelet transform (3D-DWT) approach for performing feature extraction from T-1 weighted
MRI. Due to the high number of coefficients when applying 3D-DWT to each of the VOIs, a feature
selection algorithm based on mutual information is used, as is the minimum Redundancy Maximum
Relevance (mRMR) algorithm. Region optimization has been performed in order to discover the most
relevant regions (VOIs) in the brain with the use of Multi-Objective Genetic Algorithms, being one of
the objectives to be optimize the accuracy of the system. The error index of the system is computed
by the confusion matrix obtained by the multi-class support vector machine (SVM) classifier. Principal
Component Analysis (PCA) is used with the purpose of reducing the number of features to the classi-
fier. The cohort of subjects used in the study consisted of 296 different patients. A first group of 206
patients was used to optimize VOI selection and another group of 90 independent subjects (that did not
belong to the first group) was used to test the solutions yielded by the genetic algorithm. The proposed
methodology obtains excellent results in multi-class classification achieving accuracies of 94.4% and also
extracting significant information on the location of the most relevant points of the brain. This suggests
that the proposed method could aid in the research of other neurodegenerative diseases, improving the
accuracy of the diagnosis and finding the most relevant regions of the brain associated with them.

Keywords: Alzheimer’s disease (AD); multi-class classification; feature selection; multi-objective genetic
optimization; relevant regions in the brain.


Corresponding author.

1850022-1
2nd Reading
June 14, 2018 18:31 1850022

O. Valenzuela et al.

1. Introduction (the hippocampus, posterior cingulate, entorhinal


The World Alzheimer Report 2015 states that 46.8 cortex, tempo-parietal cortex, and precuneus). How-
million people worldwide are living with dementia. ever, since these atrophies are shared with other
It also estimates that this number will increase to dementias, measuring general shrinking of the matter
74.7 million people in 2030 and 131.5 million peo- alone is not useful to make an AD diagnosis; deter-
ple in 2050. More than half of this increase, specifi- mining the individual regions affected by AD is key.
cally 68%, will take place in low and middle income Thus, some studies have researched atrophy in
countries, and the total estimated worldwide cost the temporal lobe involving AD patients.11 These
of dementia will rise to US$ 2 trillion by 2030. studies have led to the development of computer-
The cost of treatment and diagnosis, both in time based methods that analytically measure the degree
of temporal atrophy, yielding the same discrimina-
by THE UNIVERSITY OF NEW SOUTH WALES LIBRARY on 06/23/18. For personal use only.

and money, and the relevance of a prodromal diag-


nosis, makes a main goal for scientists to develop tory performance of the visual scales used by medical
a tool for early diagnosis. Detecting Mild Cogni- experts,12 but with a standardized and computerized
tive Impairment (MCI) patients, and distinguishing method which is, potentially, more objective.
between converters and nonconverters, could allow One of the dementias often confused with AD
Int. J. Neur. Syst. Downloaded from www.worldscientific.com

people with dementia to prevent the cognitive dam- is frontotemporal dementia (FTD), since both have
age for years, enhancing their quality of life.1 proven to develop a striking atrophy in the poste-
Alzheimer’s disease (AD) is the leading cause of rior cortex and temporal lobe. In order to avoid the
dementia, and diagnosis of AD is made nowadays confusion, some studies have proposed visual scales
by using clinical criteria.2–4 The American Academy to help in a differential diagnosis.13 Regarding this
of Neurology (AAN) recommends the use of an issue, the fact that the symmetric atrophy of the
image test, a Computed Axial Tomography (CAT) hippocampus is a useful biomarker to differentiate
or a cranial Magnetic Resonance Image (MRI) when between AD and FTD is also known (and helpful).
studying dementia. Despite the fact of a CAT being The paper presented by Romero-Garcia et al.14
quick and cheap, MRI provides far better resolu- propose the realization of a structural analysis and
tion and contrast between tissues and does not metabolic cortical networks derived from correlations
use ionizing radiations.5 In recent years, the use of cortical thickness and regional glucose utilization
of Fluorodeoxyglucose Positron Emission Tomogra- in healthy old adults, amnestic MCI subjects and
phy (FDG-PET) and the beta-amyloid protein depo- mild AD patients.
sitions in the brain have proven to be very rel- Early diagnosis is a key factor in those patients
evant techniques in helping neuroimaging-assisted who will suffer from dementia. Given the fact that
diagnosis.6–8 shrinking of brain matter in MCI patients is halfway
MCI is defined as an intermediate stage between between AD and healthy patients15 and that atrophy
normal cognition and dementia. Subjective cognitive of the hippocampus and Amygdala are well-known
complaints, verified by a reliable observer (family, biomarkers of AD in healthy and MCI patients,16,17
friends, etc.) and posterior demonstration by objec- further investigating the impact of Alzheimer’s in the
tive neuropsychological tests, are the standard means brain and trying to reach a deeper understanding of
to diagnose this condition. Yet, many disorders can the mechanisms that differentiate those MCI patients
cause MCI, including systemic or psychiatric disor- who will convert to AD and those who will not is
ders, neurological diseases, or others.9 Since between essential to address the issue.
10% and 15% of all patients with MCI have shown to For the stated reasons, this study has two main
develop dementia within a year in several epidemi- goals. The first one is to develop a classifier able
ological studies, it is key to discern whether a par- to distinguish between AD, MCI converter (MCIc),
ticular patient exhibiting signs of MCI will develop MCI nonconverter MCInc, and normal subjects.
dementia or not.10 The second one is to find out which regions of
Each dementia has its own characteristics and the brain are the most relevant ones in terms of
atrophy distribution. Particularly for AD, atrophy accuracy in order to classify patients belonging to
of the brain is strong in the medial temporal lobe these four classes. In order to do so, the procedure

1850022-2
2nd Reading
June 14, 2018 18:31 1850022

Genetic Algorithms to Find Relevant Volumes of the Brain in AD

implements three-dimensional discrete wavelet trans- Discriminant Ratio31,48 or Fisher Criterion,37 Princi-
form (3D-DWT) as a tool to extract 3D features pal Component Analysis (PCA),33,37,43,44 minimum
from the brain, and an optimization algorithm to Redundancy Maximum Relevance (mRMR),31,49
find which regions are the most relevant ones. and others, are available.
Regarding the classification or machine learning
method, many methods and algorithms are available
2. State-of-the-Art
and we are not listing them here, but refer the reader
Diagnosis and research tools can be developed to to the reference section. The choice of using Sup-
better understand dementia and other conditions port Vector Machine (SVM) comes from its robust-
using both information from medical signals (such ness, high performance, and capability to deal with
as EEG,18–25 Magnetoencephalography (MEG)26,27 ) high-dimensional data, as can be seen in comparison
by THE UNIVERSITY OF NEW SOUTH WALES LIBRARY on 06/23/18. For personal use only.

and medical images (such MRI,28 FDG-PET,6 etc). tables from different papers,32–34,43,44 and others.
In the last decades, a wide variety of relevant Finally, on the subject of optimization, there are
papers and research papers have focused on devel- many optimization algorithms available. The contri-
oping a procedure that could systematically analyze bution presented by Tekin et al.50 uses an enhanced
Int. J. Neur. Syst. Downloaded from www.worldscientific.com

MRI images (or different sources of information for version of Ant Colony Optimization (ACO) called
that matter) to determine the given condition of a Improved ACO (IACO). The papers presented by
patient.29,30 Most of the references are studies that Nachimuthu et al.40 and Raja et al.41 propose Parti-
aim to build a classifier that could help in the diag- cle Swarm Optimization (PSO), and Sharma et al.42
nosis of AD, classifying AD patients and healthy suggests a modification of Gravitational Search
patients.31–33 Algorithm (GSA) called Hybrid Refined GSA.
Regarding the normalization of MRI images and Nonetheless, our choice of a Multi-Objective Genetic
sources utilized for feature extraction, a relevant con- Algorithm (precisely, NSGA-II algorithm51,52 ) comes
tribution was presented by Cuingnet et al.34 in which from the fact that two different functions will need
different DARTEL35 and standard SPM normaliza- to be optimized in order to obtain a Pareto front and
tions36 are tested in terms of the accuracy achieved find the best combination of ROIs or VOIs to aid in
by the classifiers. Despite the fact that both DAR- the early diagnosis of AD, MCIc, and MCInc, with
TEL and SPM standard pipeline were tested, the a sound and solid background of research. In multi-
later was finally chosen to be implemented in the objective optimization, in which even several objec-
procedure. tives can be opposed or complex to be reached simul-
For the extraction of features, several approaches taneously, a set of solutions is defined that form the
have given good results: plain voxel values,6,37 Prob- Pareto front when these solutions are nondominated
ability Density Function (PDF) of volumes of inter- optimal solutions (if no objective can be improved
est (VOIs),32 one approach inspired by FT called without sacrificing at least another objective).52
Fractional Fourier Entropy,38 Co-occurrence Matri- It is noteworthy to remark the results obtained
ces,39,40 Hyper Analytic Wavelet Transform,41 3D by Dukart et al.7 In this contribution, the authors
volumetric Square Centroid Lines Gray Level Dis- presented relevant findings. The first one is that the
tribution Method (SCLGM),42 region of interest differential age-related glucose hypometabolism and
(ROI) variance–covariance vector,43 Displacement atrophy patterns between clinical groups (AD, MCIc,
Field (DF),33 Eigenbrain,44 and others. DWT was MCInc, and control patients) have relevant conse-
chosen for this study because of its ease of implemen- quences in the use of neuroimaging biomarkers for
tation, its power both in information preservation obtaining accuracy diagnostic and forecast conver-
and in computational resources reduction and the sion to AD. Also, relevant is the relationship between
results yielded in several research papers.31,33,45–47 symptom severities (measured by FDG-PET and/or
Feature selection is commonly performed by sta- MRI). The authors confirmed for MCIc and AD
tistical algorithms, though there is some research patients a strong association between symptom seri-
using wrapped methods which carries out the selec- ousness and the quantity of atrophy, which is not
tion by classifying the images and then selecting evident in MCInc. Relevant to remark, that no such
the best performing features. In this area, Fisher’s pattern was presented in glucose metabolism with

1850022-3
2nd Reading
June 14, 2018 18:31 1850022

O. Valenzuela et al.

any of the patient cohorts. Authors suggest that 3. Materials and Methodology
FDG-PET could be more closely linked to future
The block scheme for the proposed methodology,
cognitive deterioration despite the fact that MRI is with the aim to discover and optimize which are the
more strongly related to the present-day cognitive most relevant VOIs for the multi-class classification,
state. is shown in Fig. 1(a). The main functional blocks
Most papers in the references run traditional are described in the following sections. Figure 1(b)
binary or two-class classifications, for AD versus shows the test phase and the results are described in
Normal6,31,32,53 or AD versus MCI/MCI versus Nor- Sec. 4.
mal.54–57 There are also contributions presented in
the references for three multi-class classification33,58
or four multi-class classification59–61 3.1. Subjects cohort
by THE UNIVERSITY OF NEW SOUTH WALES LIBRARY on 06/23/18. For personal use only.

A useful procedure developed in this proposed Data used in the preparation of this study were
methodology is adding an optimization phase to obtained from the AD Neuroimaging Initiative
the traditional procedure of classification (or, more (ADNI) database (adni.loni.usc.edu). The ADNI was
specifically, wrapping an optimization algorithm launched in 2003 as a public–private partnership,
Int. J. Neur. Syst. Downloaded from www.worldscientific.com

around the classification) so that the most relevant led by Principal Investigator Michael W. Weiner,
regions of the brain related to AD could be found. In MD. The primary goal of ADNI has been to test
this study, a SVM performing four multi-class classi- whether serial MRI, PET, other biological markers,
fication is presented, obtaining a global accuracy of and clinical and neuropsychological assessment can
about 94.4%. be combined to measure the progression of MCI and

(a) Training phase using multi-objective genetic algorithm (206 patients)

Fig. 1. Pipeline of proposed method to find the most relevant VOIs of the brain for multi-classification purpose, using
Multi-Objective Genetic Algorithms: (a) training phase and (b) classification phase with test samples.

1850022-4
2nd Reading
June 14, 2018 18:31 1850022

Genetic Algorithms to Find Relevant Volumes of the Brain in AD


by THE UNIVERSITY OF NEW SOUTH WALES LIBRARY on 06/23/18. For personal use only.
Int. J. Neur. Syst. Downloaded from www.worldscientific.com

(b) Test phase using solutions from the Pareto front

Fig. 1. (Continued)

early AD. For up-to-date information, see www.adni- 74. Details of the demographics and clinical char-
info.org. acteristics of the subjects used in this research are
More than 400 different brain images in Neu- presented in Tables 1 and 2. These images are
roimaging Informatics Technology Initiative (NIfTI) around 15 GB of information (40 GB after images
format were downloaded. After deleting corrupt or were segmented and separately stored). Finally, we
duplicated images, the database consisted of 296 want to highlight the error that can be induced
MRI images, T1-weighted, from different types of in the system, if a patient has been initially erro-
patients. The AD group contained 87 subjects rang- neously labeled (especially in the case of MCIc and
ing in age from 62 to 88 (75.2 ± 7.6) years. The MCInc).
Mini Mental State Examination (MMSE) and Clin- The database of 296 patients is divided into two
ical Dementia Ratio (CDR) and Global Deterio- parts: 206 patients from the whole database are used
ration Scale (GDS) scores are mean 23.18 ± 2.59 to optimize the VOI selection using the novel opti-
and mean 0.73 ± 0.31, respectively (it is impor- mization procedure presented in this paper, and 90
tant to remember that a CDR of zero represents patients are later used for testing the accuracy of the
no dementia). The healthy patients has a total of solutions obtained by the optimization methodology
59 subjects, MCIc patients 76 and finallyMCInc proposed.

Table 1. Characteristics of the cohort of Alzheimer’s patient and normal patient used in this contribution.
Measures AD (87) Normal (59)
Sex (male–
42 45 30 29
female)
Weight
71.1 17.3 83.5 14.1
(mean–std)
Age 75.2 7.6 88 62 76.5 3.8 83 61
MMSE 23.18 2.59 26 13 29.45 0.98 31 25
GDS 1.75 1.42 5 0 1.25 1.25 5 0
CDR 0.73 0.31 2 0.5 0.009 0.07 0.5 0
Mean Std Max Min Mean Std Max Min

1850022-5
2nd Reading
June 14, 2018 18:31 1850022

O. Valenzuela et al.

Table 2. Characteristics of the cohort of MCIc and MCInc used in this contribution.
Measures MCInc (74) MCIc (76)
Sex (male–
38 36 37 39
female)
Weight
75.8 13.9 77.3 13.3
(mean–std)
Age 75.2 7.9 86 61 74.6 7.54 88 62
MMSE 26.35 2.47 31 18 27.2 1.89 30 21
GDS 1.94 1.82 9 0 1.82 1.47 5 0
CDR 0.50 0.12 1 0 0.48 0.08 0.52 0
Mean Std Max Min Mean Std Max Min
by THE UNIVERSITY OF NEW SOUTH WALES LIBRARY on 06/23/18. For personal use only.

3.2. Segmentation and normalization artifact or noise that modifies the intensity of the
image (bias). Although the noise is not usually a
The pre-processing of the information from the sam-
problem for visual inspection of a human expert, it
ple data, which requires segmentation, bias correc-
Int. J. Neur. Syst. Downloaded from www.worldscientific.com

can negatively affect the realization of an automatic


tion and spatial normalization, has been performed
pre-processing of the images, and therefore its elimi-
using the Segmentation routine implemented in
SPM12. Statistical Parametric Mapping (SPM)36,62 nation is necessary.
refers to the construction and assessment of spatially These ideas have been instantiated in software
extended statistical processes used to test hypotheses called SPM (written by the Wellcome Department
about functional imaging data, which is an extension of Imaging Neuroscience at University College Lon-
procedure of the unified segmentation algorithm pro- don to aid in the analysis of functional neuroimaging
posed by Ashburner et al.,63 in which a probabilistic data). The SPM software package has been designed
framework was described that allows image registra- for the analysis of brain imaging data sequences.
The sequences can be a series of images from dif-
tion, tissue classification, and bias correction to be
ferent cohorts, or time-series from the same subject.
combined in a single generative model. The model
Afterwards, they are normalized to MNI space, using
is based on a mixture of Gaussians, with the option
1 × 1 × 1 mm voxels (remind that one voxel is a 3D
to include smooth intensity variation and nonlinear
pixel) and a Bounding Box of limits: [−78 −112 −60;
registration with tissue probability maps.
78 76 85]. Brain images have been segmented so that
The methodology used in this contribution for
the whole matter (also called W images), gray mat-
segmentation, bias correction, and spatial normali-
zation, is fundamentally identical as that presented ter (C1 images) and white matter (C2 images) are
by Ashburner et al.,63 incorporating the following available in different files. In this study, the complete
improvements (as suggested by Penny et al.36 ): (i) volumetric information of the brain is used (there-
a slightly different treatment of the mixing propor- fore, two-dimensional (2D) information will not be
tions, (ii) the use of an enhanced registration model, used, instead 3D information will be used).
(iii) the capacity to use multi-spectral data, (iv) an
extended set of tissue probability maps, allowing dis- 3.3. Feature extraction
similar treatment of voxels outside the brain. In wavelet analysis, a fully scalable modulated win-
The information of each patient in the data dow is shifted along the signal for every position, and
base is classified according to different tissue types. this process is repeated with shorter or longer win-
The tissue types are defined corresponding to tissue dows, yielding a multi-resolution analysis. As a result
probability maps, which define the prior proba- of this process when applied to an image, decompos-
bility of discovering a tissue type at a particu- ing the original image in its wavelet coefficients (cal-
lar location (usually, the order of tissues is gray culated as explained below) will produce a series of
matter, white matter, (CSF), bone, soft tissue and images with different scales.
air/background). When applied to 2D images, one-dimensional
It is important to mention that MR images are (1D) DWT is applied separately to each dimensions,
frequently corrupted by a smooth, spatially varying resulting in four sub-bands (Low–Low, Low–High,

1850022-6
2nd Reading
June 14, 2018 18:31 1850022

Genetic Algorithms to Find Relevant Volumes of the Brain in AD

The size of the first level approximation coeffi-


cients of an N by N image is N/2 by N/2, the size
of the second level will be N/4 by N/4, and so forth.
As the level decomposition is increased, a more com-
pact and coarser image is obtained. Traditionally,
slices of the brain are fetched and used to extract
features.33,34,43,44,64
However, the methodology presented in this
paper aims to extend these traditional methods in
order to use 3D volumes instead of 2D slices, so that
small VOIs of the brain could be found (the whole
by THE UNIVERSITY OF NEW SOUTH WALES LIBRARY on 06/23/18. For personal use only.

Fig. 2. Block diagram of 2D wavelet decomposition (2D- mater images and gray matter images are divided in
DWT) with application to images. 1170 VOIs). Using 3D volumes could mean a signif-
icant reduction of the number of features extracted,
since it allows the researcher to extract features just
High–High, and High–Low or LL, LH, HH, HL).
Int. J. Neur. Syst. Downloaded from www.worldscientific.com

from a closed region related to the disease, and ignore


In this 2D case, LL is regarded as the approxima-
all the adjacent information that could be unrelated.
tion coefficients of the image, while LH, HH, and
Therefore, 3D-DWT65 wavelet methodology is used
HL remain as the detailed coefficients. To compute
for each VOI, being the implementation of the 3D-
the next level of decomposition, the same process
DWT wavelet algorithm an extension of the existing
is applied to the approximation coefficients. As the
2D algorithms.
level of decomposition is increased, further sub-bands
The 3D wavelets, 3D-DWT, can be formed as
are computed and more compact yet coarser approx-
a concatenation of separable application of 1D
imation coefficients are obtained. This whole pro-
cess computes a hierarchical framework to work with wavelets in three spatial directions x, y, z, decompos-
images. ing data in row, column, and slice direction. Figure 3
Figure 2 shows a block diagram of dimen- shows a block diagram of 3D-DWT for a specific
sion wavelet decomposition with application to VOI. The first step is the application of the 1D
images, where the HH-k matrices correspond to high wavelet for performing a decomposition along the
horizontal-high-vertical filtering, the HL-k matrices columns (the x-dimension), obtaining the low-pass
correspond to high horizontal-low vertical filtering, volume L(x, y, z) and a high-pass volume H(x, y, z),
the LH-k matrices correspond to low horizontal-high but taking into account that the number of data in
vertical filtering and the LL-k matrices correspond these volumes are reduced (if the original VOI has
to low horizontal-low vertical filtering, being k the Nx , Ny , and Nz pixel in each axis, the L(x, y, z) low-
level of decomposition. pass volume has Nx /2 ∗ Ny ∗ Nz ). The next step is

Fig. 3. Block diagram of 3D DWT wavelet methodology used in this contribution for each of the VOI in which the brain
is divided.

1850022-7
2nd Reading
June 14, 2018 18:31 1850022

O. Valenzuela et al.

executing a decomposition along the rows (y-axis), y, MI is calculated as shown in Eqs. (1) and (2),
obtaining four sub-volumes: LL, LH, HL, and HH. where summations and integrals are used according
Again, the number of data in these volumes are to if x and y are discrete or continuous, respectively.
reduced (in this case Nx /2 ∗ Ny /2 ∗ Nz ). Finally, a In these equations, term p(x, y) is the join proba-
decomposition along the slices is carried out (z-axis), bility of x and y, and terms p(x) and p(y) are the
obtaining eight sub-volumes or sub-band: LLL, LLH, marginal probabilities:
LHL, LHH, HLL, HLH, HHL, and HHH, being in
the case the number of data: Nx /2 ∗ Ny /2 ∗ Nz /2.   
p(x, y)
In this contribution, 3D-DWT was performed using I(x, y) = p(x, y) log , (1)
x y
p(x)p(y)
biorthogonal 3.3 wavelet up to level 2 on each of the
VOIs and approximation coefficients were stored. For    
p(x, y)
by THE UNIVERSITY OF NEW SOUTH WALES LIBRARY on 06/23/18. For personal use only.

further details, one can refer to Ref. 66. I(x, y) = p(x, y) log dy dx. (2)
x y p(x)p(y)

3.4. Feature selection mRMR is an iterative algorithm that starts with


an empty set S, and for each iteration searches for
Int. J. Neur. Syst. Downloaded from www.worldscientific.com

It is important to indicate that when performing the


3D-DWT on each of the VOIs of a certain patient, an input feature p that maximizes MI with respect
the number of coefficients obtained is very high to the output feature and minimizes MI with respect
(thousands of coefficients can be obtained). If we to the rest of input features, that is, maximizes the
take into account that the brain will be divided into relevance and minimizes the redundancy. This fea-
2340 VOI, the number of total coefficients could be ture is added to S. The algorithm finishes when all
excessive. In a first phase, it is intended to select features are finally added, and the final ranking is
the most relevant 3D-DWT coefficients from each the order in which they have been added to S.
VOI (approximately 100 features), and for this pur- It is noteworthy to remind that in order to reduce
pose, the mRMR algorithm is used (Fig. 4 shows this noise and improve the performance of the classifier
methodology for the training subjects). later on, it is useful (and recommended) to discretize
In this study, we have used mRMR by Peng the data into categorical data with zero mean value
et al.49 in order to select the most relevant features and unit variance.67 Finally, for most of the simu-
to use in the classification step. It is based on mutual lations performed in this contribution, the mRMR
information (MI) criteria from Information The- algorithm has selected the 100 most relevant coeffi-
ory,67,68 that is, a mutual dependency measurement cients, as a compromise between computational time
between random variables. Given two variables x and cost and information reliability.

Fig. 4. Feature selection for each of the VOIs of each patients. After the application of the 3D-DWT in each VOI,
M -features are obtained. Due to this big number, mRMR feature selection is applied to each VOI in order to select the
most relevant one (n-features, being n < M ).

1850022-8
2nd Reading
June 14, 2018 18:31 1850022

Genetic Algorithms to Find Relevant Volumes of the Brain in AD

3.5. VOIs optimization using NSGA-II


In this study, we use a controlled elitist genetic algo-
rithm,51 variant of NSGA-II, to search for small
regions in the brain that are related to AD. The aim
of this search is to find the most relevant regions
related to the disease, so that better classifications
can be performed, and unknown regions of the brain
(in terms of the disease) could be researched (select-
ing which are the most relevant VOIs).
NSGA-II is one of the most popular multi-
Fig. 5. Binary coding used by the multi-objective algo-
by THE UNIVERSITY OF NEW SOUTH WALES LIBRARY on 06/23/18. For personal use only.

objective optimization algorithms. There are three


rithm for the selection of the most relevant VOIs to
outstanding features that allow this: a fast non- obtain a precise multi-classification.
dominated sorting approach, a fast-crowded distance
estimation procedure, and finally, a simple crowded
comparison operator.52 The algorithm is often sum-
Int. J. Neur. Syst. Downloaded from www.worldscientific.com

marized as performing the following steps: Unlike single objective optimization techniques,
NSGA-II simultaneously optimizes several objectives
(1) Random population initialization based on the so that the yielded solutions are nondominated by
problem ranges and constraints. In the simula- other solutions. In this contribution, the fitness func-
tion carried out in this contribution, the popu- tion has two objectives that are simultaneously opti-
lation size of the genetic algorithm is 100. mized: one related to the number of regions used to
(2) Nondominated sorting of the population. perform a classification between AD, MCIc, MCInc,
(3) Crowding distance computation and assignment, and normal subjects, and another one related to the
so that individuals in the population are selected accuracy achieved using certain regions (this accu-
based on their rank. racy score is carried out by one-versus-all SVM clas-
(4) Selection of individuals using a binary tourna- sifier approach). In this way, the algorithm finds a set
ment. of solutions that represent a trade-off between the
(5) Genetic operators are applied, simulating binary number of VOIs and the accuracy achieved. Thus,
crossover and polynomial mutation. several choices are available to the researcher.
(6) Recombination and selection, so that the next Once a set of VOIs has been selected for a given
generation is a combination of the offspring and solution of the Multi-Objective Genetic Algorithm
the current population. NSGA-II, a feature matrix is obtained, in which each
(7) Fitness functions evaluation. In the proposed row is each individual of the training set and the
methodology, the fitness functions have two columns contain the features of the selected VOIs.
objectives: (a) complexity (number of VOIs) (b) Figure 6 shows a diagram of the matrix obtained by
accuracy of the classifier (one-versus-all SVM the algorithm NSGA-II. Due to the high number of
classifier is used). features, before performing the classification of the
fitness function of the genetic algorithm, a feature
In the NSGA-II algorithm used in this contribu- reduction is performed using PCA (in the following
tion, a binary coding has been carried out, being subsection, more detail is presented).
the number of genes equal to the number of VOIs The final solution of the NSGA-II is the so-called
that a patient has. Figure 5 shows a codification Pareto front. A Pareto front is a set of nondominated
example for one individual, in which the number “1” solutions, being chosen as optimal, if no objective
means that the VOI is taken into account, while zero can be improved without sacrificing at least another
that is not selected. The coding of the entire popu- objective, being in our case, the two objectives: the
lation is represented in the right side of Fig. 5, tak- numbers of VOIs and the accuracy of the classifier. In
ing into account that 100 individuals (the size of the the following section, more detail about the classifier
population) are simultaneously optimized, being the used is presented. Each solution of the Pareto front
maximum number of generations 300. represents a possible solution, with a given set of

1850022-9
2nd Reading
June 14, 2018 18:31 1850022

O. Valenzuela et al.
by THE UNIVERSITY OF NEW SOUTH WALES LIBRARY on 06/23/18. For personal use only.
Int. J. Neur. Syst. Downloaded from www.worldscientific.com

Fig. 6. Matrix of features for the i-solution obtained by NSGA-II. It is important to mention that this i-solution comes
from Pareto front. Therefore, set of selected VOIs (represented by value “1”) and an SVM trained classifier.

features. The reduction of the number of features


entails a reduction in the memory size required to
perform the training of the SVM classifier, thus
reducing its complexity and computation time.
Figure 8 shows a block diagram of using PCA for
the reduction of the number of features. Dimension-
ality can be reduced by several orders of magnitude.
For example, in the problem presented in this con-
Fig. 7. Pareto front obtained (selected VOIs and one-
versus-all SVM trained classifier for each of the multiple tribution, the number of features in the matrix of
solutions within the Pareto front). features is 27,200 (taking into account the mRMR
reduction was carried out and adding both, W and
C1 sources, Table 4) and, preserving around the
selected VOIs and their corresponding trained clas-
95% of the variance, 175 principal components were
sifier (Trained SVM, Fig. 7).
selected using PCA. Thus, the classifier will han-
dle a 175-dimensional space instead of a 27,200-
3.6. Feature reduction dimensional space.
Once the a subset of relevant VOIs have been selected
during the evolution of the NSGA-II, the next step
is to unify all the information of all the VOIs for a 3.7. Classification
given patient, to later use this global information for Different machine learning techniques have been used
the classifier. Due to the high number of features, a in the references for diagnosis/classification AD (in
reduction of this number will be carried out using Ref. 69, a complete study of the state-of-the-art in
PCA. PCA finds a mapping function through which this field is presented). In order to increase the pre-
it still includes a large portion of the information in cision and simultaneously to accelerate the compu-
samples analyzed. tation time of the hybrid system proposed in this
Thus, PCA is implemented as a final feature paper, the use of SVM as classifier was selected. It
reduction algorithm prior to the classification step, is relevant to note that the SVM classifier is one of
so that the SVM classifier has to handle fewer the functional blocks of the Multi-objective Genetic

1850022-10
2nd Reading
June 14, 2018 18:31 1850022

Genetic Algorithms to Find Relevant Volumes of the Brain in AD

Fig. 8. Block diagram of using PCA for the reduction of the number of features that will be used for the SVM classifier
for obtaining the fitness function of the NSGA-II.
by THE UNIVERSITY OF NEW SOUTH WALES LIBRARY on 06/23/18. For personal use only.

Algorithm (in this contribution, NSGA-II is used), 3.8. Pipeline description


and therefore, the relevant features of the ROIs from
The whole database of subjects is divided into two
the brain selected by the NSGA-II (the selected
parts, and the study involves two phases. In the first
Int. J. Neur. Syst. Downloaded from www.worldscientific.com

VOIs) will be the input to the classifier.


phase, 206 patients from the whole database are used
SVM was first proposed in Refs. 70 and 71 and is
to optimize the VOI selection using the optimization
frequently used for a computer-aided diagnosis (AD) algorithm described. The remaining 90 subjects are
system for AD.72 Intuitively, SVM maps vectors to a later used to test the Pareto solutions yielded by the
higher (possibly infinite)-dimensional space, in which optimization algorithm. The aim of this fractioning
mapped vectors are potentially clustered, and then, is to allow checking if overfitting has happened, since
it computes a number of hyperplanes that separate the cohort of subjects to optimize VOI selection and
the different mapped data points, trying to maximize the subjects to test those VOIs are completely dif-
the distance between them and the hyperplanes. This
ferent and independent. Due to the large computa-
distance is the so-called functional margin; the larger
tional requirement necessary for the optimization of
the functional margin, the better the classification
the most representative VOIs by the NSGA-II, no
and the lower probability error.
k-fold has been performed. Due to the large com-
In order to create nonlinear classifiers, Aizerman
putational time required for the optimization of the
et al.73 proposed the so-called kernel trick, men-
most representative VOIs by the NSGA-II, k-fold has
tioned by Boser et al.71 as a way to create nonlin-
not been performed.
ear classifiers. In this contribution, the Radial Basis
To locate the most relevant regions of the brains,
Function kernel is used, since it maps in a non- and also to find out if gray matter-segmented images
linear fashion the features to a higher-dimensional could improve the performance of this kind of sys-
space, and, because of that, it can handle nonlin- tems, MR images belonging to the 296 patients were
ear relationships between samples and target labels. divided into 1170 different, nonoverlapping, VOIs,
It also has fewer hyperparameters than the poly- each one with a dimension of 15 × 15 × 15 voxels.
nomial kernel, easing the complexity of the model. This adds up to 2340 VOIs in total (1170 belonging
Most of the research validates its superiority over
to whole matter images and 1170 belonging to gray
other kernels. After deciding what kernel to use, the
matter images).
next steps involve the computation of the parame-
For the 296 subjects of this contribution, 3D-
ters C and γ to optimize the classifier, using k-fold
DWT was performed using biorthogonal 3.3 wavelet
cross-validation and performing a grid-search. In this
up to level 2 on each of the VOIs and approxima-
study, one-versus-all SVM multi-class approach is
tion coefficients were stored. Due to the large num-
used and 10-fold cross validation was carried out.
ber of coefficients for an individual VOI, the mRMR
Grid-search is a common technique in SVM that ana-
is used, selecting the 100 most relevant features of
lyzes different combinations of parameter values and each volume (as a compromise between computa-
selects the set of parameters which provides the most tional time cost and information reliability).
accurate model. The process is simply an exhaustive This information fetched for the 206 subjects
searching through a set of pairs of (C, γ). (training sample) was the information fed into the

1850022-11
2nd Reading
June 14, 2018 18:31 1850022

O. Valenzuela et al.

optimization algorithm to find the most relevant defined for a two-class classification problem as
regions (NSGA-II). Each individual in the popu- TP TN
lation of the algorithm (potential solutions) has a SEN = ; SPE = , (3)
TP + FN TN + FP
genetic binary code (with a length of 2340 genes, where TP, TN, FP, and FN are true positives, true
the same as the number of VOIs) that states which negatives, false positives, and false negatives, respec-
regions of the brain ought to be used to perform the tively. Because of the nature of the proposed method,
classification. which is a multi-class classifier, these definitions do
The next stage is the unification of all the coeffi- not hold, and instead, confusion matrices are used.
cients of the different selected VOIs (by the NSGA- A confusion matrix is a graphic representation of the
II) to obtain global information of the patient. performance of a multi-class problem where rows rep-
The number of coefficients obtained is very high
by THE UNIVERSITY OF NEW SOUTH WALES LIBRARY on 06/23/18. For personal use only.

resent the output class given by the classifier and


(obtained by the sum of the coefficients of the columns represent the target class, i.e. the actual
selected VOIs), so it is necessary to reduce this num- class. A summarized list of results presented in the
ber using PCA. The final features (components from references is shown in Table 7, and a graphical rep-
the PCA) are the input to the classifier (one-versus- resentation in Fig. 9.
Int. J. Neur. Syst. Downloaded from www.worldscientific.com

all SVM classifier is used).


A population of 100 individuals and 300 gener-
4.2. Classification results
ations was used by the NSGA-II. After around 150
generations, the optimization came to a stall state The optimization algorithm yields a Pareto front
and the optimization ended, meaning no improve- of nondominated solutions as shown in Table 3.
ment was achieved in terms of the fitness function For example, the third Pareto front solution is the
across the last generations. The number of individ- third-complex optimal solution found by the Multi-
uals and generations is short, since computational Objective Genetic Algorithm (third place, have been
complexity and running times exponentially increase ordered in order of complexity, the first being the
in this kind of tasks. solution with the highest number of VOIs). This
For the first phase, the training of the classi- third solution has 115 VOIs.
fier with the selected VOIs, a classic SVM train-
ing has been achieved, in which a group of patients
will be used for the validation. It is important to
note that the SVM classifier is included as a func-
tional block of the Genetic Algorithm, specifically in
the fitness function, therefore in order to obtain the
optimal classifier, a 10-fold cross-validation is carried
out.
Once the Pareto front has been obtained, dif-
ferent configurations of solutions can be evaluated
or tested (solutions with different VOI numbers
and their corresponding associated SVM classifier).
For this test phase, 90 different patients are used.
Trained SVM (which already has fixed parameters) is
used in the test phase to classify test patients. Three
solutions from the Pareto front have been analyzed,
and the results are presented in Sec. 4.

4. Results
4.1. Performance comparison
Fig. 9. Bar figure for a graphical representation of
Some important measures of the performance of a Table 7, note that the amount of subjects per class is
classifier are the sensitivity and specificity, which are related to their percentage, not absolute numbers.

1850022-12
2nd Reading
June 14, 2018 18:31 1850022

Genetic Algorithms to Find Relevant Volumes of the Brain in AD

Table 3. Pareto front for VOIs experiment during


training phase.

Pareto
solution Accuracy Total VOIs W VOIs C1 VOIs

First (1) 97.5% 136 59 77

Second (2) 96.66% 120 49 77

Third (3) 95% 115 48 67


by THE UNIVERSITY OF NEW SOUTH WALES LIBRARY on 06/23/18. For personal use only.

It is important to note that the accuracy in


Table 3 is related to the training phase (these solu-
tions are the solutions founded by the evolutionary
algorithm using the 206 training patients, and there- Fig. 11. Pareto front genome, which gives the relevance
Int. J. Neur. Syst. Downloaded from www.worldscientific.com

of the set of VOIs used to obtain a certain accuracy.


fore reflect the accuracy obtained by the SVM classi-
fier during the training phase). The whole population
of individuals of the Multi-Objective Genetic Algo- These results are not still solid enough, since they
rithm is shown in Fig. 10 and the Pareto genome could be worthless if overfitting had occurred and
(Fig. 11). the classifier would then classify the noise instead
A relevant conclusion that can be obtained from of the underlying relationships between the features
Fig. 11, taking into account that in the Pareto and the disease. Phase two (the testing phase) of the
genome, the x-axis is the different VOIs of the brain experiment needs to test the accuracy and truth-
selected and the y-axis is the accuracy of the set of fulness of these results. Thus, with this purpose,
the VOIs used, is that more sources are selected from the results obtained in the optimization phase were
gray matter images (since sources 1 to 1170 corre- tested on the remaining 90 subjects of the whole
spond to whole brain images and 1171 to 2340 cor- database that had not been part of phase one. This
respond to gray matter images). This fact suggests is very important and must be kept in mind: the 206
that in classification tasks, gray matter-segmented subjects used in the optimization phase are different
images are more suitable than whole brain images. and independent from the 90 subjects used in the
testing phase. This way, results (in terms of accu-
racy) obtained in the testing phase can be said to be
solid.
For the testing phase, different configurations
regarding the parameters involved in the process
were tested, and the results of the most relevant ones
are shown in Table 4. All these tests were performed

Table 4. Results on independent cohort of subjects


for first Pareto individual (Std stands for standard
deviation).

Confusion
Source Features PC Accuracy Std. matrix

W 11,800 175 56.67% 7.24% Fig. 12(a)

Fig. 10. Whole populations of solutions (including the C1 15,400 175 92.2% 4.96% Fig. 12(c)
solution from the Pareto front) during the execution of
W + C1 27,200 175 93.3% 4.64% Fig. 12(e)
the Multi-Objective Genetic Algorithm (NSGA-II).

1850022-13
2nd Reading
June 14, 2018 18:31 1850022

O. Valenzuela et al.

extracting biorthogonal 3.3 approximation coeffi- around 95% of the features variance. Each classi-
cients using 3D-DWT up to level 2 in the group fication was run five times and averaged in order
of 90 independent subjects, and selecting the most to reduce noise (because subjects belonging to the
relevant features with the mRMR methodology (the training set and testing set are randomly selected to
threshold was 100, which means a maximum of 100 ensure unbias).
features per VOI). Once all the information of the The confusion matrix for each configuration is
selected VOIs is unified, PCA was applied to keep shown in Figs. 12(a), 12(c), and 12(e) (W , C1, and
by THE UNIVERSITY OF NEW SOUTH WALES LIBRARY on 06/23/18. For personal use only.
Int. J. Neur. Syst. Downloaded from www.worldscientific.com

(a) (b)

(c) (d)

(e) (f)

Fig. 12. Results of the testing phase for first Pareto solution (left) and third Pareto solution (right). For the third Pareto
solution, only VOIs belonging to gray matter images (67 regions) are used. (a) Confusion matrix for W sources from first
Pareto individual. (b) Confusion matrix for third Pareto solution. 80 selected features per VOI. (c) Confusion matrix for
C1 sources from first Pareto individual. (d) Confusion matrix for third Pareto solution. 100 selected features per VOI.
(e) Confusion matrix for W + C1 sources from first Pareto individual. (f) Confusion matrix for third Pareto solution. 130
selected features per VOI.

1850022-14
2nd Reading
June 14, 2018 18:31 1850022

Genetic Algorithms to Find Relevant Volumes of the Brain in AD

W + C1 sources, respectively). It is noteworthy that more features selected per VOI led to lower accu-
W and C1 sources involve different VOIs. racy. Confusion matrices are shown in Figs. 12(b),
For the sake of curiosity, the same testing phase 12(d), and 12(f), and the results are summarized in
was performed using the regions belonging to gray Table 5.
matter-segmented images from the third Pareto solu-
tion, which comprises fewer VOIs (the third Pareto
solution has a total of 115 VOIs, being 48 W VOIs 4.3. Regions examination
and 67 C1 VOIs). This time, it was also tested how After obtaining these promising results in the inde-
the number of mRMR features affected the accuracy pendent cohort (93.3% accuracy for W + C1 sources
achieved. from the first Pareto solution and 94.4% accuracy for
The procedure was the same as the one already C1 regions from the third Pareto solution), an inter-
by THE UNIVERSITY OF NEW SOUTH WALES LIBRARY on 06/23/18. For personal use only.

described (using the independent cohort of 90 sub- esting goal was to label the regions used in order to
jects, extracting bior3.3 level 2 approximation coef- find a link to Alzheimer pathology. It is noteworthy
ficients, etc.), except for the fact that just the VOIs that W and C1 sources involve different VOIs, cen-
belonging to gray matter images were used, this is, ters of the boxes of 15 × 15 × 15 voxels are listed in
Int. J. Neur. Syst. Downloaded from www.worldscientific.com

67 VOIs of 15 × 15 × 15 voxels, which add up to Table 6.


226,125 voxels from a total of 4,035,528. In order to do this, graphical representation of
Three classifications were performed, one in the regions used and the MRIcron tool were needed.
which only 80 mRMR features per VOI were selected, For example, in Fig. 13, the VOIs conforming the
another one, with 100 features per VOI, and the first Pareto solution are shown. In the left, a sliced
last one, with 130 features per VOI. Both fewer and brain and each VOI of 15×15×15 voxels rendered in
its location are shown; it can be seen how some of the
Table 5. Summarized results for the three configura-
VOIs (specifically eight) lie on the cortical surface,
tions tested using the third Pareto solution. which might suggest that cortical thickness could be
of interest when assessing AD and its intermediate
Maximum feats. states. The rest of the VOIs are located in specific
per VOI Total feats. PCA Accuracy Std. regions of the brain, and will be addressed in the
80 6160 175 94.4% 5.55 following sections. On the right area of the figure,
the three slices used to plot the 3D view are shown,
100 7700 175 92.2% 6.33 with the intersected VOIs that have common voxels
130 10,010 175 91.1% 3.04
with the highlighted slices. The corresponding plot
for C1 VOIs is shown in Fig. 14.

Table 6. Location of the W and C1 VOIs centers for first Pareto solution obtained with the multi-objective genetic
algorithm.
Source VOI centers (x × y × z, separated by semicolons)
10 38 82;10 66 26;10 80 96;10 94 82;10 136 68;25 66 82;25 94 96;25 150 54;25 164 82;25 178 12;25 178 40;40
24 40;40 136 82;40 136 110;40 150 68;55 38 124;55 66 110;55 94 26;55 136 54;55 150 26;55 150 110;55 164
54;55 164 96;55 178 26;70 24 82;70 52 54;70 66 12;70 80 54;70 94 26;70 150 68;70 164 110;70 178 124;85 94
W
124;85 122 54;85 136 26;85 150 40;100 38 40;100 52 12;100 80 68;100 150 82;100 164 124;115 24 54;115 52
54;115 66 96;115 94 68;115 94 124;115 150 40;130 24 68;130 80 110;130 122 54;130 122 82;130 164 82;145 24
40;145 52 110;145 66 12;145 66 68;145 122 96;145 164 12;145 178 110
10 38 82;10 66 82;10 80 12;10 80 40;10 122 12;10 136 12;25 10 40;25 10 96;25 24 124;25 164 26;40 52 54;40
52 96;40 66 12;40 66 54;40 66 68;40 94 68;40 136 96;40 150 26;40 178 68;55 52 110;55 108 110;55 122 12;55
122 124;55 164 40;70 38 68;70 52 110;70 80 40;70 80 110;70 94 82;70 94 110;70 108 40;70 108 110;70 108
124;70 122 54;70 150 68;70 164 68;70 178 96;85 10 26;85 10 54;85 10 96;85 24 54;85 38 26;85 52 68;85 66
C1
54;85 80 26;85 80 96;85 94 110;85 108 82;85 122 40;85 164 96;85 178 40;85 178 54;100 10 26;100 24 82;100
24 96;100 66 82;100 94 12;100 94 110;100 108 82;115 38 40;115 38 82;115 66 26;115 66 96;115 66 110;115
122 12;115 164 40;130 24 96;130 38 54;130 52 54;130 52 68;130 66 68;130 136 68;130 164 40;130 178 68;145
38 12;145 66 54;145 164 12

1850022-15
2nd Reading
June 14, 2018 18:31 1850022

O. Valenzuela et al.

so, a wrapped optimization algorithm around a clas-


sifier was implemented, with the purpose of finding
the most relevant regions of the brain when classi-
fying patients. The second goal was to implement a
classifier that could be part of a CAD tool, being able
to classify between AD, MCIc, MCInc, and healthy
patients.
For the optimization phase of the study, 2340
different VOIs (1170 whole matter images and
1170 from gray matter-segmented images) were
extracted from MRI images from 206 different sub-
by THE UNIVERSITY OF NEW SOUTH WALES LIBRARY on 06/23/18. For personal use only.

jects. The optimization was accomplished using a


Fig. 13. Whole matter regions conforming first Pareto Multi-objective Genetic Algorithm, and a Pareto
solution.
front was obtained.
For the testing phase, two solutions from the
Int. J. Neur. Syst. Downloaded from www.worldscientific.com

Pareto front were tested with an independent sub-


ject cohort comprising 90 subjects. The first tested
solution was the first Pareto individual, comprising
136 VOIs. Using these regions, the classifier achieved
93.3% accuracy. The second solution tested was the
gray matter regions comprising the third Pareto solu-
tion. Using these 67 volumes, which represent just
5.69% of the total volume of the brain, the classifier
achieved an accuracy of 94.4%.
Analyzing the Pareto front, we also drew the con-
clusion that gray matter images were more relevant
than whole matter images when it comes to classify-
Fig. 14. Gray matter regions conforming first Pareto ing and diagnosing AD.
solution.
There are very little, not to say almost no refer-
ences that perform 3D feature extraction, for opti-
MRIcron tool was used to address the issue of mization of VOIs in a multi-class classification (AD,
labeling each region belonging to the first Pareto MCIc, MCInc, and Normal). The fact that these
solution, the corresponding information obtained methods were implemented in the proposed method-
is shown in Table 8 for the regions located in ology, along with the high accuracies achieved during
the temporal lobe, Table 9 for the regions located the testing phase, suggests that 3D-DWT for feature
in the frontal lobe, Table 10 for the regions located in extraction and mRMR for feature selection, in con-
the parietal lobe, Table 11 for the regions located junction with NSGA-II and PCA could improve the
in the occipital lobe, Table 12 for the regions located accuracy achieved by classifiers (SVM). Besides, the
in the limbic lobe, and Table 13 for other regions. proposed methodology (optimization and testing) is
Some examples (but not all for the sake of con- not only successful and outstanding in terms of accu-
ciseness) of the most relevant regions found using this racy, but it could also be a promising procedure that
procedure are shown in Figs. 15(a), 15(b), and 15(c). should be improved and tested on other diseases to
aid the researchers to further extract knowledge from
MR images.
5. Discussion and Conclusions Another important accomplishment of the pro-
This study has two main goals. The first one is to posed methodology is the information extracted from
develop a procedure that could help researchers to the Pareto solutions: the regions of the brain involved
explore the regions of the brain related to AD (and in AD and MCI. Most of the regions, listed and
scalable to explore other diseases). In order to do described in Sec. 4.3, are already stated as solid

1850022-16
Int. J. Neur. Syst. Downloaded from www.worldscientific.com
by THE UNIVERSITY OF NEW SOUTH WALES LIBRARY on 06/23/18. For personal use only.

Table 7. Comparison results of different classifiers presented in the references. If different configurations were carried out for the
experiment, the most similar ones to the presented methodology or those that achieved best accuracies are shown.
Authors Imaging Cohort Classification Method Acc (%) Sen (%) Spe (%)
Aggarwal et al., MRI OASIS 99 AD versus 99 Normal 3DWT, FDR, mRMR, SVM 68.94 81.13 75.02
201531
Beheshti et al., MRI ADNI 130 AD versus 130 Normal VBM,DARTEL,PDF,SVM 90.76 90 91.53
201532 68 AD versus 68 Normal VBM,DARTEL,PCA,SVM 78.23 77.34 79.11
Dukart et al., MRI PET ADNI 28 AD versus 28 Normal VOIs,SVM 85.7 89.3 82.1
June 14, 2018 18:31 1850022

20136
Zhang et al., MRI OASIS 28 AD versus 98 Normal ICV,Eigenbrain,WTT,SVM 91.47 90.17 91.84
201544 28 AD versus 98 Normal DF,PCA,TSVM 92.36 90.56 93.37
Olfa Ben MRI ADNI 45 AD versus 52 Normal MKL (MD,SMR, CSF) 90.2 82.98 97.2
Ahmend et al., 58 MCI versus 52 Normal 79.42 71.58 86.05
201756 45 AD versus 58 MCI 76.63 65.62 81.33
Ahmend et al., MRI ADNI 137 AD versus 162 Normal SVM-RBF, 83.77 79.09 88.2
201555 210 MCI versus 162 Normal Hipp-PCC 69.45 74.8 62.52
137 AD versus 210 MCI features 62.07 49.02 75.15
Ahmend et al., MRI Bordeaux-3City 16 AD versus 21 Normal SVM-RBF, Hipp-PCC 78 74.7 80.04
201555 dataset features
Liu et al.59 MRI ADNI 180 AD versus 204 Normal SVM, SAE 82.59 86.83 77.78
374 MCI versus 204 Normal 71.98 49.52 84.3
Liu et al. MRI ADNI 180 AD versus 204 Normal versus SVM, SAE 46.3 66.14 77.78
201559 160 MCIc versus 214 MCInc
Liu et al. MRI and ADNI 85 AD versus 77 Normal versus SAE-ZEROMASK 53.79 52.14 86.98
201559 PET ADNI 102 MCIc versus 67 MCInc
Martinez-Murcia MRI ADNI 180 AD versus 180 Normal HMM,DARTEL,SVM 82.8 79.4 86.1

1850022-17
et al. 201653
Ortiz et al. MRI and PET ADNI 70 AD versus 68 Normal Ensemble Deep Learning 90 86 94
201657 111 MCI versus 70 AD (FEDBN-SVM) 84 79 89
68 Normal versus 111 MCI 83 67 95
Tong et al. MRI and PET ADNI 37 AD versus 35 Normal Nonlinear graph fusion 91.8 88.9 94.7
201758 75 MCI versus 35 Normal 79.5 85.1 67.1
37 AD versus 75 MCI
versus 35 Normal 60.2 — —
2nd Reading

Zhu et al. MRI ADNI 51 AD versus 52 Normal versus 99 PCA,LDA,LPP,SVM 68.31 — —


201661 MCI
51 AD versus 52 Normal versus 43 59.74 — —
MCIc versus 56 MCInc
Zhu et al. MRI and PET ADNI 51 AD versus 52 Normal versus 99 PCA,LDA,LPP,SVM 73.35 — —
201661 MCI
51 AD versus 52 Normal versus 43
MCIc versus 56 MCInc 61.06 — —
Zhang et al. MRI OASIS 24 AD versus 97 Normal versus 57 3D-DWT, 81.5 — —
2015(b)33 MCI PCA,KSVM + PSOTVA
Proposed MRI ADNI 87 AD versus 76 MCIc versus 74 DWT–mRMR–PCA–SVM 93.3 Fig. 12(e)
method MCInc versus 59 Normal
Proposed MRI ADNI 87 AD versus 76 MCIc versus 74 DWT–mRMR–PCA–SVM 94.4 Fig. 12(b)
method MCInc versus 59 Normal
Genetic Algorithms to Find Relevant Volumes of the Brain in AD
2nd Reading
June 14, 2018 18:31 1850022

O. Valenzuela et al.

Table 8. Temporal lobe, VOIs for first Pareto individual, being each VOIs represented by its
centers (x × y × z).
Whole matter Gray matter
Temporal lobe Left hemisphere Right hemisphere Left hemisphere Right hemisphere
(130 38 54)
(10 80 40)
(130 52 54)
(145 66 68) (40 52 54)
Medial/Inferior (10 80 96) (130 52 68)
(145 66 12) (40 66 54)
(130 66 68)
(40 66 68)
(145 66 54)
Superolateral (145 66 68)
by THE UNIVERSITY OF NEW SOUTH WALES LIBRARY on 06/23/18. For personal use only.

Table 9. Frontal lobe, VOIs for first Pareto individual, being each VOIs represented by its centers (x × y × z). Pt = Pars
triangularis; Pop = Pars opercularis; Pg = Precentral gyrus; Mg = Medial gyrus; Stg = Straight gyrus; R = Rectus;
Or = Orbital.
Int. J. Neur. Syst. Downloaded from www.worldscientific.com

Whole matter Gray matter


Frontal lobe Left hemisphere Right hemisphere Left hemisphere Right hemisphere
(130 136 68) (Pt)
(25 150 54) (Pt) (130 122 54) (Pop) (55 108 110) (Pg)
(130 136 68) (Pop)
Superolateral (40 136 82) (Pt) (115 94 124) (Pg) (55 122 124)
(85 164 96) (Mg)
(40 150 68) (Pt) (130 122 82) (Pg) (70 164 68) (Mg)
(100 94 110) (Pg)
(40 136 110)
(40 136 96) (85 178 40) (Or)
(55 164 54)
(85 136 26) (Stg, R) (55 164 40) (85 178 54) (Or)
Medial/Inferior (55 164 96)
(115 150 40) (Or) (25 164 26) (Or) (115 164 40) (Or)
(55 150 26) (Or)
(40 150 26) (Or) (130 164 40) (Or)
(55 178 26) (Or)
Supplementary (70 108 110)
(85 94 110)
motor area (70 108 124)

Table 10. Parietal lobe, VOIs for first Pareto individual, being each VOIs represented by its centers (x × y × z).
Pr = Precuneus; Po = Postcentral; Pa = Paracentral; Sg = Supramarginal gyrus; Ag = Angular gyrus; Pl=Paracentral
lobule.
Whole matter Gray matter
Parietal lobe Left hemisphere Right hemisphere Left hemisphere Right hemisphere
(10 94 82) (115 66 96)
(115 66 96)
(25 66 82) (70 52 110 ) (Pr) (100 94 110) (Po)
Medial/Inferior (130 80 110) (Po)
(25 94 96) (70 80 40) (Pr) (85 66 54 ) (Pr)
(85 94 124) (Pa)
(55 38 124) (Pr) (100 66 82) (Pr)
(10 66 82) (Sg)
(40 94 68) (Sg)
(115 66 96) (Ag)
(40 52 96) (Ag)
Superolateral (115 66 110)
(55 52 110)
(85 94 110) (Pl)
(70 80 110) (Pl)
(70 94 110) (Pl)

1850022-18
2nd Reading
June 14, 2018 18:31 1850022

Genetic Algorithms to Find Relevant Volumes of the Brain in AD

Table 11. Occipital lobe, VOIs for first Pareto individual, being each VOIs represented by its centers (x × y × z).
Lg = Lingual gyrus; C = Cuneus; Cs = Calcarine sulcus.
Whole matter Gray matter
Occipital lobe Left hemisphere Right hemisphere Left hemisphere Right hemisphere
(85 66 54) (Lg)
(100 10 26) (Lg)
(115 38 40) (Lg)
(100 24 82) (C)
(100 38 40) (Lg)
(40 24 40) (70 38 68) (C) (85 10 26) (Cs)
Medial/Inferior (115 52 54) (Lg)
(70 52 54) (Lg) (70 38 68) (Cs) (85 10 54) (Cs)
(115 24 54)
(85 24 54) (Cs)
(85 52 68) (Cs)
by THE UNIVERSITY OF NEW SOUTH WALES LIBRARY on 06/23/18. For personal use only.

(100 10 26) (Cs)


(115 38 40) (Cs)
Superolateral (70 24 82 ) (115 38 82 )
Int. J. Neur. Syst. Downloaded from www.worldscientific.com

Table 12. Limbic lobe, VOIs for first Pareto individual, being each VOIs represented by its centers (x × y × z).
AC = Anterior cingulum; MC = Medial cingulum; Pc = Posterior cingulum.
Whole matter Gray matter
Limbic lobe Left hemisphere Right hemisphere Left hemisphere Right hemisphere
(100 94 12)
(55 94 26) (55 122 12)
Hippocampus (100 80 68) (115 66 26)
(70 80 54) (70 80 40)
(115 122 12)
(85 80 96) (Mc)
(85 108 82) (Mc)
(85 150 40) (Ac) (70 150 68) (Ac) (100 66 82) (Mc)
Cingulate gyrus
(100 150 82) (Mc) (70 80 110) (Mc) (100 108 82) (Mc)
(85 66 54) (Pc)
(100 66 82) (Pc)

Table 13. Other regions, VOIs for first Pareto individual, being each VOIs represented by its centers (x × y × z).
Whole matter Gray matter
Other regions Left hemisphere Right hemisphere Left hemisphere Right hemisphere
(55 136 54)
Caudate (Basal ganglia) (70 122 54) (85 122 40)
(70 150 68)
Ventral Pallidum (Basal ganglia) (70 108 40)
(85 38 26)
(85 80 26)
Cerebellum (Hindbrain) (70 66 12) (100 52 12) (40 66 12)
(85 66 54) (Vermis)
(85 80 26) (Vermis)
Thalamus (Diencephalon) (85 122 54) (70 108 40)
(115 94 68)
Insular cortex
(115 150 40)

1850022-19
2nd Reading
June 14, 2018 18:31 1850022

O. Valenzuela et al.
by THE UNIVERSITY OF NEW SOUTH WALES LIBRARY on 06/23/18. For personal use only.

(a) (b)
Int. J. Neur. Syst. Downloaded from www.worldscientific.com

(c)

Fig. 15. Analysis of ROIs in Caudate, Pars triangularis, and Precuneus. (a) Caudate: The caudate nucleus is deeply
related to goal-directed action, memory, learning, emotion, and language. In recent years, it has been strongly related to
AD.74,75 (b) Pars triangularis: Also known as Brodmann area 45 and part of the frontal cortex. Together with Brodmann
area 44, it comprises the Broca’s area, strongly related to semantic tasks. (c) Precuneus: Part of the superior parietal
lobule. Involving memory tasks, the precuneus plays a key role in attention, episodic memory retrieval, working memory,
and conscious perception (like self-awareness). Regarding visuospatial abilities, it has been suggested to be involved in
directing attention in space, motor imagery, visuospatial mental operations, and even to modeling other people’s views
(thus related to empathy and judging). It has also been suggested that together with the posterior cingulate, the precuneus
is pivotal for conscious information processing. Finally, it has been proposed that it works as a hub between parietal and
prefrontal regions.76

sources of information to assess the diagnosis of AD used in preparation of this paper were obtained from
and MCI. A few of them have recently been the the ADNI database (adni.loni.usc.edu).
object of new studies regarding this condition and its
consequences. It could be the case that the remain- References
ing regions, for which no references were found, are 1. R. Wolz, V. Julkunen, J. Koikkalainen, E. Niskanen,
relevant in the progression of cognitive impairment D. P. Zhang, D. Rueckert, H. Soininen and J. Lotjo-
and the development of AD. nen, Multi-method analysis of MRI images in early
diagnostics of Alzheimer’s disease, PLoS One 6(10)
(2011) e25446.
Acknowledgments 2. G. B. Frisoni, N. C. Fox, C. R. Jack, P. Scheltens and
P. M. Thompson, The clinical use of structural MRI
The publication of this paper was supported by in Alzheimer disease, Nat. Rev. Neurol. 6(2) (2010)
Projects TIN2015-71873-R and P12-TIC 2082. Data 67.

1850022-20
2nd Reading
June 14, 2018 18:31 1850022

Genetic Algorithms to Find Relevant Volumes of the Brain in AD

3. American Psychiatric Association, Diagnostic and 13. E. L. G. E. Koedam, M. Lehmann, W. M. Van Der
Statistical Manual of Mental Disorders, 5th edn. Flier, P. Scheltens, Y. A. L. Pijnenburg, N. Fox,
(American Psychiatric Association, Arlington, VA, F. Barkhof and M. P. Wattjes, Visual assessment
2013), p. 280. of posterior atrophy development of a MRI rating
4. S. T. DeKosky, M. C. Carrillo, C. Phelps, D. Knop- scale, Eur. Radiol. 21(12) (2011) 2618–2625.
man, R. C. Petersen, R. Frank, D. Schenk, D. Mas- 14. R. Romero-Garcia, M. Atienza and J. L. Cantero,
terman, E. R. Siemers, J. M. Cedarbaum, M. Gold, Different scales of cortical organization are selec-
D. S. Miller, B. H. Morimoto, A. S. Khacha- tively targeted during the progression to Alzheimer’s
turian and R. C. Mohs, Revision of the criteria disease, Int. J. Neural Syst. 26 (2016) 165003.
for Alzheimer’s disease: A symposium, Alzheimers 15. M. R. Farlow, Y. He, S. Tekin, J. Xu, R. Lane and
Dement. 7(1) (2011) e1–e12. H. C. Charles, Impact of APOE in mild cognitive
5. S. Kloeppel, C. M. Stonnington, C. Chu, B. Dra- impairment, Neurology 63(10) (2004) 1898–1901.
by THE UNIVERSITY OF NEW SOUTH WALES LIBRARY on 06/23/18. For personal use only.

ganski, R. I. Scahill, J. D. Rohrer, N. C. Fox, C. R. 16. C. R. Jack, R. C. Petersen, Y. C. Xu, P. C. O’Brien,


Jack, J. Ashburner and R. S. J. Frackowiak, Auto- G. E. Smith, R. J. Ivnik, B. F. Boeve, S. C. War-
matic classification of MR scans in Alzheimers dis- ing, E. G. Tangalos and E. Kokmen, Prediction of
ease, Brain 131(7) (2008) 681–689. AD with MRI-based hippocampal volume in mild
6. J. Dukart, K. Mueller, H. Barthel, A. Villringer, cognitive impairment, Neurology 52(7) (1999) 1397–
O. Sabri and M. L. Schroeter, Meta-analysis based 1403.
Int. J. Neur. Syst. Downloaded from www.worldscientific.com

SVM classification enables accurate detection of 17. C. R. Jack, D. W. Dickson, J. E. Parisi, Y. C. Xu,
Alzheimer’s disease across different clinical centers R. H. Cha, P. C. O’Brien, S. D. Edland, G. Smith,
using FDG-PET and MRI, Psychiatry Res. Neu- B. F. Boeve, E. Tangalos, E. Kokmen and R. C.
roim. 212(3) (2013) 230–236. Petersen, Antemortem MRI findings correlate with
7. J. Dukart, K. Mueller, A. Villringer, F. Kherif, hippocampal neuropathology in typical aging and
B. Draganski, R. Frackowiak and M. L. Schroeter, dementia, Neurology 58(5) (2002) 750–757.
Relationship between imaging biomarkers, age, pro- 18. H. Adeli, S. Ghosh-Dastidar and N. Dadmehr,
gression and symptom severity in Alzheimer’s dis- Alzheimer’s disease: Models of computation and
ease, NeuroImage Clin. 3 (2013) 84–94. analysis of EEGs, Clin. EEG Neurosci. 36(3) (2005)
8. H. I. Suk and D. Shen, Subclass-based multi-task 131–140.
learning for Alzheimer’s disease diagnosis, Front. 19. H. Adeli, S. Ghosh-Dastidar and N. Dadmehr,
Aging Neurosci. 6 (2014) 1–20. A spatio-temporal wavelet-chaos methodology for
9. C. DeCarli, G. B. Frisoni, C. M. Clark, D. Harvey, EEG-based diagnosis of Alzheimer’s disease, Neu-
M. Grundman, R. C. Petersen, L. J. Thal, S. Jin, rosci. Lett. 444(2) (2008) 190–194.
C. R. Jack and P. Scheltens, Qualitative estimates of 20. Z. Sankari, H. Adeli and A. Adeli, Wavelet coherence
medial temporal atrophy as a predictor of progres- model for diagnosis of Alzheimer disease, Clin. EEG
sion from mild cognitive impairment to dementia, Neurosci. 43(4) (2012) 268–278.
Arch. Neurol. 64(1) (2007) 108–115. 21. F. C. Morabito, M. Campolo, D. Labate, G. Mora-
10. D. López-Sanz, P. Garcés, B. Álvarez, M. L. bito, L. Bonanno, A. Bramanti, S. de Salvo, A. Marra
Delgado-Losada, R. López-Higes and F. Maestú, and P. Bramanti, A longitudinal EEG study of
Network disruption in the preclinical stages of Alzheimer’s disease progression based on a complex
Alzheimer’s Disease: From subjective cognitive network approach, Int. J. Neural Syst. 25(2) (2015)
decline to mild cognitive impairment, Int. J. Neu- 1550005.
ral Syst. 27(8) (2017) 1750041. 22. N. Mammone, L. Bonanno, S. D. Salvo, S. Marino,
11. G. B. Frisoni, P. H. Scheltens, S. Galluzzi, F. M. P. Bramanti A. Bramanti and F. C. Morabito, Per-
Nobili, N. C. Fox, P. H. Robert, H. Soininen, L.-O. mutation disalignment index as an indirect, EEG-
Wahlund, G. Waldemar and E. Salmon, Neuroimag- based, measure of brain connectivity in MCI and AD
ing tools to rate regional atrophy, subcortical cere- patients, Int. J. Neural Syst. 27 (2017) 1750020.
brovascular disease, and regional cerebral blood flow 23. A. H. Ahmadlou, H. Adeli and A. Adeli, Frac-
and metabolism: Consensus paper of the EADC, J. tality and a wavelet-chao methodology for EEG-
Neurol. Neurosurg. Psychiatry 74(10) (2003) 1371– based diagnosis of Alzheimer’s disease, Alzheimer
1381. Dis. Assoc. Disord. 25 (2011) 85–92.
12. E. Westman, L. Cavallin, J. S. Muehlboeck, 24. Z. Sankari and H. Adeli, Probabilistic neural net-
Y. Zhang, P. Mecocci, B. Vellas, M. Tsolaki, works for EEG-based diagnosis of Alzheimer’s dis-
I. Koszewska, H. Soininen, C. Spenger, S. Love- ease using conventional and wavelet coherence,
stone, A. Simmons and L. O. Wahlund, Sensitivity J. Neurosci. Methods 197 (2011) 165–170.
and specificity of medial temporal lobe visual rat- 25. N. Mammone, C. Leracitano, H. Adeli A. Bra-
ings and multivariate regional MRI classification in manti and F. C. Morabito, Permutation Jaccard
Alzheimer’s disease, PLoS One 6(7) (2011) e22506. distance-based hierarchical clustering to estimate

1850022-21
2nd Reading
June 14, 2018 18:31 1850022

O. Valenzuela et al.

EEG network density modifications in MCI subjects, 38. S. Wang, Y. Zhang, X. Yang, P. Sun, Z. Dong,
IEEE Trans. Neural Netw. Learn. Syst. (2018). A. Liu and T. F. Yuan, Pathological brain detection
26. J. Amezquita-Sanchez, A. Adeli and H. Adeli, A new by a novel image feature-fractional fourier entropy,
methodology for automated diagnosis of mild cogni- Entropy 17(12) (2015) 8278–8296.
tive impairment (MCI) using magnetoencephalogra- 39. M. S. M. Rahim, T. Saba, F. Nayer and A. Z. Syed,
phy (MEG), Behav. Brain Res. 305 (2016) 174–180. 3D texture features mining for MRI brain tumor
27. M. Ahmadlou, A. Adeli, R. Bajo and H. Adeli, identification, 3D Res. 5(1) (2014) 1–8.
Complexity of functional connectivity networks in 40. D. S. Nachimuthu and A. Baladhandapani, Multidi-
mild cognitive impairment patients during a working mensional texture characterization: On analysis for
memory task, Clin. Neurophysiol. 125 (2013) 694– brain tumor tissues using MRS and MRI, J. Digit.
702. Imaging 27(4) (2014) 496–506.
28. D. Zhang, Y. Wang, L. Zhou, H. Yuan and D. Shen, 41. C. Raja and N. Gangatharan, A hybrid swarm algo-
by THE UNIVERSITY OF NEW SOUTH WALES LIBRARY on 06/23/18. For personal use only.

Multimodal classification of Alzheimer’s disease and rithm for optimizing glaucoma diagnosis, Comput.
mild cognitive impairment, Neuroimage 55(3) (2011) Biol. Med. 63 (2015) 196–207.
856–867. 42. R. R. Sharma and P. Marikkannu, Hybrid RGSA
29. J. P. Lerch, J. Pruessner, A. P. Zijdenbos, D. L. and support vector machine framework for three-
Collins, S. J. Teipel and A. C. Evans, Automated dimensional magnetic resonance brain tumor classi-
cortical thickness measurements from MRI can accu- fication, The Scientific World Journal 2015 (2015)
Int. J. Neur. Syst. Downloaded from www.worldscientific.com

rately separate Alzheimer’s patients from normal 184350-1–14.


elderly controls, Neurobiol. Aging 29(2) (2008) 23– 43. E. Challis, P. Hurley, L. Serra, M. Bozzali, S. Oliver
30. and M. Cercignani, Gaussian process classification of
30. R. S. Desikan, H. J. Cabral, C. P. Hess, W. P. Dillon, Alzheimer’s disease and mild cognitive impairment
C. M. Glastonbury, M. W. Weiner, N. J. Schmansky, from resting-state fMRI, NeuroImage 112 (2015)
D. N. Greve, D. H. Salat, R. L. Buckner and B. Fis- 232–243.
chl, Automated MRI measures identify individuals 44. Y. Zhang, Z. Dong, P. Phillips, S. Wang, G. Ji,
with mild cognitive impairment and Alzheimers dis- J. Yang and T.-F. Yuan, Detection of subjects and
ease, Brain 132 (2009) 2048–2057. brain regions related to Alzheimer’s disease using 3D
31. N. Aggarwal, B. Rana and R. Agrawal, 3D discrete MRI scans based on eigenbrain and machine learn-
wavelet transform for computer-aided diagnosis of ing, Front. Comput. Neurosci. 9 (2015) 1–15.
Alzheimer’s disease using t1-weighted brain MRI, 45. H. Kalbkhani, M. G. Shayesteh and B. Zali-
Int. J. Imag. Syst. Technol. 25(2) (2015) 179–190. Vargahan, Robust algorithm for brain magnetic res-
32. I. Beheshti and H. Demirel, Probability distribution onance image (MRI) classification based on GARCH
function-based classification of structural MRI for variances series, Biomed. Signal Process. Control
the detection of Alzheimer’s disease, Comput. Biol. 8(6) (2013) 909–919.
Med. 64 (2015) 208–216. 46. E. A. El-Dahshan, A. B. M. Salem and T. H.
33. Y. Zhang, S. Wang, P. Phillips, Z. Dong, G. Ji and Younis, A Hybrid technique for automatic MRI
J. Yang, Detection of Alzheimer’s disease and mild brain images classification, Stud. Univ. Babes-Bolyai
cognitive impairment based on structural volumet- Inform. 54(1) (2009) 55–67.
ric MR images using 3D-DWT and WTA-KSVM 47. S. Chaplot, L. M. Patnaik and N. R. Jagannathan,
trained by PSOTVAC, Biomed. Signal Process. Con- Classification of magnetic resonance brain images
trol 21 (2015) 58–73. using wavelets as input to support vector machine
34. R. Cuingnet, E. Gerardin, J. Tessieras, G. Auzias, and neural network, Biomed. Signal Process. Con-
S. Lehéricy, M. O. Habert, M. Chupin, H. Benali and trol 1(1) (2006) 86–92.
O. Colliot, Automatic classification of patients with 48. J. Zhang, C. Yu, G. Jiang, W. Liu and L. Tong,
Alzheimer’s disease from structural MRI: A compar- 3D texture analysis on MRI images of Alzheimer’s
ison of ten methods using the ADNI database, Neu- disease, Brain Imaging Behav. 6(1) (2012)
roImage 56(2) (2011) 766–781. 61–69.
35. J. Ashburner, A fast diffeomorphic image registra- 49. H. C. Peng, F. H. Long and C. Ding, Feature selec-
tion algorithm, NeuroImage 38 (2007) 95–113. tion based on mutual information: Criteria of max-
36. W. Penny, K. J. Friston, J. T. Ashburner, S. J. dependency, max-relevance and min-redundancy,
Kiebel and T. E. Nichols, Statistical Parametric IEEE Trans. Pattern Anal. Mach. Intell. 27(8)
Mapping: The Analysis of Functional Brain Images (2005) 1226–1238.
(Academic Press, San Diego, CA, USA, 2007). 50. T. Tekin Erguzel, C. Tas and M. Cebi, A wrapper-
37. I. Beheshti and H. Demirel, Feature-ranking-based based approach for feature selection and classifica-
Alzheimer’s disease classification from structural tion of major depressive disorder-bipolar disorders,
MRI, Magn. Reson. Imag. 34(3) (2016) 252–263. Comput. Biol. Med. 64 (2015) 127–137.

1850022-22
2nd Reading
June 14, 2018 18:31 1850022

Genetic Algorithms to Find Relevant Volumes of the Brain in AD

51. K. Deb, Multi-Objective Optimization Using Evolu- 63. J. Ashburner and K. Friston, Unified segmentation,
tionary Algorithms (John Wiley & Sons, New York, NeuroImage 26 (2005) 839–851.
NY, USA, 2001). 64. L. Khedher, J. Ramı́rez, J. M. Górriz, A. Brahim
52. K. Deb, A. Pratap, S. Agarwal and T. Meyarivan, and F. Segovia, Early diagnosis of Alzheimer’s dis-
A fast and elitist multiobjective genetic algorithm: ease based on partial least squares, principal com-
NSGA-II, IEEE Trans. Evol. Comput. 6(2) (2002) ponent analysis and support vector machine using
182–197. segmented MRI images, Neurocomputing 151(P1)
53. F. J. Martinez-Murcia, J. M. Gorriz, J. Ramirez (2015) 139–150.
and A. Ortiz, A structural parametrization of the 65. N. Sriraam and R. Shyamsunder, 3D medical image
brain using hidden Markov models-based paths in compression using 3D wavelet coders, Digit. Signal
Alzheimer’s disease, Int. J. Neural Syst. 26(7) (2016) Process. 21(1) (2011) 100–109.
1650024. 66. C. He, J. Dong, Y. Zheng and Z. Gao, Optimal 3D
by THE UNIVERSITY OF NEW SOUTH WALES LIBRARY on 06/23/18. For personal use only.

54. B. S. Wade, S. H. Joshi, B. A. Gutman and P. M. coefficient tree structure for 3D wavelet video coding,
Thompson, Machine learning on high dimensional IEEE Trans. Circuits Syst. Video Technol. 13(10)
shape data from subcortical brain surfaces: A com- (2003) 961–972.
parison of feature selection and classification meth- 67. C. Ding and H. Peng, Minimum redundancy fea-
ods, Pattern Recogn. 63 (2017) 731–739. ture selection from microarray gene expression data,
55. O. B. Ahmed, M. Mizotin, J. Benois-Pineau, Bioinform. Comput. Biol. 3(2) (2005) 185–206.
Int. J. Neur. Syst. Downloaded from www.worldscientific.com

M. Allard, G. G. Catheline and C. B. Amar, 68. T. M. Cover, J. A. Thomas et al., Elements of Infor-
Alzheimer’s disease diagnosis on structural MR mation Theory (1991), 2nd Edition, Wiley, USA.
images using circularharmonic functions descrip- 69. G. Mirzaei, A. Adeli and H. Adeli, Imaging
tors on hippocampus and posteriorcingulate cortex, and machine learning techniques for diagnosis of
Comput. Med. Imaging Graph. 44 (2015) 13–25. Alzheimer disease, Rev. Neurosci. 27(8) (2016) 857–
56. O. B. Ahmed, J. Benois-Pineau, M. Allard, 870.
G. Catheline and C. B. Amar, Recognition of 70. C. Cortes and V. Vapnik, Support vector machine,
Alzheimer’s disease and mild cognitive impairment Machine Learn. 20(3) (1995) 1303–1308.
with multimodal image-derived biomarkers and mul- 71. B. E. Boser, I. M. Guyon and V. N. Vapnik, A
tiple kernel learning, Neurocomputing 220 (2017) training algorithm for optimal margin classifiers, in
98–110. Proc. Fifth Annual ACM Workshop on Computa-
57. A. Ortiz, J. Munilla, J. M. Gorriz and J. Ramirez, tional Learning Theory, Pittsburgh, Pennsylvania,
Ensembles of deep learning architectures for the USA, ACM (1992), pp. 144–152.
early diagnosis of the Alzheimer’s disease, Int. J. 72. L. Khedher, I. A. Illan, J. M. Gorriz, J. Ramirez,
Neural Syst. 26(7) (2016). A. Brahim and A. Meyer-Baese, Independent
58. T. Tong, K. Gray, Q. Gao, L. Chen and D. Rueck- component analysis-support vector machine-based
ert, Multi-modal classification of Alzheimer’s disease computer-aided diagnosis system for Alzheimer’s
using nonlinear graph fusion, Pattern Recogn. 63 with visual support, Int. J. Neural Syst. 27 (2017)
(2017) 171–181. 1650050.
59. S. Liu, S. Liu, W. Cai, H. Che, S. Pujol, R. Kiki- 73. M. A. Aizerman, E. A. Braverman and L. Rozo-
nis, D. Feng and M. J. Fulham, Multimodal neu- noer, Theoretical foundations of the potential func-
roimaging feature learning for multiclass diagnosis tion method in pattern recognition learning, Autom.
of Alzheimers disease, IEEE Trans. Biomed. Eng. Remote Control 25 (1964) 821–837.
62(4) (2015) 1132–1140. 74. S. Jiji, K. A. Smitha, A. K. Gupta, V. P. M. Pil-
60. A. Bin Tufail, A. Abidi, A. M. Siddiqui and M. S. lai and R. S. Jayasree, Segmentation and volumetric
Younis, Multiclass classification of initial stages analysis of the caudate nucleus in Alzheimer’s dis-
of Alzheimer’s disease using structural MRI phase ease, Eur. J. Radiol. 82(9) (2013) 1525–1530.
images, 2012 IEEE Int. Conf. Control System, Com- 75. S. K. Madsen, A. J. Ho, X. Hua, P. S. Saharan, A. W.
puting and Engineering (Malaysia, 2012), pp. 317– Toga, C. R. Jack, M. W. Weiner and P. M. Thomp-
321. son, 3D maps localize caudate nucleus atrophy in
61. X. Zhu, S.-W. Suk, H.-I. Lee and D. Shen, Sub- 400 Alzheimer’s disease, mild cognitive impairment
space regularized sparse multitask learning for multi- and healthy elderly subjects, Neurobiol. Aging 31(8)
class neurodegenerative disease identification, IEEE (2010) 1312–1325.
Trans. Biomed. Eng. 63(3) (2016) 607–618. 76. E. Bullmore, E. Bullmore, O. Sporns and O. Sporns,
62. G. Flandin and K. Friston, Analysis of family-wise Complex brain networks: Graph theoretical analy-
error rates in statistical parametric mapping using sis of structural and functional systems, Nat. Rev.
random field theory, Hum. Brain Mapp. (Nov. 1, Neurosci. 10 (2009) 186–198.
2017) 1–3.

1850022-23

You might also like