Brain Tumour Detection Using MRI Images
Brain Tumour Detection Using MRI Images
A R T I C L E I N F O A B S T R A C T
Keywords: Early detection and diagnosis of brain tumors are crucial to taking adequate preventive measures, as with most
Artificial intelligence cancers. On the other hand, artificial intelligence (AI) has grown exponentially, even in such complex envi
Cancer detection ronments as medicine. Here it’s proposed a framework to explore state-of-the-art deep learning architectures for
Machine learning
brain tumor classification and detection. An own development called Cross-Transformer is also included, which
Magnetic resonance imaging
consists of three scalar products that combine self-care model keys, queries, and values. Initially, we focused on
Transformers
Tumors the classification of three types of tumors: glioma, meningioma, and pituitary. With the Figshare brain tumor
dataset was trained the InceptionResNetV2, InceptionV3, DenseNet121, Xception, ResNet50V2, VGG19, and
EfficientNetB7 networks. Over 97 % of classifications were accurate in this experiment, which provided a net
work’s performance overview. Subsequently, we focused on tumor detection using the Brain MRI Images for
Brain Tumor Detection and The Cancer Genome Atlas Low-Grade Glioma database. The development encom
passes learning transfer, data augmentation, as well as image acquisition sequences; T1-weighted images (T1WI),
T1-weighted post-gadolinium (T1-Gd), and Fluid-Attenuated Inversion Recovery (FLAIR). Based on the results,
using learning transfer and data augmentation increased accuracy by up to 6 %, with a p-value below the sig
nificance level of 0.05. As well, the FLAIR sequence was the most efficient for detection. As an alternative, our
proposed model proved to be the most effective in terms of training time, using approximately half the time of
the second fastest network.
1. Introduction they diffuse to nearby areas. The growth can press on brain tissue,
causing high-impact complications even if the tumors are benign [10,
Cancer is one of the most common diseases worldwide, with an 11]. Brain tumors account for approximately 2.17 % of all cancer deaths
estimated 1.8 million new cases and more than 600,000 deaths in 2020 and the five-year survival rate is low, at around 5.6 % for glioblastoma
in the United States alone [1,2]. Cancer is a disease characterized by the [12]. The impact of brain tumors and the concerning statistics have
uncontrolled growth of abnormal cells in the body. It is caused by mu motivated ongoing research in the field [13], with physicians and sci
tations or changes in the function of cells [3], which leads to the loss of entists searching for ways to prevent tumors, more efficient treatments,
the cell’s ability to undergo programmed cell death [4]. This results in better diagnostic tests, and better ways to study and classify tumors [14,
the formation of tumors and affects various organs and tissues [5,6]. 15]. This research includes new methods for exploring brain anatomy
Cancer can be difficult to detect depending on the affected organ or and the development of AI systems [16].
cause treatment complications [7,8]. For example, brain cancer involves Several tools can be used to detect brain abnormalities such as
CNS parts, making it difficult to perform surgery or radiotherapy to computed tomography (CT), positron emission tomography (PET),
remove the affected regions [9]. magnetoencephalography (MEG) and magnetic resonance imaging
Brain tumors, while they rarely spread to other parts of the body, can (MRI) are among the most used [17,18]. MRI is considered the most
still be dangerous as they can grow quickly and damage brain tissue as popular and effective method for detecting brain abnormalities because
* Corresponding author.
E-mail address: [email protected] (L. Mera-Jiménez).
https://doi.org/10.1016/j.ejro.2023.100484
Received 25 August 2022; Received in revised form 28 February 2023; Accepted 1 March 2023
Available online 14 March 2023
2352-0477/© 2023 The Authors. Published by Elsevier Ltd. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-
nc-nd/4.0/).
A. Anaya-Isaza et al. European Journal of Radiology Open 10 (2023) 100484
it can distinguish between different structures and tissues and it does not Table 1
use ionizing radiation, making it safe for patients [19]. AI has been Dataset used for training convolutional neural networks for brain tumor
applied in the field of brain tumor detection, classification, segmenta detection.
tion, diagnosis, and evolution [15,20–22]. The application of AI, espe Dataset Subjects Sequences Slices Classes Images
cially DL-based methods, has demonstrated high levels of accuracy per class
comparable to that of expert radiologists [23–25]. BTD 233 T1-Gd Axial, Meningioma 708
Various developments in AI have led to multiple models or archi coronal and Glioma 1426
tectures being developed to handle various tasks, but on natural images sagittal Pituitary 930
MRI-D 253 T1WI Axial Tumors 155
[26–29]. However, the results cannot be fully extrapolated to medical
Not tumors 98
imaging, because of the own physical and physiological characteristics TCGA- 110 T1W1, T1- Axial Tumors 1373
recorded on it [30]. In line with that, there are machine learning ap LGG Gd, FLAIR Not tumors 2556
proaches that have been developed for the detection of neoplasms, such
as the use of basic algorithms like k Nearest Neighbor (kNN algorithm)
with an accuracy of 98.2 % [31], or the use of principal component • Seven novel deep-learning networks were compared for brain tumor
analysis (PCA) with a sensitivity, specificity, and accuracy of 97.36 %, detection. Additionally, an influence assessment of data augmenta
100 %, and 95.0 % [32]. Another example is the use a of support vector tion and learning transfer has been carried out.
machine (SVM) to differentiate between benign and malignant tumors, • The experiment was repeated for all seven novel networks. Never
with an accuracy of 99.24 %, precision of 95.83 %, and recall of 95.30 % theless, the three most common acquisition sequences comparative
[33]. Lastly, an ensemble method comprising of Bagging Classifier, analysis was performed. Moreover, we included the novel architec
Random Forest, Extra Trees Classifier, Gradient Boosting, Extreme ture we named Cross-Transformer, together with the seven networks.
Gradient Boosting, and Adaptive Boosting algorithms, has been used to
achieve an accuracy of 94.07, precision of 90.78, recall of 93.33, spec 2. Materials and methods
ificity of 94.44, and F1_score of 91.52 [34].
In the field of Deep Learning, advancements have been made in 2.1. Dataset
detecting physiological anomalies using Deep Belief Networks (DBN)
with an accuracy of over 94.11 % [35]. Other developments include the As previously mentioned, Magnetic Resonance Imaging (MRI) is a
detection of brain metastasis using single-shot detection models on CT widely accepted and reliable method for identifying brain abnormalities
scans with a sensitivity of 88.7 % [36] and the use of Wasserstein due to its ability to differentiate between various structures and tissues
adversarial generative networks (WGAN) for cancer diagnosis [37]. within the brain. Therefore, this paper addresses only three principal
Additionally, feature-based artificial neural networks (ANN) and the sequences: T1WI, T1-Gd, and FLAIR. Images were collected from mul
Extended Set-Membership Filter (ESMF) have been used to diagnose tiple centers and institutions to ensure that the MRI data used is diverse.
brain tumors with an accuracy of 97.14 % and 88.24 % respectively. The datasets used for this study are described in detail in Table 1 and
Future research should focus on classifying abnormalities into benign Fig. 1, which also show examples of various images obtained from the
and malignant tumors [38]. In that way, some research utilizes more three datasets: The Brain Tumor Dataset (BTD), Magnetic Resonance
robust DL-oriented approaches, such as using a convolutional network Imaging Dataset (MRI-D), and The Cancer Genome Atlas Low-Grade
with a new regularization method called Mixed-Pooling-Dropout, which Glioma database (TCGA-LGG). For details regarding the pre-processing
results in a classification accuracy of 92.6 % compared to 86.8 % for techniques implemented on the images, please refer to the supplemen
traditional clustering methods [39]. Another study employed DarkNet tary material.
for brain tumor classification and segmentation by incorporating data
augmentation and transfer learning with a Figshare database [40] of 708 2.2. Performance evaluation metrics
meningiomas, 1426 gliomas, and 930 pituitaries, achieving an accuracy
of 98.54 % [41]. The performance of the networks in detecting or classifying brain
Tandel et al. used a combination of five DL networks (AlexNet, tumors was evaluated by computing F_1 score, accuracy, sensitivity,
VGG16, ResNet18, GoogleNet, and ResNet50) and five conventional specificity, and precision. All these metrics were expressed mathemati
machine learning algorithms (Support Vector Machine, K-Neighbors, cally as equations from (1) to (5) in Table 2.
Naive Bayes, Decision Tree, and Linear Discrimination) to classify by
majority vote using a five-fold cross-validation scheme. They also used 2.3. Experimental design
data augmentation methods such as scaling and rotation and incorpo
rated transfer learning. They achieved average scores over 97.10 % in The initial data sets were divided into two sets with 80% and 20%
accuracy, sensitivity, specificity, Area Under the Receiver, Positive proportions for training and testing, respectively. For an understanding
predictive value, and negative predictive value [42]. of the neural networks utilized in this study and their key attributes,
The results of using deep learning networks for the classification and please refer to the supplementary material. All networks were trained
detection of brain tumors using magnetic resonance images are prom using the following hyperparameters:
ising and demonstrate the high effectiveness of these networks. How
ever, the potential of these strategies has only recently been explored • Loss function: Categorical cross-entropy.
and many configurations may affect the performance of AI [43]. As a • Optimizer: Adadelta.
result of this experimental framework, this work shows the following • Epochs: 50
contributions: • Validation: 10 k-folds cross-validation.
• Number of repeated runs per fold: 3
• We developed a new architecture based on attention models, like the • Batch size: 4
Transformer network, which we call Cross-Transformer. • Initialization of weights: Uniform Glorot.
• An overview of artificial intelligence systems in detection and clas • Bias initialization: Zeros.
sification was performed.
• Seven novel deep-learning networks were compared for brain tumor Three experiments were conducted to classify and detect brain tu
classification. mors using MRI data and the most seven novel CNNs described before. In
the first experiment, the BTD dataset was used (see Table 1) to classify
2
A. Anaya-Isaza et al. European Journal of Radiology Open 10 (2023) 100484
Fig. 1. Samples of A) the three types of brain tumors in the BTD database, B) the MRI-D database with the two classes: tumors and non-tumors, and C) the TCGA-LGG
database with the three types of sequences and in the two classes: tumors and non-tumors.
Table 2 Table 4
Performance evaluation metrics. Maximum scores achieved by the seven DL neural networks as a function of the
three classes.
Metric Equation
Class F1_score Accuracy Sensitivity Specificity Precision
Accuracy [44] (1)
TP + TN Pituitary 95,39 97,22 97,81, 8 100,00 100,00
ACC =
TP + TN + FP + FN Glioma 93,59 93,94 96,15 100,00 97,67
Meningioma 82,71 92,14 85,92 100,00 100,00
F1 score [45] (2)
2TP
Table 1). Finally, the detection experiment was repeated on the TCGA-
F1 =
2TP + FP + FN
Sensitivity or Recall [45,46] (3) LGG database, using three types of acquisition sequences: T1WI,
TP FLAIR, T1-Gd, and the proposed new network called Cross-Transformer
SE =
TP + FN (see supplementary material) was included. The performance of each
Specificity [44,46] (4)
training was evaluated using metrics such as accuracy, sensitivity,
TN
specificity, and F1 score. The results were compared using the
SP =
TN + FP nonparametric Kruskal-Wallis test. The architectures were modeled
using Python and libraries such as Keras and TensorFlow. The experi
Precision [45] (5)
ments were run on the Colab platform using a Tesla T4 GPU and 25 GB
TP
Pr =
TP + FP of RAM.
Where, the terms TP, TN, FP, and FN are the true positives, true negatives, false
3. Results
positives, and false negatives, respectively.
3.1. Tumor classification – BTD dataset
Table 3
Maximum scores achieved by the seven DL neural networks on the test data. BTD data.
Network F1_score Accuracy Sensitivity Specificity Precision
3
A. Anaya-Isaza et al. European Journal of Radiology Open 10 (2023) 100484
Table 5
P-value evaluated between the seven different neural networks using the Kruskal-Wallis test. Classification of the three types of tumors.
p-value
Networks 1 2 3 4 5 6 7
Fig. 2. Score distribution generated by the different training evaluated with the test data. A) accuracy, B) sensitivity, and C) specificity. D) metrics as a function of
tumor type for the best performing network (InceptionResNetv2).
identifying all three types of tumors with high accuracy, specificity, and
precision values close to 100 %. Pituitary tumors achieved the highest
scores, with an accuracy approximately 3 % higher than gliomas and 5 %
more accurate than meningiomas.
As shown in Table 5, the Kruskal-Wallis test generates a p-value
between the different neural network’s F1 scores. In most cases, the p-
value is below the significance level (α = 0.05), meaning the networks
have statistically different distributions. To highlight, InceptionRes
NetV2 and DenseNet121 have a p-value of 0.77, this can be deduced by
the box and whisker plots in Fig. 2 for accuracy, sensitivity, and speci
ficity which show similar ranges for both networks, particularly in the
interquartile ranges.
The performance of InceptionResNetV2 in tumor classification is
evaluated in Fig. 2D using the accuracy, sensitivity, and specificity
metrics. The plots show that all metrics are close to 90 % for pituitary
tumors, except for sensitivity for meningioma, which is significantly
lower compared to the accuracy and specificity metrics. Additionally,
pituitary tumors exhibit more homogenous distributions compared to
glioma and meningioma.
The training and validation results for InceptionResNetv2 are pre
sented in Fig. 3, which includes the loss function and accuracy metric.
The curves display the average of multiple runs with a 95 % error band.
The validation and training curves demonstrate similar patterns, indi
Fig. 3. Average training and 95 % error bands for the best performing network
cating good generalization ability and low or no overtraining. In Fig. 3A,
(InceptionResNetv2). A) accuracy as a function of epochs and B) loss as a the accuracy values are around 0.95, consistent with the findings in
function of epochs with training and validation data. Fig. 2A. Fig. 3B shows the loss values close to 0.1. It’s worth noting that
the training and validation curves intersect at epoch 35, suggesting that
4
A. Anaya-Isaza et al. European Journal of Radiology Open 10 (2023) 100484
Table 6
Maximum scores achieved by the seven DL neural networks in detection under the four training conditions.
Methods Metric Inception ResNetV2 InceptionV3 Xception ResNet50V2 DenseNet121 VGG19 Efficient NetB7
Fig. 4. Scores distribution generated by the different training evaluated with the test data. A) Accuracy, B) Sensitivity, C) Specificity, and D) F1 score. The dis
tributions are shown for the four training conditions, i.e., without any strategy, learning transfer, data augmentation, and combining learning transfer and data
augmentation. MRI-D dataset.
the model could achieve high performance with fewer epochs. compared to training from zeros. The networks with the highest peak
performance were DenseNet121, InceptionV3, and VGG19, while
3.2. Tumor detection – MRI-D dataset Xception had the lowest scores. Fig. 4.
In Table 7, the F1 score and precision metrics are presented for the
Table 6 reports the accuracy and F1 score of seven networks trained two classes, tumor, and not-tumor. The results demonstrate that the
under four different conditions (see supplementary material): training combination of data augmentation and learning transfer significantly
from scratch (N), transfer learning (T), data augmentation (D), and both enhances network performance, with the latter being more effective
transfer learning and data augmentation (T&D). The table shows that all than the former in terms of the F1 score. Specifically, the fourth training
networks achieved high performance, with scores over 90 %. The F1 condition (T&D) resulted in the highest scores for the tumor class,
score improved by 3.4 % for training from zeros and by 1 % for data achieving a 3 % improvement in class differentiation compared to
augmentation, while transfer learning led to a 6 % increase in accuracy training without any strategy, which resulted in an 11 % difference
5
A. Anaya-Isaza et al. European Journal of Radiology Open 10 (2023) 100484
Table 7
Maximum scores achieved by the seven DL neural networks as a function of the Table 9
two classes and the four training conditions. P-value evaluated between the four training conditions for each of the seven
Method Class F1_score Accuracy convolutional neural networks using the Kruskal-Wallis test. Tumor detection.
Table 8
P-value evaluated among the seven neural networks using the Kruskal-Wallis
test in tumor detection. The statistic was calculated with all the training con
ditions scores.
p-value - (Comparison with the scores of all training sessions)
Network 1 2 3 4 5 6 7
6
A. Anaya-Isaza et al. European Journal of Radiology Open 10 (2023) 100484
Table 10
Maximum scores achieved by the eight DL neural networks in detection, with the three different image acquisition sequences.
Sequences Metric Inception ResNetV2 Cross- Transformer Xception DenseNet121 InceptionV3 ResNet50V2 Efficient NetB7 VGG19
T1WI F1_score 89,72 82,89 89,30 87,94 86,14 86,35 79,63 79,69
FLAIR 93,45 84,76 91,95 91,99 88,85 87,29 80,13 79,44
T1-Gd 89,42 82,84 88,45 86,83 84,80 85,85 79,26 78,83
T1WI Accuracy 86,53 88,06 85,90 84,88 82,21 81,83 71,03 70,65
FLAIR 91,36 89,58 89,33 89,20 84,63 83,61 70,01 68,36
T1-Gd 85,90 88,31 85,13 83,35 80,81 81,07 68,74 65,06
FLAIR sequence followed by T1WI and T1-Gd as the preferred imaging accuracy metric increasing to above 0.9 as the number of epochs in
technique. creases. However, the validation score was lower than the training score
As previously stated, the InceptionResNetV2 model demonstrated after epoch 40, indicating model overtraining.
the highest efficacy in identifying brain tumors. Furthermore, the In Fig. 9 Although the metric reached higher values during training,
network showed statistically significant differences compared to most of the proposed model’s distributions were partially lower than those of
the other networks. The p-value generated by the Kruskal-Wallis test the InceptionResNetv2 network. Additionally, the model loss was low
suggests that InceptionResNetV2 is similar to Xception with a signifi values, however, the validation curve returned values that were higher
cance level of 0.07 (see Table 12). than the training curve after epoch 25, again showing partial
Table 13 displays the Kruskal-Wallis p-value for each network in overtraining.
comparison to various imaging sequences. In most networks, the FLAIR As a final point, Fig. 10 shows the eight architectures average
sequence demonstrated a significant difference except for the Effi training time implemented in this study. The results indicate that the
cientNetB7 and VGG19 networks. However, for the T1WI and T1-Gd proposed model (Cross-Transformer) is highly effective. In other words,
sequences, no significant differences were observed concerning the on average, the proposed model took approximately 18 min to train 50
other networks. Notably, only the InceptionResNetV2 and DenseNet121 epochs with almost three thousand images. In comparison, training
networks exhibited p-values that were below the significance level. InceptionResNetV2 took 5x longer and efficientNetB7 took 9x longer
Fig. 7 demonstrates the distribution of scores by networks and image under the same conditions.
acquisition sequences. The box-and-whisker plots in Fig. 7A reveal that The training time was checked with the Kruskal-Wallis test. The p-
the FLAIR sequence had a higher distribution than the other two se values are shown in Table 14, which demonstrates that the proposed
quences in each network, signifying that it has higher accuracy in model (Cross-Transformer) showed statistically significant differences
detecting brain tumors. Moreover, sensitivity, specificity, and F1 score from the other seven models. The model with the highest p-value was
metrics displayed a similar pattern to the accuracy. the ResNet50V2 network, which reached a significance level of 0.09.
In Fig. 8 for InceptionResNetV2 the accuracy of both the training and
validation curves increased with time, convergent above 0.9. Otherwise, 4. Discussion
the loss curves (see Fig. 8B) decreased for training and validation,
converging to approximately 0.5. In this case, the validation curves were The primary focus of this research is to investigate the aggressiveness
better than the training curves; therefore, a larger number of epochs of cancer, particularly in the context of brain tumors, and their potential
would be requiring improving the model’s performance. to cause severe complications, regardless of whether they are malignant
Similarly, Fig. 9 depicts the accuracy metric training and model loss or benign. To achieve this goal, various datasets and deep learning
for Cross-Transformer, where the training and validation curves are neural networks were employed to detect and classify different types of
illustrated. The proposed model shows excellent performance, with the brain tumors, such as meningioma, glioma, and pituitary tumors. The
study utilized several classification networks including ResNet50V2,
EfficientNetB7, InceptionResNetV2, InceptionV3, VGG19, Xception, and
Table 11 DenseNet121. Additionally, the study evaluated the acquisition se
Maximum scores achieved by the eight DL neural networks as a function of the quences of magnetic resonance imaging (MRI) scans, including T1WI,
two classes and the three image acquisition sequences. FLAIR, and T1-Gd, in order to improve the accuracy of the classification
Method Class F1_score Accuracy process. The ultimate objective of this research is to detect brain tumors
quickly and effectively to prevent potentially serious complications.
T1WI Tumor 81,46 86,53
Not tumor 89,72 88,06 The statement is discussing the results of a study in which the
FLAIR Tumor 87,31 91,36 InceptionResNetV2 network was used to predict pituitary tumors. It was
Not tumor 93,45 91,36 found that the network was highly accurate, with accuracy scores above
T1-Gd Tumor 79,93 85,90 97 %. The results were surprising for two main reasons: firstly, the
Not tumor 89,42 88,31
Table 12
P-value evaluated among the eight neural networks using the Kruskal-Wallis test in tumor detection. The statistic was calculated with all the scores of the three image
acquisition sequences.
p-value - (Data of all sequences)
Network 1 2 3 4 5 6 7 8
7
A. Anaya-Isaza et al. European Journal of Radiology Open 10 (2023) 100484
Table 13 needed to understand why pituitary tumors are easier for DL networks to
P-value evaluated between the three image acquisition sequences for each of the detect.
eight ANNs using the Kruskal-Wallis test. Tumor detection. In the case of meningiomas, they presented performance metrics
p-value (Sequences) below the other two types of tumors (low sensitivity and F1 score),
Statistician evaluated between: FLAIR FLAIR T1WI
which is in line with the research conducted by Swati et al. where it was
established that meningioma presented an F1 score of 88.88 % in
T1WI T1-Gd T1-Gd
contrast to 94.52 % and 91.80 % for glioma and pituitary, respectively
InceptionResNetV2 0,00 0,00 0,04 [47]. The scores showed the high DL models effectiveness and were the
Cross-Transformer 0,00 0,00 0,67 highest in the state of the art at the time, as our research achieved F1
Xception 0,00 0,00 0,25
DenseNet121 0,00 0,00 0,04
scores of 82.71, 93.59, and 95.39 in the classification of meningiomas,
InceptionV3 0,00 0,00 0,13 glioma, and pituitary, respectively (See Table 4). Furthermore, this
ResNet50V2 0,03 0,00 0,35 confirms the need to explore new DL networks, because Swati et al. only
EfficientNetB7 0,94 0,22 0,24 focused on the AlexNet and VGG19 networks.
VGG19 0,17 0,60 0,07
Secondly, we investigated the diagnosis of brain tumors, i.e.,
whether the MRI images revealed any physiological features charac
dataset of pituitary tumors was smaller compared to that of glioma tu teristic of brain tumors. The experiment was performed on the MRI-D
mors, and secondly, pituitary tumors are typically smaller in size and database. The results confirmed that data augmentation strategies
harder to detect visually (see Fig. 1). However, the pituitary tumors using transfer learning significantly improved the model’s performance,
exhibited homogeneous behavior which led to high classification scores. increasing accuracy by up to 6 % (InceptionResNetV2 network
The study used a dataset of 1426 gliomas, 930 pituitary tumors, and 708 augmentation). On the other hand, training with data augmentation also
meningiomas. Despite the large dataset, the models performed better in improved the network’s performance, but to a lesser extent than with
detecting pituitary tumors than gliomas. The results suggest that gliomas transfer learning. However, this does not assume that data augmentation
may be harder to detect due to their physiological characteristics in MRI alone is sufficient to improve network performance, since, as Sugimori
images. The study also indicates that networks can identify some pa et al. showed, data augmentation is effective on smaller data sets [48].
thologies more easily even with smaller dataset sizes. Further studies are Moreover, the scores indicate that the difference between the metrics
Fig. 7. Scores distribution generated by the different training evaluated with the test data. A) Accuracy, B) Sensitivity, C) Specificity, and D) F1 score. The dis
tributions are shown for the three image acquisition sequences, i.e., T1WI, FLAIR, and T1-Gd. TCGA-LGG dataset.
8
A. Anaya-Isaza et al. European Journal of Radiology Open 10 (2023) 100484
Fig. 10. The average training time of the 8 architectures with 95 % confidence
interval (black lines). Training.
9
A. Anaya-Isaza et al. European Journal of Radiology Open 10 (2023) 100484
Table 14
P-value evaluated in between the eight neural networks using the Kruskal-Wallis test at training time for tumor detection. Statistics estimate by all scores from the three
image acquisition sequences.
p-value - (Times)
Network 1 2 3 4 5 6 7 8
An analysis of brain tumor classification and detection on magnetic No funding was received for this work.
resonance imaging has been performed using different datasets.
Initially, we evaluated the seven most recent neural networks for the CRediT authorship contribution statement
classification of meningioma, glioma, and pituitary-type tumors (BTD
database). The results indicate that the neural networks have excellent Conceptualization, Anaya-Isaza and Mera-Jiménez.; Methodology,
detection and classification algorithms. Firstly, the InceptionResNetV2 Anaya-Isaza and Mera-Jiménez; Software, Anaya-Isaza and Mera-
network achieved up to 97 % accuracy, outperforming the networks of Jiménez; Validation, Mera-Jiménez; Formal analysis, Anaya-Isaza.;
InceptionV3, DenseNet121, Xception, ResNet50V2, VGG19, and Effi Investigation, Anaya-Isaza.; Resources, Anaya-Isaza and Mera-
cientNetB7 with a significance level of 0.07 in the Kruskal-Wallis test. Jiménez.; Writing – original draft, Anaya-Isaza, Mera-Jiménez, Ver
Additionally, InceptionResNetV2 provided the most homogeneous dis dugo-Alejo and Sarasti-Ramírez; Writing – review & editing, Anaya-
tribution, ensuring its high effectiveness. Isaza, Verdugo-Alejo, Mera-Jiménez and Sarasti-Ramírez; Visuali
Moreover, it was noted that pituitary tumors are distinguished from zation, Anaya-Isaza and Mera-Jiménez.; Supervision, Anaya-Isaza
meningiomas and gliomas, even though the former has a lower number and Mera-Jiménez.; Project administration, Anaya-Isaza and Mera-
of images. After this, the MRI-D dataset was used to detect brain tumors Jiménez.; Funding acquisition, Anaya-Isaza. All authors have read and
by incorporating transfer learning and data augmentation. Together, agreed to the published version of the manuscript.
these two strategies increased the accuracy of InceptionResNetV2 by up
to 6 % over the model trained from zeros. Further, such a combination Declaration of Competing Interest
was statistically different for networks trained under other conditions,
such as training with only learning transfer or only data augmentation. The authors declare that they have no known competing financial
In fact, out of the 7 networks cited above, only InceptionV3 and Xception interests or personal relationships that could have appeared to influence
were statistically significant at the 0.05 level. Finally, the detection was the work reported in this paper.
replicated on TCGA-LGG data by examining the T1WI, FLAIR, and T1-Gd
acquisition sequences. A new network was introduced to the experi Acknowledgement
ment, referred to as the cross-transformer. Results showed that the
FLAIR sequence is more suitable for brain tumor detection, with a sig This research was supported by the research division of INDIGO
nificance level of less than 0.03 in six of the eight networks, except for Technologies (https://indigo.tech/). The results published here are in
the EfficientNetB7 and VGG19 networks. Additionally, it was shown whole or part based upon data generated by the TCGA Research
that the cross-transformer achieved accuracy values close to 90 % while Network: https://www.cancer.gov/tcga.
using a training-time fraction of the second-fastest network,
ResNet50V2.
Appendix
A. Hyperparameters
Adadelta: A stochastic gradient descent method based on the adaptive learning rate per dimension for the optimization of training parameters, i.e.,
to adjust model parameters or weights during training. [49]
Batch size: The number of samples processed before updating the model weights [50]. The larger the batch size, the faster the training, but it
requires more RAM.
Bias initialization: Values taken by the bias (bj ) of the model before the model training is started (see Eq. (1)).
Categorical cross-entropy: Loss function based on the logarithmic difference (see Eq. (6)) between two probability distributions of random data or
sets of events. Its use focuses on the set elements classification [51]. In the case of images, this principle can be applied to image pixels, where each
element is cataloged into two possible categories: background and object of interest.
LBCE (y, ̂y ) = − (ylog(̂y ) + (1 − y)log(1 − ̂y ) ) (6)
Here y are the actual labels and ̂y are the values predicted by the model.
Cross-validation: It is a technique used to evaluate the performance of artificial intelligence networks, guaranteeing the partition independence
between training and validation data. The method consists of dividing the data set into a given number of subsets. One subset would be left for
10
A. Anaya-Isaza et al. European Journal of Radiology Open 10 (2023) 100484
validation and trained with the remaining subsets. The process is repeated and in each run a different subset is taken for validation [52].
Epochs: Number of times the model training is repeated with the whole data set.
Loss function: A function that determines the difference between the actual data and the data predicted by the network or model [53].
Optimizer: A way in which the gradient (or a gradient variant) of the training parameters is calculated to adjust these values towards values that
optimize or reduce the loss function [54].
Performance metrics: Functions to monitor and measure model performance based on actual and predicted model values [55].
Training parameters: Coefficients that accompany the mathematical models’ operations and that are iteratively adjusted in the training process (e.
g., weights and bias).
Initialization of weights: Values taken by the training model parameters or weights before the model training is started. In the case of a con
volutional network, the weights are those that make up the convolutional filter K ij (see Eq. (1)).
B. Glossary
Supplementary data associated with this article can be found in the online version at doi:10.1016/j.ejro.2023.100484.
References [8] A. Auvinen, Y.M. Hakama, Cancer screening: theory and applications. International
Encyclopedia of Public Health, Elsevier, 2017, pp. 389–405, https://doi.org/
10.1016/B978-0-12-803678-5.00050-3.
[1] American Cancer Society, Cancer Facts & Figures 2020, 1–76, 2020.
[9] R. Huang, J. Boltze, Y.S. Li, Strategies for improved intra-arterial treatments
[2] Cancer, https://www.who.int/en/news-room/fact-sheets/detail/cancer (accedido
targeting brain tumors: a systematic review, Front. Oncol. 10 (2020), https://doi.
17 de noviembre de 2020).
org/10.3389/fonc.2020.01443.
[3] T.M. Mack, What a cancer is. Cancers in the Urban Environment, Elsevier, 2021,
[10] S.-J. Moon, D.T. Ginat, R.S. Tubbs, Y.M.D. Moisi, Tumors of the brain. en Central
pp. 5–8, https://doi.org/10.1016/B978-0-12-811745-3.00003-3.
Nervous System Cancer Rehabilitation, Elsevier, 2019, pp. 27–34, https://doi.org/
[4] S.D. Ray, N. Yang, S. Pandey, N.T. Bello, Y.J.P. Gray, Apoptosis. en Reference
10.1016/B978-0-323-54829-8.00004-4.
Module in Biomedical Sciences, Elsevier, 2019, https://doi.org/10.1016/B978-0-
[11] H. Sontheimer, Brain tumors. en Diseases of the Nervous System, Elsevier, 2021,
12-801238-3.62145-1.
pp. 207–233, https://doi.org/10.1016/B978-0-12-821228-8.00009-3.
[5] J.R. Foster, Introduction to Neoplasia. Comprehensive Toxicology, Elsevier, 2018,
[12] N. Reynoso-Noverón, A. Mohar-Betancourt, Y.J. Ortiz-Rafael, Epidemiology of
pp. 1–10, https://doi.org/10.1016/B978-0-12-801238-3.02217-0.
brain tumor. Pinciples of Neuro-Oncology., Springer International Publishing,
[6] J. Yokota, Tumor progression and metastasis, Carcinogenesis vol. 21 (n.◦ 3) (2000)
Cham, 2021, pp. 15–25, https://doi.org/10.1007/978-3-030-54879-7_2.
497–503, https://doi.org/10.1093/carcin/21.3.497.
[13] N. Turner, Y.N. Vidovic, Cancer health concerns. Reference Module in Food
[7] D.E. Ost, Y.M.K. Gould, Decision making in patients with pulmonary nodules, Am.
Science, Elsevier, 2018, https://doi.org/10.1016/B978-0-08-100596-5.22577-8.
J. Respir. Crit. Care Med. 185 (4) (2012) 363–372, https://doi.org/10.1164/
rccm.201104-0679CI.
11
A. Anaya-Isaza et al. European Journal of Radiology Open 10 (2023) 100484
[14] O. Troyanskaya, Z. Trajanoski, A. Carpenter, S. Thrun, N. Razavian, Y.N. Oliver, grade classification, Comput. Biol. Med. 137 (2021), 104829, https://doi.org/
Artificial intelligence and cancer, Nat. Cancer 1 (2) (2020) 149–152, https://doi. 10.1016/j.compbiomed.2021.104829.
org/10.1038/s43018-020-0034-6. [35] T. Sathies Kumar, C. Arun, Y.P. Ezhumalai, An approach for brain tumor detection
[15] W.L. Bi, et al., Artificial intelligence in cancer imaging: Clinical challenges and using optimal feature selection and optimized deep belief network, Biomed. Signal.
applications, CA Cancer J. Clin. (2019), caac.21552, https://doi.org/10.3322/ Process. Control 73 (2022), 103440, https://doi.org/10.1016/j.bspc.2021.103440.
caac.21552. [36] H. Takao, S. Amemiya, S. Kato, H. Yamashita, N. Sakamoto, Y.O. Abe, Deep-
[16] A. Hosny, C. Parmar, J. Quackenbush, L.H. Schwartz, Y.H.J.W.L. Aerts, Artificial learning single-shot detector for automatic detection of brain metastases with the
intelligence in radiology, Nat. Rev. Cancer 18 (8) (2018) 500–510, https://doi.org/ combined use of contrast-enhanced and non-enhanced computed tomography
10.1038/s41568-018-0016-5. images, Eur. J. Radiol. 144 (2021), 110015, https://doi.org/10.1016/j.
[17] M.I. Sharif, J.P. Li, J. Naz, Y.I. Rashid, A comprehensive review on multi-organs ejrad.2021.110015.
tumor detection based on machine learning, Pattern Recognit. Lett. 131 (2020) [37] Y. Xiao, J. Wu, Y.Z. Lin, Cancer diagnosis using generative adversarial networks
30–37, https://doi.org/10.1016/j.patrec.2019.12.006. based on deep learning from imbalanced data, Comput. Biol. Med. 135 (2021),
[18] K.R. Bhatele, Y.S.S. Bhadauria, Brain structural disorders detection and 104540, https://doi.org/10.1016/j.compbiomed.2021.104540.
classification approaches: a review, Artif. Intell. Rev. 53 (5) (2020) 3349–3401, [38] G. Song, T. Shan, M. Bao, Y. Liu, Y. Zhao, Y.B. Chen, Automatic brain tumour
https://doi.org/10.1007/s10462-019-09766-9. diagnostic method based on a back propagation neural network and an extended
[19] R. Pauli, Y.M. Wilson, The basic principles of magnetic resonance imaging. set-membership filter, Comput. Methods Prog. Biomed. 208 (2021), 106188,
Encyclopedia of Behavioral Neuroscience, second ed., Elsevier, 2022, pp. 105–113, https://doi.org/10.1016/j.cmpb.2021.106188.
https://doi.org/10.1016/B978-0-12-819641-0.00108-0. [39] B. Ait Skourt, A. El Hassani, Y.A. Majda, Mixed-pooling-dropout for convolutional
[20] M.T. Duong, A.M. Rauschecker, Y.S. Mohan, Diverse applications of artificial neural network regularization, J. King Saud. Univ. Comput. Inf. Sci. (2021),
intelligence in neuroradiology, Neuroimaging Clin. N. Am. 30 (4) (2020) 505–516, https://doi.org/10.1016/j.jksuci.2021.05.001.
https://doi.org/10.1016/j.nic.2020.07.003. [40] J. Cheng, Brain tumor dataset, Figshare (2017), https://doi.org/10.6084/m9.
[21] M. Nazir, S. Shakil, Y.K. Khurshid, Role of deep learning in brain tumor detection figshare.1512427.v5.
and classification (2015 to 2020): A review, Comput. Med. Imaging Graph. 91 [41] S. Ahuja, B.K. Panigrahi, Y.T.K. Gandhi, Enhanced performance of Dark-Nets for
(2021), 101940, https://doi.org/10.1016/j.compmedimag.2021.101940. brain tumor classification and segmentation using colormap-based superpixel
[22] A. Işın, C. Direkoğlu, Y.M. Şah, Review of MRI-based brain tumor image techniques, Mach. Learn. Appl. 7 (2022), 100212, https://doi.org/10.1016/j.
segmentation using deep learning methods, Procedia Comput. Sci. 102 (2016) mlwa.2021.100212.
317–324, https://doi.org/10.1016/j.procs.2016.09.407. [42] G.S. Tandel, A. Tiwari, Y.O.G. Kakde, Performance optimisation of deep learning
[23] R. Aggarwal, et al., Diagnostic accuracy of deep learning in medical imaging: a models using majority voting algorithm for brain tumour classification, Comput.
systematic review and meta-analysis, Npj Digit. Med. 4 (1) (2021) 65, https://doi. Biol. Med. 135 (2021), 104564, https://doi.org/10.1016/j.
org/10.1038/s41746-021-00438-z. compbiomed.2021.104564.
[24] S. Serte, A. Serener, Y.F. Al-Turjman, Deep learning in medical imaging: a brief [43] J. Waring, C. Lindvall, Y.R. Umeton, Automated machine learning: review of the
review, Trans. Emerg. Telecommun. Technol. (2020), https://doi.org/10.1002/ state-of-the-art and opportunities for healthcare, Artif. Intell. Med. 104 (2020),
ett.4080. 101822, https://doi.org/10.1016/j.artmed.2020.101822.
[25] K. He, X. Zhang, S. Ren, Y.J. Sun, Delving deep into rectifiers: surpassing human- [44] A.-M. Šimundić, Measures of diagnostic accuracy: basic definitions, EJIFCC 19 (4)
level performance on imagenet classification, Proc. IEEE Int. Conf. Comput. Vis. (2009) 203–211.
2015 (2015) 1026–1034, https://doi.org/10.1109/ICCV.2015.123. [45] D.M.W. Powers, Evaluation: from Precision, Recall and F-measure to ROC,
[26] J. Schmidhuber, Deep learning in neural networks: an overview, Neural Netw. 61 Informedness, Markedness and Correlation, oct. 2020.
(2015) 85–117, https://doi.org/10.1016/j.neunet.2014.09.003. [46] R. Trevethan, Sensitivity, specificity, and predictive values: foundations,
[27] D. Ravi, et al., Deep learning for health informatics, IEEE J. Biomed. Health Inform. pliabilities, and pitfalls in research and practice, Front. Public Health 5 (2017) 307,
21 (1) (2017) 4–21, https://doi.org/10.1109/JBHI.2016.2636665. https://doi.org/10.3389/fpubh.2017.00307.
[28] W. Liu, Z. Wang, X. Liu, N. Zeng, Y. Liu, Y.F.E. Alsaadi, A survey of deep neural [47] Z.N.K. Swati, et al., Brain tumor classification for MR images using transfer
network architectures and their applications, Neurocomputing 234 (2017) 11–26, learning and fine-tuning, Comput. Med. Imaging Graph. 75 (2019) 34–46, https://
https://doi.org/10.1016/j.neucom.2016.12.038. doi.org/10.1016/j.compmedimag.2019.05.001.
[29] S. Dong, P. Wang, Y.K. Abbas, A survey on deep learning and its applications, [48] H. Sugimori, H. Hamaguchi, T. Fujiwara, Y.K. Ishizaka, Classification of type of
Comput. Sci. Rev. 40 (2021), 100379, https://doi.org/10.1016/j. brain magnetic resonance images with deep learning technique, Magn. Reson.
cosrev.2021.100379. Imaging 77 (2021) 180–185, https://doi.org/10.1016/j.mri.2020.12.017.
[30] X. Zhang, N. Smith, Y.A. Webb, Medical imaging. Biomedical Information [49] M.D. Zeiler, ADADELTA: an Adaptive Learning Rate Method, dic. 2012.
Technology, Elsevier, 2020, pp. 3–49, https://doi.org/10.1016/B978-0-12- [50] M. Li, T. Zhang, Y. Chen, y A. J. Smola, Efficient mini-batch training for stochastic
816034-3.00001-8. optimization, in: Proceedings of the Twentieth ACM SIGKDD International
[31] G. Deepa, G.L.R. Mary, A. Karthikeyan, P. Rajalakshmi, K. Hemavathi, Y. Conference on Knowledge Discovery and Data Mining, Ago. 2014, 661–670. doi:
M. Dharanisri, Detection of brain tumor using modified particle swarm 10.1145/2623330.2623612.
optimization (MPSO) segmentation via haralick features extraction and subsequent [51] Yi-de Ma, Qing Liu, y Zhi-bai Quan, Automated image segmentation using
classification by KNN algorithm, Mater. Today Proc. (2021), https://doi.org/ improved PCNN model based on cross-entropy, in: Proceedings of the International
10.1016/j.matpr.2021.10.475. Symposium on Intelligent Multimedia, Video and Speech Processing, 2004., 2004,
[32] M.K. Islam, M.S. Ali, M.S. Miah, M.M. Rahman, M.S. Alam, Y.M.A. Hossain, Brain 743–746. doi: 10.1109/ISIMP.2004.1434171.
tumor detection in MR image using superpixels, principal component analysis and [52] H. Belyadi, Y.A. Haghighat, Model evaluation, Mach. Learn. Guide Oil Gas. Using
template based K-means clustering algorithm, Mach. Learn. Appl. 5 (2021), Python (2021) 349–380, https://doi.org/10.1016/B978-0-12-821929-4.00009-3.
100044, https://doi.org/10.1016/j.mlwa.2021.100044. [53] Q. Wang, Y. Ma, K. Zhao, Y.Y. Tian, A Comprehensive Survey of Loss Functions in
[33] N. Bhagat, Y.G. Kaur, MRI brain tumor image classification with support vector Machine Learning, Ann. Data Sci. (2020), https://doi.org/10.1007/s40745-020-
machine, Mater. Today Proc.. (2021), https://doi.org/10.1016/j. 00253-5.
matpr.2021.11.368. [54] S. Ruder, An Overview of Gradient Descent Optimization Algorithms, Sep. 2016.
[34] R. Chandra Joshi, R. Mishra, P. Gandhi, V.K. Pathak, R. Burget, Y.M.K. Dutta, [55] V. Kotu, Y.B. Deshpande, Model evaluation, Data Sci. (2019) 263–279, https://doi.
Ensemble based machine learning approach for prediction of glioma and multi- org/10.1016/B978-0-12-814761-0.00008-3.
12