0% found this document useful (0 votes)
26 views16 pages

On The Performance of Deep Transfer Learning Networks For Brain Tumor Detection Using MR Images

Uploaded by

gdheepak1979
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
26 views16 pages

On The Performance of Deep Transfer Learning Networks For Brain Tumor Detection Using MR Images

Uploaded by

gdheepak1979
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 16

Received April 10, 2022, accepted May 25, 2022, date of publication May 30, 2022, date of current

version June 8, 2022.


Digital Object Identifier 10.1109/ACCESS.2022.3179376

On the Performance of Deep Transfer Learning


Networks for Brain Tumor Detection
Using MR Images
SAIF AHMAD AND PALLAB K. CHOUDHURY , (Member, IEEE)
Department of Electronics and Communication Engineering, Khulna University of Engineering & Technology (KUET), Khulna 9203, Bangladesh
Corresponding author: Pallab K. Choudhury ([email protected])

ABSTRACT A brain tumor need to be identified in its early stage, otherwise it may cause severe condition
that cannot be cured once it is progressed. A precise diagnosis of brain tumor can play an important role to
start the proper treatment, which eventually reduces the survival rate of patient. Recently, deep learning based
classification method is popularly used for brain tumor detection from 2D Magnetic Resonance (MR) images.
In this article, several transfer learning based deep learning methods are analyzed using number of traditional
classifiers to detect the brain tumor. The investigation results are based on a labeled dataset with the images of
both normal- and abnormal brain. For transfer learning, seven methods are used such as VGG-16, VGG-19,
ResNet50, InceptionResNetV2, InceptionV3, Xception, and DenseNet201. Each of them is followed by
five traditional classifiers, which are Support Vector Machine, Random Forest, Decision Tree, AdaBoost,
and Gradient Boosting. All the combinations of deep learning based feature extractor and classifier are
investigated to evaluate the relevant performance in terms of accuracy, precision, recall, F1-score, Cohen’s
kappa, AUC, Jaccard, and Specificity. Later on, learning curves for all of the combinations that achieved the
highest accuracies were presented. The presented results show that the best model achieved an accuracy of
99.39% with a 10-fold cross validation. The results presented in this article are expected to be useful for the
selection of suitable method in deep transfer learning based brain tumor detection.

INDEX TERMS Brain tumor, magnetic resonance imaging (MRI), transfer learning, deep learning, VGG-19,
support vector machine (SVM), DCNN.

I. INTRODUCTION difficult due to the lack of precise information about tumor’s


A tumor is caused by an abnormal growth of cells that has size results from low resolution image of tumor areas. The
no purpose. In the case of benign tumors that do not invade patients can be treated in good way if the tumor is detected
surrounding tissues and thus, they grow in a contained area. and treated early in the tumor formation process. As a result,
However, if such tumors grow near to a vital area, they can tumor treatment is highly dependent on the timely diagno-
still cause troubles. On the other hand, malignant tumors grow sis of tumor with its proper classification. To diagnose the
and spread in such a way that can cause life-threatening can- brain tumors, there are several medical imaging technologies
cerous disease. When the majority of the cells are damaged or are used, for example, Magnetic Resonance Imaging (MRI),
old, they are removed or replaced with new cells. It may cause Computerized Tomography (CT) scan, Ultrasound, Simple
problems if the damaged or old cells are not removed. The Photon Emission Computed Tomography (SPECT), Positron
development of a mass of tissue, which refers to the growth Emission Tomography (PET), and X-ray. Among these, MRI
or tumor, is often the result of the creation of additional cells. is the most commonly used medical imaging technique as it
Because of the size, shape, position, and form of tumor in offers better contrast images of brain tumor in compared to
the brain, the identification of brain tumor is a challenging other medical imaging techniques. Recently, machine learn-
task. In particular, early-stage brain tumor diagnosis is quite ing (ML) based approaches are gained much popularity to
identify the brain tumor from the MR images as it gives quite
The associate editor coordinating the review of this manuscript and
accurate and precise detection results. Especially, transfer
approving it for publication was Yu Zhang. learning technique has demonstrated in several investigations,

This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
VOLUME 10, 2022 59099
S. Ahmad, P. K. Choudhury: On Performance of Deep Transfer Learning Networks for Brain Tumor Detection

where the knowledge learned from a task can be reused images to extract the necessary features of brain tumor and
for another similar task to achieve improved performance in finally applied to an ANN model for classification. The pro-
classification on target dataset [1], [2]. Conventionally, the posed method shows the accuracy 99% and sensitivity 97.9%.
amount of computational complexity is quite high to train However, the system increases the computational complexity
a deep convolutional neural network (DCNN) model using due to the long processing time. In [7], three multi-resolution
a massive dataset. Therefore, such learning procedure can techniques such as wavelet transform, curvelet transform and
be simplified by reusing the model weights from previously shearlet transform are used to detect the brain abnormality.
trained models. The trained model’s layers are then employed By using only fifteen shearlet features, the SVM classifier
in a new model to be trained with new dataset of interest. with the radial basis function (RBF) kernel approach achieved
As a result, the training time and generalization error is a maximum classification accuracy 97.38%. In [8], a CNN
significantly reduced. However, a detail study of using dif- model named as BrainMRNet is proposed using the combi-
ferent traditional deep transfer learning models followed by nation of residual blocks, attention module, and hypercolumn
well-known classifiers is necessary for the selection of best technique followed by dense layer and softmax to detect brain
performing model in target application. tumor. Here, the proposed study claimed to reach the accuracy
In this article, combination of several transfer learning of 96.05%. In [9], Support Vector Machine along with a
based deep learning methods with different classifiers are Fully Automatic Heterogeneous Segmentation (FAHS-SVM)
investigated to detect the brain tumor from MR images and process is utilized to locate the tumor areas, where the model
finally, compared their relative performance. An effective accuracy reached to 98.51%. A modified ResNet50 model
deep transfer learning system is identified to detect and clas- is also constructed in [10], where the 5 layers are removed
sify brain tumor with greater accuracy even in the presence of from the existing structure and new 10 layers are added at the
lower dataset. In particular, seven transfer learning methods end. Even though the modified model shows the classification
are used as feature extractors such as VGG-16, VGG-19, accuracy of 97.01%, the system complexity is increased due
ResNet50, InceptionResNetV2, InceptionV3, Xception, and to the presence of additional layers. In [11], Le-Net and U-Net
DenseNet201. Moreover, each of the CNN model is further models are combined to develop a new model LU-Net that
followed by five traditional classifiers namely SVM, Random provides less number of layers to reduce the system complex-
Forest, Decision Tree, Adaptive Boosting (AdaBoost), Gradi- ity. A detail comparative analysis is performed by considering
ent Boosting. Pre-trained deep CNNs are used for MR brain the Le-Net, VGG-16 and proposed LU-Net, where the new
images to extract the necessary features and further catego- model achieved the highest accuracy of 98.00% in compare to
rized using classifiers with 10-fold cross-validation. Finally, other models. However, there is an uncertainty of system per-
a detail comparative results are computed in the presence of formance in the presence of large dataset. In [12], authors are
different performance matrices. used superpixels and Principal Component Analysis (PCA)
The rest of the paper is organized as follows. Section II for the feature extraction, which is again followed by a filter
covers the recent works for brain tumor detection from MR to enhance the images. Moreover, TK-means clustering is
images using CNN models. Section III presents the investi- added in the model for image segmentation and brain tumor
gation framework including the detail description of brain detection. However, the study is carried out using low number
image dataset and data augmentation, image pre-processing, of image dataset. By incorporating clinical presentations and
CNN models and classifiers used for this research. The evalu- traditional MRI analysis, a deep learning based paradigm
ation matrices and the corresponding results are discussed in is proposed in [13]. Here the backward propagation for the
section IV. Finally, the conclusion of this work is presented gradients is used to increase the depth of the network, which
in section V. eventually improves the model accuracy. However, the model
suffers with long computational time as well as increase of
II. RELATED WORKS development complexity due to the presence of additional
In [3], a hybrid technique is introduced using wavelet trans- layers. In [14], a demonstration of ensemble features and
form, principal component analysis, and supervised learning ensemble classifiers was proposed. The DenseNet-169 model
algorithms, where the detection accuracy of brain tumor is achieved an average accuracy of 92.37% using small dataset,
reached to 98.6%. However, the proposed system requires to whereas the ResNeXt-101 model achieved an average accu-
train in each time if the image database is changed. racy of 96.13% using large dataset. However, the model size
Beside this, a novel combination of methods such as dis- is insufficient for a real-time medical diagnostic system based
crete wavelet packet transform (DWPT), Shannon entropy on knowledge distillation techniques. Moreover, a single clas-
(SE), Tsallis entropy (TE) and generalized eigenvalue prox- sifier shows better results for some cases compared to ensem-
imate support vector machine (GEPSVM) are also utilized ble configuration with average results. A brief summary of
to classify the brain images [4]. In [5], tenserflow is used related works using ML based brain tumor detection are
to implement a 5-layer convolutional neural network for presented in Table 1.
MRI-based brain tumor detection. However, a limited number In summary, the aforementioned ML approaches are
of training data are used for machine learning. In [6], spa- mostly used the standard CNN models for brain tumor detec-
tial gray level dependency (SGLD) matrix is used for MRI tion. On the other hand, the pre-trained model by transfer

59100 VOLUME 10, 2022


S. Ahmad, P. K. Choudhury: On Performance of Deep Transfer Learning Networks for Brain Tumor Detection

TABLE 1. Brief summary of related works ML based brain tumor detection.

VOLUME 10, 2022 59101


S. Ahmad, P. K. Choudhury: On Performance of Deep Transfer Learning Networks for Brain Tumor Detection

FIGURE 1. Overview of the investigation framework.

learning technique results in less computational time, higher III. INVESTIGATION FRAMEWORK
accuracy and removes the constraint of maintaining large The investigation framework used for this study is pre-
dataset for training. Moreover, the classification performance sented in Fig. 1. The process is started with MR brain
of traditional classifiers is better than softmax or fully con- image dataset, which is further used for data augmentation.
nected layers used in previous investigations. Overall, the The dataset splits in three ways namely train set, test set,
major contributions of this study can be summarized as and validation set. Later on, the MR brain images are fur-
follows: ther processed to reduce the noise and ready for feature
extraction. In feature extraction part, several CNN models
– To provide in-depth analysis of seven pre-trained models
are tested such as VGG-16, VGG-19, ResNet50, Inception-
such as VGG-16, VGG-19, ResNet50, InceptionRes-
ResNetV2, InceptionV3, Xception, and DenseNet201. The
NetV2, InceptionV3, Xception, and DenseNet201. The
pre-processed images were fed into the transfer learning
transfer learning techniques are used to extract deep
models with a batch size of 32. Finally, classification stages
features from target dataset of MR brain images.
are prepared using different classifiers like SVM, Random
– To provide in-depth analysis of five classifiers such as
Forest, Decision Tree, AdaBoost, and Gradient Boosting.
SVM, Random Forest, Decision Tree, Adaptive Boost-
Based on these feature extractors and classifiers, the relative
ing (AdaBoost), Gradient Boosting. Different classifiers
performance of detecting the brain tumor is evaluated to
are used to classify the brain MR images into benign and
select the best performing machine learning model using
malignant.
brain MR images.
– To conduct an extensive analysis on seven pre-trained
models followed by five classifiers considering all the A. BRAIN IMAGE DATASET AND DATA AUGMENTATION
combinations and finally, compare the effectiveness of In this investigation, a publicly accessible MRI dataset
all the CNN models and ML classifiers on the target from Kaggle [https://www.kaggle.com/navoneel/brain-mri
dataset. images-for-brain-tumor-detection] is used to analyze and
– To propose the best-performing model that achieved the evaluate the developed framework. The images are in two
highest accuracy and optimal computational time among folders labeled as ‘yes’ and ‘no’ corresponding to the
all the models. Moreover, the corresponding parameter abnormal- and normal brain images as shown in Fig.2. Orig-
settings are also explored. inally, it contains 152 abnormal brain images and 98 normal
– To provide a comparison with the state-of-the-art models brain images, thus a total of 250 images of varying dimen-
that justify the use of best performing model for classi- sions. The images are grayscale in JPG format. Later on,
fying the brain tumor MR images to achieve the highest augmentation technique is applied to increase the size of the
accuracy. dataset.

59102 VOLUME 10, 2022


S. Ahmad, P. K. Choudhury: On Performance of Deep Transfer Learning Networks for Brain Tumor Detection

FIGURE 2. Brain MR images a) Without tumor b) With tumor.

FIGURE 3. Overview of data augmentation process.

Data augmentation is a process of adding slightly changed results. Several methods can be used for augmentation, how-
copies of current data or newly created synthetic data from ever, the present article used the process like width shifting,
existing data to expand the size of present dataset. By generat- height shifting, shear intensity, brightness, horizontal flip,
ing new and varied samples of dataset, data augmentation pro- and vertical flip for dataset size improvement as shown in
cess can help to improve the performance of machine learning Fig.3. After applying the augmentation process, the dataset is
models. When a machine learning model’s dataset is large converted to 1240 abnormal- and 1078 normal- brain images.
and diverse, the model performs better to get more accurate Using this dataset, the 5 images of each category are used as

VOLUME 10, 2022 59103


S. Ahmad, P. K. Choudhury: On Performance of Deep Transfer Learning Networks for Brain Tumor Detection

input layer dimension of the feature extractors. Moreover, the


small patches of the unnecessary noises are also removed by
applying the erosion and dilations operations.

C. FEATURE EXTRACTION USING DCNN


Deep learning technique has proved an essential tool in vari-
ous applications due to its feature learning ability and thus,
highlighted its potential in many research articles includ-
ing a review work published in nature [17]. In particular,
convolutional neural network, a popular part of deep learn-
ing family, has attracted by many researchers just after the
published results at ILSVRC-2012 (ImageNet Large Scale
Visual Recognition Challenge) image classification competi-
FIGURE 4. Data distribution based on train, validation and test set.
tion using AlexNet model [18]. Even though such deep CNN
shows good performance in the presence of large labeled
dataset like ImageNet, the model has limited in application
test set and the remaining data is divided into; 80% as train for medical imaging like MR images classification due to
set and 20% as validation set. Based on this distribution, the the availability of small sample size. Especially for small
train set has 987 abnormal- and 858 normal brain images; dataset applications, a well investigated and good alternative
the test set has 5 abnormal- and 5 normal brain images; approach to train the deep CNN using a pre-trained model
the validation set has 248 abnormal- and 215 normal brain with transfer learning. The pre-trained models are proven to
images. Fig. 4 shows the dataset distribution using a bar be easier and faster to build with improved accuracy for the
graph. target application [2]. In recent years, various CNN archi-
tectures using transfer learning have outperformed classical
B. IMAGE PRE-PROCESSING machine learning models. They have also shown considerable
In machine learning, the used dataset is typically not orga- success to improve the image classification performance.
nized as it comes from different sources. Therefore, the In image classification, extracting the key features of the
dataset needs to be standardized and processed before images is an important part of the process and thus, the
being fed to the ML model. Moreover, MR images may models are properly trained to distinguish multiple levels of
contain defects such as inhomogeneity distortions and visual representation thanks to the concept of deep learning.
motion heterogeneity due to the person’s body motion dur- Conventionally, there are two ways to use the pre-trained
ing image acquisition or instability of the scanning hard- models, firstly, the off-the-shelf pre-trained models are used
ware. These distortions eventually add unwanted intensity for image dataset to extract the features and train a separate
rates in the acquired images to develop false positives. classifier to classify those features. Secondly, the pre-trained
Image pre-processing is commonly used to reduce these models are fine tuned in selected or all the layers to get
unwanted noises by collecting the useful information from the the desire results [19]. Here, the first approach is adopted
images and hence, such process improves the classification with the combination of number of pre-trained models and
performance. traditional classifiers. In this article, seven pre-trained CNN
In this research, the image pre-processing stage comprises models are utilized for the feature extraction using MR brain
with number of steps as shown in Fig. 5. Firstly, the original image dataset. The pre-trained CNN models are trained on
grayscale MR images in varying sizes are loaded for pre- large ImageNet dataset [20]. The pre-trained CNN models
processing. In step 2, the active contour-based segmentation used in this study are VGG-16 [21], InceptionResNetV2 [22],
technique in used to select the region of interest area by ResNet50 [23], VGG-19 [21], Xception [24], InceptionV3
defining the biggest contour. A contour is a set of points that [25], and DenseNet201 [26]. A summary of these models
are interpolated together using different interpolation meth- are presented in Table 2 and more details are available in
ods like linear, splines, or polynomial to describe the curve the mentioned references. The performance results of each
in an image [15]. In step 3, the extreme points are selected model are presented in the later section to show the relative
by thresholding technique. Thresholding is a basic non- efficiency for the detection of brain tumor from MR images.
contextual segmentation technique that converts a greyscale
or color image into a binary image to create a binary area D. CLASSIFIERS
map with one threshold [16]. The binary map has two poten- Classifiers are used to divide a batch of data into cate-
tially disjoint domains, one containing pixels with input data gories. It is a method to map the input data in a certain
values less than a threshold and the other containing pixels category using an algorithm. In this study, the extracted fea-
values equal or greater than the threshold. In step 4 and 5, tures from deep CNN models are classified using five clas-
the images are cropped to collect the useful portion and sifiers namely Support Vector Machine [27], [28], Random
resized 224 × 224 pixels with RGB format to fit for the Forest [29], [30], Decision Tree [31], AdaBoost [32], and

59104 VOLUME 10, 2022


S. Ahmad, P. K. Choudhury: On Performance of Deep Transfer Learning Networks for Brain Tumor Detection

FIGURE 5. Process of image pre-processing.

Gradient Boosting [33]. The brief details of these classifiers The accuracy can be considered as a capacity to successful
are presented in Table 3. The performance results of each detection of brain tumor from the target image dataset. The
CNN model followed by classifiers will be discussed in later fraction of true positive and true negative in all the cases
section. under investigation are used to estimate the accuracy as
follows [34]:
TP + FN
IV. RESULTS ANALYSIS AND DISCUSSION Accuracy =
TP + TN + FP + FN
This section mainly highlights the performance analysis of
Precision is a true positive measure, which is calculated
several transfer learning based CNN models for the features
as [34]:
extraction from brain MR images. The extracted features are
further classified using number of classifiers. All the com- TP
Precision =
bination of feature extractors and classifiers are evaluated in TP + FP
terms of computational time and accuracy with 10-fold cross Recall (Sensitivity) is a metric that evaluates the system’s
validation as shown in Table 4. Moreover, the presented deep capacity to accurate classification of brain tumors, and it is
learning frameworks are also tested using different evaluation determined by the percent of true positives as [34]:
matrices like accuracy, precision, recall, F1-score, Cohen’s TP
kappa, AUC (Area Under ROC (Receiver Operating Charac- Recall =
TP + FN
teristic) Curve), Jaccard, and Specificity scores as shown in
Table 5. Based on the evaluation results, the best performing The F1-score takes the harmonic mean of a classi-
model is identified for effective classification of brain tumor fier’s precision and recall to create a single statistic.
into Benign and Malignant using brain MR images. The main The F1-score is given by [34]:
parameter settings of best pre-trained model and different Recall × Precision TP
F1-score = 2 × =
classifiers are also highlighted in Table 6 and Table 7 respec- Recall + Precision TP + 1 2(FP + FN)

tively. Moreover, the best performing model is compared with
Cohen’s Kappa is a statistical measure that determines how
the state-of-the-art methods as shown in Table 8.
often two raters agree on the same quantity and is measured
as [34]:
A. EVALUATION MATRICES po − pe
K=
The efficiency of the proposed deep transfer learning frame- 1 − pe
work is measured using four key outcomes: true positives where,
(TP), false positives (FP), true negatives (TN), and false po = Overall accuracy of the model
negatives (FN). The following performance matrices are used pe = Metric for the degree of agreement between model
to evaluate the proposed ML framework: predictions and actual class values.

VOLUME 10, 2022 59105


S. Ahmad, P. K. Choudhury: On Performance of Deep Transfer Learning Networks for Brain Tumor Detection

TABLE 2. Brief description of different pre-trained models used in this research.

59106 VOLUME 10, 2022


S. Ahmad, P. K. Choudhury: On Performance of Deep Transfer Learning Networks for Brain Tumor Detection

TABLE 3. Summary of classifiers used in this research.

The ROC curve is a binary classification task evaluation Here,


metric. It is a probability curve that compares true positive FPR’ (T) = First derivative of FPR with respect to T.
rate (TPR) to false positive rate (FPR) at various thresh- T = The sample data
old levels, effectively separating the signal from the noise. Jaccard similarity coefficient is measured to address the
The AUC is a summary of the ROC curve that measures similarities between sample sets. The mathematical formula
a classifier’s ability to distinguish between classes and is is [36]:
given by [35]: |A ∩ B| |A ∩ B|
J(A, B) = =
Z∞ |A ∪ B| |A| + |B| − |A ∩ B|
TPR (T ) FPR0 (T ) dT The fraction of real negatives that were projected as negatives,
−∞ also known as true negatives, is defined as specificity. In other

VOLUME 10, 2022 59107


S. Ahmad, P. K. Choudhury: On Performance of Deep Transfer Learning Networks for Brain Tumor Detection

TABLE 4. Ten-fold cross validation results for accuracy and computational time.

words, specificity is addressed as True Negative Rate (TNR). B. COMPARATIVE ANALYSIS


The mathematical formula is [37]: This section highlights the performance of seven pre-trained
TN CNN models i.e., VGG16, InceptionResNetV2, ResNest50,
Specificity =
TN + FP VGG19, Xception, InceptionV3 and DenseNet201 and

59108 VOLUME 10, 2022


S. Ahmad, P. K. Choudhury: On Performance of Deep Transfer Learning Networks for Brain Tumor Detection

FIGURE 6. Graphical representation of accuracy results for different pre-trained models with classifiers.

further followed by five classifiers such as Support Vector ‘False’ to unload the fully connected layers (classification
Machine, Random Forest, Decision Tree, AdaBoost, and layers). After the last pooling layer, here an additional flatten
Gradient Boosting. The relative performance of each feature layer is added and the networks are incorporated with dif-
extractor and classifier pair is tested to identify the best ferent traditional classifiers. The flatten layer works as a
performing model. In this investigation, the transfer learning dimensionality reduction function as it reduces the number of
models are used as standalone feature extractors and later parameters. It also converts the feature map that pooled from
on, the traditional classifiers are used to classify those fea- the last pooling layer to a single dimensional array and for-
tures to detect the tumor from brain images. As a standalone wards the output to the classifiers in the next step. All the pair
feature extractor, the pre-trained network is used to process of extractor-classifier are analyzed using the performance
the images, extract features and the fully connected layers parameters of accuracy and time in 10-fold cross validation.
(classification layers) are kept inactive. Cross-validation is used to estimate the skill of a model
Conventionally, the CNN networks are used up to the based on unseen data. In 10-folds cross-validation method,
last pooling layer and ‘include_top’ argument is defined as the dataset is shuffled randomly and split into 10 groups of

VOLUME 10, 2022 59109


S. Ahmad, P. K. Choudhury: On Performance of Deep Transfer Learning Networks for Brain Tumor Detection

TABLE 5. Ten-folds classification results for precision, recall, F1-score, Cohen’s kappa and AUC.

equal sizes. At first, it takes data from one group for the data from a different group for validation test and the data
validation test and the data from other nine groups use for from other nine groups as training set. The final result is the
training. The system evaluates the validation test based on average of all the 10 processes.
the training set and stores the result. The process continues According to the Table 4, the number of features
10 times (a total of 10 observations) and each time, it takes that extracted by VGG16, InceptionResNetV2, ResNest50,

59110 VOLUME 10, 2022


S. Ahmad, P. K. Choudhury: On Performance of Deep Transfer Learning Networks for Brain Tumor Detection

FIGURE 7. Learning curves (a) VGG16 - SVM pair (b) InceptionResNetV2 - Gradient Boosting pair (c) ResNet50 - SVM pair (d) VGG19 - SVM pair
(e) Xception - SVM pair (f) InceptionV3 - SVM pair (g) DenseNet201 - SVM pair.

VGG19, Xception, InceptionV3 and DenseNet201 are 25088, in Table 4. Table 4 clearly illustrates that SVM classifier
38400, 100352, 25088, 100352, 51200 and 94080 respec- shows the best accuracy results compared to other classifiers
tively. The features are extracted from the final pooling layer. while working with the ML models of VGG-16, ResNet50,
Moreover, the features are used by five classifiers to classify VGG-19, Xception, InceptionV3 and DenseNet201. For
the MR brain images into benign and malignant. The high- Inception-ResNet-V2, Gradient Boosting classifier shows
est accuracy values are marked in Bold considering all the the improved performance in terms of accuracy to clas-
combinations of pre-trained models and classifiers as shown sify the MR images. In particular, the accuracy results of

VOLUME 10, 2022 59111


S. Ahmad, P. K. Choudhury: On Performance of Deep Transfer Learning Networks for Brain Tumor Detection

TABLE 6. Hyper-parameter settings of VGG-19. TABLE 7. Main parameter set of classifiers.

SVM classifier are 99.31%, 99.22%, 99.39, 96.38%, 95.51%,


and 98.83% using VGG-16, ResNet50, VGG-19, Xception,
InceptionV3 and DenseNet201 respectively. On the other
hand, Inception-ResNet-V2-Gradient Boosting model shows
the accuracy of 90.47%. Based on the accuracy performance
as mentioned above, the VGG-19-SVM model achieved the
highest accuracy i.e., 99.39% among all the investigated
models for this study. There are another two models that
demonstrate the accuracy above 99% are VGG-16-SVM
and ResNet50-SVM with the value of 99.31% and 99.22%
respectively. Beside this, InceptionV3-Decision Tree shows
the lowest accuracy score of 75.67 %. Fig. 6 shows the
summary of accuracy results in graphically for all the combi-
nations of ML models followed by classifiers.
Beside accuracy, the computational time of each of the
feature extractor-classifier pair is also estimated as shown
in Table 4, where the lowest values are marked in Bold.
The presented results clearly indicate that the Random Forest
classifier performs the classification operation faster than the
other classifiers maintaining the lowest value of 7.691 sec-
onds while working with the VGG-19 model. Even though
the Random Forest classifier performs better in classification
time, however, it shows the accuracy of around 90% that model is considered to be the best performing deep learning
indicates the performance degradation to accurately classify system with respect to all the measured values of performance
the brain MR images into benign and malignant. Overall, matrices as mentioned in Table 4 to Table 5. Based on this
the performance of different pair of feature extractors and evaluation, the best performing CNN model is shown in
classifiers shows a tradeoff between computational time and Fig. 8. Moreover, Table 6 shows the hyper-parameter settings
accuracy. of best performing model VGG-19. The main parameter set-
The presented deep learning frameworks are also tested tings of all the classifiers are also shown in Table 7.
using different evaluation matrices as formulated in section Finally, Table 8 shows a comparison of best perform-
IV-A and the corresponding results are appeared in Table 5. ing model as presented here with state-of-the-art archi-
All the combination of deep learning based feature extrac- tectures proposed in [8]–[13]. Toğaçar et al. [8] used
tors with different classifiers are analyzed using preci- a combination of residual blocks, attention module, and
sion, recall, F1-score, Cohen’s kappa, AUC, Jaccard and hypercolumn technique with the claimed accuracy of 96.05%.
Specificity. The performance matrices with highest values Jia et al. [9] utilized the FAHS-SVM technique, where the
are marked in Bold as appeared in Table 5. For preci- mentioned accuracy of 98.51%. Besides this, Çinar et al. [10]
sion, VGG-16-SVM, ResNet50-SVM, and VGG-19-SVM achieved 97.01% accuracy with the improved ResNet50
are achieved the highest value of 99.51%; For recall, model. Moreover, Rai et al. [11] combined Le-Net and
VGG-19-SVM and DenseNet201-SVM are achieved the best U-Net to form LU-Net model that achieved an accuracy of
value of 99.27%; For F1-score, Cohen’s kappa, AUC, Jac- 98.00%. Islam et al. [12] utilized superpixels and Principal
card, and Specificity the highest values are measured for Component Analysis (PCA) followed by TK-means clus-
VGG-19-SVM model as 99.39%, 98.70%, 99.36%, 98.63% tering that achieved an accuracy of 95.00%. The study
and 99.63% respectively. Additionally, the learning curves for of Deep-CNN by Das et al. [13] achieved an accuracy
the pairs that achieved the highest accuracies compared to the of 98.00%. By comparing with all the aforementioned
other pairs are provided in Fig. 7. In summary, VGG-19-SVM results, the presented model VGG-19-SVM shows the highest

59112 VOLUME 10, 2022


S. Ahmad, P. K. Choudhury: On Performance of Deep Transfer Learning Networks for Brain Tumor Detection

FIGURE 8. Architecture of best performing CNN model using VGG-19-SVM.

TABLE 8. Results comparison with state-of-the-art methods. works of brain tumor detection using ML model. However,
the presented model was not tested for different brain MRI
modalities along with other imaging techniques. Also the
proposed technique can also be extended for the classification
of tumor types like Glioma, Meningioma, Pituitary using the
MR image dataset. Above all, the use of larger dataset and
better GPU based processing can also improve the accuracy
results as well as computational speed of presented models.
We aim to highlight those issues as a part of the future works.

REFERENCES
[1] P. K. Chahal, S. Pandey, and S. Goel, ‘‘A survey on brain tumor detection
techniques for MR images,’’ Multimedia Tools Appl., vol. 79, nos. 29–30,
pp. 21771–21814, May 2020.
[2] K. Muhammad, S. Khan, J. D. Ser, and V. H. C. D. Albuquerque, ‘‘Deep
learning for multigrade brain tumor classification in smart healthcare
classification accuracy of 99.39% and thus, it is expected to systems: A prospective survey,’’ IEEE Trans. Neural Netw. Learn. Syst.,
vol. 32, no. 2, pp. 507–522, Feb. 2021.
show good performance for detecting the brain tumor from
[3] E. S. A. El-Dahshan, T. Hosny, and A. B. M. Salem, ‘‘Hybrid intelligent
MR images. techniques for MRI brain images classification,’’ Digit. Signal Process.,
vol. 20, pp. 433–441, Mar. 2010.
[4] Z. Dong, G. Ji, J. Yang, Y. Zhang, and S. Wang, ‘‘Preclinical diagnosis
V. CONCLUSION of magnetic resonance (MR) brain images via discrete wavelet packet
In this article, several transfer learning based deep learning transform with Tsallis entropy and generalized eigenvalue proximal sup-
methods are analyzed and corresponding results are com- port vector machine (GEPSVM),’’ Entropy, vol. 17, no. 4, pp. 1795–1813,
Mar. 2015.
pared to select a best performing CNN model for the detection [5] A. Sawant, M. Bhandari, R. Yadav, R. Yele, and M. S. Bendale, ‘‘Brain
of brain tumor from MR images. There are seven classi- cancer detection from MRI: A machine learning approach (tensor flow),’’
cal feature extractors are used to develop the deep learning Int. Res. J. Eng. Technol., vol. 5, no. 4, pp. 2089–2094, Apr. 2018.
framework, where the extracted features from each of the [6] H. E. M. Abdalla and M. Y. Esmail, ‘‘Brain tumor detection by using
artificial neural network,’’ in Proc. Int. Conf. Comput., Control, Electr.,
pre-trained model are classified using five traditional clas- Electron. Eng. (ICCCEEE), Khartoum, Sudan, Aug. 2018, pp. 1–6.
sifiers. The performance matrices such as Accuracy, com- [7] A. Gudigar, U. Raghavendra, T. R. San, E. J. Ciaccio, and U. R. Acharya,
putational time, Precision, Recall, F1-score, Cohen’s kappa, ‘‘Application of multiresolution analysis for automated detection of brain
abnormality using MR images: A comparative study,’’ Future Gener. Com-
AUC, Jaccard, and Specificity are computed for all the com- put. Syst., vol. 90, pp. 359–367, Jan. 2019.
bination of feature extractors and classifiers with 10-folds [8] M. Toğaçar, B. Ergen, and Z. Cömert, ‘‘BrainMRNet: Brain tumor detec-
cross validation. The best performing model i.e., VGG-19- tion using magnetic resonance images with a novel convolutional neural
network model,’’ Med. Hypotheses, vol. 134, Jan. 2020, Art. no. 109531.
SVM shows the highest accuracy of 99.39% among all the
[9] Z. Jia and D. Chen, ‘‘Brain tumor identification and classification of
presented models in this investigation. Moreover, VGG-19- MRI images using deep learning techniques,’’ IEEE Access, early access,
SVM model also performs better in compared with recent Aug. 13, 2020, doi: 10.1109/ACCESS.2020.3016319.

VOLUME 10, 2022 59113


S. Ahmad, P. K. Choudhury: On Performance of Deep Transfer Learning Networks for Brain Tumor Detection

[10] A. Çinar and M. Yildirim, ‘‘Detection of tumors on brain MRI images using [32] K. Grąbczewski, Meta-Learning in Decision Tree Induction. Cham,
the hybrid convolutional neural network architecture,’’ Med. Hypotheses, Switzerland: Springer, 2014.
vol. 139, Jun. 2020, Art. no. 109684. [33] R. Sonavane and P. Sonar, ‘‘Classification and segmentation of brain
[11] H. M. Rai and K. Chatterjee, ‘‘Detection of brain abnormality by a novel tumor using AdaBoost classifier,’’ in Proc. Int. Conf. Global Trends Signal
Lu-Net deep neural CNN model from MR images,’’ Mach. Learn. Appl., Process., Inf. Comput. Commun. (ICGTSPICC), Dec. 2016, pp. 396–403.
vol. 2, Dec. 2020, Art. no. 100004. [34] M. A. B. Siddique, S. Sakib, M. M. R. Khan, A. K. Tanzeem, M. Chowd-
[12] M. K. Islam, M. S. Ali, M. S. Miah, M. M. Rahman, M. S. Alam, and hury, and N. Yasmin, ‘‘Deep convolutional neural networks model-based
M. A. Hossain, ‘‘Brain tumor detection in MR image using superpixels, brain tumor detection in brain MRI images,’’ in Proc. 4th Int. Conf.
principal component analysis and template based K -means clustering algo- I-SMAC (IoT Social, Mobile, Analytics Cloud) (I-SMAC), Palladam, India,
rithm,’’ Mach. Learn. Appl., vol. 5, Sep. 2021, Art. no. 100044. Oct. 2020, pp. 909–914.
[13] T. K. Das, P. K. Roy, M. Uddin, K. Srinivasan, C.-Y. Chang, and [35] C. Calì and M. Longobardi, ‘‘Some mathematical properties of the ROC
S. Syed-Abdul, ‘‘Early tumor diagnosis in brain MR images via deep curve and their applications,’’ Ricerche di Matematica, vol. 64, no. 2,
convolutional neural network model,’’ Comput., Mater. Continua, vol. 68, pp. 391–402, Oct. 2015.
no. 2, pp. 2413–2429, 2021. [36] K. Usman and K. Rajpoot, ‘‘Brain tumor classification from multi-
[14] K. Jaeyong, U. Zahid, and G. Jeonghwan, ‘‘MRI-based brain tumor classi- modality MRI using wavelets and machine learning,’’ Pattern Anal. Appl.,
fication using ensemble of deep features and machine learning classifiers,’’ vol. 20, no. 3, pp. 871–881, Aug. 2017.
Sensors, vol. 21, no. 6, p. 2222, Mar. 2021. [37] Z. N. K. Swati, Q. Zhao, M. Kabir, F. Ali, Z. Ali, S. Ahmed, and J. Lu,
[15] T. F. Chan and L. A. Vese, ‘‘Active contours without edges,’’ IEEE Trans. ‘‘Brain tumor classification for MR images using transfer learning and
Image Process., vol. 10, no. 2, pp. 266–277, Feb. 2001. fine-tuning,’’ Comput. Med. Imag. Graph., vol. 75, pp. 34–46, Jul. 2019.
[16] O. Tarkhaneh and H. Shen, ‘‘An adaptive differential evolution algorithm
to optimal multi-level thresholding for MRI brain image segmentation,’’
Expert Syst. Appl., vol. 138, Dec. 2019, Art. no. 112820.
[17] Y. LeCun, Y. Bengio, and G. Hinton, ‘‘Deep learning,’’ Nature, vol. 521,
no. 7553, pp. 436–444, Nov. 2015.
[18] A. Krizhevsky, I. Sutskever, and G. E. Hinton, ‘‘ImageNet classification
with deep convolutional neural networks,’’ in Proc. Adv. Neural Inf. Pro-
cess. Syst. Stateline, NV, USA: Harveys Lake Tahoe, 2012, pp. 1097–1105. SAIF AHMAD received the B.Sc. degree in elec-
[19] F. Zhuang, Z. Qi, K. Duan, D. Xi, Y. Zhu, H. Zhu, H. Xiong, and Q. He, tronics and communication engineering from the
‘‘A comprehensive survey on transfer learning,’’ Proc. IEEE, vol. 109,
Khulna University of Engineering & Technology
no. 1, pp. 43–76, Jul. 2020.
(KUET), Khulna, Bangladesh, in 2022. He is cur-
[20] D. Garcia-Gasulla, F. Parés, A. Vilalta, J. Moreno, E. Ayguadé, J. Labarta,
U. Cortés, and T. Suzumura, ‘‘On the behavior of convolutional nets for
rently doing an Internship at the BRAC Learning
feature extraction,’’ J. Artif. Intell. Res., vol. 61, pp. 563–592, Mar. 2018. Division, Learning and Leadership Development
[21] K. Simonyan and A. Zisserman, ‘‘Very deep convolutional networks for Unit. His research interests include medical image
large-scale image recognition,’’ Sep. 2014, arXiv:1409.1556. analysis, deep learning, machine learning, transfer
[22] C. Szegedy, S. Ioffe, V. Vanhoucke, and A. A. Alemi, ‘‘Inception-v4, learning, and pattern recognition.
inception-ResNet and the impact of residual connections on learning,’’
in Proc. 31st AAAI Conf. Artif. Intell., San Francisco, CA, USA, 2017,
pp. 4278–4284.
[23] K. He, X. Zhang, S. Ren, and J. Sun, ‘‘Deep residual learning for image
recognition,’’ in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR),
Las Vegas, NV, USA, Jun. 2016, pp. 770–778.
[24] F. Chollet, ‘‘Xception: Deep learning with depthwise separable convo-
lutions,’’ in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR),
Honolulu, HI, USA, Jul. 2017, pp. 1800–1807. PALLAB K. CHOUDHURY (Member, IEEE)
[25] C. Szegedy, V. Vanhoucke, S. Ioffe, J. Shlens, and Z. Wojna, ‘‘Rethinking received the B.Sc. degree in electrical and elec-
the inception architecture for computer vision,’’ in Proc. IEEE Conf. tronic engineering from the Khulna University
Comput. Vis. Pattern Recognit. (CVPR), Las Vegas, NV, USA, Jun. 2016, of Engineering & Technology (KUET), Khulna,
pp. 2818–2826. Bangladesh, in 2003, the M.S. degree in infor-
[26] G. Huang, Z. Liu, L. van der Maaten, and K. Q. Weinberger, ‘‘Densely mation and communication technologies from the
connected convolutional networks,’’ in Proc. IEEE Conf. Comput. Vis. Asian Institute of Technology, Thailand, in 2007,
Pattern Recognit. (CVPR), Honolulu, HI, USA, Jul. 2017, pp. 2261–2269. and the Ph.D. degree in optical communication
[27] S. R. Telrandhe, A. Pimpalkar, and A. Kendhe, ‘‘Detection of brain engineering from the Integrated Research Center
tumor from MRI images by using segmentation & SVM,’’ in Proc. World for Photonics Networks and Technologies (IRC-
Conf. Futuristic Trends Res. Innov. Social Welfare (Startup Conclave), PhoNET), Scuola Superiore Sant’anna, Pisa, Italy, in 2012. From 2012 to
Coimbatore, India, 2016, pp. 1–6. 2013, he was a Research Assistant with the IRCPhoNET, Research Group
[28] S. Abe, Support Vector Machines for Pattern Classification. London, U.K.: of Optical System Design. During 2018–2019, he worked as a Postdoctoral
Springer, 2010, pp. 163–226.
Fellow with the LiDAR and Intelligent Optical Node (LION) Research
[29] M. Murty and R. Raghava, Support Vector Machines and Perceptrons:
Laboratory, jointly formed by the Chongqing University of Technology
Learning, Optimization, Classification, and Application to Social Net-
works, 1st ed. India: Springer, 2016. (CQUT), Chongqing, China, and the Korea Advanced Institute of Science
[30] C. J. Mantas, J. G. Castellano, S. Moral-García, and J. Abellán, ‘‘A com- and Technology (KAIST), Daejeon, South Korea. He is currently a Professor
parison of random forest based algorithms: Random credal random with the Department of Electronics and Communication Engineering, KUET.
forest versus oblique random forest,’’ Soft Comput., vol. 23, no. 21, He is the author of more than 50 international journals and conferences. He
pp. 10739–10754, Nov. 2019. also holds one U.S. patent. His research interests include the design and
[31] A. V. Shichkin, A. G. Buevich, and A. P. Sergeev, ‘‘Comparison of artificial development of LiDAR for autonomous vehicle, machine learning, medical
neural network, random forest and random perceptron forest for forecast- image analysis, and optical wireless transmission systems. He is an Associate
ing the spatial impurity distribution,’’ AIP Conf. Proc., vol. 1982, no. 1, Editor of IEEE ACCESS.
2018, Art. no. 020005.

59114 VOLUME 10, 2022

You might also like