Deep CNN Model Based On VGG16 For Breast Cancer Classification
Deep CNN Model Based On VGG16 For Breast Cancer Classification
Abstract— Deep learning (DL) technologies are becoming a Magnetic Resonance Imaging (MRI), Mammography and
buzzword these days, especially for breast histopathology image others. However, these non-invasive imaging may not be able
tasks, such as diagnosing, due to the high performance obtained to determine the cancerous area efficiently[3]. Thus, to
in image classification. Among deep learning types, analyze the malignancy professionally, biopsy images are
Convolutional Neural Networks (CNN) are the most common used [4] with different stained to produce histopathology
types of DL models utilized for medical image diagnosis and images. However, manual investigation of the histopathology
analysis. However, CNN suffers from high computation cost to images is tedious, time-consuming, and based on physician
be implemented and may require to adapt huge number of skills. Thus, manual diagnosing is subjective. To this end,
parameters. Thus, and in order to address this issue; several
Computer-Aided Diagnosis (CAD) plays a significant role in
pre-trained models have been established with the predefined
network architecture. In this study, a transfer learning model
assisting pathologists in examining the histopathology images
based on Visual Geometry Group with 16-layer deep and finding the suspected area. Typically, it increases the
model architecture (VGG16) is utilized to extract high-level diagnostic performance of BC by reducing the inter-and intra-
features from the BreaKHis benchmark histopathological pathologist variation in making a final decision[5].
images dataset. Then, multiple machine learning models Numerous machine learning techniques are often utilized
(classifiers) are used to handle different Breast Cancer (BC) to analyze the malignancy pathology images[4]. CAD systems
histopathological image classification tasks mainly: binary and
are the most analysis models for digital histopathological
multiclass with eight-class classifications. The experimental
image analysis. The previous studies on breast CADs can be
results on the public BreakHis benchmark dataset demonstrate
that the proposed models are better than the previous works on
divvied into shallow machine learning and deep learning. The
the same dataset. Besides, the results show that the proposed shallow machine learning CADs mainly rely on features
models are able to outperform recent classical machine learning extraction from the input/ training images; then, these features
algorithms. are used to train a classifier (e.g. SVM) [6]. Therefore, these
models are sometimes categorized as Handcrafted features
Keywords— Deep learning, VGG16, BreaKHis histopathology based models[7] [8]. For example, the authors in [9] extracted
dataset, feature extraction, pre-trained model. six different types of features from input training images
mainly: Gray Level Co-Occurrence Matrices (GLCM),
I. INTRODUCTION Parameter-Free Threshold Adjacency Statistics (PFTAS),
Breast cancer remains one of the most severe public health Local Binary Patterns (LBP), Local Phase Quantization
concerns, and it is the leading cause of cancer-related deaths (LPQ), Completed Local Binary Pattern (CLBP), Oriented
in women around the world[1]. For example, in Jordan, breast FAST, and Rotated BRIEF (ORB). They achieved a
cancer constitutes 19.7% of all diagnosed cancer[2]. classification accuracy of around 81% when using the 40x
Therefore, the early diagnosis of this disease is vital to avoid dataset from BreakHis. While the authors in [10] extracted
its progression consequences and reduce its morbidity rates in different type of features from the input images, such as
women. Breast cancer cells include numerous entities with GLCM LBP, TWT , and PWT. They evaluated these features
distinctive clinical and histological attributes, this indicates by k-nearest neighbors classifier(model) and achieved an
that this disease is a heterogeneous one. Figure 1 shows two accuracy range from 83% to 86%. In [11], the authors used an
rows of images; the top row shows benign cells images while L1-norm sparse SVM (SSVM) as a feature selection method
the bottom row show malignant cells ones. Unfortunately, this to select the essential handcraft features from BreakHis
malignancy happens from the development of unusual breast images.
cells and might conquer the close healthy tissues. Most of the previous studies on BreakHis used
Clinical diagnosis of breast cancer is composed of conventional handcrafted features that achieved acceptable
numerous techniques. The first technique is clinical screening, results. Some drawbacks of these conventional CADs are that:
which is performed by employing radiology images, e.g., first, the quality of the CADs depends on the extracted
features, while acquiring representative features from the and Material) are presented in Section 3, the results and
image is very complex issue. Second, using acquiring features discussion are detailed in Section 4, and finally, the
may not be suitable for inter and intra-class variation in the conclusions and future work are drawn in Section 5.
histopathological images[4]. Third, most of the extracted
features are based on class label information (supervised); II. LITERATURE REVIEW
thus, they can be lying to biased results[12]. Deep learning models were used for breast cancer either
In contrast to the conventional CADs that was based on as feature extraction models or classification models. In the
the extracted handcrafted features, DL plays a significant in former models, the transfer learning approaches are used from
multiple classification tasks and can achieve high DL as feature extractors from images. Various researches
performance and extract high-level features from have leveraged the pre-trained DL model to classify breast
histopathology images automatically. To mitigate cancer related to the BreaKHis histopathology images. For
conventional CADs practices limitations, numerous recent instance, VGG16 and AlexNet as pre-trained DL models are
scholars thought of entrusting classification tasks to deep used by authors in [14] to classify the Breakhis dataset.
learning models, which can be adapted to select the most Authors in [17] introduced CNN as a feature extractor method
powerful features based on conventional and pooling layers. for BreaKHis dataset. Then, they utilized data augmentation
to reduce the imbalance between the class labels. The reported
Among DL models, CNN-based feature extraction models results for their method was varied from 88.3%- 94.1% based
have received vast concern among scholars for the on the data augmentation. In other works in [18], the authors
classification of histopathology breast images [13], [14], [15]. investigated the pre-trained inception-v3 with BLSTM to
For example, authors in [13] employed ResNet50 as a feature classify breast cancer histopathology images into three
extractor model from CNN architecture for the BreaKHis different classes: benign, normal, and carcinoma. Their
dataset. Then, they evaluated the extracted features using experiment results showed that the proposed model achieved
linear SVM and reported results for 40x around 88%. around 91.35 accuracy. In another study, the authors in [19]
used AlexNet followed by SVM to classify the breast cancer
The researchers of [14] introduced hybrid AlexNet and
images into 4-classes. The authors used BreaKHis dataset
VGG16 models as feature extractor methods. In these hybrid
with 200x magnification and achieved an accuracy of 77% for
methods, the BreaKHis dataset was classified, and the
4-classes and 83.3% for binary classification.
maximum accuracy reported was 90.96 %, while the VGG16
model achieved 90.96 % for 40x magnification. It has been The authors in [20], discriminate between benign and
observed that the AlexNet was better than VGG16 for feature malignant cases in the BreaKHis dataset with various
extraction in this dataset. Besides, the performance has been magnifications. They adapted AlexNet by changing the last
increased as the number of training samples increases. Most layer to include two classes. Then used it as a feature extractor
of the above studies utilized different pre-trained models with and classifier at the same time. They achieved an improved F-
a preprocess for the BreaKHis images. However, a little measure of 94.6 % for binary classification with 40
investigation has been done in previous studies for solving magnification. However, they ignore the multiclass task,
binary and multiclass tasks for the BreaKHis dataset using a which is a challenging task in this domain.
different type of classifiers based on the features extracted
from the VGG16 pre-trained model. VGG16 used by [21] for BreakHis classification with 40X
magnification. They achieved 89.6% on multiclass
classification. One reason for these low results is both of them
used CNN as a classifier model, which may require a sharp
fine-tuning for its parameters. Thus, feeding the features to the
classifiers may produce good results.
In view of the previous studies, some points are observed:
first, using pre-trained CNN as a feature extraction method is
better than merging the extraction and classification in one
model. Second, the multiclass classification task in BreaKHis
Figure 1: Breast histopathology images from BreaKHis dataset with is a challenging task and needs more attention from scholars.
40× magnification factor, top row: benign, bottom row: malignant. Third, to reduce the imbalance in the dataset, a data
All images include their class label for multiclass task [16]. augmentation strategy is needed
806
Authorized licensed use limited to: NANJING UNIVERSITY OF AERONAUTICS AND ASTRONAUTICS. Downloaded on April 08,2024 at 09:46:29 UTC from IEEE Xplore. Restrictions apply.
2021 International Conference on Information Technology (ICIT)
807
Authorized licensed use limited to: NANJING UNIVERSITY OF AERONAUTICS AND ASTRONAUTICS. Downloaded on April 08,2024 at 09:46:29 UTC from IEEE Xplore. Restrictions apply.
2021 International Conference on Information Technology (ICIT)
The nature of breast histopathological images carry many the VGG16 for feature extraction. The first experiment is
textures, shape, and histological structure such as nuclei, used to solve the binary classification task (benign vs.
cytoplasm. Thus, the proposed method utilizes VGG16 to malignant), as shown in Table 1. While the second
extract deep representative features of the input experiment handles the multiclass task (8-subclasses), as
histopathological images. The main steps of the proposed shown in Table 2. In each experiment, the results of the five
method, as presented in Algorithm 1, are: different classifiers are collected. It is worth mentioning that
all of the reported results are for the test dataset.
First, the input training images are resized to consist of the
size of the input layer of the VGG16. Second, the VGG16 is Table 1: Average performance of the proposed VGG16 model (%)
adapted for the histopathological images by removing the last for binary classification task with standard deviation
three layers (FC) layers (Top layers of the stack),as illustrated
in Figure 3, which represent the classifier. Thus, the leaving Classifier Accuracy Sensitivity Specificity
RBF-SVM 95.1± 4.47 95.1± 4.40 93.9± 4.42
layers are convolutional and pooling layers. The input image
is passed through the model to extract 4096 features. LR 88.8± 4.42 88.8± 4.44 84.8± 2.20
After preparing the data by extract the features from all Poly-SVM 96.0± 2.20 96.0± 2.22 93.9± 4.41
input images, we divide the data into training 90% and testing KNN 87.9± 2.24 87.9± 2.21 77.2± 2.26
(10%) to be used with a set of classifiers. Then, we trained five
diverse classifiers, namely, RBF Support Vector Machine NN 90.4± 0.0 90.4± 0.0 86.36± 1.11
(SVM), Logistic Regression (LR), Poly SVM, K-Nearest
neighbors (KNN), and Neural Network (NN).
As illustrated in Table1, the proposed model obtained the
following accuracies: 96% and 95.1 using polynomial SVM
and RBF SVM, respectively. These values confirm that the
SVMs have the highest average accuracy scores for 40x
magnification. In comparison, the proposed model
outperforms the previous studies on 40X. For examples, in
[9], the author achieved 88% for 40x magnification using the
combination of CNNs trained from scratch. It is worth
mentioning that the proposed method outperforms the works
in [14], which combine the features of VGG16 and AlexNet.
We can see from Table 1 that the worst classifier was Knn
which achieved around 87% accuracy for binary
Figure 4: The proposed Model for Breast Histopathological Image classification. One reason for this result is choosing the k
Classification value for this classifier.
4) Material Table 2: Average performance of the proposed VGG16 model (%)
In the implementation of the present study, the for Multiclass classification task with standard deviation
BreaKHis[9] benchmark dataset for breast cancer
Classifier Accuracy Sensitivity Specificity
histopathology images with 40X magnification is used. The
40x dataset is divided into benign lesions, consisting of 625 RBF SVM 89.83± 0.005 89.83± 0.005 100± 0.0
images, and malignant lesions, which cosset 1370 images. LR 87.2± 2.27 87.2± 2.23 100± 0.0
Each of the malignant and benign cases is divided into 4-
Poly SVM 88.1± 0.007 88.1± 0.007 100± 0.0
subclasses. Thus, in multiclass classification, eight different
classes are handled. We balanced the dataset to reduce the bias KNN 87.2± 2.24 87.2± 2.28 100± 0.0
in the classifiers.
NN 87.2± 2.32 87.2± 2.83 100± 0.0
IV. RESULTS AND DISCUSSION
1) Experiment Setup
This study conveyed two main experiments with images The results for the eight-class classification are reported in
of 40x magnification from BreaKHis dataset. For each image, Table2. As stated in Table2, the proposed model based on
the VGG16 is used to extract 4096 features. These features RBF SVM gained the highest score, where the average
with class labels for the images is used to construct the dataset. accuracy, sensitivity, and specificity of the proposed model
This dataset is divided into 90% training and 10% testing, as are 88.1%, 88.1%, and 100%, respectively. Specificity (true
in [24]. Then, we trained five diverse classifiers, particularly, negative rate) in the experiments means the number of
RBF SVM, LR, Poly SVM, KNN with k=1, and NN with 300 negative samples that were correctly classified divided by all
iterations. negative samples. The high specificity in this study implies
that the proposed method able to predict the true negative
2) Evaluation Methods cases accurately.
In this study, various metrics such as accuracy, sensitivity,
and Specificity are applied to measure the image diagnosis
performance for binary and multiclass classification tasks[25].
All of these metrics are averaged over 30 runs as in [20].
3) Results
Two major experiments on the 40X magnification
Breakhis have been conducted to evaluate the performance of
808
Authorized licensed use limited to: NANJING UNIVERSITY OF AERONAUTICS AND ASTRONAUTICS. Downloaded on April 08,2024 at 09:46:29 UTC from IEEE Xplore. Restrictions apply.
2021 International Conference on Information Technology (ICIT)
Table 3: Comparison of the performance of the proposed model images," Informatics in Medicine Unlocked, vol. 20,
with previous deep learning approaches on the same dataset with p. 100375, 2020.
40X magnification. [4] S. Sahran, D. Albashish, A. Abdullah, N. Abd
Shukor, and S. H. M. Pauzi, "Absolute cosine-based
Previous Classification Type of Accuracy
studies type classifier SVM-RFE feature selection method for prostate
histopathological grading," Artificial intelligence in
VGG16[26] Binary VGG16 86.2±2.0
medicine, vol. 87, pp. 78-90, 2018.
AlexNet [20] Binary AlexNet 94.6% [5] M. Aubreville, C. A. Bertram, C. Marzahl, C.
AlexNet and Binary SVM 84.87± 1.14 Gurtner, M. Dettwiler, A. Schmidt, et al., "Deep
VGG16 [14] learning algorithms out-perform veterinary
VGG16 [21] Multi-class SVM 89.6±4.0 pathologists in detecting the mitotically most active
(8-classes ) +NN + KNN
tumor region," Scientific RepoRtS, vol. 10, pp. 1-11,
Proposed for Binary RBF-SVM 96.0± 2.20 2020.
binary [6] D. Albashish, A. I. Hammouri, M. Braik, J. Atwan,
Proposed for Multi-class RBF-SVM 89.83± 0.005 and S. Sahran, "Binary biogeography-based
multiclass (8-classes) optimization based SVM-RFE for feature
selection," Applied Soft Computing, vol. 101, p.
107026, 2021.
Overall, from the obtained results in Table 1,Table 2, and [7] A. Adam, A. A. I. Mudjahidin, B. Hasan, and D.
Table 3 we noticed that the proposed model with SVM Albashish, "Injecting Tissue Texture and
classifiers provide superior performance in binary and Morphology Comprehension into Algorithm for
multiclass classification tasks. We also observed that the Cancer Grading," 2020.
performance of the specificity is increased remarkably in [8] O. Dorgham, M. Alweshah, M. Ryalat, J. Alshaer,
multiclass classification, which intimates that the extracted
M. Khader, and S. Alkhalaileh, "Monarch butterfly
features from VGG16 able to discriminate between the
optimization algorithm for computed tomography
complex cases in the breast cancer domain.
image segmentation," Multimedia Tools and
V. CONCLUSIONS AND FUTURE WORK Applications, pp. 1-34, 2021.
[9] F. A. Spanhol, L. S. Oliveira, C. Petitjean, and L.
Extracting high-level features from breast
histopathological images assists in improving the Heutte, "A dataset for breast cancer
effectiveness of the diagnostic process. Thus, the main histopathological image classification," Ieee
objective of this study is to utilize VGG16, a pre-trained transactions on biomedical engineering, vol. 63, pp.
model from CCN deep learning, to extract the high-level 1455-1462, 2015.
features from breast images. To do that, we removed the last [10] A. A. Samah, M. F. A. Fauzi, and S. Mansor,
fully connected layers in VGG16. Then, the obtained features "Classification of benign and malignant tumors in
were classified using a set of heterogeneity classifiers. histopathology images," in 2017 IEEE International
Extensive experiments on Breakhis dataset (public dataset) Conference on Signal and Image Processing
were carried out, and a set of performance metrics was Applications (ICSIPA), 2017, pp. 102-106.
calculated for performance evaluation (on test data portion). [11] M. A. Kahya, W. Al-Hayani, and Z. Y. Algamal,
The experimental results outperformed various techniques in "Classification of breast cancer histopathology
the state-of-the-art. This is demonstrate the effectiveness of images based on adaptive sparse support vector
the extracted features using VGG16 with polynomial and RBF machine," Journal of Applied Mathematics and
SVMs classifiers. In the future, further investigation for an Bioinformatics, vol. 7, p. 49, 2017.
ensemble of different classifiers and pre-trained models to [12] Z. Mohammad, "Cryptanalysis and improvement of
deliver high performance for this complex domain will be the YAK protocol with formal security proof and
addressed. security verification via Scyther," International
VI. REFERENCES Journal of Communication Systems, vol. 33, p.
e4386, 2020.
[1] M. A. Rahman, R. chandren Muniyandi, D.
[13] S. Saxena, S. Shukla, and M. Gyanchandani, "Pre‐
Albashish, M. M. Rahman, and O. L. Usman,
trained convolutional neural networks as feature
"Artificial neural network with Taguchi method for
extractors for diagnosis of breast cancer using
robust classification model to improve classification
histopathology," International Journal of Imaging
accuracy of breast cancer," PeerJ Computer
Systems and Technology, vol. 30, pp. 577-591,
Science, vol. 7, p. e344, 2021.
2020.
[2] M. Al-Masri, B. Aljalabneh, H. Al-Najjar, and T.
[14] E. Deniz, A. Şengür, Z. Kadiroğlu, Y. Guo, V. Bajaj,
Al-Shamaileh, "Effect of time to breast cancer
and Ü. Budak, "Transfer learning based
surgery after neoadjuvant chemotherapy on survival
histopathologic image classification for breast
outcomes," Breast Cancer Research and Treatment,
cancer detection," Health information science and
pp. 1-7, 2021.
systems, vol. 6, pp. 1-7, 2018.
[3] O. Dorgham, M. Ryalat, and M. A. Naser,
[15] K. Simonyan and A. Zisserman, "Very deep
"Automatic body segmentation for accelerated
convolutional networks for large-scale image
rendering of digitally reconstructed radiograph
recognition," arXiv preprint arXiv:1409.1556,
2014.
809
Authorized licensed use limited to: NANJING UNIVERSITY OF AERONAUTICS AND ASTRONAUTICS. Downloaded on April 08,2024 at 09:46:29 UTC from IEEE Xplore. Restrictions apply.
2021 International Conference on Information Technology (ICIT)
810
Authorized licensed use limited to: NANJING UNIVERSITY OF AERONAUTICS AND ASTRONAUTICS. Downloaded on April 08,2024 at 09:46:29 UTC from IEEE Xplore. Restrictions apply.