0% found this document useful (0 votes)
70 views12 pages

VGGIN-Net Deep Transfer Network For Imbalanced Breast Cancer Dataset

This article proposes a deep learning model called VGGIN-Net for classifying images in an imbalanced breast cancer dataset. VGGIN-Net uses transfer learning by combining layers from a pre-trained VGG16 model with an Inception module and additional layers. Fine-tuning and data augmentation help reduce overfitting. Experiments show VGGIN-Net achieves higher classification accuracy than other methods for images of different magnifications, demonstrating its effectiveness in handling class imbalance.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
70 views12 pages

VGGIN-Net Deep Transfer Network For Imbalanced Breast Cancer Dataset

This article proposes a deep learning model called VGGIN-Net for classifying images in an imbalanced breast cancer dataset. VGGIN-Net uses transfer learning by combining layers from a pre-trained VGG16 model with an Inception module and additional layers. Fine-tuning and data augmentation help reduce overfitting. Experiments show VGGIN-Net achieves higher classification accuracy than other methods for images of different magnifications, demonstrating its effectiveness in handling class imbalance.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

This article has been accepted for publication in a future issue of this journal, but has not been

fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/TCBB.2022.3163277, IEEE/ACM Transactions on Computational Biology and Bioinformatics

IEEE TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, TCBB-2020-10-0577.R1 1

VGGIN-Net: Deep Transfer Network for


Imbalanced Breast Cancer Dataset
Manisha Saini, and Seba Susan, Member, IEEE.

Delhi Technological University, Delhi, India

Abstract— In this paper, we have presented a novel deep neural network architecture involving transfer learning approach, formed
by freezing and concatenating all the layers till block4 pool layer of VGG16 pre-trained model (at the lower level) with the layers
of a randomly initialized naïve Inception block module (at the higher level). Further, we have added the batch normalization, flatten,
dropout and dense layers in the proposed architecture. Our transfer network, called VGGIN-Net, facilitates the transfer of domain
knowledge from the larger ImageNet object dataset to the smaller imbalanced breast cancer dataset. To improve the performance of
the proposed model, regularization was used in the form of dropout and data augmentation. A detailed block-wise fine tuning has
been conducted on the proposed deep transfer network for images of different magnification factors. The results of extensive
experiments indicate a significant improvement of classification performance after the application of fine-tuning. The proposed deep
learning architecture with transfer learning and fine-tuning yields the highest accuracies in comparison to other state-of-the-art
approaches for the classification of BreakHis breast cancer dataset. The articulated architecture is designed in a way that it can be
effectively transfer learned on other breast cancer datasets.

Index Terms—VGG16, pre-trained model, Inception module, transfer learning, deep learning, convolutional neural networks, fine
tuning.
————————————————
● Manisha Saini is a research scholar in the Department of Computer science and Engineering, Delhi Technological University, Delhi, India, E-
mail: [email protected]
● Seba Susan is a Professor in the Department of Information technology, Delhi Technological University, Delhi, India, E-mail:
[email protected]

——————————  ——————————

1 INTRODUCTION
Cancer is a grievous health problem that affects both A common problem associated with real-world datasets
developed and developing countries [1]. Breast cancer is is the class imbalance problem. It is a challenging task to
one of the most commonly found cancer in women. There tackle the imbalanced dataset due to the disproportionate
are certain challenges faced while diagnosing cancer distribution of samples of different classes. The BreakHis
manually from the Histopathological biomedical images breast cancer dataset is an instance of an imbalanced
due to a high probability of inaccurate detection that might dataset with the Malignant samples outnumbering the
occur due to human error. In order to avoid the manual Benign samples. The imbalanced nature of this dataset [6]
time consuming task for diagnosis of cancer, a computer- brings in several challenges as the class imbalance problem
aided diagnosis system is required which automates the causes several incorrect predictions. So there is a need to
process efficiently. Various automated processes are have a model which can handle the class imbalance
already available to distinguish the Malignant and Benign situation effectively and will be able to correctly detect
cancer images [2]. Deep learning has further improved the cancerous patterns from bio-medical data. In order to
performance of the automated process for the diagnosis of tackle this problem, we have proposed a novel deep
cancer at the early stage. The role of deep learning in learning based architecture incorporating transfer learning
various applications has increased tremendously with the in this paper. The transfer learning approach using pre-
recent advancements in computational hardware trained networks has the advantage of transferring the
accelerators and parallel processing. Health care and learned weights from an architecture which is trained on
biomedical are popular fields where deep learning has another domain to the biomedical domain, which will
shown remarkable improvements [3]. ultimately reduce the training time and computational cost
The core of the deep learning architecture is the required to train the model from scratch. The contribution
convolutional neural network (CNN) [4]. Basic CNN of the paper can be summarized as: (a) Successful design
architecture consists of various convolutional building of novel deep network architecture using transfer learning
blocks stacked together for automating the process of approach to solve class imbalance problem in breast cancer
feature extraction and classification, unlike machine datasets. The proposed architecture is created by
learning where both the steps are performed separately. combining the relevant layers from VGG16 pre-trained
Broadly CNN networks are categorized into (i) networks network (layers till block4 pool layer) along with the naïve
which are trained from scratch, and (ii) the pre-trained Inception module in combination with flatten, batch
networks [5] which are trained beforehand on a large normalization and dense layers. Also, certain
dataset facilitating the transfer of knowledge from one regularization techniques have been infused in the
domain
1545-5963 (c) 2021into another.
IEEE. Personal
proposed model such as data augmentation and dropout
use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more
Authorized licensed use limited to: Valliammai Engineering College - Chennai. Downloaded on May 09,2022 at 09:54:13 UTC from IEEE Xplore. Restrictions apply.
information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/TCBB.2022.3163277, IEEE/ACM Transactions on Computational Biology and Bioinformatics

2 IEEE TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, TCBB-2020-10-0577.R1

which overall helps to reduce the overfitting to a great operations can be applied on minority class and majority
extent. (b) The proposed network has been successfully class, or it can be applied separately on the minority class
tried and tested on images of different magnification [14] [43] in order to make the samples of the minority class
factors: 40X, 100X, 200X and 400X, establishing the equivalent to the majority class. In [43], the authors had
network's invariance to size and scale of the image. The increased the number of samples in the minority class
results indicate an overall improvement in the using the Deep Convolutional Generative Adversarial
classification performance in comparison to other state-of- Network (DCGAN) by synthetically generating the fake
the-art approaches. (c) We have analyzed the effect and samples, and the augmented dataset was learned using a
significance of block wise fine-tuning on the proposed modified VGG16 based architecture. In previous work,
deep transfer neural network architecture which has comparison of BOVW, deep neural networks and data
substantial impact on the classification performance. (d) augmentation for the imbalanced computer vision dataset
The proposed 24 layer network architecture has been was presented in [15]. It was observed from the study that
articulated by integrating the right combination of layers the deep neural network mitigates the effect of class
in well-ordered way to reduce the computational imbalance in the most effective manner. Traditionally, the
complexity involved. The formulated network architecture methods that are used to tackle the class-imbalance
is designed in the way that it can be successfully transfer problem modify the data distribution, which can be
learned on other breast cancer datasets. challenging for medical datasets as it may cause loss of
The various sections of the paper are summarized as relevant information. Ding et al. (2017) demonstrates that
follows. Section 2 describes the state of the art in the fields the use of very deep networks (models with more than 10
of machine learning and deep learning for breast cancer layers) improves the network training process and
diagnosis. In section 3, the architecture and learning achieves better rate of convergence for imbalanced
methodology used in the proposed work are discussed in datasets [45]. The authors experimentally validate the
detail. Further, in Section 4, a discussion regarding the claim by using deep network architectures which are
datasets used in the experimental task along with the trained for 100 epochs in comparison to shallower
implementation details have been presented. In section 5, networks. There are other relevant approaches also
the final analysis and discussion on results related to the proposed by researchers to deal with the imbalance
classification task performed and comparison with other situation such as Abbas et al [44] had proposed a
state-of-the-art networks is presented. The final conclusion Decompose, transfer and compose (DeTraC) model using
is given in the last section of the research paper. the concept of class decomposition within CNN approach,
which helps to learn the class boundaries effectively. The
2 RELATED WORK classification task is performed after ensuing the error
Various machine learning and deep learning based correction criteria applied to the softmax layers of the
networks have been proposed for detecting cancerous network which indeed had improved the classification
patterns from cell images [7]. In [46], authors had proposed performance.
an appropriate feature-classifier combination for multi- Different pre-trained networks have been used before
class imbalanced datasets by extracting the deep features by many researchers to detect cancer patterns. Rakhlin et
using visual codebook generation along with the non- al. (2018) had worked upon the ICIAR 2018 Grand
linear Chi2 SVM classifier. A number of studies have been Challenge on Breast Cancer Histology Images by
reported in the literature using CNN for classification of extracting deep features from the pre-trained networks,
medical images. Spanhol et al. (2016) used patches, that were trained using the gradient boosted trees classifier
extracted by applying the sliding window mechanism in [16]. Deniz et al. (2018) concatenated the features extracted
order to efficiently train CNNs [8]. Feng et al (2018) had from various layers of AlexNet and VGG16 pre-trained
also focused on the patch based learning from images so in models. The deep features extracted from these layers
order to emphasize on the reduction of complexity of the were used in conjunction with support vector machine
network by reducing the number of parameters [9]. classifier for the classification of Benign and Malignant
Bayramoglu et al. (2016) proposed single-task and multi- cancer images [17]. Gupta and Bhavsar (2018) had
task CNN models for the classification of BreakHis proposed a sequential framework which consists of deep
Histopathological dataset [10]. Bardou et al. (2018) had multi-layer features extracted from fine-tuned DenseNet
compared various configuration combinations of CNN on BreakHis dataset. The deep features extracted from
with other traditional approaches. They have also multiple layers were given to XGBoost classifier [18]. In
emphasized the role of data augmentation operations on our proposed transfer learning approach, we have created
the performance of the model. From the results analysis, it a new deep architecture by concatenating pre-trained
was validated that the deep features extracted using CNN layers at lower level with trainable layers at higher level.
outperforms the handcrafted features [11]. The higher layer features are learned specific to the breast
In order to tackle the class-imbalance problem which cancer dataset. This makes our approach distinctive from
occurs due to an unequal distribution of samples in the previous works since we focus on transferring general
dataset, several methods have been suggested in data features from the lower layers of pre-trained large
mining such as undersampling, oversampling and hybrid convolutional models while the higher-level features are
sampling strategies [12],[13]. Data augmentation learned specifically from the cancer dataset.
1545-5963 (c) 2021 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more
Authorized licensed use limited to: Valliammai Engineering College - Chennai. Downloaded on May 09,2022 at 09:54:13 UTC from IEEE Xplore. Restrictions apply.
information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/TCBB.2022.3163277, IEEE/ACM Transactions on Computational Biology and Bioinformatics

SAINI ET AL.: VGGIN-NET: DEEP TRANSFER NETWORK FOR IMBALANCED BREAST CANCER DATASET 3

3 PROPOSED DEEP TRANSFER NETWORK: VGGIN- network is efficient in dealing with images of different
NET scales and magnification factors. This is particularly useful
A novel deep neural network architecture is proposed by for our study as one of the target datasets that we
concatenating the VGG16 architecture layers [19] with a experiment with, is the BreakHis dataset which consists of
single naïve Inception block [20], as shown in Figure 1. images of varying magnification factors. Throughout our
Inception block is a module which is popularly used in study, we do not introduce skip connections or depth
Inception architecture (a.k.a. GoogLeNet) [20]. We have separable convolutions as they are well beyond our scope
presented a transfer learning approach involving the novel of this study. The idea is to improvise the goodness of
deep neural network architecture, that we have named as popularly used existing architectures through a
VGGIN-Net, which is formed by freezing and sequentially stacked layer model and use of multi-level
concatenating all the layers till block4 pool layer of the features that adapt better to histopathological datasets.
VGG16 pre-trained model (at the lower level) with the There are certain crucial deployment challenges faced
layers of a randomly initialized (using Glorot uniform in the VGG based architecture which motivated us to
distribution) naïve Inception module (at the higher level). modify the VGG16 architecture. As inspired from the
Further, several randomly initialized higher layers are previous works [22],[23],[24],[25] we have modified the
added such as the dense layer along with batch VGG16 architecture so as to overcome the deployment
normalization, flatten, and dropout layers. The proposed challenges that comes with VGG-Nets. Such
architecture is created by concatenating the layers as computational challenges are prevalent even on powerful
shown in Figure 2. The new deep network is now trained single-GPU systems (Graphical Processing Units) due to
on the breast cancer dataset.
its large memory footprint. The sequential ConvNet,
VGG16 bears a large number of parameters (140 million)
due the presence of multiple convolutional layers of
varying receptive fields, hence, it can become inefficient
for inference at test time. Due to the large number of
parameters, VGG16 network is also prone to vanishing
gradient problem. The presence of three fully connected
layers in the original VGG16 architecture is primarily
responsible for the major bulkiness of the model. So we
have extracted the relevant deep features uptill block4 of
the pre-trained model and removed all the layers after that
which added extra complexity and computation in the
Figure 1. Proposed deep network architecture VGGIN-Net showing the
lower layers till block4 pool layer of VGG16 pre-trained network and the proposed architecture. Further, to address the
higher layers comprising of naïve Inception module and the dense layers.
shortcomings seen in the VGG16 architecture, we have
added the naïve Inception block as an additional block in
The 24-layer architecture is constructed as shown in the proposed architecture. The GoogLeNet incarnation of
Figure 2 by first stacking the VGG16 layers, starting from the Inception architecture uses multiple auxiliary
the VGG16 pre-process layer till the block4 pool layers. classifiers connected to intermediate layers to tackle the
The 224 x 224 x 3 image is given to the VGG16 pre-trained vanishing gradient problem. In our case, any auxiliary loss
network as input. The reason for considering the features has not been used to train the inception block because of
till block4 pool layers from the VGG16 pre-trained model the presence of lesser number of images in the currently
is to extract the most relevant bottleneck features. Also, the used dataset in comparison to the large-scale ImageNet.
consideration of features beyond the block4 pool layer The reason for the addition of a single Inception block in
would only increase the computational difficulties with the higher layers of the proposed architecture is that it
improvement in the performance of the model, as would not require auxiliary classifiers and the model can
validated by our experimental results. Further, the converge by itself using a single loss only. Also, it would
obtained relevant bottleneck features till block4 pool layer be less computationally expensive to add a single
of VGG16 pre-trained network are concatenated with the Inception module instead of addition of multiple similar
layers of naïve Inception block. The naïve Inception block modules. The main idea behind adding the naïve Inception
consists of convolutional layers having filters of sizes of module in the higher end of our network is to cover a
1x1, 5x5 and 3x3, with each layer having stride of size 3x3 larger region of a convoluted image yet preserving the
and ReLU (Rectified linear unit) activation function [21], finer details. The naïve inception block is specifically
with addition of max pooling 2D layer. engineered to convolve in parallel such that accurate
In this paper, we have considered the goodness of both detailing is possible through 1x1, 3x3, 5x5 convolutions
the pre-trained models (VGG16 and Inception) to create a (64, 128, 32 filters were used respectively). The goal of
more robust architecture which effectively resolves the adding the naïve module is to increase the CNN’s learning
class imbalance problem. The use of VGG16 pre-trained ability and abstraction of complex filters which was also
layers in the initial stage of our network was motivated by found to be a drawback in VGG-based architectures.
the fact that VGG16 architecture achieves good accuracy Moreover, the advantage of the Inception architecture is
for most of the image classification problems and this that it is able to perform well even with a single fully
1545-5963 (c) 2021 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more
Authorized licensed use limited to: Valliammai Engineering College - Chennai. Downloaded on May 09,2022 at 09:54:13 UTC from IEEE Xplore. Restrictions apply.
information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/TCBB.2022.3163277, IEEE/ACM Transactions on Computational Biology and Bioinformatics

4 IEEE TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, TCBB-2020-10-0577.R1

connected layer [20]. normalization to reduce the features into one dimension of
Our consideration of the naïve Inception block as a size 144256. Dropout regularization [27] with a rate of 0.4
suitable choice along with singe dense layer makes the is added after the flatten layer which in turn helps to avoid
architecture less computationally expensive. A single overfitting problems.
dense (fully connected) layer with softmax activation Our transfer network thus, facilitates the transfer of
function is present at the output to learn the proposed domain knowledge from the larger ImageNet object
network architecture to deal with the higher end linear dataset [28] to the smaller breast cancer dataset in the
features and to find the probability of occurrence of the lower layers itself, and learns the higher layer features
image belonging to each class for the two class specific to cancer images. In doing so, we follow the
classification problem (i.e. Benign and Malignant). guidelines of Yosinski et al. (2014) [29] who propounded
Further, batch normalization, flatten, and dropout layers the theory that when the base and target datasets are
are added to enhance the network performance.We refrain dissimilar, the use of pre-trained weights alone may
from using multiple batch normalization layers (one batch degrade the performance. It is essential that the higher
level features should be specific to the target dataset
normalization per convolutional layer) as per inspiration
instead of the base dataset. Another regularization
from [26] as a single batch normalization should suffice
technique involved in the proposed architecture,
when layers are being concatenated as in case of the
Inception module. Flatten layer is added after batch

Figure 2. Proposed model architecture VGGIN-Net formed by concatenating VGG16 layers uptill block4 layer with the naive Inception module along with
dense layer used for the classification

specifically at data-level, is data augmentation which is


applied on the training dataset in order to synthetically
increase the number of samples and also to improve the
overall performance of the network by reducing the fitting
problem [30]. The typical CNN training process employing
data augmentation would include on the fly generation of
random image samples across training mini-batches by use
of affine transformations. In the proposed network, we
have applied certain data augmentation operations as
inspired from [31] on both the classes (Benign and Figure 3. Illustration of various data augmentation operations applied on
applied on the BreakHis dataset
Malignant). Figure 3 displays some randomly generated
samples after applying data augmentation operations. The Fine tuning approach has further been adapted on the
data augmentation operations applied on images include: proposed deep transfer learning architecture, inspired
(a) random rotation within range of 20 degrees, (b) random from several works in literature. In case of DeTrac
width and height shift operation with range 0.2 i.e. approach, Abbas et al [43] had focused on the relevance of
translation of the images both horizontally as well as applying fine-tuning on different architectural blocks of
vertically by number of pixels less than or equal to 20% of pretrained CNN. Similarly, Sharma and Mehra (2018) had
the actual image dimensions, (c) random horizontal and emphasized the role of transfer learning, full training, and
vertical flip, combined with (d) random shear and random fine-tuning of several pre-trained networks for the medical
zoom operations with the same range as that of translation. image dataset [32]. From the analysis it was found that
These values were determined after lots of experimental
VGG16 pre-trained network features with logistic
trials in order to maximize performance. To improve the
regression as classifier gives best performance amongst
regularization of our network further we make use of
other combinations of VGG19 and ResNet-50 pre-trained
random crops [31] which helps the network to learn even
network with regression. The same authors had further
better due to the translation invariance property of
convolutional networks. The image patches are resized to extended their work in [33] and elaborated on the role of
224 x 340 using bilinear resizing and then randomly layer wise fine-tuning and had presented the in-depth
cropped into patches of 224 x 224. At inference (test) time, study of layer wise fine-tuning on AlexNet for the
central crop was used. BreakHis dataset. From the study they had depicted that
different magnification factors had influence in selecting
1545-5963 (c) 2021 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more
Authorized licensed use limited to: Valliammai Engineering College - Chennai. Downloaded on May 09,2022 at 09:54:13 UTC from IEEE Xplore. Restrictions apply.
information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/TCBB.2022.3163277, IEEE/ACM Transactions on Computational Biology and Bioinformatics

SAINI ET AL.: VGGIN-NET: DEEP TRANSFER NETWORK FOR IMBALANCED BREAST CANCER DATASET 5

the appropriate layers to be fine-tuned in the network number of observations belonging to one class is
architecture. Kandel and Castelle (2020) [34] had also significantly lower than other class. This is usually
conducted an extensive comparative analysis of block wise referred to as the class imbalance problem; this problem is
and complete fine-tuning of various pre-trained networks. prevalent in the BreakHis dataset as evident from Table 1
The pre-trained networks used were: VGG16, VGG19 and where the number of Malignant samples is far more than
Inception on histopathological image dataset. From the the number of Benign samples. For conducting the
analysis it was found that fine-tuning of complete pre- experimental task, 70:30 split is selected. 60% samples are
trained network might not be the ideal choice in all the selected for the training, 10% samples are kept for
situations while considering different magnification factor validation and remaining 30% samples are for testing
images. All these previous works motivated us to apply purpose. The original size of images present in the
block-wise fine tuning approach on the proposed network. BreakHis dataset is 700 x 460.
Our research work proposes a novel approach by TABLE 1
combining modified VGG16 architecture with naïve
SAMPLE DISTRIBUTION OF BENIGN AND MALIGNANT IMAGES
Inception block to tackle imbalanced problem in breast
IN THE BREAKHIS DATASET WITH RESPECT TO THE DIFFERENT
cancer classification. The same has been empirically MAGNIFICATION FACTORS
validated by conducting extensive experimentation along
with ablation study to prove the veracity of our claim that Magnification Factor
Class
our proposed architectural combination is able to able to 40𝑋� 100𝑋� 200𝑋� 400𝑋�
solve the breast cancer classification task effectively by Benign 625 644 623 588
proposing the explainable and less computationally costly
architecture. By adjusting the sequence and right Malignant 1,370 1,437 1,390 1,232
combination of appropriate layers in the proposed Total No of Images
1,995 2,081 2,013 1,820
architecture we are able to obtain the competent (7909 )
architecture to enhance the performance of the classifier
that can be utilized for transfer learning on any other breast 4.1.2 Breast-Histopathological-Images dataset
cancer dataset. The modified VGG16 architecture is chosen We have also validated our deep transfer network on
in such a way to resolve the deployment issues associated another breast cancer dataset. For another set of
with original VGG-Nets by reducing the number of dense experiments we have considered the Breast-
layer to one and extracting the appropriate features from Histopathological-Images dataset [36]. The Breast-
suitable layers. In addition, we have introduced the naive Histopathological-Images dataset comprises of 277524
Inception block with the batch normalization layer to image samples having microscopic views of breast cell
address the vanishing gradient problem, and data specimens at 40X magnification factor. Each image patch
augmentation, regularization and fine-tuning techniques extracted from whole slides is of size 50 x 50 and all
have been used for improving the prediction performance. experiments are evaluated using a similar stratified split
4. EXPERIMENTAL SETUP such that 30% of the total samples are kept for testing
purpose and remaining 60% for training and 10% for
4.1 Datasets validation. The dataset tries to address growing challenges
Two different datasets are considered in the experimental in detecting invasive ductal carcinoma (IDC), which is the
task: BreakHis and Breast-Histopathological-Images most common type of breast cancer. A highly imbalanced
dataset. BreakHis dataset is used to train the proposed class distribution is observed in this dataset with 198738
network. The trained network also supports transfer IDC –ve images and 78786 IDC +ve images. It is interesting
learning as validated by the classification task performed to note in this particular dataset, images from IDC -ve class
on the Breast-Histopathological-Images dataset. are in majority in comparison to the number of images
present in the IDC +ve classes.
4.1.1 BreakHis Histopathological dataset
BreakHis dataset [35] used for current study is a publicly 4.2 Experimental setup
available dataset consisting of 7909 high resolution breast All the experiments were conducted on Google Cloud
cancer images belonging to two different classes. i.e. Platform using a single Compute Engine VM instance with
Benign and Malignant, and composed of different dual-core Intel Xeon CPU (2.00 GHz) and 8GBRAM, and a
magnification factors, 40X, 100𝑋�, 200𝑋� and 400𝑋� as NVIDIA Tesla T4 GPU accelerator with 16 GB memory.
illustrated in Table 1. The dataset also contains labels For performing all our experiments, we used the
comprising of eight sub types belonging to four benign TensorFlow v2.3.0 framework with the help of Keras API
tumor types: adenosis (A), fibroadenoma (F), phyllodes [37] using Python v3.8.9. The source code containing the
tumor (PT), and tubular adenona (TA); and four malignant code for our implementations is available in our GitHub
tumor types: carcinoma (DC), lobular carcinoma (LC), repository.1 The different hyperparameters were selected
mucinous carcinoma (MC) and papillary carcinoma (PC). so as to maximize the performance. (i) Adam optimizer,
Skewed classes are observed in scenarios where the used with learning rate initially assigned as 0.001; the

1 https://github.com/SainiManisha/VGGIN-Net
1545-5963 (c) 2021 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more
Authorized licensed use limited to: Valliammai Engineering College - Chennai. Downloaded on May 09,2022 at 09:54:13 UTC from IEEE Xplore. Restrictions apply.
information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/TCBB.2022.3163277, IEEE/ACM Transactions on Computational Biology and Bioinformatics

6 IEEE TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, TCBB-2020-10-0577.R1

Adaptive Momentum optimization automatically adjusts a single dense layer, and also with the addition of an
the learning rate for further training. (ii) The loss function Inception block and a single dense layer which is the
used was categorical cross entropy. (iii) The training batch proposed architecture. The VGG16 architecture had been
size was set to 128 with a net budget of 100 epochs. The modified with a single dense layer but after the addition of
proposed architecture is fine-tuned for four different a naïve Inception block, the same network architecture had
magnification factors. While fine tuning the network, the shown tremendous improvement in the results. From the
learning rate is significantly reduced so as to make sure results tabulated in Tables 2 to 9, it is observed that
that any large gradient updates would not cause the VGGIN-Net shows remarkable improvement in terms of
network to abruptly change any of the pre-trained accuracy, F1 score, IBA and GMean. ROC curve analysis
weights. The training process is conducted with the help with its AUC is also shown in Figure 5 to validate the
of a simple learning rate schedule where we exponentially proposed approach.
decay the learning rate after 15 starting warmup epochs. In Table 3, comparative analysis with state-of-the-art
During the fine tuning process the network is trained for a methods is shown for the BreakHis dataset using accuracy
total of 50 sweeps (epochs). The warmup steps linearly as the evaluation parameter with scores reported across
increase the learning rate from 1e-5 to 5e-5 which is further several runs. Hence, it is evident that our proposed
exponentially decayed by a factor of 0.8. network with and even without fine tuning shows
In order to verify our claim that the proposed network remarkable improvement in results.
architecture can further support transfer learning based To emphasize the veracity of our claim that the proposed
tasks on other target histopathological images dataset, we network architecture helps to tackle the class imbalance
use the weights of our VGGIN-Net trained on BreakHis problem, certain experiments were conducted. A
40X dataset with some amount of fine tuning to classify comparative analysis illustrates the use of various well
IDC +ve and –ve images from Breast-Histopathology- known approaches that deems to solve the class imbalance
Images dataset. The weights chosen were from the 40X problem i.e., with undersampling and oversampling
magnification factor as the target dataset also consists of techniques. We observe from the comparisons to sampling
images scanned at 40X zoom factor. The hyperparameters experiments in Table 4 that the proposed architecture itself
for training the terminal dense layer for the new is able to tackle the class imbalance problem by itself
classification is similar to our other experiments except without requirement of any sampling technique.
that we use the Adam optimizer with a learning rate of Extensive experiments were conducted related to the
0.001 as any learning rate higher than that would affect the block-wise fine-tuning technique applied on the proposed
performance of the classifier. network. It is observed from the analysis that the block-
wise fine tuning operations have shown significant
5 RESULTS AND DISCUSSION improvement in the performance as depicted in Table 5. It
is evident that different fine tuning combinations are
5.1 Performance Analysis found suitable for different magnification factors. For 40X,
In our study, we have done the comparative analysis of the fine tuning of block3, block4 and Inception block seems to
proposed architecture with few state-of-the-art deep be an ideal choice, whereas in case of 400X, fine tuning of
learning approaches as well as popularly used CNN block4 and Inception block was only found to be the
architectures. For this curated set of architectures we perfect fit. Fine tuning of the complete network was found
primarily apply a transfer learning approach based on to be ideal in case of 100X and 200X magnification factor
weights pre-trained on the ImageNet dataset. This is due images. It can be inferred from the results that complete
to the fact that our target imbalanced classification dataset fine-tuning of the network is not always the perfect choice
contains much less samples in comparison to large scale for different magnification factor images as different block
datasets (a few million images) which is almost always wise fine tuning combinations also be deemed to be
required to effectively train a large ConvNet from scratch. suitable in certain scenarios. Figure 4 depicts the validation
Initially, we experiment with VGG architecture using the accuracy and loss corresponding to the four magnification
well-known VGG16 network proposed by Zisserman et al. factors. It is clearly visible that with the help of suitable
[19] (2015). We also used the GoogLeNet incarnation of the block-wise fine tuning, the network training improves its
Inception architecture (as per work done in [20] by anytime performance, and learning curves get more and
Szegedy et al. (2015)) as well as deep residual network more stable. The proposed approach along with fine
ResNet-50 [26] proposed by He et al (2018) to classify tuning has shown significant improvements in the
histopathological images from the imbalanced BreakHis classification performance. For 40𝑋�, 100𝑋�, 200𝑋� and 400𝑋�
dataset. These methods when applied using transfer magnification factors of the BreakHis dataset, the best
learning serve as effective baseline methods. So we have obtained accuracies are 98.51%, 97.53%, 96.688% and
evaluated the performance of popular deep learning 95.528% respectively. So in a nutshell, it is notable to
approaches i.e., VGG16, GoogLeNet and ResNet-50 in mention that our work demonstrates that single branch
Table 2. We have also done the comparative analysis models can converge quite well and our work is not
between these methods and our proposed deep transfer intended to deal with training of increasingly complex
network, named VGGIN-Net, obtained after considering residual models. Rather, we are aimed at building a simple
the(c)
features model with reasonable depth and favorable accuracy that
1545-5963 2021 IEEE. extracted
Personal use is till block4
permitted, pool layer of VGG16
but republication/redistribution requireswith
IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more
Authorized licensed use limited to: Valliammai Engineering College - Chennai. Downloaded on May 09,2022 at 09:54:13 UTC from IEEE Xplore. Restrictions apply.
information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/TCBB.2022.3163277, IEEE/ACM Transactions on Computational Biology and Bioinformatics

SAINI ET AL.: VGGIN-NET: DEEP TRANSFER NETWORK FOR IMBALANCED BREAST CANCER DATASET 7

can be simply implemented using basic architecture blocks reduction block to the proposed architecture is less in
(like convolution, ReLU, max pooling, etc.) on a single comparison to the naïve Inception block as validated by
branch while tackling imbalanced biomedical datasets. the experimental results illustrated in Table 8. The naïve
To validate that our design of the proposed deep transfer Inception block was initially used with 64, 128, 32 filters for
architecture supports further transfer learning on any 1x1, 3x3, 5x5 conv layers respectively. In Table 9, we show
other breast cancer biomedical dataset we performed experiments on using diverse widening factors (K) where
experiments as illustrated in Table 6 using Breast- each value of K indicates the multiple factor by which the
Histopathological-Images dataset. From the analysis it was filters are increased. It was found that K value as 1 is the
validated that the VGGIN-Net architecture also supports ideal choice instead of K values of 2, 3, 5 and 10 in the naïve
transfer learning concept when tested on other breast Inception block. Also, it helps to validate our choice of the
cancer datasets. number of filters besides keeping the computational
5.2 Ablation Study complexity optimum since K=1 is the lowest possible
Ablation study has been conducted for the proposed considered value. More results in the supplementary file
VGGIN-Net architecture. In Table 7, we show highlight the significance of data augmentation for
experimental results on the 40X magnification factor to enhancing the performance of the various deep networks
compare and contrast the use of block 3, 4 and 5 as the including VGGIN-Net. Table 10 illustrates the experiments
backbone features for our network. We inferred that related to the proposed VGGIN-Net with and without data
feature extraction till block4 pool layer is an ideal augmentation for different magnification factors for the
combination. Another set of experiments were conducted BreakHis dataset. Experiments show that VGGIN-Net
to demonstrate the selection of naïve Inception block in the with data augmentation works significantly better in
proposed architecture. Comparison with another variant comparison to VGGIN-Net without Data Augmentation.
of the Inception block as tabulated in Table 8 proves that The incorporation of data augmentation in the training
the naïve inception block is apt for our model. Although, pipeline helps reduce over-fitting by imparting the
the dimensionality reduction block variant of the Inception necessary regularization, allowing the models to learn
module is less computationally expensive in comparison continually across several epochs. For our case, we have
to the naïve inception block, the attained model applied all random transformations including random
performance obtained after combining the dimensionality cropping of samples to a fixed crop size.

TABLE 2
PERFORMANCE EVALUATION OF VGG16, GOOGLENET, AND RESNET-50 WITH THE MODIFIED VGG16 ARCHITECTURE AND THE
PROPOSED APPROACH ON BREAKHIS DATASET (I) 40X, (II) 100X, (III) 200X, (IV) 400X

40X 100X
Technique
Accuracy F1 IBA GMean Accuracy F1 IBA GMean
VGG16 [19] 0.9294 0.87 / 0.95 0.79 0.89 0.9240 0.86 / 0.95 0.80 0.89
GoogLeNet [20] 0.8682 0.78 / 0.91 0.71 0.84 0.8674 0.78 / 0.91 0.72 0.85
ResNet50 [26] 0.9350 0.89 / 0.95 0.85 0.92 0.9381 0.89 / 0.96 0.86 0.93
Modified VGG16 w/
0.9387 0.89 / 0.96 0.84 0.91 0.9522 0.92 / 0.97 0.90 0.95
Single Dense Layer
Modified VGG16 w/
Inception Block w/ 0.9628 0.93 / 0.97 0.93 0.96 0.9681 0.95 / 0.98 0.93 0.96
Single Dense Layer

200X 400X
Technique
Accuracy F1 IBA GMean Accuracy F1 IBA GMean
VGG16 [19] 0.9119 0.86 / 0.94 0.83 0.91 0.8913 0.81 / 0.92 0.72 0.85
GoogLeNet [20] 0.8880 0.82 / 0.92 0.78 0.88 0.8668 0.77 / 0.91 0.69 0.83
ResNet50 [26] 0.9431 0.90 / 0.96 0.87 0.93 0.9221 0.87 / 0.94 0.81 0.90
Modified VGG16 w/
0.9357 0.88 / 0.96 0.82 0.91 0.8893 0.82 / 0.92 0.77 0.88
Single Dense Layer
Modified VGG16 w/
Inception Block w/ 0.9651 0.88 / 0.96 0.80 0.89 0.9364 0.89 / 0.95 0.86 0.93
Single Dense Layer

1545-5963 (c) 2021 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more
Authorized licensed use limited to: Valliammai Engineering College - Chennai. Downloaded on May 09,2022 at 09:54:13 UTC from IEEE Xplore. Restrictions apply.
information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/TCBB.2022.3163277, IEEE/ACM Transactions on Computational Biology and Bioinformatics

8 IEEE TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, TCBB-2020-10-0577.R1

TABLE 3
COMPARISON OF THE PROPOSED APPROACH WITH THE STATE-OF-THE-ART APPROACHES ON BREAKHIS DATASET BASED ON MEAN
ACCURACY ACROSS DIFFERENT MAGNIFICATION FACTORS

Technique 40X 100X 200X 400X

Spanhol et al. [8] 0.8960 ± 0.0650 0.8500 ± 0.0480 0.8400 ± 0.0320 0.8080 ± 0.0310

Spanhol et al. [39] 0.8460 ± 0.0290 0.8480 ± 0.0420 0.8420 ± 0.0170 0.8160 ± 0.0370

Bayramoglu et al. [10] 0.8300 ± 0.0300 0.8310 ± 0.0350 0.8460 ± 0.0270 0.8210 ± 0.0440

Zhu et al. [40] 0.8570 ± 0.0190 0.8420 ± 0.0320 0.8490 ± 0.0220 0.8010 ± 0.0440

Gupta et al. [41] 0.8674 ± 0.0237 0.8856 ± 0.0273 0.9031 ± 0.0376 0.8831 ± 0.0301

Deniz et al. [17] 0.9096 ± 0.0159 0.9058 ± 0.0196 0.9137± 0.0172 0.9130 ± 0.0740

Song et al. [42] 0.9002 ± 0.0302 0.9120 ± 0.0440 0.8780 ± 0.0530 0.8740 ± 0.0720

Gupta et al. [18] 0.9471 ± 0.0088 0.9590 ± 0.0420 0.9676 ± 0.0109 0.8911 ± 0.0012

Ours 0.9588 ± 0.0033 0.9657 ± 0.0087 0.9500 ± 0.0122 0.9315 ± 0.0034

Ours (with fine


0.9710 ± 0.0046 0.9667 ± 0.0022 0.9716 ± 0.0033 0.9368 ± 0.0053
tuning)

TABLE 4
PERFORMANCE EVALUATION OF THE PROPOSED APPROACH WITH UNDERSAMPLING AND OVERSAMPLING TECHNIQUES ON BREAKHIS
DATASET (I) 40X, (II) 100X, (III) 200X, (IV) 400X

40X 100X
Sampling
Technique
Accuracy F1 IBA GMean Accuracy F1 IBA GMean
Undersampling 0.9591 0.93 / 0.97 0.89 0.94 0.9381 0.90 / 0.96 0.89 0.94
Oversampling 0.9406 0.90 / 0.96 0.89 0.94 0.9593 0.93 / 0.97 0.90 0.95
None 0.9628 0.93 / 0.97 0.93 0.96 0.9681 0.95 / 0.98 0.93 0.96

200X 400X
Sampling
Technique
Accuracy F1 IBA GMean Accuracy F1 IBA GMean
Undersampling 0.9540 0.91 / 0.97 0.86 0.92 0.9262 0.88 / 0.95 0.84 0.91
Oversampling 0.9669 0.94 / 0.98 0.92 0.96 0.9303 0.88 / 0.95 0.83 0.91
None 0.9651 0.88 / 0.96 0.80 0.89 0.9364 0.89 / 0.95 0.86 0.93

TABLE 5
PERFORMANCE EVALUATION OF BLOCK WISE FINE-TUNING ON PROPOSED VGGIN-NET FOR 40X, 100X, 200X AND 400X
MAGNIFICATION FACTORS ON BREAKHIS DATASET

40X 100X
Fine Tuning
Accuracy F1 IBA GMean Accuracy F1 IBA GMean
Complete Network 0.9666 0.99 / 0.97 0.89 0.94 0.9753 0.96 / 0.98 0.96 0.98
Block 2, 3, 4,
0.9777 0.96 / 0.98 0.95 0.97 0.8674 0.71 / 0.91 0.56 0.74
Inception block
Block 3, 4,
0.9851 0.97 / 0.99 0.96 0.98 0.9646 0.94 / 0.98 0.89 0.94
Inception block
Block4, Inception
0.9610 0.93 / 0.97 0.88 0.94 0.9700 0.95 / 0.98 0.92 0.96
block
1545-5963 (c) 2021 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more
Authorized licensed use limited to: Valliammai Engineering College - Chennai. Downloaded on May 09,2022 at 09:54:13 UTC from IEEE Xplore. Restrictions apply.
information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/TCBB.2022.3163277, IEEE/ACM Transactions on Computational Biology and Bioinformatics

SAINI ET AL.: VGGIN-NET: DEEP TRANSFER NETWORK FOR IMBALANCED BREAST CANCER DATASET 9

No Fine Tuning 0.9628 0.93 / 0.97 0.93 0.96 0.9752 0.95 / 0.98 0.93 0.97

200X 400X
Fine Tuning
Accuracy F1 IBA GMean Accuracy F1 IBA GMean
Complete
0.9688 0.95 / 0.98 0.92 0.96 0.9077 0.86 / 0.93 0.85 0.93
Network
Block 2, 3, 4,
0.9467 0.91 / 0.96 0.91 0.95 0.9323 0.89 / 0.95 0.83 0.91
Inception block
Block 3, 4,
0.9651 0.94 / 0.98 0.89 0.94 0.9426 0.90 / 0.96 0.85 0.92
Inception block
Block 4, Inception
0.9614 0.93 / 0.97 0.92 0.96 0.9528 0.92 / 0.97 0.91 0.95
block
No Fine Tuning 0.9651 0.88 / 0.96 0.80 0.89 0.9364 0.89 / 0.95 0.86 0.93

INCEPTION BLOCK AND DIMENSIONALITY REDUCTION


TABLE 6 INCEPTION BLOCK FOR 40X MAGNIFICATION FACTOR ON
TRANSFER LEARNING OF PROPOSED VGGIN-NET ON BREAKHIS DATASET
BREAST HISTOPATHOLOGICAL DATASET WITH AND WITHOUT
FINE-TUNING 40X
Transfer Technique
Accuracy F1 IBA GMean Accuracy F1 IBA GMean
Learning
VGGIN-Net as Proposed
Fixed Feature 0.8470 0.89 / 0.73 0.66 0.81 Network w/ 0.93 /
Extractor 0.9628 0.93 0.96
Naïve Inception 0.97
Fine Tuning the Block
VGGIN-Net 0.8678 0.91 / 0.75 0.67 0.82 Proposed
Inception Block 0.90 / 0.90
Network w/ 0.9443 0.82
0.96
Dimensionality
TABLE 7 Reduction
ANALYSIS OF FEATURES EXTRACTED FROM DIFFERENT Inception Block
BLOCKS OF VGG16 ARCHITECTURE TO FIND THE
APPROPRIATE FEATURES IN THE PROPOSED ARCHITECTURE TABLE 9
FOR 40X MAGNIFICATION FACTOR ON BREAKHIS DATASET ANALYSIS OF APPROPRIATE NUMBER OF FILTERS IN
INCEPTION BLOCK TO BE USED IN THE PROPOSED
40X ARCHITECTURE FOR 40X MAGNIFICATION FACTOR ON
Technique BREAKHIS DATASET
Accuracy F1 IBA GMean
40X
Proposed Network 0.92 / Widening
0.9536 0.89 0.94 Factor
using block3_pool 0.97 Accuracy F1 IBA GMean
Proposed Network 0.93 /
0.9628 0.93 0.96 k=1 0.9628 0.93 / 0.97 0.93 0.96
using block4_pool 0.97
Proposed Network 0.92 / k=2 0.9443 0.90 / 0.96 0.83 0.91
0.9536 0.89 0.94 k=5 0.9684 0.95 / 0.98 0.93 0.96
using block5_pool 0.97
k=10 0.9684 0.94 / 0.98 0.90 0.95
TABLE 8
ANALYSIS OF PROPOSED ARCHITECTURE WITH THE
TABLE 10
PROPOSED VGGIN-NET WITH AND WITHOUT DATA AUGMENTATION FOR 40X, 100X, 200X AND 400X MAGNIFICATION FACTORS ON
BREAKHIS DATASET

40X 100X
Technique
Accuracy F1 IBA GMean Accuracy F1 IBA GMean
VGGIN-Net w/ Data Augmentation 0.9628 0.93 / 0.97 0.93 0.96 0.9681 0.95 / 0.98 0.93 0.96
VGGIN-Net w/o Data Augmentation 0.9239 0.86 / 0.95 0.80 0.89 0.9134 0.85/ 0.94 0.78 0.88

1545-5963 (c) 2021 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more
Authorized licensed use limited to: Valliammai Engineering College - Chennai. Downloaded on May 09,2022 at 09:54:13 UTC from IEEE Xplore. Restrictions apply.
information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/TCBB.2022.3163277, IEEE/ACM Transactions on Computational Biology and Bioinformatics

10 IEEE TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, TCBB-2020-10-0577.R1

200X 400X
Fine Tuning
Accuracy F1 IBA GMean Accuracy F1 IBA GMean
VGGIN-Net w/ Data Augmentation 0.9651 0.88 / 0.96 0.80 0.89 0.9364 0.89 / 0.95 0.86 0.93
VGGIN-Net w/o Data Augmentation 0.9155 0.85 / 0.94 0.78 0.88 0.8852 0.79/ 0.92 0.68 0.82

Figure 4. Validation accuracy and loss plot corresponding to the proposed architecture VGGIN-Net for different magnification factors (40X, 100X, 200X
and 400X). Purple line indicate start of fine tuning.

Figure 5. ROC curve comparison of proposed approach with state-of-the-art networks in case of (i) 40X, (ii) 100X, (iii) 200X and (iv) 400X magnification
factors for the BreakHis dataset

1545-5963 (c) 2021 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more
Authorized licensed use limited to: Valliammai Engineering College - Chennai. Downloaded on May 09,2022 at 09:54:13 UTC from IEEE Xplore. Restrictions apply.
information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/TCBB.2022.3163277, IEEE/ACM Transactions on Computational Biology and Bioinformatics

SAINI ET AL.: VGGIN-NET: DEEP TRANSFER NETWORK FOR IMBALANCED BREAST CANCER DATASET 11

[10] Bayramoglu, Neslihan, Juho Kannala, and Janne Heikkilä. "Deep


learning for magnification independent breast cancer
6 CONCLUSION AND FUTURE SCOPE histopathology image classification." In 2016 23rd International
conference on pattern recognition (ICPR), pp. 2440-2445. IEEE,
In this paper, a novel deep learning based network 2016.
VGGIN-Net has been proposed using layers from the pre- [11] Bardou, Dalal, Kun Zhang, and Sayed Mohammad Ahmad.
trained deep network VGG-16 at lower level, and trainable "Classification of breast cancer based on histology images using
Inception module and dense layers at the higher level. The convolutional neural networks." IEEE Access 6 (2018): 24680-
proposed transfer network has been compared with 24693.
[12] Susan, Seba, and Amitesh Kumar. "SSOMaj-SMOTE-SSOMin:
different state-of-the-art approaches on the basis of various
Three-step intelligent pruning of majority and minority samples
performance evaluation metrics. It is validated from the for learning from imbalanced datasets." Applied Soft
experiments that VGGIN-Net designed to deal with the Computing 78 (2019): 141-149.
imbalanced breast cancer dataset and overall helps to [13] Susan, Seba, and Amitesh Kumar. "Hybrid of Intelligent
improve robustness and generalizability of the approach. Minority Oversampling and PSO-Based Intelligent Majority
Undersampling for Learning from Imbalanced Datasets."
The proposed deep transfer network with fine tuning has
In International Conference on Intelligent Systems Design and
achieved accuracies of 97.10%, 96.67%, 97.16% and 93.68% Applications, pp. 760-769. Springer, Cham, 2018.
for the 40𝑋�, 100𝑋�, 200𝑋� and 400𝑋� magnification factors [14] Saini, Manisha, and Seba Susan. "Data Augmentation of
respectively, for the BreakHis dataset. The proposed Minority Class with Transfer Learning for Classification of
network was able to classify both the minority and Imbalanced Breast Cancer Dataset Using Inception-V3."
majority classes effectively. We also validated through In Iberian Conference on Pattern Recognition and Image
Analysis, pp. 409-420. Springer, Cham, 2019.
experiments that the trained VGGIN-Net model supports [15] Saini, Manisha, and Seba Susan. "Comparison of Deep Learning,
transfer learning on other breast cancer datasets. Data Augmentation and Bag of-Visual-Words for Classification
In future, we shall explore use of skip connections inside of Imbalanced Image Datasets." In International Conference on
deep neural nets. Also, we would look into constructing Recent Trends in Image Processing and Pattern Recognition, pp.
other deep network architectures for classification of 561-571. Springer, Singapore, 2018.
[16] Rakhlin, Alexander, Alexey Shvets, Vladimir Iglovikov, and
multi-class imbalanced biomedical datasets. We would
Alexandr A. Kalinin. "Deep convolutional neural networks for
seek to create a hybrid approach based on techniques breast cancer histology image analysis." In International
similar to DeTraC [43], combined with popular CNN pre- Conference Image Analysis and Recognition, pp. 737-744.
trained models to deal with multi-class imbalanced Springer, Cham, 2018.
datasets. [17] Deniz, Erkan, Abdulkadir Şengür, Zehra Kadiroğlu, Yanhui
Guo, Varun Bajaj, and Ümit Budak. "Transfer learning based
histopathologic image classification for breast cancer
REFERENCES detection." Health information science and systems 6, no. 1
[1] Lukong, Kiven Erique. "Understanding breast cancer–The long (2018): 18.
and winding road." BBA clinical 7 (2017): 64-77. [18] Gupta, Vibha, and Arnav Bhavsar. "Sequential modeling of deep
[2] Cheng, Heng-Da, Juan Shan, Wen Ju, Yanhui Guo, and Ling features for breast cancer histopathological image classification."
Zhang. "Automated breast cancer detection and classification In Proceedings of the IEEE Conference on Computer Vision and
using ultrasound images: A survey." Pattern recognition 43, no. Pattern Recognition Workshops, pp. 2254-2261. 2018.
1 (2010): 299-317. [19] Simonyan, Karen, and Andrew Zisserman. "Very deep
[3] Eklund, Anders, Paul Dufort, Daniel Forsberg, and Stephen M. convolutional networks for large-scale image recognition." arXiv
LaConte. "Medical image processing on the GPU–Past, present preprint arXiv:1409.1556 (2014).
and future." Medical image analysis 17, no. 8 (2013): 1073-1094. [20] Szegedy, Christian, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott
[4] LeCun, Yann, Léon Bottou, Yoshua Bengio, and Patrick Haffner. Reed, Dragomir Anguelov, Dumitru Erhan, Vincent Vanhoucke,
"Gradient-based learning applied to document recognition." and Andrew Rabinovich. "Going deeper with convolutions."
Proceedings of the IEEE 86, no. 11 (1998): 2278-2324. In Proceedings of the IEEE conference on computer vision and
[5] Yanai, Keiji, and Yoshiyuki Kawano. "Food image recognition pattern recognition, pp. 1-9. 2015.
using deep convolutional network with pre-training and fine- [21] Agostinelli, Forest, Matthew Hoffman, Peter Sadowski, and
tuning." In 2015 IEEE International Conference on Multimedia & Pierre Baldi. "Learning activation functions to improve deep
Expo Workshops (ICMEW), pp. 1-6. IEEE, 2015. neural networks." arXiv preprint arXiv:1412.6830 (2014).
[6] Parvin, Hamid, Behrouz Minaei-Bidgoli, and Hamid Alinejad- [22] Perdana, Anugrah Bintang, and Adhi Prahara. "Face Recognition
Rokny. "A new imbalanced learning and dictions tree method Using Light-Convolutional Neural Networks Based On
for breast cancer diagnosis." Journal of Bionanoscience 7, no. 6 Modified Vgg16 Model." In 2019 International Conference of
(2013): 673-678. Computer Science and Information Technology (ICoSNIKOM),
[7] Hamidinekoo, Azam, Erika Denton, Andrik Rampun, Kate pp. 1-4. IEEE, 2019.
Honnor, and Reyer Zwiggelaar. "Deep learning in [23] Kumar, Abhinav, Sanjay Kumar Singh, Sonal Saxena, K.
mammography and breast histology, an overview and future Lakshmanan, Arun Kumar Sangaiah, Himanshu Chauhan,
trends." Medical image analysis 47 (2018): 45-67. Sameer Shrivastava, and Raj Kumar Singh. "Deep feature
[8] Spanhol, Fabio Alexandre, Luiz S. Oliveira, Caroline Petitjean, learning for histopathological image classification of canine
and Laurent Heutte. "Breast cancer histopathological image mammary tumors and human breast cancer." Information
classification using convolutional neural networks." In 2016 Sciences 508 (2020): 405-421.
international joint conference on neural networks (IJCNN), pp. [24] Aravind, Krishnaswamy R., Purushothaman Raja, Rajendran
2560-2567. IEEE, 2016. Ashiwin, and Konnaiyar V. Mukesh. "Disease classification in
[9] Feng, Yangqin, Lei Zhang, and Juan Mo. "Deep manifold Solanum melongena using deep learning." Spanish Journal of
preserving autoencoder for classifying breast cancer Agricultural Research 17, no. 3 (2019): 0204.
histopathological images." IEEE/ACM Transactions on [25] Baheti, Bhakti, Suhas Gajre, and Sanjay Talbar. "Detection of
1545-5963 (c)Computational
2021 IEEE. PersonalBiology and Bioinformatics
use is permitted, (2018). requires IEEE permission.distracted
but republication/redistribution
driver using convolutional neural network."
See http://www.ieee.org/publications_standards/publications/rights/index.html for more
Authorized licensed use limited to: Valliammai Engineering College - Chennai. Downloaded
information. on May 09,2022 at 09:54:13 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/TCBB.2022.3163277, IEEE/ACM Transactions on Computational Biology and Bioinformatics

12 IEEE TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, TCBB-2020-10-0577.R1

In Proceedings of the IEEE Conference on Computer Vision and Assisted Intervention, pp. 99-106. Springer, Cham, 2017.
Pattern Recognition Workshops, pp. 1032-1038. 2018. [43] Saini, Manisha, and Seba Susan. "Deep transfer with minority
[26] He, Kaiming, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. data augmentation for imbalanced breast cancer
"Identity mappings in deep residual networks." In European
dataset." Applied Soft Computing 97 (2020): 106759.
conference on computer vision, pp. 630-645. Springer, Cham,
2016. [44] Abbas, Asmaa, Mohammed M. Abdelsamea, and Mohamed
[27] Srivastava, Nitish, Geoffrey Hinton, Alex Krizhevsky, Ilya Medhat Gaber. "Detrac: Transfer learning of class decomposed
Sutskever, and Ruslan Salakhutdinov. "Dropout: a simple way medical images in convolutional neural networks." IEEE
to prevent neural networks from overfitting." The journal of Access 8 (2020): 74901-74913.
machine learning research 15, no. 1 (2014): 1929-1958. [45] Ding W, Huang D, Chen Z, Yu X, Lin W. Facial action
[28] Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E. Hinton. recognition using very deep networks for highly imbalanced
"Imagenet classification with deep convolutional neural class distribution. In: 2017 Asia-Pacific signal and information
networks." In Advances in neural information processing processing association annual summit and conference (APSIPA
systems, pp. 1097-1105. 2012. ASC). 2017. p. 1368–72. https://doi.org/10.1109/APSIP
[29] Yosinski, Jason, Jeff Clune, Yoshua Bengio, and Hod Lipson. A.2017.82822 46.
"How transferable are features in deep neural networks?." [46] Saini, Manisha, and Seba Susan. "Bag-of-Visual-Words codebook
In Advances in neural information processing systems, pp. 3320- generation using deep features for effective classification of
3328. 2014 imbalanced multi-class image datasets." Multimedia Tools and
[30] Shorten, Connor, and Taghi M. Khoshgoftaar. "A survey Applications (2021): 1-27.
onimage data augmentation for deep learning." Journal of Big
Data 6, no. 1 (2019): 60.
[31] Howard, Andrew G. "Some improvements on deep
convolutional neural network based image classification." arXiv Ms. Manisha Saini is currently pursuing Ph.D. in
preprint arXiv: 1312.5402 (2013). Computer Science and Engineering Department from
[32] Sharma, Shallu, and Rajesh Mehra. "Conventional Machine Delhi Technological University, Delhi, India and also
Learning and Deep Learning Approach for Multi-Classification working as Assistant Professor in the Computer Science
of Breast Cancer Histopathology Images—a Comparative Engineering Department at Faculty of Engineering and
Insight." Journal of Digital Imaging (2020): 1-23. Technology, Manav Rachna International Institute of
Research and Studies. Her research interests includes
[33] Sharma, Shallu, and Rajesh Mehra. "Effect of layer-wise fine-
Computer Vision, Neural Networks, Machine Learning,
tuning in magnification-dependent classification of breast cancer and Deep Learning.
histopathological image." The Visual Computer 36, no. 9 (2020):
1755-1769.
[34] Kandel, Ibrahem, and Mauro Castelli. "How Deeply to Fine- Dr. Seba Susan (M’17) is a Professor in the Department
Tune a Convolutional Neural Network: A Case Study Using a of Information Technology, Delhi Technological
Histopathology Dataset." Applied Sciences 10, no. 10 (2020): University, Delhi, India. She completed her Ph.D. from
3359. the Indian Institute of Technology (IIT) Delhi in 2014.
Her current area of research is the development of soft
[35] Databases – Laboratório Visão Robótica e Imagem, 2019,
computing tools for computer vision, speech and
https://web.inf.ufpr.br/vri/databases/. (Accessed 28 language processing.
November 2019).
[36] Mooney, Paul. "Breast Histopathology Images." Kaggle.
December 19, 2017. Accessed October 09, 2020.
https://www.kaggle.com/paultimothymooney/breast-
histopathology-images.
[37] Géron, Aurélien. Hands-On Machine Learning with Scikit-
Learn, Keras, and TensorFlow: Concepts, Tools, and Techniques
to Build Intelligent Systems. O'Reilly Media, 2019.
[38] Sokolova, Marina, and Guy Lapalme. "A systematic analysis of
performance measures for classification tasks." Information
Processing & Management 45, no. 4 (2009): 427-437
[39] F. A. Spanhol, L. S. Oliveira, P. R. Cavalin, C. Petitjean, and L.
Heutte, “Deep features for breast cancer histopathological image
classification,” 2017 IEEE International Conference on Systems,
Man, and Cybernetics (SMC), 2017.
[40] Zhu, Chuang, Fangzhou Song, Ying Wang, Huihui Dong, Yao
Guo, and Jun Liu. "Breast cancer histopathology image
classification through assembling multiple compact
CNNs." BMC medical informatics and decision making 19, no. 1
(2019): 198.
[41] Gupta, Vibha, and Arnav Bhavsar. "Breast cancer
histopathological image classification: is magnification
important?" In Proceedings of the IEEE Conference on Computer
Vision and Pattern Recognition Workshops, pp. 17-24. 2017.
[42] Song, Yang, Hang Chang, Heng Huang, and Weidong Cai.
"Supervised intra-embedding of fisher vectors for
histopathology image classification." In International
Conference on Medical Image Computing and Computer

1545-5963 (c) 2021 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more
Authorized licensed use limited to: Valliammai Engineering College - Chennai. Downloaded on May 09,2022 at 09:54:13 UTC from IEEE Xplore. Restrictions apply.
information.

You might also like