0% found this document useful (0 votes)

8 views19 pages

Automated Identification of Breast Cancer Type Using Novel Multipath Transfer Learning and Ensemble of Classifier

This study presents a novel approach for automated breast cancer type identification using multipath transfer learning and an ensemble of classifiers, achieving a remarkable accuracy of 96.86% with the Voting Classifier. The method utilizes deep learning models like ResNet50 and VGG16 for feature extraction, followed by the integration of various classifiers to enhance diagnostic precision. This innovative framework aims to improve early detection and treatment strategies for breast cancer, potentially revolutionizing patient care.

Uploaded by

reskarthic

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

8 views19 pages

Automated Identification of Breast Cancer Type Using Novel Multipath Transfer Learning and Ensemble of Classifier

Uploaded by

reskarthic

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 19

Received 20 May 2024, accepted 9 June 2024, date of publication 17 June 2024, date of current version 28 June 2024.

Digital Object Identifier 10.1109/ACCESS.2024.3415482

Automated Identification of Breast Cancer Type

Using Novel Multipath Transfer Learning and
Ensemble of Classifier
SALINI SASIDHARAN NAIR 1 AND MOHAN SUBAJI2
1 School of Computer Science and Engineering, Vellore Institute of Technology, Vellore 632014, India
2 Institute for Industry and International Programme, Vellore Institute of Technology, Vellore 632014, India

Corresponding author: Mohan Subaji ([email protected])

This work was supported by Vellore Institute of Technology, Tamil Nadu, India.

ABSTRACT Breast cancer, a global health concern, requires innovative diagnostic approaches. The potential
of Artificial Intelligence and Machine Learning in breast cancer diagnosis warrants exploration along with
conventional methods. Our method partitions breast cancer images into four regions by, employing transfer
learning using ResNet50 and VGG16 for feature extraction in each region. The extracted features are
consolidated and fed into an Extra Tree Classifier. In addition, an ensemble learning framework combines
logistic regression, SVM (Support Vector Machine), Extra Tree Classifier, and Ridge Classifier outputs,
harnessing the strengths of each for robust breast cancer image classification. Among the five machine
learning classification models (— Extra Tree Classifier, Logistic Regression, Ridge Classifier, SVM, and
Voting Classifier) — the goal was to determine the most effective in terms of accuracy. Surprisingly,
the Voting Classifier emerged as the top performer, with an impressive accuracy of 96.86% across these
carcinoma classes, validating the effectiveness of the approach. The Extra Tree Classifier followed with
an accuracy of 89.66%, whereas the Ridge Classifier trailed closely at 88.74%. Additionally, Logistic
Regression exhibited a notable accuracy rate of 91.42%, and the SVM model achieved a reasonable accuracy
of 91.44%. This approach integrates the feature extraction power of deep learning with the interpretability of
the traditional models. The results demonstrate the efficacy of our method in classifying ductal, lobular, and
papillary cancers. The proposed method offers a variety of advantages, including early-stage identification,
increased precision, customized medical advice, and simplified analysis, by combining feature extraction
with ensemble learning. Ongoing research aims to refine these algorithms, leading to earlier detection and
improved outcomes. This innovative approach has the potential to revolutionize breast cancer care and
fundamentally reshape treatment strategies.

INDEX TERMS Breast cancer, artificial intelligence, deep learning, transfer learning, ResNet50, VGG16,
ensemble classifier, machine learning, extra tree classifier, logistic regression, ridge classifier, SVM, voting
classifier.

I. INTRODUCTION and dissemination of aberrant cells, a phenomenon encapsu-

Cancer remains a highly intricate and formidable ailment, lated by the term ‘‘Cancer’’ encompassing various disorders.
presenting continuous hurdles for both healthcare systems Contemporary medical practices employ methods such as
and individuals worldwide. Cancer presents substantial chal- Computed Tomography (CT) [1], MRI (Magnetic Resonance
lenges to curative efforts as it is one of the most formidable Imaging) [2], and Positron Emission Tomography (PET) [3],
diseases. The body experiences the uncontrolled proliferation to identify and diagnose ailments. Amidst the myriads of
cancer types, breast cancer has emerged as a prominent and
The associate editor coordinating the review of this manuscript and problematic variant that, affects countless individuals annu-
approving it for publication was Okyay Kaynak . ally. Typically originating in the milk-producing glands or
2024 The Authors. This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License.
87560 For more information, see https://creativecommons.org/licenses/by-nc-nd/4.0/ VOLUME 12, 2024
S. S. Nair, M. Subaji: Automated Identification of Breast Cancer Type

ducts of the breast, breast cancer is of significant concern, machine learning techniques to address complex challenges
necessitating meticulous attention. The conventional mam- in image processing, optimization, and computer vision
mography method, which uses low-dose X-rays, remains a applications.
prevalent technique for detecting breast anomalies. These approaches can potentially enhance the precision,
Clinical breast examinations conducted by healthcare pro- efficacy, and customization of breast cancer diagnosis and
fessionals were employed for identification. Despite the treatment, ultimately leading to enhanced patient outcomes
presence of numerous computerized diagnostic techniques, and providing valuable support for medical practitioners’
the integration of Artificial Intelligence (AI) [4] and related decision-making processes. Our methodology involves rec-
methodologies remains relatively underexplored in the con- ognizing the inherent difficulty of extracting meaningful
text of breast cancer. Even when automated, the diagnostic features from cancer images, and the proposed method
process still entails a burdensome manual component because employs a multi-step approach. First, preprocessing tech-
of the requirement for insightful decision-making. A potential niques such as scaling, denoising, and histogram equalization
solution is to apply AI and Machine Learning (ML) [5], ensure consistency and clarity. The images were then divided
obviating the need for manual disease assessment by medi- into four parts. The power of the transfer learning is impor-
cal practitioners. Technological advancements have yielded tant. Pre-trained deep learning models, such as ResNet50
an array of cancer detection techniques, including liquid [6] and VGG16 [7], which are known for their exceptional
biopsy, genome profiling, imaging methodologies, metabolic feature extraction capabilities, were leveraged to analyze each
and optical techniques, and AI-ML approaches. Despite their image quadrant. An Extra Tree Classifier performs feature
transformative impact, the adoption of these advanced tech- selection on each extracted feature set to refine the data
niques varies based on the disease type, stage, resource further. This ensures that only the most informative features
availability, and healthcare system status. Ongoing research are used for classification. Finally, the study incorporates the
and technological progress promise to refine AI-ML method- strength of ensemble learning. The features from all four
ologies, potentially leading to timelier cancer detection, image parts were combined, and this comprehensive fea-
enhanced treatment efficacy, and improved patient survival ture vector was fed into four different classification models:
rates. Herein, we present a novel approach, that combines tra- Logistic Regression [8], Support Vector Machine [9], Extra
ditional image processing and machine learning techniques Tree Classifier [10], and Ridge Classifier [11]. By combin-
in a hybrid manner to create a robust detection module. This ing the predictions from these models, the system achieves
approach offers a multitude of merits, including early-stage robust and accurate classification of breast cancer images.
identification, heightened accuracy, personalized medicine This ensemble approach seeks to capitalize on the unique
recommendations, expeditious and efficient analysis, amal- strengths of each classifier, fostering robust and compre-
gamation of diverse data sources, and continual learning and hensive breast cancer image classification. The proposed
enhancement. The advantages of deploying AI-ML algo- method not only leverages the power of deep learning for
rithms for cancer diagnosis, as outlined above, particularly feature extraction but also integrates the interpretability of
resonate with the intricate landscape of breast cancer. traditional machine learning models through ensemble learn-
Recent advancements in computer vision and image pro- ing. In this study, we focused on classifying breast cancer
cessing have led to the development of innovative methodolo- histopathological images into three different classes: papil-
gies to address various challenges. Bao et al. [44] proposed lary, ductal, and lobular carcinomas, as shown in Figure 1.
a multi-objective optimization algorithm, analyzing its con- The results highlight how well our approach classifies the
vergence under different conditions, which holds significance three types of cancer (ductal, lobular, and papillary) into each
in theoretical research on optimization algorithms. Tang and class. For these carcinoma classes, the ensemble learning
Hu [45] introduced a semi-supervised image classification model, called the Voting Classifier, produces outstanding
method that leverages antagonism networks, enhancing clas- accuracy, precision, recall, and F1 scores, among other
sification accuracy even with limited labeled samples, which metrics.
is particularly beneficial in medical image classification This study aimed to develop a novel architecture for
tasks. Shi et al. [46] devised a scene categorization model automatically identifying breast cancer types using transfer
integrating deep, visually sensitive features, and utilizing learning, multi-level feature reduction, and ensemble learn-
context relationships and Convolutional Neural Networks ing.
to improve scene understanding. Li et al. [47] presented a
high-resolution video frame generation network, that incor- • Patch-wise processing of high-resolution input images
porated non-local modules and multi-scale feature fusion was performed to avoid compromising structural loss
for superior performance compared to existing methods. during downscaling.
Wu et al. [48] proposed a Spatial Attention-Guided Upsam- • Multi-path transfer learning was used for feature extrac-
pling network for real-time stereo matching, achieving high tion.
accuracy by leveraging spatial attention and gradient loss • Feature reduction using information-gain and ensemble
functions. These research contributions collectively highlight classification approaches highlights the novelty of this
the ongoing progress in leveraging artificial intelligence and study.

VOLUME 12, 2024 87561

S. S. Nair, M. Subaji: Automated Identification of Breast Cancer Type

The present study aimed to fight against cancer by develop VGGNet-16-based method augmented with classifiers, such
an automated tool to assist radiologists in classifying various as Support Vector Machines and Random Forests, while uti-
cancers. This model addresses this challenge by developing a lizing data augmentation to expand the dataset. Li et al. [17]
system for classifying the breast cancer images. The innova- evaluated histological images using a Convolutional Neural
tive approach of this method holds promise for aiding in the Network (CNN) architecture, employing DenseNet and a
early detection of breast cancer, potentially saving lives. squeeze-and-excitation module to amplify feature informa-
The forthcoming sections of this paper are dedicated to tion. Vo et al. [18] improved biopsy tissue diagnosis via data
pivotal preliminaries, a comprehensive exposition of the augmentation by, relying on an ensemble of DCNNs and
methodology, the presentation of the achieved results, exten- gradient-boosting tree classifiers for accurate classification.
sive discussions, and a conclusive summary. Saxena et al. [19] proposed a hybrid ML model for address-
ing class imbalance involving a pre-trained ResNet50 and
the kernelized weighted extreme learning machine. Through
diverse analyses, Alom et al. [20] evaluated breast cancer
classification using the Inception of Recurrent Residual Con-
volutional Neural Network (IRRCNN). Boumaraf et al. [21]
utilized ResNet-18 with transfer learning and global con-
trast normalization for histopathological images, coupled
with three-fold data augmentation. Burçak et al. [22] pre-
sented a deep CNN for cancerous region detection by,
employing various algorithms for weight computation and
FIGURE 1. Breast cancer presents with diverse types. parallel computing architecture. Xie et al. [23] explored
expressive feature extraction from histopathological images
using the Inception_V3 and Inception_ResNet_V2 CNNs
II. LITERATURE REVIEW with transfer learning. Jiang et al. [24] introduced the
Arshad et al. [49] compared the performance of five Breast Cancer Histopathology Image Classification Net-
pre-trained Convolutional Neural Network (CNN) mod- work (BHCNet) with a small SE-ResNet module and Gauss
els, InceptionV3, ResNet152V2, MobileNetV2, VGG-16, error scheduler SGD algorithm. Han et al. [25] addressed
and DenseNet-121, with the goal of determining which class imbalance using structured deep learning and data
model was the most accurate. VGG-16 stood out with an augmentation. Kumar et al. [26] employed contrast-limited
impressive 98% accuracy rate, whereas DenseNet-121 per- adaptive histogram equalization and k-means clustering for
formed exceptionally well, with a remarkable accuracy of biopsy images, testing diverse classifiers. Sheikh et al. [27]
99% for identifying invasive ductal carcinoma. Alqudah proposed MSI-MFNet for tissue texture feature extraction
and Alqudah [12] introduced an inventive sliding window and disease probability prediction. Nahid et al. [28] intro-
technique for localized feature extraction by, employing duced novel DNN techniques combining CNN, LSTM, and
25 sliding windows per image. Their approach employed the SVM for breast cancer image classification. Zhu et al. [29]
Local Binary Pattern (LBP) for feature extraction within each employed multiple compact CNNs with channel pruning
window, coupled with Support Vector Machines (SVM) for and data partition-based models. Gong et al. [30] uti-
window classification, eventually determining the final class lized the node-attention graph transfer network (NaGTN)
through majority voting. Gour et al. [13] devised ResHist, for graph convolutional network-based knowledge trans-
a 152-layered Convolutional Neural Network (CNN) based fer. George et al. [31] devised nucleus-guided transfer
on residual learning, to classify breast cancer. Their model learning (NucTraL) for breast tumor classification, involv-
leveraged histopathological images to derive discrimina- ing local nuclei feature extraction and SVM, along with
tive features and harnessed data augmentation to enhance belief theory-based classifier fusion (BCF) to enhance
the performance. Gandomkar et al. [14] proposed MuD- accuracy. However, these methods predominantly focus
eRN, a framework encompassing deep residual networks on the binary classification of whole-slide images, over-
(ResNet) with 152 layers, to classify hematoxylin-eosin- looking variations in color distribution in histopathologi-
stained breast digital slides. This entailed two stages: utilizing cal breast cancer images and the challenges of multiclass
ResNet to classify patches and classifying malignant and classification.
benign images into subtypes. Employing a meta-decision Most classification models using transfer learning com-
tree, the authors integrated the ResNet outputs from various promise the structural loss by downscaling the image to
magnification factors. Beltran-Perez et al. [15] advocated make it compact for pre-trained models. In addition, the
multi-scale generalized radial basis function (MSRBF) neu- features extracted using transfer learning have high noise,
ral networks for image feature extraction and classification. and to reduce it, other works have used simple feature
Their architecture spanned an input-output model, high- extraction techniques such as PCA and LDA. This motivated
level image feature extraction, and a classification module the efficient improvement of these architectures by apply-
for breast cancer prediction. Kumar et al. [16] proposed a ing patch-wise processing, multi-path transfer learning, and
87562 VOLUME 12, 2024
S. S. Nair, M. Subaji: Automated Identification of Breast Cancer Type

efficient feature extraction using an information-gain-based

method.
Various techniques for classifying breast cancer images
have been reviewed in literature. Diverse research endeav-
ors employ profound learning methodologies, encompassing
Convolutional Neural Nets (CNNs) [32] like ResNet-50,
VGGNet-16, DenseNet [33], and Inception_V3 [34]. While
some strategies incorporate innovative designs, such as
MSRBF neural networks [35], nucleus-guided transfer learn-
ing (NucTraL) [31], and node-attention graph transfer net-
work (NaGTN) [30], others employ transfer learning with
pre-trained models for feature extraction. Ensemble tech-
niques, data augmentation, and hybrid machine learning
models, which combine CNNs with SVM and LSTM classi- FIGURE 2. A transfer learning model utilizes pre-trained neural network
architectures to solve new tasks with limited labeled data.
fiers, are commonplace. Standard image processing methods,
including k-means clustering, histogram equalization, and
contrast enhancement, were also used. Notwithstanding these
diverse methodologies, a common drawback is the prevailing nuanced process that, drives the evolution and progress of
emphasis on the binary classification of entire-slide images, machine learning applications.
which frequently overlooks subtleties in color distribution In this study, we used pre-trained models Resnet50 and
and the difficulties linked to multiclass classification in his- VGG16, which provided promising results for our task.
tological breast cancer images. By addressing these issues,
breast cancer image categorization techniques may become
1) VGG16
more thorough and efficient.
The architecture of the VGG16 model, shown in Figure 3,
comprises 16 weight layers. The structure of VGG16 consists
of a series of convolutional layers followed by max-pooling
III. PRELIMINARIES layers for spatial downsampling. Each convolutional layer
A. TRANSFER LEARNING uses small 3×3 receptive fields with a fixed stride of one pixel
In the realm of transfer learning [36], the concept involves and padding to maintain spatial dimensions. The max-pooling
leveraging the expertise of a pre-trained machine learning layers then use a 2 × 2 window with a stride of two to reduce
model and applying it to a distinct yet closely related problem. the spatial resolution while preserving important features.
Transfer learning fundamentally centers on capitalizing on At the end of the network, VGG16 includes a softmax layer
the knowledge acquired in one specific task to enhance the and fully connected layers for classification. In particular,
ability to generalize to another task. This involves migrating the initial two fully connected layers contained 4096 neurons
the learned weights from a network that excelled in ‘‘task A’’ each, and there was a last layer consisting of 1000 neurons for
to a fresh challenge, denoted as ‘‘task B.’’ Figure 2 shows classifying different categories. Although simple, this design
the transfer learning model. The fundamental concept behind has been successful in activities, such as identifying images
transfer learning involves harnessing the knowledge that a and objects, leading to the widespread use of VGG16 in
model has acquired from a task that possesses abundant computer vision.
labeled training data and applying it to a novel task character-
ized by limited data availability. Instead of commencing the 2) RESNET50
learning process from scratch, we kickstart it with insights ResNet50 key feature is the use of residual connections to
gleaned from addressing a related task. combat the issue of vanishing gradients in training complex
Transfer learning is a powerful strategy in machine networks. With 50 layers, ResNet50 incorporates convolu-
learning, and it has significant applications, particularly in tional layers, batch normalization, max-pooling, and fully
domains such as computer vision [37] and natural language connected layers into a hierarchical structure, as shown in
processing [38], where substantial computational resources Figure 4. The use of residual blocks ResNet50, embodies
are required. In computer vision, transfer learning optimizes the concept of residual learning. Each block contained two
performance by reusing the early and middle layers of neu- 3 × 3 convolutional layers with batch normalization, ReLU
ral networks while adapting the final layers for a specific activation functions, and a shortcut link to get across a layer
task, thus using the general features learned in the earlier or layers. This design enabled a smoother flow of information
stages. The essence of transfer learning lies in distilling and within the network, making it easier to train deeper archi-
transporting expertise cultivated during prior experiences to tectures. In addition, ResNet50 uses bottleneck blocks in its
enhance the performance and adaptability of models when deeper layers to improve computational efficiency and model
faced with new challenges. This knowledge transfer is a performance. The design includes a global average pooling

VOLUME 12, 2024 87563

S. S. Nair, M. Subaji: Automated Identification of Breast Cancer Type

layer to gather spatial information from the feature maps and These photos were conscientiously gathered from a diverse
a fully connected layer for classification. cohort of 82 patients, and a range of magnifications dis-
tinguished them, including 40X, 100X, 200X, and 400X.
B. ENSEMBLE LEARNING The collection was meticulously organized and contained
Ensemble learning [39] is a powerful technique in the realm 2,480 benign and 5,429 malignant tissue samples, which
of machine learning that leverages the collective wisdom included important information. Each image adheres to a
of multiple models to enhance the predictive accuracy and 3-channel RGB color system with an 8-bit depth and is
robustness. It operates on the premise that combining the standardized to 700 × 460 pixels in size. The standardized
outputs of various individual models yields more reliable Portable Network Graphics (PNG) format of the dataset
and superior results than relying on a single model. These makes it compatible and easy to retrieve. It is crucial to
ensemble methods provide a structured approach to mitigate mention that the prestigious P&D Laboratory - Pathologi-
overfitting and reduce errors, making them valuable assets cal Anatomy and Cytopathology, located in Parana, Brazil
in the data scientist’s toolkit. This research paper delves into (http://www.prevencaoediagnose.com.br), was a key collabo-
the world of ensemble learning and its practical applications, rator in establishing this priceless resource. The development
mainly focusing on voting classifiers [40]. A Voting Classifier of diagnostic and therapeutic approaches is possible using
is an ensemble method that combines predictions from multi- this dataset, which is a vital resource for academics, prac-
ple machine learning models, each with unique strengths and titioners, and researchers in the histopathological image
abilities. It essentially allows different algorithms to ‘‘vote’’ analysis of breast cancer.
on the final classification, contributing to their expertise in
arriving at a more comprehensive and accurate prediction. C. PREPROCESSING
It is important to note that the Voting Classifier can be imple- In the preprocessing phase of our study, a multi-step
mented in different variations, with two primary approaches: approach, as shown in Figure 6, was employed to enhance the
hard voting and soft voting. Hard voting involves each model quality of breast cancer images for subsequent classification.
casting a single ‘‘vote’’ for the class they predict as the output. The first step involved scaling the images and resizing them
By contrast, soft voting incorporates the probability scores to a standardized dimension for consistent analysis. These
from each model, resulting in a weighted average to make processes are interconnected and depend on the preprocess-
the final decision. These variations offer flexibility and can ing stage, as illustrated in Figure 6. Subsequently, a Gaussian
be tailored to specific problems. filter [42] was applied for denoising, utilizing the Gaussian-
Blur function from the OpenCV [43] library. This process
IV. MATERIAL AND METHODOLOGY aims to reduce image noise and enhance overall image clarity.
In the methodology section, we discuss the proposed archi- The Gaussian filter operates by convolving the image with a
tecture, the dataset used, the preprocessing techniques we Gaussian kernel, and two-dimensional bell-shaped curve.
used for the dataset, feature extraction using transfer learning, Let I be the input and let Ifiltered be the filtered image. The
after feature extraction, fusing the extracted feature, and the Gaussian filter operation can be mathematically expressed as
ensemble of classifiers. shown in Equation (1):
To compress fine-grained images without losing detail,
2
we propose a novel method that divides the image into a grid 1 − i2+j2
Ifiltered = 6i=−k
k
6j=−k
k
e 2σ .I (x − i, y − j) (1)
of smaller images and extracts features from each grid using 2π σ 2
a pre-trained deep learning model. The extracted features are
where, (x, y) are the coordinates of the pixel being pro-
then concatenated to obtain the features of the entire image.
cessed in the output image Ifiltered , k is the kernel size, σ is
the standard deviation of the Gaussian distribution, and I(x,
A. PROPOSED ARCHITECTURE
y) is the intensity of the pixel at coordinates (x, y) in the
In Figure 5, we can see that from the dataset, we iterate
input image. The equation calculates the weighted sum of
through each image, which is preprocessed. The preprocess-
neighboring pixel intensities in the input image, where the
ing step includes scaling, denoising, and histograms. In the
weights are determined by the Gaussian distribution centered
next step, the image is divided into four parts: (W, H) is 2+j2
−i
changed to (W/2, H/2), and each part is subjected to a pre- at the current pixel location. The function represents e 2σ 2
trained model, and the features of the four parts are extracted. Gaussian kernel, which assigns higher weights to central
The setup is a fusion of four features of a single image. pixels and, gradually decreases weights as the pixels move
Finally, we feed the labels and fused features for ensemble away from the center. The filtered pixel value Ifiltered (x, y)
learning. was obtained by summing the products of the neighbor-
ing pixel intensities and the corresponding Gaussian kernel
B. DATASET weights for all pixels within the kernel window centered at
The Breast Cancer Histopathological Image Classifica- (x, y). The denoised images underwent further preprocessing
tion (BreakHis) dataset [41], contains 7,909 painstakingly steps, including histogram equalization, to refine their fea-
taken microscopic images of breast tumor tissue specimens. tures for subsequent classification tasks. This comprehensive

87564 VOLUME 12, 2024

S. S. Nair, M. Subaji: Automated Identification of Breast Cancer Type

FIGURE 3. Feature extraction using VGG16: An in-depth examination of the design and configuration of a model.

FIGURE 4. Feature extraction using ResNet50: An in-depth examination of the design and configuration of a model.

preprocessing strategy aims to mitigate noise, standardize into four cropped variants to extract the valuable features.
dimensions, and enhance discriminative features within the By splitting an image into four smaller sub-images, this func-
breast cancer images. The results of these preprocessing steps tion effectively divides the input image into four equal parts
are shown in Figure 7. for cancer types, as shown in Figure 8. Subsequent research
has focused on repeatedly processing our dataset, which has
D. FEATURE ENGINEERING AND FEATURE EXTRACTION been carefully arranged into three class folders, each repre-
The main idea behind this study is presented in the section, senting a particular category or class. This enhances feature
with particular emphasis on the fundamental strategy used diversity and reduces the image size from (W, H) to (W/2,
for feature engineering and feature extraction from a wide H/2). This study used the power of pre-trained deep learning
range of image datasets. The following rigorous preparation models—particularly VGG16 and ResNet50—which were
steps were applied to each image in these class folders, and rigorously trained on massive datasets. The ability to recog-
each image was processed during the conversion and loading nize detailed patterns and complex structures inside images
phases to ensure uniformity in the color channels. This was makes these pre-trained models essential feature extractors.
achieved by employing various libraries for image manipu- Next, we loaded the necessary libraries and two pre-trained
lation and standardization. The images were transformed to convolutional neural networks (CNN) models known for
adhere to the RGB color standard, and, to maintain consistent their image processing skills: VGG16 and ResNet50. These
input dimensions for the subsequent processing stages, they pre-trained models were prepared for our image dataset to
were resized to a standard size. In the successive cropping and operate as feature extractors. To capture essential high-level
feature extraction steps, which are vital, we improved the spa- representations of the image content, pre-trained ResNet50
tial information in our photographs. Each image was divided and VGG16 models were utilized for feature extraction from

VOLUME 12, 2024 87565

S. S. Nair, M. Subaji: Automated Identification of Breast Cancer Type

FIGURE 5. The architectural design of the proposed methodology for employing machine learning, transfer learning models, and ensemble classifier.

these cropped images. Subsequently, the features gathered structured format with features. The retrieved features of
from the cropped images were flattened into one-dimensional the ResNet50 and VGG16 models were positioned to pro-
arrays. This flattening process ensures that the feature rep- vide helpful input for further machine learning tasks, such
resentations are appropriately structured for further analysis as picture identification and classification. This thorough
and model training. preprocessing method improves the dataset quality and suit-
The flattened feature sets for each image were saved ability for testing and training machine learning models,
as arrays and organized sequentially within a list. Simul- eventually making it easier to accurately recognize and clas-
taneously, the respective class labels, that designated each sify items or patterns within the dataset.
image category were recorded in a separate list. This linkage
between feature sets and class labels is crucial for subsequent E. FEATURE SELECTION AND FUSION
supervised machine learning tasks. Our research tracked the In this section, we elucidate the essential feature selection
number of classes throughout this careful preprocessing path- and fusion process, which is the cornerstone of our image
way, and effectively tracked the overall number of unique analysis approach. This process involves dividing an image
categories within the dataset. These rigorous preparation into four distinct regions, extracting features using VGG16
stages are combined to convert the raw image data into a and ResNet50, and employing an Extra Tree Classifier to

87566 VOLUME 12, 2024

S. S. Nair, M. Subaji: Automated Identification of Breast Cancer Type

FIGURE 6. Visualizing the raw image to a standardized dimension during the initial preprocessing stage.

FIGURE 7. Visualizing the scaling of the image, denoising, and histogram equalization are integral parts of the preprocessing steps.

extract pertinent features from each part. These regional mizing the risk of overfitting. By independently processing
features were fused harmoniously to represent the entire each region, we capture unique patterns and attributes that
image comprehensively. This innovative approach optimizes may not be discernible in the entire image context. The result
the feature selection, enabling our model to capture nuanced is a set of region-specific feature vectors that collectively
details and patterns across an image. The resulting fused encapsulate the diverse characteristics present within the
features serve as the foundation for the subsequent phases image. The regional feature vectors obtained from the extra
of our analysis, offering a richer and more discriminative tree classifier were harmoniously fused to comprehensively
input for our machine learning algorithms. The process was represent the entire image. This fusion process integrates the
initiated by dividing the input image into four distinct parts localized knowledge gathered from each partition, enabling
and, employing a systematic partitioning strategy to ensure the model to consider fine-grained and holistic information
comprehensive image content coverage. This division was during subsequent analyses.
designed to facilitate localized feature extraction while pre-
serving the spatial relationships within each region. The F. ENSEMBLE OF CLASSIFIERS
partitioned images were independently processed to obtain Machine learning techniques have been successfully used
region-specific information. The extra tree classifier per- in medical applications. It is difficult to determine which
forms feature extraction for each of the four image regions. ML classifiers are better than the others because their
The choice of the extra tree classifier was motivated by its applicability and performance rely on the application and
robustness in identifying informative features, while mini- characteristics of the dataset. For instance, simpler ML

VOLUME 12, 2024 87567

S. S. Nair, M. Subaji: Automated Identification of Breast Cancer Type

FIGURE 8. The images of various cancers are partitioned into four smaller
sub-images.

algorithms learn more effectively from small datasets and pre- FIGURE 9. The representation of ensemble learning methods illustrates
their function, which relies on combining predictions from multiple
vent overfitting. They also had a high bias and low variance. classifiers.
Therefore, to evaluate the effectiveness of our feature extrac-
tion techniques objectively, we selected a collection of ML
classifiers to perform the task using the acquired integrated engineering and extraction phase, each preprocessed image
features. Figure 9 shows the selected ML ensemble of the is first converted to the RGB color standard and then divided
classifiers. In the context of our research, we employed vari- into four sub-images to capture more localized features.
ous machine learning techniques, each with unique strengths These sub-images were processed using two transfer learning
and applications. models, ResNet50 and VGG16, to extract deep features
Moreover, this method uses an ensemble of classifiers to representing essential image patterns. The extracted features
prevent overfitting, which is a common issue in machine from both models were then fused and refined using an Extra
learning. Ensemble techniques gather the judgments of sev- Tree Classifier, that selected the most relevant features for
eral classifiers, enhance the generalization performance of classification. Finally, machine learning models were trained
the model and reduce the possibility of retaining noise in the on these selected features, and their predictions are combined
training set. This is particularly important for categorizing using an ensemble voting method to improve the classifica-
histopathology images, where the datasets may be small and tion accuracy. The performance of the models was evaluated
noisy. Another issue in this area is class imbalance, in which using a range of metrics, including accuracy, precision, recall,
some carcinoma subtypes have fewer samples than others do. F1-score, sensitivity, specificity, loss, IoU, MCR, AUC-ROC,
When combined with appropriate methods, such as weighted and confusion metrics, to ensure a comprehensive assessment
voting, the fusion technique may address these problems with of the effectiveness of the algorithm.
class imbalance by ensuring that the model’s performance is
not biased in favor of the majority class, which can be crucial V. IMPLEMENTATION
for efficient clinical diagnosis. The implementation for this study was developed and
Ensemble Learning is a complex strategy in machine executed in the Google Colab environment, with Python
learning that combines insights from four base learners by serving as the primary programming language. Several
aggregating their outputs using a weighted voting technique. essential libraries, including NumPy, PIL (Python Imaging
This synergy between base-level models and their consensus- Library), Matplotlib, Keras, sci-kit-learn, and Yellowbrick,
based decision-making process often demonstrates superior were employed to accomplish various tasks and functions
classification accuracy, underlining its critical role in the within the work.
effectiveness and robustness of our model. Google Colab was used to run the code, which gives users
access to a virtual computer with hardware acceleration capa-
G. ALGORITHM bilities. The specific hardware details include CPU and GPU
The algorithm for breast cancer classification from specifications, Intel Core i9, and NVIDIA GeForce RTX
histopathological images begins with a preprocessing phase, 3070. The code employs the Python Imaging Library (PIL) to
in which each image is resized to a standardized dimension process and manipulate the images. After converting all the
of 250 × 250 pixels. This resizing ensures uniformity across loaded images to RGB format, they were uniformly resized
the dataset, thereby facilitating consistent feature extraction. to dimensions of 250 × 250 pixels.
The resized images then undergo denoising using a Gaussian The code utilizes two pre-trained deep learning models,
filter, which helps reduce noise and enhance image quality. VGG16 and ResNet50, for feature extraction from the photos.
Following denoising, histogram equalization was applied to An algorithm further divides each image into four quadrants,
improve the contrast of the images, making the features and the ResNet50 model is employed to extract feature vec-
more distinct and accessible for analysis. In the feature tors from these cropped regions. The extracted features from

87568 VOLUME 12, 2024

S. S. Nair, M. Subaji: Automated Identification of Breast Cancer Type

Algorithm 1 Breast Cancer Classification from Histopatho- Tree Classifier, and Ridge Classifier—were chosen to com-
logical Images pare the performances of the various classifiers. These four
Let D be the dataset of histopathological images, where each image base classifiers were collectively employed to construct an
is denoted by Xi and its corresponding class label is denoted by Yi, ensemble model, known as the Voting Classifier. This ensem-
where I ranges from 1 to the total number of images in the dataset.
Input: Histopathological images, Xi Output: ClassificationRe- ble approach comprehensively evaluates the effectiveness
sult[i] = predict(Xi) of the models in classification tasks. The code employs
1. Preprocessing phase: cross-validation to evaluate the performance of each classi-
1.1 Scale or resize images: fier, with accuracy as the primary assessment metric. This
-Xi = resize(Xi, 250 × 250) metric provides essential insights into how effectively models
1.2 Denoise the resized image using the Gaussian filter:
-Xi = GaussianBlur(Xi, σ ) classify data and, offers valuable information regarding their
1.3 Histogram equalization of denoised image: capabilities. The code utilizes the Yellowbrick package to
-Xi = histogramEqualization(Xi) generate a Confusion Matrix for Logistic Regression. This
2. Feature engineering and feature extraction phase: visualization tool offers a graphical representation of the
2.1 Process each sub-image: model’s performance on the test dataset, aiding the assess-
-Convert images to RGB color standard
and resize ment of its classification accuracy and error rates.
Xi = convertToRGBAndResize(Xi)
2.2 Split images into four sub-images: VI. RESULTS AND DISCUSSIONS
-Let Xij represent jth sub-image of Xi, where j ranges This section presents the results of our customized ensemble
from 1 to 4 model and various classifiers, using a publicly accessi-
2.3 Extract features using ResNet50 and VGG16 transfer
learning models:
ble dataset of breast cancer cases. A variety of metrics
-Let Fij represent the extracted features of Xij from Equation (2) to (9), including the F1-Score, precision,
Fij = ResNet50(Xij) and VGG16(Xij) sensitivity, specificity, accuracy, misclassification rate, and
2.4 Save feature arrays and record class labels: intersection over union(IoU) [62], can be used to assess the
- features[i, j] = Fij proposed customized model.
- labels[i, j] = Yi
3. Feature selection and feature fusion: TN + TP
Accuracy = × 100 (2)
3.1 Feature selection using extra tree classifier: (TN + TP + FN + FP)
-Let Fij represent the features extracted by transfer FP + FN
learning models MCR = × 100 (3)
Fijnew = ExtraTreeClassifier(Fij) (TN + FP + FN + TN )
- Fuse the features as Fi of each image Xi TP
4. Machine learning classification: Precision = × 100 (4)
(TP + FP)
4.1 Train machine learning models: TP
- Modelk = trainModel(Fi, Yi, k), where k represents Sensitivity = × 100 (5)
different models (TP + FN )
4.2 Utilize ensemble learning techniques: TN
-Let Predictionk represent the predictions of Specificity = × 100 (6)
(TN + FP)
Modelk
-Combine Predictionk of Modelk using voting
TP
Recall = × 100 (7)
ensemble method (TP + FN )
5. Evaluate model performance: 2TP
5.1 Use benchmark evaluation metrics: F1 − score = × 100 (8)
- Accuracy, Precision, Recall, F1-score, Sensitivity,
(2TP + FP + FN )
Specificity, loss, IoU, MCR, AUC-ROC, Confusion TP
IoU = × 100 (9)
metrics (TP + FN + FP)
A. RESULTS
The preprocessing stage of the data pipeline is essential
the image quadrants were stored in arrays X1, X2, X3, and because it lays the groundwork for reliable and efficient
X4. Each collection of these features is associated with a machine learning. This section summarizes the outcomes and
corresponding set of class labels denoted as Y. An extra tree conclusions of several pre-processing methods. In the dataset,
classifier was employed to perform the feature selection. This the images underwent essential pre-processing steps. First,
classifier was individually trained on each set of features they were converted to the RGB color standard to guarantee
to identify the most critical traits in the data. Input data data uniformity across the various color formats. Addition-
(X) were generated for the model training by concatenating ally, the images were resized to standard dimensions of 250×
the selected features. Class labels (Y) were transformed into 250 pixels. This resizing simplifies the algorithm and ensures
numeric values using the LabelEncoder. uniform input sizes for the subsequent processing. A pivotal
The scikit-learn function was then used to split the dataset step in the preprocessing stage involves feature extraction,
into separate training and testing sets for further analy- a process in which crucial information is derived from the
sis. Four distinct models—Logistic Regression, SVM, Extra images. This was achieved using pre-trained deep learning

VOLUME 12, 2024 87569

S. S. Nair, M. Subaji: Automated Identification of Breast Cancer Type

models, specifically VGG16 and ResNet50, to extract high- B. DISCUSSIONS

level features from the images. In evaluating the five classification models, each displayed
Furthermore, a quadrant-based feature extraction tech- unique characteristics in classifying the dataset, and the corre-
nique is implemented using the ResNet50 model. This sponding confusion matrices are shown in Figures 11, 12, 13,
approach enables independent retrieval of feature vectors 14, and 15. For the Logistic Regression model, the confusion
from each of the four quadrants in every image, facilitating matrix revealed that 31 images were correctly classified as
the effective capture of localized information for subsequent class 0, 32 as class 1, and 28 as class 2 images. This resulted
analysis and model training. Partitioning of the images into in an overall accuracy of 91.42%. The precision, recall, and
four quadrants is shown in Figure 10. F1-score for Class 0 were 0.97, 1.00, and 0.98, respec-
Enhancing the quality of the input data for our machine tively. The SVM model classified 31 Class 0, 32 Class 1,
learning models involves the crucial task of feature selec- and 28 Class 2 images and, achieved an impressive overall
tion. In this context, each set of features (X1, X2, X3, accuracy of 91.44%. The precision, recall, and F1 scores
and X4) was evaluated using an extra tree classifier, allow- for Class 1 are 0.94, 0.88, and 0.91, respectively. In com-
ing for the identification of the most vital features within parison, the Ridge Classifier accurately classified 30 Class 0,
the dataset. This process is instrumental in retaining essen- 32 Class 1, and 29 Class 2 images, with an overall accuracy
tial information while effectively reducing dimensionality, of 88.74%. The precision, recall, and F1 scores for Class 2 are
thereby optimizing the input for subsequent model training 0.94, 0.94, and 0.94, respectively. The Extra Tree Classifier
and analysis. displayed a precision of 1.00 in Class 0, 0.94 in Class 1, and
0.85 in Class 2, resulting in an overall accuracy of 89.66%.
The F1 scores for classes 0, 1, and 2 were 0.98, 0.91, and 0.89,
respectively. Notably, the Voting Classifier, which combines
the predictions of previous models, achieved remarkable per-
formance in most classes, with high accuracy, precision, and
F1 scores. Overall, each model demonstrated remarkable
accuracy and robust performance, thereby contributing to
the overall superiority of the Voting Classifier in multiclass
classification scenarios.
In this evaluation of two models, we can see that the AUC
(Area Under the Curve) of logistic regression is better than
that of the extra tree classifier because, in logistic regression,
the ROC (receiver operating characteristic) of class 0 was
FIGURE 10. Segmentation of breast cancer images into four quadrants 1.00, the ROC of class 1 was 0.99, and for class 2 was 0.99,
occurs during the preprocessing stage.
but in the extra tree classifier, the ROC of class 0 was 1.00, the
ROC of class 1 was 0.99, and for class 2 was 0.98. In addition,
A critical step in data preparation involves label encoding the ROC curve of the logistic regression is closer to the
to ensure the readiness of the dataset for machine-learning upper-left corner of the graph (higher on the y-axis and more
tasks. Specifically, implementing the LabelEncoder played to the left on the x-axis), indicating that the logistic regression
a pivotal role in transforming class labels into numerical is a better model, as shown in Figure 16 and 17.
representations. This transformation was essential, because it Apart from metric accuracy, measures such as precision,
enabled machine learning algorithms to effectively utilize and recall, and f1 score are also essential for evaluating the
process these labels for various tasks and analyses. We pre- performance of a classifier. The values of precision, recall,
pared input data (X) for our machine learning models by and F1-score for each class and different classifiers were
concatenating the selected features from the preprocessed evaluated, and are listed in Table 1. It is evident from the table
dataset, ensuring the readiness of the dataset for modeling. that the voting classifier outperformed the other classifiers.
We compared the performance of four primary classifiers: The precision-recall curve shows the tradeoff between
logistic regression, SVM, extra tree classifier, and ridge precision and recall for different thresholds. As shown in
classifier, alongside an ensemble model, the Voting Classi- Figures 18, 19, 20, and 21, the precision-recall tradeoff
fier, which integrates these classifiers. Cross-validation was for all the classifiers selected for the ensemble is ideally
employed to assess the effectiveness of each classifier using obtained, and all possess good precision and recall. This may
an accuracy metric for evaluation. In addition, a confusion lead to better performance of the ensemble classifier. Pho-
matrix for the Logistic Regression classifier was generated tographs and grayscale figures were prepared at a resolution
using the Yellowbrick library, providing a graphical rep- of 300 dpi and saved with no compression at 8 bits per pixel
resentation of its performance on test data. These crucial (grayscale).
preprocessing steps are fundamental for achieving accurate Figures 22, 23, 24, 25, and 26 show the prediction errors
and reliable machine learning results, and our subsequent for each class for different classifiers. In almost all classifiers,
study builds upon these findings. the data points from the second class were predicted to be the
87570 VOLUME 12, 2024
S. S. Nair, M. Subaji: Automated Identification of Breast Cancer Type

FIGURE 14. Assessing the classification performance of ridge classifier

using a confusion matrix: A graphical depiction of the model’s predictions
FIGURE 11. Assessing the classification performance of voting classifier versus the actual results.
using a confusion matrix: A graphical depiction of the model’s predictions
versus the actual results.

FIGURE 15. Assessing the classification performance of SVM classifier

using a confusion matrix: A graphical depiction of the model’s predictions
versus the actual results.
FIGURE 12. Assessing the classification performance of extra tree
classifier using a confusion matrix: A graphical depiction of the model’s
predictions versus the actual results.

FIGURE 16. Assessing logistic regression classification performance:

employing the receiver operating characteristic (ROC) curve for
comprehensive model performance evaluation.
FIGURE 13. Assessing the classification performance of logistic
regression classifier using a confusion matrix: A graphical depiction of
the model’s predictions versus the actual results.
data points are in the extra tree classifier, and the first class
is predicted to be the third class. Compared to all classifiers,
third. The third class was misclassified as the second class. the smallest number of misclassifications was achieved in the
In the figure, the most significant number of misclassified ensemble.

VOLUME 12, 2024 87571

S. S. Nair, M. Subaji: Automated Identification of Breast Cancer Type

TABLE 2. Comparison of various feature extraction methods.

FIGURE 17. Assessing extra tree classification performance: employing

the receiver operating characteristic (ROC) curve for comprehensive
model performance evaluation.

TABLE 1. Class report of different classifiers on various cancer classes. TABLE 3. Comparison of different performance measures on different
classifiers.

We examined other feature extraction techniques in Table 2

in addition to our suggested approach. Apart from using
an extra tree classifier, all the other methods run into an
unforeseen error.
Table 3 presents the performance evaluation of the pro-
posed ensemble model, along with other models, using a
publicly available dataset of breast cancer histopathological
images categorized into three distinct tumor classes. The
evaluation metrics included the accuracy, loss, sensitivity,
specificity, IoU, and MCR. Our proposed model achieved an
accuracy of 96.84%, outperforming the Extra Tree Classifier, FIGURE 18. Evaluating the performance of the extra tree classifier using
which had an accuracy of 89.66%, and the Ridge Classifier, the precision-recall curve.
which achieved 88.74% accuracy. The logistic regression
showed a significant accuracy of 91.42%, whereas the SVM
model achieved a notable accuracy of 91.44%. Moreover, Table 4 compares the average accuracy, recall, precision,
our proposed model recorded the lowest loss at 3.16% and and f1 score of all classifiers; among them, the voting
excelled in other metrics, with a sensitivity of 98.30%, speci- classifier performed better. The limitations include the per-
ficity of 97.00%, IoU of 94.00%, and MCR of 2.13%. formance of the proposed model directly depending on the

87572 VOLUME 12, 2024

S. S. Nair, M. Subaji: Automated Identification of Breast Cancer Type

FIGURE 19. Evaluating the performance of the logistic regression

classifier using the precision-recall curve.
FIGURE 22. Evaluating the performance of the voting classifier using the
class prediction error chart of different carcinoma classes.

FIGURE 20. Evaluating the performance of the ridge classifier using the FIGURE 23. Evaluating the performance of the extra tree classifier using
precision-recall curve. the class prediction error chart of different carcinoma classes.

FIGURE 24. Evaluating the performance of the logistic regression

FIGURE 21. Evaluating the performance of the SVM classifier using the classifier using the class prediction error chart of different carcinoma
precision-recall curve. classes.

goodness of the pre-trained models, which also requires more In Table 5, each row represents a different classifier, and
explainability. each column corresponds to a fold in the cross-validation

VOLUME 12, 2024 87573

S. S. Nair, M. Subaji: Automated Identification of Breast Cancer Type

TABLE 5. Performance evaluation of various classifiers based on K-Fold

validation.

FIGURE 25. Evaluating the performance of the ridge classifier using the model on k−1 folds, and testing the remaining folds, repeat-
class prediction error chart of different carcinoma classes.
ing this process k times. Hence, each fold served as the test
set. The average accuracy across all folds estimates the over-
all performance of the model, allowing us to measure the
consistency and stability of the performance of each clas-
sifier. The Voting Classifier consistently achieved higher
accuracy than the others, demonstrating its robustness. At the
same time, Logistic Regression and Ridge Classifier also
showed relatively stable performance, and the Extra Tree and
SVM exhibited more variation in accuracy across the folds.
This cross-validation table, offers a comprehensive view of
the classifiers’ performance across multiple iterations and
different data subsets, helping to assess their generalizability
and providing a more reliable estimate of their performance
on unseen data.
Table 6 presents the performance evaluation results of
the different ensemble methods using a publicly available
dataset of images related to breast cancer. The table displays
FIGURE 26. Evaluating the performance of the SVM classifier using the
class prediction error chart of different carcinoma classes. the accuracies of the various approaches. With an accuracy
of 96.8%, the proposed voting classifier outperformed the
bagging (92.4%) and stacking (80.9%) classifiers.
TABLE 4. Comparison of average accuracy, recall, precision, F1 score of
different classifiers. This study proposes a unique method that painstakingly
splits each input image into four quadrants to improve com-
prehension. These procedures include data preparation, such
as image standardization and scaling. These preprocessing
steps result in feature sets that contain the extracted data.
This study used label encoding techniques to simplify cat-
egorization by converting class names into binary labels that
are suitable for training machine learning models. The devel-
opment of a transfer learning model forms the core of this
study. Notably, this work goes beyond transfer learning and
incorporates several machine learning models, such as voting
classifier, logistic regression, SVM, extra tree classifier, and
ridge classifier. This all-encompassing strategy enables the
complete evaluation of several machine learning models.

C. COMPARISON WITH STATE-OF-THE-ART APPROACHES

process, with the cell values indicating the classifier accuracy Srikantamurthy et al. [50] compared a hybrid CNN-LSTM
for each fold. Cross-validation evaluates the model perfor- model with established CNN architectures, such as VGG-16,
mance by splitting the dataset into k equal folds, training the ResNet50, and Inception, for breast histopathological

87574 VOLUME 12, 2024

S. S. Nair, M. Subaji: Automated Identification of Breast Cancer Type

TABLE 6. Performance evaluation of various ensemble methods. trained models, and various machine learning (ML) models.
Through practical experimentation, this research constructs
pre-trained neural network architectures using EfficientNets
along with ML classification models. Specifically, support
vector machine (SVM) and eXtreme Gradient Boosting
(XGBoost) algorithms were trained on breast cancer datasets.
The outcomes revealed comparable yet satisfactory per-
formance for both EfficientNetB4 and XGBoost. Notably,
XGBoost achieved an accuracy rate of 84%, accompanied by
metrics such as recall, precision, and F1_Score of 0.80, 0.83,
and 0.81, respectively.
image classification. Each model was trained using three Joshi et al. [55] presented a framework based on a deep
optimizers—Adam, RMSProp, and SGD—across varying convolutional neural network (CNN). Pre-trained CNN mod-
epochs. The findings revealed that, the Adam optimizer els, such as EfficientNetB0, ResNet50, and Xception were
consistently yielded the highest accuracy and minimal model tested, with custom layers replacing the top layers to tailor
loss across the training and validation sets. Notably, the the architecture for breast cancer detection. The customized
hybrid CNN-LSTM model emerged as the top performer, Xception model performed better, achieving 93.33% accu-
achieving an impressive accuracy of 92.5% for the multiclass racy on BreakHis images. Training involved 70% of the
classification of cancer subtypes. BreakHis images, with 30% reserved for testing and valida-
Rana and Bhushan [51] highlight the efficacy of transfer tion. Data augmentation, dropout, and batch normalization
learning models in automating tumor classification, without were used for regularization.
the need for augmentation or preprocessing techniques. Seven Mani and Kamalakannan [56] proposed a novel deep con-
transfer learning models, — LENET, VGG16, DarkNet53, volutional neural network (CNN)–based transfer learning
DarkNet19, ResNet50, Inception, and Xception, — were model to accurately and effectively classify breast cancer in
deployed on the BreakHis dataset for tumor classification. women. The model utilizes a pre-trained Inception-V3Net
Among these models, Xception emerged as the top performer, architecture. Initially, the model was configured for binary
achieving an accuracy of 83.07%. This study also intro- classification and subsequently extended to classify breast
duced a novel metric, Balanced Accuracy (BAC), to address cancer histopathological images on a multiclass basis. The
the challenges of unbalanced datasets. DarkNet53 was par- proposed model achieved the highest average accuracy of
ticularly effective in computing BAC, achieving a notable 94.8% for various magnification factors.
accuracy rate of 87.17%. Nakach et al. [57] introduced a novel approach that com-
Wang et al. [52] proposed a method to overcome the bines transfer and ensemble learning to classify histological
challenges of a limited sample size, time-consuming feature images of breast cancer across various magnification factors.
design, and low accuracy in breast cancer pathology image The proposed method employs bagging ensembles, utilizing
detection and classification. This algorithm integrates deep hybrid architectures that merge pre-trained deep learning
and transfer learning techniques to categorize breast can- techniques for feature extraction with machine learning clas-
cer pathology images from the BreakHis dataset effectively. sifiers, such as MLP, SVM, and KNN, as base learners.
Rooted in the DenseNet structure of deep neural networks, This study systematically evaluated and compared different
the model underwent multi-level transfer learning training, aspects, including the performance of bagging ensembles
leveraging pre-existing knowledge from large-scale datasets. with their base learners, varying numbers of base learners,
The experimental results demonstrate the efficiency of the single classifiers against the best bagging ensembles, and
algorithm, achieving an accuracy of over 84.0% in the test top-performing bagging ensembles across different feature
set. extractors and magnification factors. The best-performing
Prakash et al. [53] explored the effectiveness of utilizing bagging ensemble was identified through statistical tests,
transfer learning with four network architectures—AlexNet, such as the Scott Knott (SK) test and the Borda Count voting
VGG16, Inception v3, and DenseNet121—for classifying system, achieving a mean accuracy value of 93.98%. Notably,
histopathological images. By removing the top dense layers this ensemble comprises three base learners, uses a 200×
and incorporating custom dense layers tailored for classifica- magnification factor, employs MLP as a classifier, and uses
tion, the models were fine-tuned using a histopathology input DenseNet201 as a feature extractor.
image dataset to adapt to the specific task of breast cancer Histopathological images generated by optical micro-
classification. The pre-trained DenseNet121 model emerged scopes often contain noise, which can cause significant
as the top performer, achieving a remarkable classification performance fluctuations in well-trained convolutional neu-
accuracy of 0.9520. ral networks (CNNs) used for image classification. This
Folorunso et al. [54] focused on employing transfer learn- highlights the critical role of image quality in determining
ing methods to investigate breast cancer image classification classification accuracy. Although the wavelet transform is
and detection, covering aspects such as pre-processing, pre- a popular choice for denoising, selecting the appropriate

VOLUME 12, 2024 87575

S. S. Nair, M. Subaji: Automated Identification of Breast Cancer Type

threshold presents challenges, and conventional methods TABLE 7. Accuracy of the state-of-the-art study comparison.
need to help achieve accurate and efficient threshold selec-
tion. To overcome this obstacle, Liu et al. [58] proposed an
innovative adaptive threshold selection method that combines
the threshold selection steps with deep learning techniques.
By incorporating the threshold as a parameter in the CNN
model during training, this approach establishes a connec-
tion between the threshold and the model’s classification
outcome, enabling the determination of an optimal thresh-
old value through back-propagation during training. The
results demonstrated substantial enhancements in classifica-
tion accuracy compared with conventional threshold selection
methods, with accuracy rates of 94.37%, 93.85%, 91.63%,
and 93.31% for BreaKHis images at magnifications of 40×,
100×, 200×, and 400×, respectively.
Maleki et al. [59] introduce methodologies to improve the
speed and accuracy of histopathological image classification,
which is a crucial aspect of effective therapeutic interven-
tions. Three different classifiers and six pre-trained networks
were evaluated. Initially, a pre-trained model extracts features
from images, which are then fed into the extreme gradient
boosting (XGBoost) method, and is selected as the final clas-
sifier. The methodology is grounded in transfer learning and,
leveraging histopathological images as inputs. The evaluation
used the BreakHis dataset, encompassing histopathological
images across four magnification levels: 40X, 100X, 200X,
and 400X. The proposed method achieved accuracies of 93.6,
91.3, 93.8, and 89.1, respectively. Following an analysis of
the attained accuracy rates, the final proposed approach com-
bines the DenseNet201 model as a feature extractor with
XGBoost as the classifier.
Burrai et al. [60] employed three distinct methodologies,
each integrating a convolutional neural network (CNN) as
a feature extractor, specifically VGG16, Inception v3, and
EfficientNet, in conjunction with a classifier (either sup-
port vector machines (SVM) or stochastic gradient boosting
(SGB)) positioned atop the neural network. Initially trained
on a human breast cancer dataset (BreakHis) with accura- histopathology image classification. In addition, at the patient
cies ranging from 0.86 to 0.91, these models are applied level, the proposed method achieved an impressive F1 score
to the CMT dataset, resulting in accuracies from 0.63 to of 95.02% and an average recognition accuracy of 92.20%.
0.85 across all architectures. Particularly noteworthy is the In Table 7, we analyze the performance of various state-
performance of the combination of the EfficientNet frame- of-the-art architectural models alongside our proposed model
work with the SVM, which demonstrated the most promising and, present a thorough comparison of the different tech-
results, achieving accuracies ranging from 0.82 - 0.85. niques. Among these, the proposed method has emerged as a
Wang et al. [61] introduced a novel classification approach prominent approach, demonstrating its effectiveness with an
aimed at improving the accuracy of diagnosing benign and accuracy of 96.84% on the BreakHis dataset, highlighting its
malignant breast tumors. Our method employs various data capability to classify histological images of breast cancer. The
augmentation techniques, including staining normalization, earlier sections provided a detailed description and results of
image patch generation, and spatial geometry transforma- our proposed model, but the comparisons focused on accu-
tion, to effectively augment the training set. We leverage racy and methods. Our proposed model achieved superior
BCMNet (Breast Classification Fusion Multi-Scale Feature results, attaining an accuracy of 96.84%, surpassing all other
Network), which integrates VGG16 and CBAM, to combine reviewed methodologies.
spatial, channel, and multi-scale features extracted from input
images. By evaluating the BreakHis dataset, our approach VII. CONCLUSION AND FUTURE WORK
achieved average accuracies of 91.91%, 91.14%, 92.65%, Finally, regarding the problem of breast cancer picture clas-
and 87.56% across four different magnifications for breast sification, our research presents a solution that carefully

87576 VOLUME 12, 2024

S. S. Nair, M. Subaji: Automated Identification of Breast Cancer Type

integrates the benefits of transfer learning, ensemble learning, [14] Z. Gandomkar, P. C. Brennan, and C. Mello-Thoms, ‘‘MuDeRN: Multi-
and conventional machine learning techniques. We extract category classification of breast histopathological image using deep
residual networks,’’ Artif. Intell. Med., vol. 88, pp. 14–24, Jun. 2018.
complex features by using the representational capability of [15] C. Beltran-Perez, H.-L. Wei, and A. Rubio-Solis, ‘‘Generalized multiscale
deep learning by segmenting images into discrete regions and RBF networks and the DCT for breast cancer detection,’’ Int. J. Autom.
using transfer learning with ResNet50 and VGG16. Subse- Comput., vol. 17, no. 1, pp. 55–70, Feb. 2020.
quently, an Extra Tree Classifier was applied to enable more [16] A. Kumar, S. K. Singh, S. Saxena, K. Lakshmanan, A. K. Sangaiah,
H. Chauhan, S. Shrivastava, and R. K. Singh, ‘‘Deep feature learning for
detailed analysis at the regional level. The ensemble learning histopathological image classification of canine mammary tumors and
architecture, which includes SVM, Ridge Classifier, Extra human breast cancer,’’ Inf. Sci., vol. 508, pp. 405–421, Jan. 2020.
Tree Classifier, and logistic regression, guarantees a more [17] X. Li, X. Shen, Y. Zhou, X. Wang, and T.-Q. Li, ‘‘Classification of breast
cancer histopathological images using interleaved DenseNet with SENet
robust and interpretable model and improves overall classi- (IDSNet),’’ PLoS ONE, vol. 15, no. 5, May 2020, Art. no. e0232127.
fication accuracy. Our extensive experimental assessments [18] D. M. Vo, N.-Q. Nguyen, and S.-W. Lee, ‘‘Classification of breast cancer
indicated the potential of our method and produced better histology images using incremental boosting convolution networks,’’ Inf.
results than the single classifiers. This study addresses the Sci., vol. 482, pp. 123–138, May 2019.
[19] S. Saxena, S. Shukla, and M. Gyanchandani, ‘‘Breast cancer histopathol-
requirement for increased accuracy in medical image analysis ogy image classification using kernelized weighted extreme learning
and interpretability issues, contributing to the advancement machine,’’ Int. J. Imag. Syst. Technol., vol. 31, no. 1, pp. 168–179,
of breast cancer image classification by providing a uni- Mar. 2021.
fied technique. Subsequent investigations should examine [20] M. Z. Alom, C. Yakopcic, M. S. Nasrin, T. M. Taha, and V. K. Asari,
‘‘Breast cancer classification from histopathological images with inception
the scalability and generalizability of this methodology on recurrent residual convolutional neural network,’’ J. Digit. Imag., vol. 32,
various datasets, thereby facilitating the development of more no. 4, pp. 605–617, Aug. 2019.
efficient and reliable diagnostic instruments for breast cancer. [21] S. Boumaraf, X. Liu, Z. Zheng, X. Ma, and C. Ferkous, ‘‘A new transfer
learning based approach to magnification dependent and independent clas-
sification of breast cancer in histopathological images,’’ Biomed. Signal
REFERENCES Process. Control, vol. 63, Jan. 2021, Art. no. 102192.
[1] S. P. Power, F. Moloney, M. Twomey, K. James, O. J. O’Connor, and [22] K. C. Burçak, Ö. K. Baykan, and H. Uğuz, ‘‘A new deep convolutional neu-
M. M. Maher, ‘‘Computed tomography and patient risk: Facts, perceptions ral network model for classifying breast cancer histopathological images
and uncertainties,’’ World J. Radiol., vol. 8, no. 12, pp. 902–915, 2016. and the hyperparameter optimisation of the proposed model,’’ J. Supercom-
[2] V. P. Grover, J. M. Tognarelli, M. M. Crossey, I. J. Cox, put., vol. 77, no. 1, pp. 973–989, Jan. 2021.
S. D. Taylor-Robinson, and M. J. McPhail, ‘‘Magnetic resonance [23] J. Xie, R. Liu, J. Luttrell, and C. Zhang, ‘‘Deep learning based analysis
imaging: Principles and techniques: Lessons for clinicians,’’ J. Clin. Exp. of histopathological images of breast cancer,’’ Frontiers Genet., vol. 10,
Hepatol., vol. 5, no. 3, pp. 246–255, Sep. 2015. pp. 1–19, Feb. 2019.
[3] M. Kapoor and A. Kasi, PET Scanning. Treasure Island, FL, USA: [24] Y. Jiang, L. Chen, H. Zhang, and X. Xiao, ‘‘Breast cancer histopathological
StatPearls, 2022. Accessed: Oct. 3, 2022. [Online]. Available: image classification using convolutional neural networks with small SE-
https://www.ncbi.nlm.nih.gov/books/NBK559089 ResNet module,’’ PLoS ONE, vol. 14, no. 3, Mar. 2019, Art. no. e0214587.
[4] P. Rajpurkar, E. Chen, O. Banerjee, and E. J. Topol, ‘‘AI in health and [25] Z. Han, B. Wei, Y. Zheng, Y. Yin, K. Li, and S. Li, ‘‘Breast cancer multi-
medicine,’’ Nature Med., vol. 28, no. 1, pp. 31–38, Jan. 2022. classification from histopathological images with structured deep learning
[5] A. Singh. Foundations of Machine Learning. Accessed: Jun. 6, 2019. model,’’ Sci. Rep., vol. 7, no. 1, p. 4172, Jun. 2017.
[Online]. Available: https://ssrn.com/abstract=3399990 [26] R. Kumar, R. Srivastava, and S. Srivastava, ‘‘Detection and classification
[6] K. He, X. Zhang, S. Ren, and J. Sun, ‘‘Deep residual learning for image of cancer from microscopic biopsy images using clinically significant and
recognition,’’ in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), biologically interpretable features,’’ J. Med. Eng., vol. 2015, Aug. 2015,
Las Vegas, NV, USA, Jun. 2016, pp. 770–778. Art. no. 457906.
[7] K. Simonyan and A. Zisserman, ‘‘Very deep convolutional networks for [27] T. S. Sheikh, Y. Lee, and M. Cho, ‘‘Histopathological classification of
large-scale image recognition,’’ CoRR, vol. abs/1409.1556, pp. 1–14, breast cancer images using a multi-scale input and multi-feature network,’’
Sep. 2014. Cancers, vol. 12, no. 8, p. 2031, Jul. 2020.
[8] C.-Y.-J. Peng, K. L. Lee, and G. M. Ingersoll, ‘‘An introduction to logistic [28] A.-A. Nahid, M. A. Mehrabi, and Y. Kong, ‘‘Histopathological breast can-
regression analysis and reporting,’’ J. Educ. Res., vol. 96, no. 1, pp. 3–14, cer image classification by deep neural network techniques guided by local
Sep. 2002. clustering,’’ BioMed Res. Int., vol. 2018, Mar. 2018, Art. no. 2362108.
[9] T. Evgeniou and M. Pontil, ‘‘Machine learning and its applications,’’
[29] C. Zhu, F. Song, Y. Wang, H. Dong, Y. Guo, and J. Liu, ‘‘Breast cancer
in Support Vector Machines: Theory and Applications (Lecture Notes
histopathology image classification through assembling multiple compact
in Computer Science), vol. 2049, G. Paliouras, V. Karkaletsis, and
CNNs,’’ BMC Med. Informat. Decis. Making, vol. 19, no. 1, p. 198,
C. D. Spyropoulos, Eds. Berlin, Germany: Springer, 2001, pp. 249–257.
Oct. 2019.
[10] B. S. Bhati and C. S. Rai, ‘‘Ensemble based approach for intrusion
[30] L. Gong, J. Yang, and X. Zhang, ‘‘Semi-supervised breast histological
detection using extra tree classifier,’’ in Intelligent Computing in Engi-
image classification by node-attention graph transfer network,’’ IEEE
neering (Advances in Intelligent Systems and Computing), vol. 1125,
Access, vol. 8, pp. 158335–158345, 2020.
V. Solanki, M. Hoang, Z. Lu, and P. Pattnaik, Eds. Singapore: Springer,
2020, pp. 213–220. [31] K. George, S. Faziludeen, P. Sankaran, and K. P. Joseph, ‘‘Breast can-
[11] M. Grüning and S. Kropf, ‘‘A ridge classification method for high- cer detection from biopsy images using nucleus guided transfer learning
dimensional observations,’’ in From Data and Information Analysis and belief based fusion,’’ Comput. Biol. Med., vol. 124, Sep. 2020,
to Knowledge Engineering (Studies in Classification, Data Analysis, Art. no. 103954.
and Knowledge Organization), M. Spiliopoulou, R. Kruse, C. Borgelt, [32] K. O’Shea and R. Nash, ‘‘An introduction to convolutional neural net-
A. Nürnberger, and W. Gaul, Eds. Berlin, Germany: Springer, 2006, works,’’ 2015, arXiv:1511.08458.
pp. 684–691. [33] G. Huang, Z. Liu, L. Van Der Maaten, and K. Q. Weinberger, ‘‘Densely
[12] A. Alqudah and A. M. Alqudah, ‘‘Sliding window based support vector connected convolutional networks,’’ in Proc. IEEE Conf. Comput. Vis.
machine system for classification of breast cancer using histopathological Pattern Recognit. (CVPR), Honolulu, HI, USA, Jul. 2017, pp. 2261–2269.
microscopic images,’’ IETE J. Res., vol. 68, no. 1, pp. 59–67, Jan. 2022. [34] C. Szegedy, V. Vanhoucke, S. Ioffe, J. Shlens, and Z. Wojna, ‘‘Rethinking
[13] M. Gour, S. Jain, and T. S. Kumar, ‘‘Residual learning based CNN for the inception architecture for computer vision,’’ in Proc. IEEE Conf.
breast cancer histopathological image classification,’’ Int. J. Imag. Syst. Comput. Vis. Pattern Recognit. (CVPR), Las Vegas, NV, USA, Jun. 2016,
Technol., vol. 30, no. 3, pp. 621–635, Sep. 2020. pp. 2818–2826.

VOLUME 12, 2024 87577

S. S. Nair, M. Subaji: Automated Identification of Breast Cancer Type

[35] S. A. Billings, H.-L. Wei, and M. A. Balikhin, ‘‘Generalized multi- [55] S. A. Joshi, A. M. Bongale, P. O. Olsson, S. Urolagin, D. Dharrao, and
scale radial basis function networks,’’ Neural Netw., vol. 20, no. 10, A. Bongale, ‘‘Enhanced pre-trained xception model transfer learned for
pp. 1081–1094, Dec. 2007. breast cancer detection,’’ Computation, vol. 11, no. 3, p. 59, Mar. 2023.
[36] A. Hosna, E. Merry, J. Gyalmo, Z. Alom, Z. Aung, and M. A. Azim, [56] R. K. Mani and J. Kamalakannan, ‘‘Technique for breast cancer clas-
‘‘Transfer learning: A friendly introduction,’’ J. Big Data, vol. 9, no. 1, sification using semi-supervised deep convolutional neural networks
pp. 1–19, Oct. 2022. with transfer learning models,’’ Current Sci., vol. 125, no. 9, 2023,
[37] V. Wiley and T. Lucas, ‘‘Computer vision and image processing: A paper Art. no. 00113891.
review,’’ Int. J. Artif. Intell. Res., vol. 2, no. 1, pp. 28–36, Jun. 2018. [57] F.-Z. Nakach, A. Idri, and H. Zerouaoui, ‘‘Deep hybrid bagging ensembles
[38] D. Khurana, A. Koli, K. Khatter, and S. Singh, ‘‘Natural language pro- for classifying histopathological breast cancer images,’’ in Proc. 15th Int.
cessing: State of the art, current trends and challenges,’’ Multimedia Tools Conf. Agents Artif. Intell., vol. 2, 2023, pp. 289–300.
Appl., vol. 82, no. 3, pp. 3713–3744, Jan. 2023. [58] Y. Liu, X. Liu, and Y. Qi, ‘‘Adaptive threshold learning in frequency
[39] A. Mohammed and R. Kora, ‘‘A comprehensive review on ensemble deep domain for classification of breast cancer histopathological images,’’ Int.
learning: Opportunities and challenges,’’ J. King Saud Univ., Comput. Inf. J. Intell. Syst., vol. 2024, pp. 1–13, Mar. 2024.
Sci., vol. 35, no. 2, pp. 757–774, Feb. 2023. [59] A. Maleki, M. Raahemi, and H. Nasiri, ‘‘Breast cancer diagnosis from
[40] F. Leon, S.-A. Floria, and C. Badica, ‘‘Evaluating the effect of voting histopathology images using deep neural network and XGBoost,’’ Biomed.
methods on ensemble-based classification,’’ in Proc. IEEE Int. Conf. Innov. Signal Process. Control, vol. 86, Sep. 2023, Art. no. 105152.
Intell. Syst. Appl. (INISTA), Gdynia, Poland, Jul. 2017, pp. 1–6. [60] G. P. Burrai, A. Gabrieli, M. Polinas, C. Murgia, M. P. Becchere,
[41] F. A. Spanhol, L. S. Oliveira, C. Petitjean, and L. Heutte, ‘‘A dataset P. Demontis, and E. Antuofermo, ‘‘Canine mammary tumor histopatho-
for breast cancer histopathological image classification,’’ IEEE Trans. logical image classification via computer-aided pathology: An available
Biomed. Eng., vol. 63, no. 7, pp. 1455–1462, Jul. 2016. dataset for imaging analysis,’’ Animals, vol. 13, no. 9, p. 1563, May 2023.
[42] G. Deng and L. W. Cahill, ‘‘An adaptive Gaussian filter for noise reduction [61] Y. Wang, X. Deng, H. Shao, and Y. Jiang, ‘‘Multi-scale feature fusion for
and edge detection,’’ in Proc. IEEE Conf. Rec. Nucl. Sci. Symp. Med. Imag. histopathological image categorisation in breast cancer,’’ Comput. Meth-
Conf., San Francisco, CA, USA, Oct. 1993, pp. 1615–1619. ods Biomech. Biomed. Eng., Imag. Vis., vol. 11, no. 6, pp. 2350–2362,
[43] H. Yi, S. Shiyu, D. Xiusheng, and C. Zhigang, ‘‘A study on deep neural net- Nov. 2023.
works framework,’’ in Proc. IEEE Adv. Inf. Manage., Commun., Electron. [62] H. M. T. Khushi, T. Masood, A. Jaffar, M. Rashid, and S. Akram,
Autom. Control Conf. (IMCEC), Xi’an, China, Oct. 2016, pp. 1519–1522. ‘‘Improved multiclass brain tumor detection via customized pretrained
[44] J. Bao, X. Liu, Z. Xiang, and G. Wei, ‘‘Multi-objective optimization EfficientNetB7 model,’’ IEEE Access, vol. 11, pp. 117210–117230, 2023.
algorithm and preference Multi-objective decision-making based on arti-
ficial intelligence biological immune system,’’ IEEE Access, vol. 8,
pp. 160221–160230, 2020.
[45] H. Tang and Z. Hu, ‘‘Research on medical image classification based on
machine learning,’’ IEEE Access, vol. 8, pp. 93145–93154, 2020.
[46] J. Shi, H. Zhu, S. Yu, W. Wu, and H. Shi, ‘‘Scene categorization
model using deep visually sensitive features,’’ IEEE Access, vol. 7,
SALINI SASIDHARAN NAIR received the mas-
pp. 45230–45239, 2019.
ter’s degree in computer and information technol-
[47] Y. Li, H. Zhu, Q. Hou, J. Wang, and W. Wu, ‘‘Video super-resolution
using multi-scale and non-local feature fusion,’’ Electronics, vol. 11, no. 9, ogy from Manonmaniam Sundaranar University,
p. 1499, May 2022. Tirunelveli, Tamil Nadu, Kerala, in 2013. She is
[48] Z. Wu, H. Zhu, L. He, Q. Zhao, J. Shi, and W. Wu, ‘‘Real-time stereo currently a Research Scholar with the School of
matching with high accuracy via spatial attention-guided upsampling,’’ Computer Science and Engineering, VIT Univer-
Appl. Intell., vol. 53, no. 20, pp. 24253–24274, Oct. 2023. sity, Vellore. Her research interests include data
[49] W. Arshad, T. Masood, T. Mahmood, A. Jaffar, F. S. Alamri, mining, artificial intelligence, and digital image
S. A. O. Bahaj, and A. R. Khan, ‘‘Cancer unveiled: A deep dive into breast processing.
tumor detection using cutting-edge deep learning models,’’ IEEE Access,
vol. 11, pp. 133804–133824, 2023.
[50] M. M. Srikantamurthy, V. P. S. Rallabandi, D. B. Dudekula, S. Natarajan,
and J. Park, ‘‘Classification of benign and malignant subtypes of breast
cancer histopathology imaging using hybrid CNN-LSTM based transfer
learning,’’ BMC Med. Imag., vol. 23, no. 1, Jan. 2023, Art. no. 19.
[51] M. Rana and M. Bhushan, ‘‘Classifying breast cancer using transfer learn-
ing models based on histopathological images,’’ Neural Comput. Appl., MOHAN SUBAJI received the bachelor’s degree
vol. 35, no. 19, pp. 14243–14257, Jul. 2023. in mechanical engineering from Manonmaniam
[52] W. Wang, M. Gao, M. Xiao, X. Yan, and Y. Li, ‘‘Breast cancer image classi-
Sundaranar University, the master’s degree in
fication method based on deep transfer learning,’’ 2024, arXiv:2404.09226.
business administration (marketing) from the Uni-
[53] R. Meena Prakash, K. Ramalakshmi, S. Thayammal, R. S. S. Kumari,
and H. Selvaraj, ‘‘Classification of breast tumor from histopathological
versity of Madras, and the Ph.D. degree in
images with transfer learning,’’ in Computational Intelligence in Medical management information systems from Kookmin
Decision Making and Diagnosis. Boca Raton, FL, USA: CRC Press, 2023, University, South Korea. He has been working as
pp. 113–129. a Professor with VIT University, since September
[54] S. O. Folorunso, J. B. Awotunde, Y. P. Rangaiah, and R. O. Ogundokun, 2000. His current research interests include model
‘‘EfficientNets transfer learning strategies for histopathological breast can- integrated computing, distributed systems, data
cer image analysis,’’ Int. J. Model., Simul., Sci. Comput., vol. 15, no. 2, mining, artificial intelligence, and big data analytics.
Apr. 2024, Art. no. 2441009.

87578 VOLUME 12, 2024

Creative Image Design With AI
No ratings yet
Creative Image Design With AI
251 pages
Pytorch Tutorial by Chongruo Wu
No ratings yet
Pytorch Tutorial by Chongruo Wu
84 pages
Data Analyst Syllabus
No ratings yet
Data Analyst Syllabus
25 pages
Hands-On Deep Learning For Images With T PDF
No ratings yet
Hands-On Deep Learning For Images With T PDF
3 pages
Gartner Predictions For Data and Analytics Till 20204
No ratings yet
Gartner Predictions For Data and Analytics Till 20204
46 pages
Mutim Tools App
No ratings yet
Mutim Tools App
112 pages
ML R23 Material
No ratings yet
ML R23 Material
79 pages
A Novel Approach For Breast Cancer Detection Using Optimized Ensemble
No ratings yet
A Novel Approach For Breast Cancer Detection Using Optimized Ensemble
12 pages
Breast Cancer Diagnostiic Using Machine Learning
No ratings yet
Breast Cancer Diagnostiic Using Machine Learning
72 pages
Journal Pre-Proof: Physica A
No ratings yet
Journal Pre-Proof: Physica A
35 pages
Toward Improving Breast Cancer Classification Using An Adaptive Voting Ensemble Learning Algorithm
No ratings yet
Toward Improving Breast Cancer Classification Using An Adaptive Voting Ensemble Learning Algorithm
14 pages
(IJCST-V11I3P3) :DR M Narendra, A Nandini, T Kamal Raj, V Sai Sowmya, CH Brahma Reddy
No ratings yet
(IJCST-V11I3P3) :DR M Narendra, A Nandini, T Kamal Raj, V Sai Sowmya, CH Brahma Reddy
3 pages
Detection of Breast Cancer From Histopathology Image and Classifying Benign and Malignant State Using Machine Learning
No ratings yet
Detection of Breast Cancer From Histopathology Image and Classifying Benign and Malignant State Using Machine Learning
16 pages
Research Proposal UK
No ratings yet
Research Proposal UK
13 pages
Skin Cancer Detection Using Machine Learning
No ratings yet
Skin Cancer Detection Using Machine Learning
4 pages
Niall - Chiang Festival-1
No ratings yet
Niall - Chiang Festival-1
12 pages
Classification of Breast Cancer Histopathological
No ratings yet
Classification of Breast Cancer Histopathological
31 pages
Peerj 6201
No ratings yet
Peerj 6201
23 pages
A Novel Deep-Learning Model For Automatic Detection and Classification of Breast Cancer Using The Transfer-Learning Technique
No ratings yet
A Novel Deep-Learning Model For Automatic Detection and Classification of Breast Cancer Using The Transfer-Learning Technique
16 pages
103 Submission
No ratings yet
103 Submission
19 pages
An Optimal Deep Learning Approach For Breast Cancer
No ratings yet
An Optimal Deep Learning Approach For Breast Cancer
17 pages
Natural Language Processing
No ratings yet
Natural Language Processing
57 pages
Comparative Analysis of Breast Cancer Detection Using Cutting-Edge Machine Learning Algorithms (MLAs)
No ratings yet
Comparative Analysis of Breast Cancer Detection Using Cutting-Edge Machine Learning Algorithms (MLAs)
15 pages
Research Article: An Optimized Framework For Breast Cancer Classification Using Machine Learning
No ratings yet
Research Article: An Optimized Framework For Breast Cancer Classification Using Machine Learning
18 pages
Integrating Random Forest, MLP and DBN in A Hybrid Ensemble Model For Accurate Breast Cancer Detection
No ratings yet
Integrating Random Forest, MLP and DBN in A Hybrid Ensemble Model For Accurate Breast Cancer Detection
9 pages
Comparison of Breast Cancer Classification Models On Wisconsin Dataset
No ratings yet
Comparison of Breast Cancer Classification Models On Wisconsin Dataset
9 pages
Pptdfnse
No ratings yet
Pptdfnse
31 pages
Literature Survey For Lung Cancer Analysis and Prediction
No ratings yet
Literature Survey For Lung Cancer Analysis and Prediction
6 pages
How Can Machine Learning Be Used To Classify Breast Cancer?
No ratings yet
How Can Machine Learning Be Used To Classify Breast Cancer?
6 pages
Automated Discovery of Algorithms From Data: Nature Computational Science
No ratings yet
Automated Discovery of Algorithms From Data: Nature Computational Science
12 pages
Improving Breast Cancer Diagnosis Accuracy by Particle Swarm Optimization Feature Selection
No ratings yet
Improving Breast Cancer Diagnosis Accuracy by Particle Swarm Optimization Feature Selection
30 pages
Advanced Certificate Program in Data Science and AI Curriculum v1.0
No ratings yet
Advanced Certificate Program in Data Science and AI Curriculum v1.0
55 pages
Guo Et Al 2024 Machine Learning and New Insights For Breast Cancer Diagnosis
No ratings yet
Guo Et Al 2024 Machine Learning and New Insights For Breast Cancer Diagnosis
29 pages
Batch 27
No ratings yet
Batch 27
28 pages
Breast Cancer Diagnosis
No ratings yet
Breast Cancer Diagnosis
31 pages
Diagnostics 13 00161
No ratings yet
Diagnostics 13 00161
26 pages
1 s2.0 S0957417424014489 Main
No ratings yet
1 s2.0 S0957417424014489 Main
24 pages
Classification of Breast Cancer Using Transfer Learning and Advanced Al-Biruni Earth Radius Optimization
No ratings yet
Classification of Breast Cancer Using Transfer Learning and Advanced Al-Biruni Earth Radius Optimization
24 pages
Breast Cancer Diagnosis in An Early Stage Using Novel Deep Learning With Hybrid Optimization Technique
No ratings yet
Breast Cancer Diagnosis in An Early Stage Using Novel Deep Learning With Hybrid Optimization Technique
26 pages
Exploring Machine Learning Classifiers F
No ratings yet
Exploring Machine Learning Classifiers F
21 pages
Detection and Classification of Plant Leaf Diseases by Using Deep Learning Algorithm IJERTCONV6IS07082
No ratings yet
Detection and Classification of Plant Leaf Diseases by Using Deep Learning Algorithm IJERTCONV6IS07082
5 pages
XAI Benchmark For Visual Explanation
No ratings yet
XAI Benchmark For Visual Explanation
16 pages
Diagnostics 13 00683
No ratings yet
Diagnostics 13 00683
16 pages
A Novel SVM Kernel Classifier Technique Using Supp
No ratings yet
A Novel SVM Kernel Classifier Technique Using Supp
19 pages
Development of An Artificial Intelligence Based Breast 19lv6v3x
No ratings yet
Development of An Artificial Intelligence Based Breast 19lv6v3x
18 pages
10 1016@j Patrec 2019 03 022 PDF
No ratings yet
10 1016@j Patrec 2019 03 022 PDF
9 pages
Q2 - A Magnification-Independent Method For Breast Cancer Classification Using
No ratings yet
Q2 - A Magnification-Independent Method For Breast Cancer Classification Using
10 pages
Building Blocks of DNN PDF
No ratings yet
Building Blocks of DNN PDF
21 pages
TSP CMC 41558
No ratings yet
TSP CMC 41558
21 pages
Next Word Prediction Using Machine Learning Techniques: Cybersecurity November 2022
No ratings yet
Next Word Prediction Using Machine Learning Techniques: Cybersecurity November 2022
12 pages
Asset-V1 - MITx 6.86x 1T2021 Type@Asset Block@Slides - Lecture1 - Withcredits
No ratings yet
Asset-V1 - MITx 6.86x 1T2021 Type@Asset Block@Slides - Lecture1 - Withcredits
29 pages
IET Biometrics - 2022 - Sun - Breast Mass Classification Based On Supervised Contrastive Learning and Multi View
No ratings yet
IET Biometrics - 2022 - Sun - Breast Mass Classification Based On Supervised Contrastive Learning and Multi View
13 pages
Breast Cancer Histopathological Image Classification Using Convolutional Neural Networks-1
No ratings yet
Breast Cancer Histopathological Image Classification Using Convolutional Neural Networks-1
12 pages
Machine Learning Models For Breast Cancer Classifi
No ratings yet
Machine Learning Models For Breast Cancer Classifi
13 pages
Research Paper 1
No ratings yet
Research Paper 1
9 pages
Expert Systems With Applications: Bichen Zheng, Sang Won Yoon, Sarah S. Lam
No ratings yet
Expert Systems With Applications: Bichen Zheng, Sang Won Yoon, Sarah S. Lam
7 pages
Breast Cancer Detection Using An Ensemble Deep Learning Method
No ratings yet
Breast Cancer Detection Using An Ensemble Deep Learning Method
13 pages
BR Old
No ratings yet
BR Old
8 pages
Breast Cacner Detection
No ratings yet
Breast Cacner Detection
6 pages
Accuracy Improvement in Binary and Multi-Class Classification of Breast Histopathology Images
No ratings yet
Accuracy Improvement in Binary and Multi-Class Classification of Breast Histopathology Images
6 pages
Prediction and Early Diagnosis of Breast Cancer Using Deep Learning
No ratings yet
Prediction and Early Diagnosis of Breast Cancer Using Deep Learning
6 pages
Research Paper Diagnosis
No ratings yet
Research Paper Diagnosis
10 pages
Breast Cancer Classification Model Using Principal Component Analysis and Deep Neural Network
No ratings yet
Breast Cancer Classification Model Using Principal Component Analysis and Deep Neural Network
13 pages
Breast Cancer Prediction Using Machine Learning: Article
No ratings yet
Breast Cancer Prediction Using Machine Learning: Article
13 pages
Artificial Intelligence For Analyzing Academic Performance in Higher Education Institutions. A Systematic Literature Review
No ratings yet
Artificial Intelligence For Analyzing Academic Performance in Higher Education Institutions. A Systematic Literature Review
22 pages
A Deep-Learning-Based Novel Method To Classify Breast Cancer
No ratings yet
A Deep-Learning-Based Novel Method To Classify Breast Cancer
6 pages
Classification of Breast Cancer Using A Novel Neural Network-Based Architecture
No ratings yet
Classification of Breast Cancer Using A Novel Neural Network-Based Architecture
6 pages
TP 5 Aii
No ratings yet
TP 5 Aii
9 pages
DL 2P DDoSADF
No ratings yet
DL 2P DDoSADF
13 pages
Extreme Gradient Boosting With Synthetic Minority Over Sampling Technique For An Improved Breast Cancer Prediction
No ratings yet
Extreme Gradient Boosting With Synthetic Minority Over Sampling Technique For An Improved Breast Cancer Prediction
5 pages
Etasr 5115
No ratings yet
Etasr 5115
7 pages
Journal-Breast Cancer Prediction
No ratings yet
Journal-Breast Cancer Prediction
10 pages
Lecture 4 - Visualizing What Convnet Learn
No ratings yet
Lecture 4 - Visualizing What Convnet Learn
26 pages
Goni 2020
No ratings yet
Goni 2020
5 pages
5 Class
No ratings yet
5 Class
9 pages
Malignant and Benign Breast Cancer Classification Using Machine Learning Algorithms
No ratings yet
Malignant and Benign Breast Cancer Classification Using Machine Learning Algorithms
5 pages
Cloudx - EICT-IIT Roorke - ML & DataScience
No ratings yet
Cloudx - EICT-IIT Roorke - ML & DataScience
9 pages
Machine Learning Algorithms For Breast Cancer Prediction and Diagnosis Machine Learning Algorithms For Breast Cancer Prediction and Diagnosis
No ratings yet
Machine Learning Algorithms For Breast Cancer Prediction and Diagnosis Machine Learning Algorithms For Breast Cancer Prediction and Diagnosis
6 pages
8 Sequence Models - The Mathematical Engineering of Deep Learning (2021)
No ratings yet
8 Sequence Models - The Mathematical Engineering of Deep Learning (2021)
22 pages
Breast Cancer Modeling and Prediction Combining
No ratings yet
Breast Cancer Modeling and Prediction Combining
6 pages
Yuuy
No ratings yet
Yuuy
5 pages
Applsci 13 06711 v2
No ratings yet
Applsci 13 06711 v2
17 pages
Deep Multi-View Semi-Supervised Clustering
No ratings yet
Deep Multi-View Semi-Supervised Clustering
14 pages
Feature Selection For Breast Cancer Detection Using Machine Learning Algorithms
No ratings yet
Feature Selection For Breast Cancer Detection Using Machine Learning Algorithms
4 pages
1 s2.0 S1877050924029995 Main
No ratings yet
1 s2.0 S1877050924029995 Main
12 pages
Deeplearning in Agriculture
No ratings yet
Deeplearning in Agriculture
4 pages
Computers and Electronics in Agriculture: Konstantinos P. Ferentinos
No ratings yet
Computers and Electronics in Agriculture: Konstantinos P. Ferentinos
3 pages
ACM2
No ratings yet
ACM2
1 page
Affan Abbas: Computer Vision Engineer
No ratings yet
Affan Abbas: Computer Vision Engineer
1 page

Automated Identification of Breast Cancer Type Using Novel Multipath Transfer Learning and Ensemble of Classifier

Uploaded by

Automated Identification of Breast Cancer Type Using Novel Multipath Transfer Learning and Ensemble of Classifier

Uploaded by

Received 20 May 2024, accepted 9 June 2024, date of publication 17 June 2024, date of current version 28 June 2024.

Digital Object Identifier 10.1109/ACCESS.2024.3415482

Automated Identification of Breast Cancer Type

Corresponding author: Mohan Subaji ([email protected])

I. INTRODUCTION and dissemination of aberrant cells, a phenomenon encapsu-

VOLUME 12, 2024 87561

efficient feature extraction using an information-gain-based

VOLUME 12, 2024 87563

87564 VOLUME 12, 2024

VOLUME 12, 2024 87565

87566 VOLUME 12, 2024

VOLUME 12, 2024 87567

87568 VOLUME 12, 2024

VOLUME 12, 2024 87569

models, specifically VGG16 and ResNet50, to extract high- B. DISCUSSIONS

FIGURE 14. Assessing the classification performance of ridge classifier

FIGURE 15. Assessing the classification performance of SVM classifier

FIGURE 16. Assessing logistic regression classification performance:

VOLUME 12, 2024 87571

TABLE 2. Comparison of various feature extraction methods.

FIGURE 17. Assessing extra tree classification performance: employing

We examined other feature extraction techniques in Table 2

87572 VOLUME 12, 2024

FIGURE 19. Evaluating the performance of the logistic regression

FIGURE 24. Evaluating the performance of the logistic regression

VOLUME 12, 2024 87573

TABLE 5. Performance evaluation of various classifiers based on K-Fold

C. COMPARISON WITH STATE-OF-THE-ART APPROACHES

87574 VOLUME 12, 2024

VOLUME 12, 2024 87575

87576 VOLUME 12, 2024

VOLUME 12, 2024 87577

87578 VOLUME 12, 2024

You might also like