Multi-class Breast Cancer Classification Using CNN Features Hybridization
Multi-class Breast Cancer Classification Using CNN Features Hybridization
https://doi.org/10.1007/s44196-024-00593-7
RESEARCH ARTICLE
Abstract
Breast cancer has become the leading cause of cancer mortality among women worldwide. The timely diagnosis of such
cancer is always in demand among researchers. This research pours light on improving the design of computer-aided detec-
tion (CAD) for earlier breast cancer classification. Meanwhile, the design of CAD tools using deep learning is becoming
popular and robust in biomedical classification systems. However, deep learning gives inadequate performance when used
for multilabel classification problems, especially if the dataset has an uneven distribution of output targets. And this problem
is prevalent in publicly available breast cancer datasets. To overcome this, the paper integrates the learning and discrimina-
tion ability of multiple convolution neural networks such as VGG16, VGG19, ResNet50, and DenseNet121 architectures for
breast cancer classification. Accordingly, the approach of fusion of hybrid deep features (FHDF) is proposed to capture more
potential information and attain improved classification performance. This way, the research utilizes digital mammogram
images for earlier breast tumor detection. The proposed approach is evaluated on three public breast cancer datasets: mam-
mographic image analysis society (MIAS), curated breast imaging subset of digital database for screening mammography
(CBIS-DDSM), and INbreast databases. The attained results are then compared with base convolutional neural networks
(CNN) architectures and the late fusion approach. For MIAS, CBIS-DDSM, and INbreast datasets, the proposed FHDF
approach provides maximum performance of 98.706%, 97.734%, and 98.834% of accuracy in classifying three classes of
breast cancer severities.
Keywords Breast cancer · Deep neural networks · Mammogram images · Feature fusion · Late fusion · Transfer learning
2
* T. R. Mahesh School of Computer Science Engineering and Information
[email protected] Systems (SCORE), Vellore Institute of Technology,
Vellore 632014, India
Sannasi Chakravarthy
3
[email protected] School of Science, Engineering and Environment, University
of Salford, Manchester, UK
N. Bharanidharan
4
[email protected] Department of Electrical and Computer Engineering,
Lebanese American University Byblos, Byblos, Lebanon
Surbhi Bhatia Khan
5
[email protected] Department of Computer Science and Engineering, Faculty
of Engineering and Technology, JAIN (Deemed-to-Be
V. Vinoth Kumar
University), Bangalore 562112, India
[email protected]
6
Department of Business Administration, College of Business
Ahlam Almusharraf
and Administration, Princess Nourah Bint Abdulrahman
[email protected]
University, P.O. Box 84428, 11671 Riyadh, Saudi Arabia
Eid Albalawi 7
Department of Computer Science, School of Computer
[email protected]
Sciences and Information Technology, King Faisal
1 University, Al Hofuf 400‑31982, Al Ahsa, Saudi Arabia
Department of ECE, Bannari Amman Institute
of Technology, Sathyamangalam 638401, India
Vol.:(0123456789)
191 Page 2 of 19 International Journal of Computational Intelligence Systems (2024) 17:191
2.1 Deep Learning for Breast Cancer Classification The result of the work revealed that the model provided a
classification performance of 90% accuracy. Here, the study
In recent days, several machine learning (ML) and deep investigates on providing solutions for a binary classification
learning (DL) techniques have emerged for classifying task (benign vs malignant). In the same year, Thijs et al. [8]
breast tumors using different input datasets. In the year of presented the design of large-scale DL for the classification
2017, Neeraj et al. [7] developed a CAD system for breast problem of breast cancer.
mass detection and classification of mammograms. For the The authors presented a detailed comparison between the
classification part, they used a DL architecture which was recently evolved mammographic CAD tool, which relies on
pretrained with hand-crafted features. And they used mam- manually extracted features and a CNN. The authors trained
mograms from the INbreast database for implementation. the above systems using privately obtained mammogram
191 Page 4 of 19 International Journal of Computational Intelligence Systems (2024) 17:191
data of around 45,000 images. And concluded that the deep Chougrad et al. [11], in the year 2020, proposed a CAD
CNN architectures performed well in classification and system that intends to portray spontaneous label correlation
obtained results of 85.2% accuracy. Here, the study involves relationships for mammogram classification. For this, they
the investigation of a binary classification task. In the year utilized the pretrained CNN models for the attractive nature
of 2018, Xiaofei et al. [9] evaluated ten distinct deep CNN of transfer learning. The authors used a different approach
models and revealed that integrating both image augmenta- for fine-tuning the models by utilizing an optimization tech-
tion and CNN-based transfer learning techniques is the most nique that uses Stochastic gradient descent (SGD) adopted
efficient way to improve classification performance in breast with a decaying learning rate. The work resulted in 0.687
cancer problems. For this, the authors utilized privately and 0.617 of F1 score performance as obtained for the classi-
obtained mammogram datasets. Here, the study analyses the fication problem using INbreast and MIAS databases. Here,
binary classification problem. In the same year, Yemini [10] the study investigates on providing solutions for a multi-
developed a CAD tool using CNN based transfer learning label classification task using transfer learning. In the same
approach with Google Inception-V3 as a base model. They year, Shu et al. [12] presented a CAD system using CNNs
evaluated this using the digital mammograms taken from the for breast cancer classification using two pooling struc-
INbreast dataset and obtained a result of 0.78 AUC. Here, tures that are different from the conventional one. Here, the
the study investigates on providing solutions for a binary extraction of features is done, and the pooling structures are
classification task (normal vs abnormal). used in dividing the mammogram input regions with higher
International Journal of Computational Intelligence Systems (2024) 17:191 Page 5 of 19 191
malignant probabilities in accordance with the extricated (GLCM) features are fused with CNN features, whereas
features. The researchers used the DenseNet169 architecture classification is performed using the SVM algorithm.
for feature learning. In addition, they modified the architec- Another work of [17] portrays that the authors employed
ture’s last layer in accordance with the pooling structure for multi-structure-based fusing of CNN features used for
classifying the input feature vectors. The work was tested the classification of satellite remote sensing scenes.
using the INbreast dataset and attained a classification result Here, GoogLeNet, VGG-16, and CaffeNet are adopted
of 92.2% accuracy. Here, the study investigates on providing for extracting the feature vectors and are fused using the
solutions for a multi-label classification task. fusion network. In the research of [18], an ensemble of
In addition to the above works, we the authors did some multiple deep architectures is fused for classifying the
experimentations using transfer learning approaches for medical image inputs. The results of this work revealed
binary and multi-class classification problems. In the that the ensemble technique provides better classification
year 2021 [13], the deep features from mammograms are when combined with fused features. And in the research
extracted using AlexNet, DarkNet19, GoogleNet, VGG16, work of [19], they developed a CAD model for the clas-
and ResNet CNN models where classification is done typical sification of skin lesions using the fused features from
ML algorithms such as K-nearest neighbour (KNN), Naïve VGG16 and AlexNet models. The researchers found that
Bayes (NB), Ensemble, and support vector machines (SVM) the classification performance of fused features provides
algorithms. Here, the hyperparameters are tuned automati- better accuracy than the individual feature vectors. The
cally using the Bayes optimization techniques. In the year research works of [16–19] reveal that deep learning using
2022 [14], ResNet18-based deep feature extraction is done, CNNs emerges as one of the most substantial machine
and the classification is further proceeded using extreme learning tools in medical classification problems. It has
learning machine (ELM) model optimized with an enhanced outpaced the classification performance of conventional
crow-search algorithm. In the same year of 2022 [15], exper- classification models and human recognition. The con-
imentations using transfer learning approaches are done with volution operation in CNNs simplifies an input image
different strategies used for deep feature extraction, feature from several thousands of pixels to smaller feature maps.
selection, feature fusion, and feature classification. And all This makes the input dimension as a reduced one with
these works were carried out using MIAS, CBIS-DDSM, significant representations. Here, it is also noted that the
and INbreast datasets with a maximum performance of 95% employment of the transfer learning concept is much more
classification accuracy. In the same way, a new approach of helpful in extracting deep features. It is one of the machine
feature fusion (FHDF) is proposed in this paper to enhance learning approaches where a CNN architecture trained on
the performance of multi-class classification problems solving one problem is re-used on an another related prob-
further. lem. Moreover, the mentioned research works utilized the
From the literature [7–14], it is inferred that most concept of feature fusion for improved and better feature
researchers have focused on the problem of binary classi- representation of applied images. As a result, these deep
fication; however, multi-class classification is significant feature fusion-based approaches provide supreme classi-
in real-time scenarios. And noted that some researchers fication results as compared with the conventional hand-
employed pre-segmented image inputs for performing clas- crafted and individual deep features.
sification tasks. Furthermore, the transferred architecture is The summary of significant contributions of the pro-
often incapable of capturing the representations of image posed work are:
inputs, and conventional feature vectors cannot provide the
optimality of CAD systems in a promising manner. Thus, 1. To the extent of our knowledge, this paper is the first one
this work examines the hybrid fusion approach to address to use the FHDF approach for the three-class classifica-
the above-said problems. tion of breast cancer.
2. A better preprocessing approach is employed for pecto-
2.2 Related Works in Computer Vision Tasks Using ral muscle removal in mammograms.
Feature Fusion Approach 3. The deep learning models with improved architecture,
namely VGG16, VGG19, ResNet50, and DenseNet121,
Several research works employ the fusion of extracted are presented for extricating the complementary feature
features, some of which are given below. In the work vectors pertaining to the different depths of the CNN
of [16], the researchers developed a hybrid fusion CAD models.
model based on the integration of early and late fusion for 4. An enhanced FHDF approach is proposed to adaptively
the problem of glaucoma classification. Here, the central fuse the CNN features through dense layer combined
and Hu moments, and gray level co-occurrence matrix with softmax, batch normalization, and dropout layers.
191 Page 6 of 19 International Journal of Computational Intelligence Systems (2024) 17:191
3 Proposed Framework If more than one line is obtained by using the above pro-
cedure, then the line which provides the least loss of infor-
This section presents information on how the mammogram mation will be selected. Finally, the values of pixels that are
inputs are preprocessed for further stages. How are the covered by the shortlisted lines will be set as zero (black),
resultant mammograms augmented? How deep features are and thus the PM is removed. A sample illustration of PM
extracted from these augmented data? How is the proposed removal in mdb021 mammogram of MIAS dataset is pre-
fusion of hybrid deep features network constituted? sented in Fig. 2. Furthermore, the mammogram images of
INbreast database are FFDM, so every finding and its details
are substantial for further classification stage; however, the
3.1 Preprocessing of Mammograms above-used approach of adaptive median filtering [23] is
adopted for impulse noise removal in the mammograms of
In MIAS and CBIS-DDSM databases, the dark and thick- INbreast dataset.
ened borders on either side of mammogram images are
cropped manually. In this dataset, the mammograms are 3.2 Data Augmentation
obtained with medio-lateral oblique (MLO) viewpoint.
Herein, the significant part of preprocessing lies in the The architecture of deep learning models works well if the
removal of pectoral muscles (PM). These PMs are the models are trained using a larger sample of input images
regions located on either the top right or left side of the [28]. However, the adopted mammogram databases are com-
breast and quite the opposite to the direction of nipple posed of fewer hundred samples due to limited patient avail-
location. For successful PM removal, the left-view mam- ability. Moreover, the overfitting problem of the employed
mogram images are flipped in an uniform manner. This classification problem needs to be addressed. And the above
flipping of images is done to make all the inputs as right- issues are taken care of using the process of image aug-
MLO view mammogram images so that the PM is located mentation that intends to focus on increasing the amount of
uniformly at the upper-left-side portion. A quite rudimen- mammograms using existing samples. These newly gener-
tary idea for automatic mammogram flipping is detect- ated mammograms are actually distinct variants of the origi-
ing the image orientation. This could be easier since the nal mammograms. The proposed work employs augmenta-
background pixel areas of inputs are totally black and con- tion utilising the rotation of mammograms by the degrees of
sequently provides us with the breast orientation on either 45, 90, 135, 180, 235, and 270 and through horizontal and
half of the mammograms. Before proceeding further, the vertical flipping of inputs. In this way, each input sample for
impulse noise present in the images is filtered with an every class is augmented eight times which can be illustrated
adaptive median filter [23] approach without disturbing graphically in Fig. 3.
the non-affected pixels. In addition, the contrast of the
mammograms is adaptively enhanced using the adaptive 3.3 Feature Extraction
histogram equalization (AHE) [24] technique. After noise
removal and appropriate contrast enhancement, Sobel [25] 3.3.1 Transfer Learning Approach
filter with canny edge detection [26] is employed with a
threshold value of 1.8 for better detection of edges. Then In recent days, DL has been the emerging approach for solv-
Hough transform [27] technique is applied to obtain a list ing several real-time classification and recognition problems.
of output lines. Here, every detected line is characterized Here, CNNs are vital in providing real-time solutions for
by an object using three parameters: the first one is the dis- biomedical allied fields [29]. CNNs are the key network of
tance (dist) i.e. calculating the perpendicular distance of deep learning and are prevalent for research in wider areas.
lines from the origin, the second one is the angle (degrees) Compared to conventional machine learning (ML) algo-
i.e. calculating the degree made by the perpendicular from rithms, CNNs are much more robust to noise and uneven
the x-axis on the positive side (nearer to the origin), and transformation. And this makes it more popular in solving
the third one is calculating the two points (point1 and problems of biomedical image analysis [30]. The CNNs are
point2) on the detected line. Now possible lines for PM composed of tens or hundreds of layers in which each layer
segmentation can be shortlisted by examining a simple can learn to detect distinct features of an input image. Here,
condition of whether the values of two parameters (dist the filters play a major role in applying them to every train-
and angle) of each line lie inside the below-given intervals; ing image with a distinct resolution, and the obtained output
is applied to further layers [31]. In this way, the architecture
MIN_ANGLE < = angle < = MAX_ANGLE and of CNN is composed of convolution layers (learning low
MIN_DIST < = dist < = MAX_DIST. and high-level features), pooling layers (for reducing the size
of the convoluted feature vectors either through average or
International Journal of Computational Intelligence Systems (2024) 17:191 Page 7 of 19 191
max-pooling), and a fully connected (fc) layer that connects layers than VGG19. For reducing the errors, ResNet models
each neuron of every layer to its succeeding ones for image use shortcut or skip connections that merely perform identity
analysis based on the multilayer perceptron [32]. mapping [31]. ResNet50 is one variant with 48 convolu-
For training a CNN from scratch, it always claims more tional, one maxpooling and one average pooling layer. The
time with higher computing power and data. In the bio- skip connections in ResNet50 bypass some layers and send
medical field, the imaging databases are generally in the the output as an input to the subsequent layers. Thus, provid-
order of 102 to 104 since sorting a larger annotated database ing an alternate path for the gradient with backpropagation.
is quite impractical. In addition, the quality of the image Rather than deriving representational power from highly
will also become substandard. For this, the solution uses an wider or deeper models, DenseNet architectures utilize the
interesting part of DL, the transfer learning (TL) approach, potential of the network through the concept of feature reuse
which intends to utilise knowledge gained while generat- [33]. The layers in DenseNet121 model spread their weights
ing a solution for one task and employing it on another but across several inputs and thus make use of deep layers to
related task [34]. In place of learning from scratch, TL uses reuse features that are extricated earlier. The degradation
patterns already trained on the related task. Herein, the problem [30] encountered in deep learning is alleviated by
approach has two phases: the first one involves the selec- using skip connections in ResNet50 and feature reusability
tion of a pretrained network trained on a larger volume of a in DenseNet121 models. The structure of the four transfer
standard database, which is necessarily related to the task learning models is illustrated in Fig. 1.
that we need to solve; and the second one is fine-tuning of The work employed the VGG series, ResNet50, and
the selected model in accordance with the size and similar- DenseNet121 models in the transfer learning approach
ity of the considered problem (image inputs) [35]. Since the where the weights were pre-trained originally in the database
input datasets differ from the input of the pretrained model, of ImageNet [33]. This database comprises a training set of
the work fine-tuned and freeze some layers in the employed about 1.2 million images, a validation set of about 50,000
deep CNN models as given in Fig. 1. images, and a testing set of about 1,00,000, and all these
The work involves the training and testing of some inputs correspond to 1000 class labels. As illustrated in step
advanced pretrained DL architectures, namely VGGNet, 3 of Fig. 1, the early layers of each DL architecture are fro-
InceptionNet, ResNet, ResNet-V2, Inception-ResNet-V2, zen where more generic features are captured. And the suc-
NasNet, XceptionNet, and DenseNet models and noted cessive layers of the architectures are retrained using fine-
that the combination of VGG16, VGG19, ResNet50, and tuning by training on digital mammogram inputs to further
DenseNet121 gives the superior performance for this breast acquire more database-specific features. In the end, the work
cancer classification problem in ablation analysis presented fine-tuned the own FC classifier as shown in step 3 of Fig. 1.
in Sect. 4.2. Here, the principle of VGG models is the use of For example, as illustrated in Fig. 4, the VGG16 model is
smaller-sized convolutional filter kernels, which allow the considered where the first fewer convolutional blocks utilize
networks to possess a larger amount of weight layers [30]. the parameters (W1 , W2 , … Wk ) that are already trained (pre-
This means that more layers will result in enhanced perfor- trained) on the ImageNet database.
mance. The concept of both VGG16 and VGG19 models is The size of the preprocessed mammograms for all four
the same, except that VGG16 has three fewer convolution TL models is (224 × 224 × 3) as shown in Fig. 4b. The
191 Page 8 of 19 International Journal of Computational Intelligence Systems (2024) 17:191
learning rate is tuned as 10−3 for the first fifty epochs and convolutional block layers followed by FC layers are fine-
further, the training is continued for another fifty epochs tuned in the proposed work. Figure 4b shows a sample
with 10−5 as the learning rate. The batch size for the train- feature map visualization of the VGG16 model where the
ing data is kept as 32 whereas for testing data, the batch output of the first convolutional layer (224 × 224 × 64) is
size is considered as 1, and adaptive moment estimation, visualized. The 64 feature maps are plotted as an 8 × 8
Adam approach [36] is used for optimization. Figure 4a square of images. These feature maps illustrate how deep
illustrates the entire transfer learning approach using the mammogram's interior parts, edges, and other fine
the VGG TL model where the first fewer layers are fro- details are learned for further classification. Herein, for
zen i.e., pre-trained on the ImageNet database, and later better visualization of feature maps, the cmap of ‘hot’ is
used in matplot library as given in Fig. 4b.
Fig. 4 a Visualization of transfer learning approach where parameters lution (C)], b visualization of the feature maps of first convolutional
are transferred from pre-trained CNN and fine-tuned on digital mam- layer (224 × 224 × 64) of VGG16 as an 8 × 8 image matrix
mogram databases [fully connected layer (FC), pooling (P), convo-
International Journal of Computational Intelligence Systems (2024) 17:191 Page 9 of 19 191
3.3.2 Late Fusion (LF) Approach the outline of the proposed FHDF network. In this figure,
FV16 , FV19 , FRes , and FDen represent the normalized features
The approach of the late fusion technique is one of the extricated from the dense layer (FCL) with 1024 neurons of
ensemble methods of classification where the final output the four employed TL models: VGG16, VGG19, ResNet50,
is based on the maximum number of decision by individual and DenseNet121. The proposed network is composed of
classifiers and weights. This approach is generally used in a concatenation layer and a fully connected layer with an
ML problems to improve classification performance. In the activation function as softmax for integrating distinct fea-
proposed work, the final classification result obtained using tures. Furthermore, batch normalization and dropout lay-
the four distinct TL networks (VGG16, VGG19, ResNet50, ers are utilized between the above two layers for avoiding
and DenseNet121) is integrated by adopting a majority vot- overfitting and to optimize the performance during training
ing approach. Here, each output class is calculated according of data. Herein, the concatenation layer provides the fused
to the majority of votes obtained for that particular class tar- feature vectors with a size of 4096 . This way of effective
get. If m = 1,2, 3, … X and n = 1,2, 3, … Y , then the decision feature fusion can be represented as
of ith classifier can be given as E(m, n) ∈ (0, 1). Thus, the LF 4
approach for majority voting is illustrated as ⋃
F(i) = F n (i), (2)
x x i=1
y
∑ ∑
E(m, n) = maxn=1 E(m, n), (1)
where indicates the concatenation operation, F n (i) rep-
⋃
m=1 m=1
resents the nth feature vector, and F(i) denotes the output
where m and n represent the number of classifiers used and vector of ith fused features.
output classes; X and Y represent the maximum available
classifiers and output classes.
4 Experiments and Analysis
3.3.3 Proposed Fusion of Hybrid Deep Features (FHDF) 4.1 Preparation of Input Data for Evaluation
Network
The research evaluation considers three different mam-
In the problems of image analysis and classification, the role mogram datasets, namely mammographic image analysis
of feature representation is significant in improving clas- (MIAS) [20], curated breast imaging subset of digital data-
sification performance. As from the literature [16–19], the base for screening mammography (CBIS-DDSM) [21] and
approach of feature fusion (FF) is found to be a noteworthy INbreast [22] databases. Here, the MIAS database is con-
and efficient one in biomedical image classification. This stituted by a UK research crew. The digital mammograms
approach integrates multiple related feature vectors into a available in this dataset are publicly accessible in peipa
single one, which includes rich information and provides archive of Essex University [20] and downloaded in.pgm
more contribution (representation) as compared with the format. Here, during the acquisition, the digitization of
initial feature inputs. In the literature, there are two tech- films is done with a fifty-micro-meter pixel edge, creating
niques followed for feature fusion namely serial and parallel the mammogram output of 1024 × 1024. The image cor-
approaches [18]. In the first approach, the idea is to concat- pus consists of a sum of 322 digital mammogram images
enate two feature sets into a union vector. For example, for corresponding to both side breast parts. The dataset is
an image with a dimension of (x, y), if F1 and F2 are the two
feature sets extricated, then the serially fused one can be rep-
resented as FS = (x + y). In the latter approach, the idea is to
concatenation of feature sets using a complex vector. For the
above example, the parallel feature fusion with an imaginary
component (i) can be represented as FP = F1 + iF2.
The above two feature fusion approaches have the limi-
tation of being unable to utilize the original feature inputs
because the two methods are aimed at creating a new feature
set, either FS or FP . And the above approaches suffer from
the idea of concatenating multiple feature vectors. In the
proposed work, an idea of the fusion of hybrid deep fea-
tures (FHDF) is employed by combining feature inputs extri-
Fig. 5 Proposed framework of fusion of hybrid deep features (FHDF)
cated from multiple deep-TL models. Figure 5 illustrates
network
191 Page 10 of 19 International Journal of Computational Intelligence Systems (2024) 17:191
composed of a Mediolateral oblique (MLO) view of acqui- Table 1 Digital mammograms for evaluating the proposed work
sition. These mammograms are separated in the dataset as Database Output class Mam-
either normal or abnormal samples. In this, well-defined, mogram
spiculated, ill-defined, architectural distortion, calcifica- inputs
tion masses, and asymmetry are characterized as abnormal
MIAS Normal 207
lesions. In addition, the benign and malignant severities
Benign 64
are characterized as abnormal samples.
Malignant 51
The second dataset taken for evaluation is the DDSM
CBIS-DDSM Normal 250
database which is constituted by the University of South
Benign 200
Florida [21]. The database is acquired using 2500 approxi-
Malignant 120
mate cases with forty-three volumes. In addition, the data-
INbreast Normal 66
set is constituted using 2 basis views of angles: cranio-
Benign 56
caudal (CC) and MLO for every patient. The work adopts
Malignant 57
an MLO view of acquired images as found in the MIAS
dataset. Moreover, the research employs the mammogram
images from the updated DDSM i.e., the CBIS-DDSM
database. The last one is the INbreast dataset [22] where Table 2 Ablation experimentation on fusion of different features
the acquisition device used is Mammo-Novation Siemens (MIAS dataset)
which employed amorphous selenium-based solid-state VGG16 ✓ ✓ ✓ ✓
detectors for supporting the resolution of 14-bit with VGG19 ✓ ✓ ✓ ✓
70-mm pixel sizes. Here, the breast images are available ResNet50 ✓ ✓
in DICOM format and obtained at an imaging center in DenseNet121 ✓ ✓
association with the National Committee of Data Protec- Overall accuracy (%) 92.755 96.507 95.213 98.706
tion from 2008 to 2010. Kappa (𝜅) 0.861 0.933 0.909 0.975
In the above-said datasets, standard and good-quality
digital mammograms are available. However, the INbreast
dataset contains breast images in the form of full-field 4.2 Experimental Setup and Ablation Analysis
digital mammograms—FFDM images, which provide bet-
ter recognition of microcalcification than digital mammo- The proposed work is carried out in a computer system
grams [22]. Herein, the MIAS and CBIS-DDSM datasets having 16 GB RAM, 1 TB Hard-disk, and an Intel Core
are commonly used benchmark databases that can be use- i7 processor running on Windows 10 operating system. In
ful for evaluating many research methods. In this work, addition, the employed system was equipped with a 2 GB
we chose to use the INbreast database because it contains configuration of NVIDIA GPU. Moreover, the work utilized
high-quality FFDM images. Furthermore, this dataset is Jupyter-based python IDE for implementation and evalua-
the only available public dataset that comprises FFDM tion. The IDE is configured with many machine learning
images that give precise and accurate information about libraries such as Pandas, OpenCV, Sklearn, MatplotLib,
every detail. By using these three datasets, the paper aims Keras, TensorFlow, and PyTorch. For the evaluation of the
in classifying the mammogram inputs as either normal or work, the research adopted the standard overall accuracy and
benign or as malignant tumors. The number of mammo- total misclassification cost as metrics for performance analy-
gram inputs taken for evaluating the proposed CAD system sis. Further, the results are validated using Cohen’s kappa (𝜅)
is given in Table 1. measurement [37]. The above metrics are calculated from
After preprocessing and augmentation, the MIAS, the elements of the confusion matrix: TP, FP (true and false
CBIS-DDSM, and INbreast databases comprise of a total positives), TN, and FN (true and false negatives).
of 2576, 4560, and 1432 digital mammograms. The pro- With the above experimental setup, an ablation study is
posed work involves the stratified fashion of data prepara- carried out to further demonstrate the effectiveness of select-
tion where training and testing sets take 70% and 30% of ing the best combination of deep features. This is done by
inputs from both datasets. Herein, the testing set is fur- considering the fusion of different features, as illustrated
ther subdivided for validation of the work. In addition, the in Step 4 of Fig. 1. The ablation experimentation results
work employed a fivefold cross-validation strategy which on the MIAS dataset are summarized in Table 2. Here, the
makes use of stratified partitioning for its split. This claims results reveal that every deep feature we consider plays a
that the proposed work confirms that every mammogram key role in the classification performance, especially the
input is being tested in an equal manner and thus avoiding fusion of all four features. Also reveals clearly that even if
any bias error. only one combination is used, the proposed approach can
International Journal of Computational Intelligence Systems (2024) 17:191 Page 11 of 19 191
be very competitive compared to others. In this way, the the overall accuracy (%) is plotted in the primary axis under
work utilizes the fusion of appropriate deep features for the the total misclassification which is plotted in the secondary
remaining two datasets, which has brought the classification axis. And the obtained range ( 0 → 1) of the kappa statis-
performance supreme. tic measure is augmented into the range of ( 0 → 100 ) for
better visualization of result comparison. As from Figure,
4.3 Results and Analysis VGG16 performs well as compared with the VGG19 model
for all three datasets. That is, VGG16 provides better results
4.3.1 Overall Performance Analysis of accuracy of 92.367% (MIAS), 89.839% (CBIS-DDSM),
and 92.308% (INbreast) as compared to the performance of
The overall performance of classifiers for the three datasets VGG19. The skip connections used in ResNet50 make it to
along with the existing ones is listed in Table 3. This perfor- provide a better classification accuracy of 94.049% (MIAS),
mance is calculated for three-class breast cancer problems 93.202% (CBIS-DDSM), and 94.172% (INbreast) when
with normal, benign, and malignant targets. Moreover, the compared with the above two models. Due to improved
performance analysis is graphically illustrated in Fig. 6. feature propagation and reduced vanishing-gradient abil-
Here, the total misclassification represents how often the ity, the DenseNet121 model provides a better classification
classification model is incorrect in predicting the actual accuracy of 94.825% (MIAS), 94.363% (CBIS-DDSM),
negative and positive output targets, i.e., it can be other- and 95.338% (INbreast) as compared to the performance
wise termed as classification error. This metric is calculated of the above three models. In addition, the ensemble-based
as a concatenated result of normal vs benign and malig- LF-approach provides a higher classification accuracy of
nant, benign vs normal, and malignant vs normal cases. 96.378% (MIAS), 96.199% (CBIS-DDSM), and 97.203%
The overall classification accuracy is calculated in percent- (INbreast) over the above-discussed models. Consequently,
age (%), which gives us the amount of correct outcomes in the proposed FHDF technique yields a supreme classifica-
predicting the actual negative and positive targets. As from tion accuracy of 98.706% (MIAS), 97.734% (CBIS-DDSM),
the literature [7–14], the overall classification accuracy and 98.834% (INbreast) over others. The above-attained
could be very misleading since the metric does not con- results are validated further using the kappa coefficient
sider the class-imbalance of input datasets. To overcome where the highest value of the agreement is obtained for the
this, a robust statistic metric, Cohen’s kappa (𝜅) parameter proposed FHDF method, i.e., 0.975 (MIAS), 0.965 (CBIS-
(0 → 1) is considered in this work. Here, the metric assesses DDSM), and 0.982% (INbreast). In addition, the graph in
the degree of agreement among the employed classification Fig. 6 shows that whenever the accuracy values are found
models by calculating the inter-rater reliability. In Fig. 6, to be higher, the misclassification rate will become lower.
Table 3 Performance analysis Database Classification models Total misclassifi- Overall classification Kappa (𝜅)
of the proposed work cation accuracy (%)
Accordingly, the proposed method has the least misclassifi- false negatives and false positives are not same, then preci-
cation rate, corresponding to all three datasets. sion and recall measures can be used. As from the precision
and recall measures definition, both cannot be higher. For
4.3.2 Insight Performance Analysis a model, if recall is increased, then precision will be lower
and vice-versa. Thus, F1 score is a metric which gives a har-
The above discussion based on Table 3 and Fig. 6 focussed monic mean of the above two measures. Here, the harmonic
on the overall performance analysis. However, the research mean is more suitable for calculating ratios between recall
focuses on both detection and classification of severities. and precision. So, F1 score will be higher only if both recall
That is, detecting the disease as either normal or abnormal, and precision are higher. Thus the research work utilizes the
and further classifying the abnormal severities as either above-discussed metrics for assessing the employed models.
benign cases or malignant. This formulates the solution to Table 4 illustrates the confusion matrix obtained for the
a three-class classification problem where the mammogram test data of MIAS, CBIS-DDSM, and INbreast datasets
inputs need to be classified into three output targets namely using the proposed FHDF technique. In this way, the indi-
normal, benign, and malignant. Hence, the individual or vidual performance analysis of classification models for the
insight performance analysis of all classification models three-class classification is tabulated in Tables 5, 6, and 7,
should need to be done for each output target, respectively. respectively. The third column (no. of classified outputs)
Furthermore, the insightful analysis of the classifier’s perfor- represents the overall classified samples for each output
mance is significant because of the unavoidable class imbal- class, as shown in Table 4. And Fig. 7 illustrates a plot that
ance problem in the employed input datasets. shows the performance analysis of LF and the proposed
Accuracy metric highlights how well the model correctly FHDF approach for each class of the MIAS, CBIS-DDSM,
discriminates normal, benign, and malignant cases with and INbreast databases.
respect to the total inputs [13]. Precision metric concen- From Tables 5, 6, and 7 for the employed mammogram
trates on providing how much fraction of predicted positive databases, the VGG16, VGG19, ResNet50, DenseNet121,
cases is actually positives [13]. Recall metric calculates how and LF models give their maximum performance of clas-
well the model predicts the positive cases correctly with sification while classifying the normal cases. Hence, the
respect to total actual positives [13]. F1 score is calculated as substantial difficulty lies in discriminating the abnormal
a harmonic mean of two metrics: recall and precision [14]. severities (benign/malignant), which is why these mod-
Here, accuracy metric is greater only if the input dataset is els provide overall poor performance as portrayed before.
symmetric, i.e., the values of false negatives and false posi- Accordingly, the VGG16 model yields the highest clas-
tives are almost the same [14]. But the research employed sification performance of accuracy (95.08%), precision
three different asymmetric datasets. When the amount of (96.36%), recall (95.97%), and F1 score (96.12%) in
International Journal of Computational Intelligence Systems (2024) 17:191 Page 13 of 19 191
discriminating the normal cases for the mammograms of classification performance of accuracy (96.77%), precision
the MIAS dataset. The VGG19 model yields the highest (97.77%), recall (97.18%), and F1 score (97.57%) in dis-
classification performance of accuracy (93.01%), preci- criminating the normal cases for the mammograms of the
sion (95.48%), recall (93.56%), and F1 score (95.44%) in MIAS dataset. But in the case of LF and FHDF models,
discriminating the normal cases for the mammograms of their obtained classification result is good irrespective of
the MIAS dataset. The ResNet50 model yields the highest the database type, i.e., in specific, the proposed FHDF
classification performance of accuracy (96.25%), precision approach of classification provides superior classification
(97.36%), recall (96.78%), and F1 score (97.49%) in dis- accuracy in the range of 98.17–99.3%. Furthermore, as
criminating the normal cases for the mammograms of the compared with the four transfer learning models, Fig. 7
MIAS dataset. The DenseNet121 model yields the highest
191 Page 14 of 19 International Journal of Computational Intelligence Systems (2024) 17:191
Table 5 Individual performance Target label Mammogram No. of classi- Acc (%) Pre (%) Recall (%) F1 score (%)
analysis of the classification inputs fied outputs
models for MIAS dataset
VGG16 model
Normal 497 495 95.08 96.36 95.97 96.12
Benign 154 159 94.70 85.53 88.31 87.43
Malignant 122 119 94.95 84.87 82.78 84.59
VGG19 model
Normal 497 487 93.01 95.48 93.56 95.44
Benign 154 161 92.88 80.74 84.41 83.16
Malignant 122 125 93.4 78.40 80.32 79.28
ResNet50 model
Normal 497 494 96.25 97.36 96.78 97.49
Benign 154 158 95.60 87.97 90.26 88.96
Malignant 122 121 96.25 88.43 87.71 88.17
DenseNet121 model
Normal 497 494 96.77 97.77 97.18 97.57
Benign 154 155 96.25 90.32 90.90 91.42
Malignant 122 124 96.64 88.71 90.16 89.76
LF technique
Normal 497 498 97.54 97.99 98.18 98.31
Benign 154 155 97.54 93.54 94.15 94.29
Malignant 122 120 97.67 93.33 91.80 93.68
FHDF technique
Normal 497 497 99.22 99.39 99.39 99.56
Benign 154 155 99.09 97.41 98.05 98.49
Malignant 122 121 99.09 97.52 96.72 97.34
reveals that the proposed approach performs better for the illustrate the significance of the proposed methodology for
input of FFDM images taken from INbreast data. In addi- multiclass breast cancer classification.
tion, the proposed approach provides superior results in
discriminating both normal and abnormal severity cases 4.4 Performance Comparison of Proposed CAD
for all data inputs. This only makes the proposed FHDF Model with the Existing Research Models
classification approach to yield supreme overall-classifi-
cation performance as illustrated in Fig. 6 and Table 3. While comparing the research on breast cancer classification
Hence, from Tables 5, 6, 7 and Figs. 6, 7, the proposed problems with other biomedical research works, the research-
methodology has outperformed in discriminating whether ers are actively endeavoring to give new solutions for early
the mammogram is normal or abnormal and if there is breast cancer diagnosis. However, the comparison among
any abnormality, then it is fine enough to discriminate the research works is implicitly difficult due to several fac-
further the severities as either benign case or malignant tors such as employed mammograms with distinct databases,
class. The above results are attained not only due to the the amount of data inputs, input samples chosen for assess-
use of the FHDF model but also because of the suitable ment, the approach of extricating and selecting feature vec-
preprocessing approach (Fig. 2) applied with the appro- tors, parameter tuning, classification strategy, and the way of
priate fusion of deep features. In addition to the above evaluating the performance. The performance comparison of
performance and comparative analysis, the ANOVA test the proposed approach is listed and summarized from several
is performed for the employed classification models for findings as given in Table 9.
further statistical validation. Table 8 illustrates the analy-
sis of variance (ANOVA) results and its statistical exami-
nation for the employed problem. As listed, the higher F
value (42.06386) and the very small P value (3.38E−07)
International Journal of Computational Intelligence Systems (2024) 17:191 Page 15 of 19 191
Table 6 Individual performance Target label Mammogram No. of classi- Acc (%) Pre (%) Recall (%) F1 score (%)
analysis of the classification inputs fied outputs
models for CBIS-DDSM dataset
VGG16 model
Normal 600 578 93.13 93.77 90.33 91.96
Benign 480 484 93.27 90.08 90.83 90.44
Malignant 288 306 93.27 82.02 87.15 85.28
VGG19 model
Normal 600 571 91.45 92.29 87.83 90.39
Benign 480 477 91.74 88.47 87.91 88.27
Malignant 288 320 92.11 78.12 86.80 82.43
ResNet50 model
Normal 600 593 95.25 95.11 94.00 95.24
Benign 480 481 95.39 93.34 93.54 93.44
Malignant 288 294 95.76 89.11 90.97 90.57
DenseNet121 model
Normal 600 594 96.19 95.96 95.31 95.87
Benign 480 483 95.97 96.99 94.58 94.51
Malignant 288 289 96.56 91.69 92.01 92.39
LF technique
Normal 600 596 97.37 97.31 96.67 97.49
Benign 480 478 97.08 96.02 95.62 96.34
Malignant 288 294 97.95 94.21 96.18 95.41
FHDF technique
Normal 600 596 98.39 98.49 97.83 98.19
Benign 480 483 98.17 97.10 97.70 97.46
Malignant 288 289 98.90 97.23 97.56 97.33
5 Discussion on the Findings Then, the challenge is to detect whether the input is
normal or abnormal; if found to be abnormal, it needs to
In recent years, the evolution of DL algorithms has helped be classified as benign or malignant. For this, the research
more in solving real-time problems in the bio-medical field. proposes the FHDF approach using transfer learning to
Breast cancer classification using digital mammograms can detect and classify breast cancer. Here, the important thing
support physicians in identifying the tumors in earlier stages, is the selection of deep CNNs used for feature extrication.
which is crucial to preventing cancer deaths. The research performed a lot of ablation experiments and
The proposed work of three-class classification is evalu- found that the fusion of VGG16, VGG19, ResNet50, and
ated using three different mammogram datasets: MIAS, DenseNet121 gives a very competitive classification perfor-
CBIS-DDSM, and INbreast. And these databases are pub- mance, as illustrated in Tables 2 and 3. While assessing the
licly available one for research purposes. In the preprocess- overall performance, the results need to be validated through
ing stage, the unwanted noise is removed using a simple any consistent validation metric. The works of [38–44] note
adaptive median filter. But in the literature [38–44], a few that the research findings should be properly validated. After
works have not employed any filtering techniques for noise validating using Cohen’s kappa (𝜅), the attained results were
removal, whereas some works employed filters such as sim- validated. The value for the proposed approach is highly
ple median filters. But the thing to be noted is the noise has closer to 1, representing that the proposed approach pro-
to be removed without disturbing the unaffected pixels. So, vides supreme classification performance for breast cancer
the work utilized an adaptive median filtering approach. In problems. Since it is a multiclass classification, the insight
the next step, mammograms are enhanced using an adaptive performance analysis is presented in Tables 5, 6, and 7. The
histogram technique to improve the contrast of microcalci- findings of these tables illustrate that the utilized classifi-
fication and pectoral muscle regions without overexposure. cation architectures are well at discriminating the normal
As a result, the Hough transform and canny edge detection and abnormal mammograms. And they lag in further clas-
provides clear pectoral removed mammograms with a better sifying benign and malignant samples. However, the pro-
enhancement of microcalcification as shown in Fig. 2. posed research of the FHDF approach provides superior
191 Page 16 of 19 International Journal of Computational Intelligence Systems (2024) 17:191
Table 7 Individual performance Target label Mammogram No. of classi- Acc (%) Pre (%) Recall (%) F1 score (%)
analysis of the classification inputs fied outputs
models for INbreast dataset
VGG16 model
Normal 158 156 94.87 93.59 92.40 93.38
Benign 134 134 95.34 92.53 92.53 93.29
Malignant 137 139 94.41 90.64 91.97 91.63
VGG19 model
Normal 158 158 92.54 89.87 89.87 90.22
Benign 134 134 93.47 89.55 89.55 90.47
Malignant 137 137 94.41 91.24 91.24 91.50
ResNet50 model
Normal 158 158 95.80 94.30 94.30 94.19
Benign 134 134 96.74 94.77 94.77 95.36
Malignant 137 137 95.80 93.43 93.43 93.58
DenseNet121 model
Normal 158 157 96.97 96.17 95.57 96.46
Benign 134 136 97.20 94.85 96.26 96.39
Malignant 137 136 96.50 94.85 94.16 95.27
LF technique
Normal 158 157 98.37 98.08 97.46 98.39
Benign 134 135 98.37 97.03 97.76 97.18
Malignant 137 137 97.67 96.35 96.35 96.44
FHDF technique
Normal 158 157 99.30 99.36 98.73 98.96
Benign 134 135 99.30 98.51 99.25 99.47
Malignant 137 137 99.07 98.54 98.54 98.83
Table 9 Performance comparison of the proposed CAD model with the existing research models for breast cancer classification
Reference works Techniques used Target problem Accuracy (%)
Prathibha and Mohan (2018) [38] Multi-resolution transform with a CNN model Multiclass classification 85.4 (DDSM)
Safdarian and Hedyezadeh (2019) [39] Support vector machine (SVM) with boundary Multiclass classification 97 (DDSM)
descriptor feature inputs
Akila et al. (2019) [40] Multiscale all CNN model Multiclass Classification 96 (MIAS)
Figlu et al. (2020) [41] Optimized kernel ELM architecture Multiclass classification 97.4 (MIAS)
92.6 (DDSM)
Abeer et al. (2021) [42] Transfer learning using VGG16 model Multiclass classification 96.8 (MIAS)
Karthiga et al. (2022) [43] Deep CNN with pretrained models Binary classification 95.9 (MIAS)
96.5 (INbreast)
Khaoula et al. 2022 [44] Apriori dynamic selection with SVM Multiclass classification 96.4 (DDSM)
75.8 (INbreast)
Proposed Work Fusion of hybrid deep features (FHDF) approach Multiclass classification 98.7 (MIAS)
97.7 (CBIS-DDSM)
98.8 (INbreast)
classification performance as compared with the existing and bias errors. In this way, the above work is evaluated
works. Finally, the paper compared the classification per- using the digital mammograms of MIAS, CBIS-DDSM,
formance of the proposed method with standard pretrained and INbreast databases with VGG16, VGG19, ResNet50,
models, late fusion technique, and other existing approaches. DenseNet121, Late Fusion, and Fusion of Hybrid Deep Fea-
It revealed that the proposed FHDF approach outperforms tures models. For evaluation, the overall and insight perfor-
them, thus establishing the novelty of the framework. The mance analysis is done for better analysis of classification
potential limitations of the proposed work involve the com- models. Accordingly, the proposed FHDF approach provides
putational complexity involved during the fusion of deep a supreme result of 98.70% (MIAS), 97.73% (CBIS-DDSM),
features obtained from distinct models. In addition, as from and 98.83% (INbreast) classification accuracy as compared
Tables 5, 6, and 7, it is noted that the proposed approach with the standalone and existing classification models.
modestly struggled to recognize the malignant mammo- Moreover, the above results are validated properly through
grams as compared with the other cases. The above limita- kappa analysis: 0.975 (MIAS), 0.965 (CBIS-DDSM), and
tions will be looked out in our future proposals. 0.982 (INbreast). The future direction will involve extending
the FHDF approach for clinical mammograms with different
preprocessing methods. The proposed approach involves an
6 Conclusion and Future Work effective way of fusing deep features extracted from different
mammogram datasets. Furthermore, the effectiveness of the
The proposed study discusses the design of a robust CAD proposed approach will be applied to the same breast cancer
model for enhancing the multiclass classification of breast problem but for multimodal datasets.
cancer data. For this, the work employed the recent emerging
Acknowledgements This work was supported by Princess Nou-
deep learning strategy, i.e., four distinct pre-trained convolu- rah bint Abdulrahman University Researchers Supporting Project
tional neural networks are employed. After freezing and fine- number(PNURSP2024R432), Princess Nourah bint Abdulrahman
tuning the pretrained models, each model's deep features are University, Riyadh, Saudi Arabia. This work was supported by the
extricated. Before this task, the mammogram images are Deanship of Scientific Research, Vice President for Graduate Stud-
ies and Scientific Research, King Faisal University, Saudi Arabia
appropriately pre-processed for their removal of noise, pec- [KFU241404].
toral muscle, and unwanted regions. In addition, pre-pro-
cessed mammograms are augmented enough and partitioned Author Contributions S.C and B.N took care of the review of literature
in a stratified manner to overcome the problem of overfitting and methodology. S.B.K and M.T.R have done the formal analysis,
191 Page 18 of 19 International Journal of Computational Intelligence Systems (2024) 17:191
data collection and investigation. A.A has done the initial drafting and 8. Thijs, K., et al.: Large scale deep learning for computer aided
statistical analysis. V.K.V and E.A have supervised the overall project. detection of mammographic lesions. Med. Image Anal. 35, 303–
All the authors of the article have read and approved the final article. 312 (2017)
9. Xiaofei, Z., et al.: Classification of whole mammogram and
Funding Princess Nourah bint Abdulrahman University Researchers tomosynthesis images using deep convolutional neural networks.
Supporting Project number (PNURSP2024R432), Princess Nourah bint IEEE Trans. Nanobiosci. 17(3), 237–242 (2018)
Abdulrahman University, Riyadh, Saudi Arabia. 10. Yemini, M., Zigel, Y., Lederman, D.: Detecting masses in mam-
mograms using convolutional neural networks and transfer learn-
Data Availability Statement Data used for the findings will be shared ing. In: 2018 IEEE International Conference on the Science of
by the corresponding author upon request. Electrical Engineering in Israel (ICSEE), pp. 1–4. IEEE (2018)
11. Chougrad, H., Zouaki, H., Alheyane, O.: Multi-label transfer
Declarations learning for the early diagnosis of breast cancer. Neurocomput-
ing 392, 168–180 (2020)
Conflict of interest The authors declare that they have no conflict of 12. Shu, X., Zhang, L., Wang, Z., Lv, Q., Yi, Z.: Deep neural networks
interest. with region-based pooling structures for mammographic image
classification. IEEE Trans. Med. Imaging 39, 2246–2255 (2020)
Ethics approval and consent to participate Not applicable. 13. Sannasi Chakravarthy, S.R., Rajaguru, H.: Deep-features with
Bayesian optimized classifiers for the breast cancer diagnosis.
Consent for publication Not applicable as the work is carried out on Int. J. Imaging Syst. Technol. 31(4), 1861–1881 (2021)
publicly available dataset. 14. Chakravarthy, S.S., Rajaguru, H.: Automatic detection and clas-
sification of mammograms using improved extreme learning
Open Access This article is licensed under a Creative Commons Attri- machine with deep learning. IRBM 43(1), 49–61 (2022)
bution 4.0 International License, which permits use, sharing, adapta- 15. Sannasi Chakravarthy, S.R., Bharanidharan, N., Rajaguru, H.:
tion, distribution and reproduction in any medium or format, as long Multi-deep CNN based experimentations for early diagnosis of
as you give appropriate credit to the original author(s) and the source, breast cancer. IETE J. Res. 69(10), 7326–7341 (2022)
provide a link to the Creative Commons licence, and indicate if changes 16. Benzebouchi, N.E., Azizi, N., Ashour, A.S., Dey, N., Sherratt,
were made. The images or other third party material in this article are R.S.: Multi-modal classifier fusion with feature cooperation for
included in the article's Creative Commons licence, unless indicated glaucoma diagnosis. J. Exp. Theor. Artif. Intell. 31(6), 841–874
otherwise in a credit line to the material. If material is not included in (2019). https://doi.org/10.1080/0952813X.2019.1653383
the article's Creative Commons licence and your intended use is not 17. Xue, W., Dai, X., Liu, L.: Remote sensing scene classification
permitted by statutory regulation or exceeds the permitted use, you will based on multi-structure deep features fusion. IEEE Access 8(1),
need to obtain permission directly from the copyright holder. To view a 28746–28755 (2020). https://doi.org/10.1109/ACCESS.2020.
copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. 2968771
18. Kumar, A., Kim, J., Lyndon, D., Fulham, M., Feng, D.: An ensem-
ble of fine-tuned convolutional neural networks for medical image
classification. IEEE J. Biomed. Health. Inf. 21(1), 31–40 (2016).
https://doi.org/10.1109/JBHI.2016.2635663
References 19. Amin, J., Sharif, A., Gul, N., Anjum, M.A., Nisar, M.W., Azam,
F., Bukhari, S.A.C.: Integrated design of deep features fusion for
localization and classification of skin cancer. Pattern Recogn. Lett.
1. Wilkinson, L., Gathani, T.: Understanding breast cancer as a
131, 63–70 (2020). https://doi.org/10.1016/j.patrec.2019.11.042
global health concern. Br. J. Radiol. 95(1130), 20211033 (2022)
20. Suckling, J., Parker, J., Dance, D., Astley, S., Hutt, I.: Mammo-
2. Zemni, I., Kacem, M., Dhouib, W., Bennasrallah, C., Hadhri,
graphic image analysis society (mias) database v1. 21 (2015).
R., Abroug, H., Ben Fredj, M., Mokni, M., Bouanene, I., Bel-
https://w ww.r eposi tory.c am.a c.u k/h andle/1 810/2 50394. Accessed
guith, A.S.: Breast cancer incidence and predictions (Monastir,
28 Mar 2021
Tunisia: 2002–2030): a registry-based study. PLoS ONE 17(5),
21. Heath, M., Bowyer, K., Kopans, D., Moore, R., Kegelmeyer, P.:
e0268035 (2022)
In: Yaffe, M.J. (ed.), Proceedings of the fifth international work-
3. Yari, Y., Nguyen, T.V., Nguyen, H.T.: Deep learning applied
shop on digital mammography, pp. 212–218. Medical Physics
for histological diagnosis of breast cancer. IEEE Access 8,
Publishing (2001)
162432–162448 (2020)
22. Moreira, I.C., Amaral, I., Domingues, I., Cardoso, A., João Car-
4. Abirami, C., Harikumar, R., Chakravarthy, S.S.: Performance
doso, M., Cardoso, J.S.: INbreast: toward a full-field digital mam-
analysis and detection of micro calcification in digital mammo-
mographic database. Acad. Radiol. 19, 236–248 (2012). https://
grams using wavelet features. In: 2016 International Conference
doi.org/10.1016/j.acra.2011.09.014
on Wireless Communications, Signal Processing and Network-
23. Sannasi Chakravarthy, S.R., Rajaguru, H.: Detection and clas-
ing (WiSPNET), pp. 2327–2331. IEEE (2016)
sification of microcalcification from digital mammograms with
5. Yu, Z., Song, M., Chouchane, L., Ma, X.: Functional genomic
firefly algorithm, extreme learningmachine and non-linear regres-
analysis of breast cancer metastasis: implications for diagnosis
sion models: a comparison. Int. J. Imaging Syst. Technol. 30(1),
and therapy. Cancers 13(13), 3276 (2021)
126–146 (2020). https://doi.org/10.1002/ima.22364
6. SR, S.C., Rajaguru, H.: A systematic review on screening,
24. Rao, B.S.: Dynamic histogram equalization for contrast enhance-
examining and classification of breast cancer. In: 2021 Smart
ment for digital images. Appl. Soft Comput. 89, 106114 (2020)
Technologies, Communication and Robotics (STCR), pp. 1–4
25. Yaman, S., Karakaya, B., Erol, Y.: Real time edge detection
(2021)
via IP-core based sobel filter on FPGA. In: 2019 International
7. Dhungel, N., Carneiro, G., Bradley, A.P.: A deep learning
Conference on Applied Automation and Industrial Diagnostics
approach for the analysis of masses in mammograms with mini-
(ICAAID), vol. 1, pp. 1–4. IEEE (2019)
mal user intervention. Med. Image Anal. 37, 114–128 (2017).
26. Gong, L.H., Tian, C., Zou, W.P., Zhou, N.R.: Robust and imper-
https://doi.org/10.1016/J.MEDIA.2017.01.009
ceptible watermarking scheme based on Canny edge detection
International Journal of Computational Intelligence Systems (2024) 17:191 Page 19 of 19 191
and SVD in the contourlet domain. Multimed. Tools Appl. 80(1), on Intelligent Computing and Control Systems (ICCS), pp. 1293–
439–461 (2021) 1298. IEEE (2019)
27. Iqbal, B., Iqbal, W., Khan, N., Mahmood, A., Erradi, A.: Canny 37. Bing, P., Liu, Y., Liu, W., Zhou, J., Zhu, L.: Electrocardiogram
edge detection and Hough transform for high resolution video classification using TSST-based spectrogram and ConViT. Front.
streams using Hadoop and Spark. Clust. Comput. 23(1), 397–408 Cardiovasc. Med. (2022). https://doi.org/10.3389/fcvm.2022.
(2020) 983543
28. Zubair Rahman, A.M.J., Gupta, M., Aarathi, S., et al.: Advanced 38. Xue, X., Zhao, S., Xu, M., Li, Y., Liu, W., Qin, H.: Circular
AI-driven approach for enhanced brain tumor detection from MRI RNA_0000326 accelerates breast cancer development via modu-
images utilizing EfficientNetB2 with equalization and homomor- lation of the miR-9-3p/YAP1 axis. Neoplasma 70(3), 430–442
phic filtering. BMC Med. Inform. Decis. Mak. 24, 113 (2024). (2023). https://doi.org/10.4149/neo_2023_220904N894
https://doi.org/10.1186/s12911-024-02519-x 39. Jiang, Z., Yang, L., Jin, L., Yi, L., Bing, P., Zhou, J., Yang, J.:
29. Satheesh Kumar, J., Vinoth Kumar, V., Mahesh, T.R., et al.: Identification of novel cuproptosis-related lncRNA signatures to
Detection of Marchiafava Bignami disease using distinct deep predict the prognosis and immune microenvironment of breast
learning techniques in medical diagnostics. BMC Med. Imaging cancer patients. Front. Oncol. (2022). https://doi.org/10.3389/
24, 100 (2024). https://doi.org/10.1186/s12880-024-01283-8 fonc.2022.988680
30. Ahmed, S.T. et al.: PrEGAN: privacy enhanced clinical EMR gen- 40. Yang, C., Sheng, D., Yang, B., Zheng, W., Liu, C.: A dual-domain
eration: leveraging GAN model for customer de-identification. In: diffusion model for sparse-view CT reconstruction. IEEE Signal
IEEE Transactions on Consumer Electronics. https://doi.org/10. Process. Lett. (2024). https://doi.org/10.1109/LSP.2024.3392690
1109/TCE.2024.3386222 41. Zheng, W., Lu, S., Yang, Y., Yin, Z., Yin, L., Ali, H.: Lightweight
31. Fourcade, A., Khonsari, R.H.: Deep learning in medical image transformer image feature extraction network. PeerJ Comput. Sci.
analysis: a third eye for doctors. J. Stomatol. Oral Maxillofac. 10, e1755 (2024). https://doi.org/10.7717/peerj-cs.1755
Surg. 120(4), 279–288 (2019) 42. Saber, A., Sakr, M., Abou-Seida, O., Keshk, A.: A novel transfer-
32. Chlap, P., Min, H., Vandenberg, N., Dowling, J., Holloway, L., learning model for automatic detection and classification of breast
Haworth, A.: A review of medical image data augmentation tech- cancer based deep CNN. Kafrelsheikh J. Inf. Sci. 2(1), 1–9 (2021)
niques for deep learning applications. J. Med. Imaging Radiat. 43. Karthiga, R., Narasimhan, K., Amirtharajan, R.: Diagnosis of
Oncol. 65(5), 545–563 (2021) breast cancer for modern mammography using artificial intelli-
33. Morid, M.A., Borjali, A., Del Fiol, G.: A scoping review of trans- gence. Math. Comput. Simul 202, 316–330 (2022)
fer learning research on medical image analysis using ImageNet. 44. Soulami, K.B., Kaabouch, N., Saidi, M.N.: Breast cancer: three-
Comput. Biol. Med. 128, 104115 (2021) class masses classification in mammograms using apriori dynamic
34. Wan, Z., Yang, R., Huang, M., Zeng, N., Liu, X.: A review on selection. Concurr. Comput. Pract. Exp. 34(24), e7233 (2022)
transfer learning in EEG signal analysis. Neurocomputing 421,
1–14 (2021) Publisher's Note Springer Nature remains neutral with regard to
35. Li, W., Huang, R., Li, J., Liao, Y., Chen, Z., He, G., Yan, R., jurisdictional claims in published maps and institutional affiliations.
Gryllias, K.: A perspective survey on deep transfer learning for
fault diagnosis in industrial scenarios: theories, applications and
challenges. Mech. Syst. Signal Process. 167, 108487 (2022)
36. Mehta, S., Paunwala, C., Vaidya, B.: CNN based traffic sign classi-
fication using Adam optimizer. In: 2019 International Conference