Project Report on Breast Cancer Segmentation and Development of Web App
Project Report on Breast Cancer Segmentation and Development of Web App
ABSTRACT
Breast cancer, characterized by the uncontrolled growth of breast cells, remains a major health threat
globally, particularly to women. Early detection and accurate classification are critical for improving
patient outcomes. In this study, we propose a novel hybrid deep learning model for breast cancer image
segmentation, utilizing transformer-based architectures combined with ResNet for precise tumor detection
in breast ultrasound images. The model captures long-range dependencies through the transformer
mechanism, while ResNet’s residual connections allow the network to focus on fine-grained details,
making it well-suited for segmenting subtle features such as tumors.Our approach involves training the
model using a dataset of breast ultrasound images, where we apply various preprocessing techniques
including image resizing and normalization to enhance the model’s generalization capabilities. The model
achieved an impressive training accuracy of 97.80% and a training loss of 0.0236 after 60 epochs,
although the validation accuracy was 69.82% with a validation loss of 1.309, indicating room for
improvement in generalization.Furthermore, we incorporate U-Net based segmentation techniques, which
isolate regions of interest, helping to improve the accuracy of detecting and classifying abnormal growths.
The model was optimized using transfer learning and parallel processing techniques, which significantly
reduced training time while improving performance. The results demonstrate that transformer-based
models offer promising potential in breast cancer detection, and the proposed method provides a solid
foundation for future advancements in medical image analysis. This work contributes to the ongoing
efforts to enhance breast cancer diagnostic tools, aiming for more accurate and efficient clinical
applications.
1
CHAPTER 1
INTRODUCTION
1.1. BACKGROUND
Breast cancer is a leading cause of cancer-related mortality among women worldwide, with over 2.3
million cases reported in 2020, according to the World Health Organization (WHO)[1]. Early detection
and accurate diagnosis are critical for effective treatment and improved survival rates. However,
traditional diagnostic methods, such as mammography and ultrasound, often rely on subjective
interpretations by radiologists, which can lead to inconsistencies and delays. The classification of breast
lesions into categories such as benign, malignant, or normal is crucial for guiding appropriate medical
interventions and reducing unnecessary biopsies.
Recent advancements in artificial intelligence (AI) and machine learning (ML) are transforming the field
of medical imaging. Specifically, convolutional neural networks (CNNs) like ResNet have demonstrated
exceptional performance in image classification tasks, while architectures such as UNet have become the
gold standard for medical image segmentation due to their ability to delineate intricate structures in
biomedical images. By leveraging these state-of-the-art models, breast cancer diagnosis can become more
precise and less dependent on human expertise.
Integrating these models into a web application interface further enhances accessibility and usability,
enabling healthcare professionals to perform real-time segmentation and classification of breast ultrasound
images. This approach not only facilitates early detection but also supports remote diagnostics, especially
in underserved regions.
Despite advancements in imaging technologies, breast cancer diagnosis remains a challenge due to the
variability in imaging quality, inter-observer discrepancies, and the intricate nature of distinguishing
benign, malignant, and normal breast tissues. Traditional approaches are limited by their reliance on
handcrafted features and their inability to capture the complex spatial and temporal patterns in breast
imaging data.
There is an urgent need for automated tools capable of segmenting breast lesions and classifying them into
benign, malignant, or normal categories with high accuracy. Existing methods often lack robustness and
generalizability across diverse datasets, limiting their utility in real-world clinical settings. This research
addresses these challenges by employing advanced ML models—ResNet for classification and UNet for
segmentation—combined with a user-friendly web application interface for image processing and result
visualization.
2
1.3. OBJECTIVES
1. To develop a machine learning framework for the segmentation and classification of breast
ultrasound images into benign, malignant, and normal categories.
2. To utilize ResNet for accurate classification of breast cancer types based on ultrasound images.
3. To implement UNet for precise segmentation of breast lesions, enabling better visualization and
analysis of tumor boundaries.
4. To integrate these models into a web application interface for seamless data input, processing, and
result output, enhancing accessibility for healthcare professionals.
5. To evaluate accuracy of the proposed framework.
1.4. SIGNIFICANCE
The global burden of breast cancer necessitates innovative solutions for its early detection and
management. Traditional diagnostic approaches are limited by their subjectivity and dependence on
manual interpretation, which can lead to diagnostic delays and inconsistencies. By leveraging
state-of-the-art ML models like ResNet and UNet, this project aims to address these limitations and
advance the field of breast cancer diagnostics.
The integration of ML-powered segmentation and classification into a web application offers significant
benefits:
● Accessibility: Enables healthcare providers, including those in remote areas, to utilize advanced
diagnostic tools.
● Precision: Enhances diagnostic accuracy through automated analysis, reducing human error.
● Efficiency: Streamlines the diagnostic workflow, allowing for quicker decision-making and
interventions.
● Scalability: Facilitates widespread adoption in clinical settings without the need for extensive
computational resources.
This research contributes to the growing field of AI in healthcare, demonstrating the potential of ML
models to revolutionize breast cancer diagnostics. By combining clinical data with advanced
computational techniques, this work paves the way for more personalized, timely, and effective care for
breast cancer patients.
3
CHAPTER 2
LITERATURE REVIEW
4
[4] In the paper titled "Attention-based U-Net model for breast cancer
segmentation using BUSI dataset" by Sulaiman et al. (2024), explores the integration of an
attention mechanism into the U-Net model to improve breast cancer segmentation in
ultrasound images. The attention-driven U-Net model achieved impressive results, with a
precision of 0.96, accuracy of 0.98, and specificity of 0.99. The study emphasizes the importance
of the attention mechanism in focusing on relevant features of ultrasound images, thereby
enhancing tumor boundary delineation. Through an architecture ablation analysis, the authors
determined that the best performance is achieved with four encoders and a 3x3 kernel size in the
attention block. Additionally, the research demonstrates how optimizing the number of encoders
and kernel size can improve segmentation accuracy while balancing computational efficiency.
The study’s results suggest that this attention-based U-Net model holds significant promise for
clinical applications in breast cancer diagnosis, offering reliable and efficient segmentation.
[5] In this paper titled "Enhancing deep convolutional neural network scheme for
breast cancer diagnosis with unlabeled data" by Sun et al. (2016) propose a semi-supervised
learning (SSL) framework for improving breast cancer diagnosis using deep convolutional
neural networks (CNN). Given the difficulty of acquiring large labeled datasets in medical
imaging, the authors introduce a method that combines a small portion of labeled data with a
larger set of unlabeled data to enhance CNN's performance. The framework includes modules for
data weighing, feature selection, co-training data labeling, and CNN training. The authors
evaluated their approach using 3,158 regions of interest (ROIs) extracted from mammograms,
with only 100 labeled ROIs, achieving an area under the curve (AUC) of 0.8818. Their findings
demonstrate that incorporating unlabeled data significantly improves the model’s accuracy,
making it a viable method for improving medical image classification when labeled data is
limited.
[6] In this paper titled "ResNet and its application to medical image processing:
Research progress and challenges" by Wanni Xu, You-Lei Fu, and Dongmei Zhu (2023),
explore the advancements and challenges of applying Residual Neural Networks (ResNet)
in medical image processing. The paper first introduces the fundamental concepts of ResNet,
explaining its architecture and residual units that enable efficient learning even in very deep
networks. The authors then review various applications of ResNet in medical imaging,
highlighting its use in diagnosing diseases such as lung tumors, breast cancer, skin diseases, and
brain disorders. The paper also discusses how ResNet's deep learning capabilities improve
diagnostic accuracy by extracting complex features from medical images. Furthermore, the
challenges faced in utilizing ResNet, including dataset imbalance and the need for large-scale
annotated data, are addressed. The authors conclude by suggesting potential future directions for
enhancing ResNet models in medical image processing, including further improvements in
network optimization and model generalization.
5
[7] In this paper titled Segmentation and Classification of Breast Tumor Using
Dynamic Contrast-Enhanced MR Images by Yuanjie Zheng et al. (2008), a novel framework
is proposed for accurately characterizing breast tumors using contrast-enhanced magnetic
resonance imaging (DCE-MRI). The authors introduce a graph-cut based segmentation
algorithm that refines manual tumor segmentation, improving accuracy in identifying tumor
regions. Additionally, they present a Spatio-Temporal Enhancement Pattern (STEP) model,
which integrates dynamic contrast enhancement and spatial variations to better distinguish
between malignant and benign tumors. The framework was validated using a dataset of 31
subjects, demonstrating high classification performance with an area under the curve (AUC) of
0.97. The study concludes that combining temporal enhancement, architectural features, and
spatial variations significantly improves tumor classification. Future work aims to expand the
evaluation using a larger database. This work highlights the importance of refining segmentation
and extracting advanced features for enhanced tumor detection.
6
CHAPTER 3
METHODOLOGY
The data[9] collected at baseline are breast ultrasound images among women in ages between
25 and 75 years old from Baheya Hospital for Early Detection & Treatment of Women's Cancer,
Cairo, Egypt. This data was collected in 2018 using LOGIQ E9 ultrasound and LOGIQ E9 Agile
ultrasound system. The number of patients is 600 female patients. The dataset consists of 780
images with an average image size of 500*500 pixels. The images are in PNG format. The
ground truth images are presented with original images. The images are categorized into
three classes, which are normal, benign, and malignant.
TABLE 1. The three classes of breast cases and the number of images in each case.
7
3.2 LIBRARIES
Libraries are essential tools that provide pre-written code modules, functions, and utilities, enabling
developers and researchers to implement, train, evaluate, and deploy machine learning models efficiently.
These libraries reduce the need for repetitive coding by offering high-level abstractions for common
machine learning tasks. The libraries used for this project are:
● OS: A standard library in Python used for interacting with the operating system. It facilitates file and
directory manipulation, such as creating paths, navigating file structures, and managing file
permissions.
● Glob: This library allows for file pattern matching, helping in locating files and directories based on
wildcard patterns.
● RE (Regular Expressions): A powerful library used for pattern matching and string manipulation,
aiding in tasks like data cleaning, validation, and feature extraction.
OPENCV
An open-source computer vision library widely used for image and video processing. It provides
functionalities for tasks such as image filtering, transformations, feature extraction, and object
detection, making it essential for image-based machine learning tasks.
NUMPY
A core library for numerical computing in Python. It is optimized for handling large multi-dimensional
arrays and matrices, along with a comprehensive collection of mathematical functions to perform
operations on these arrays efficiently.
PANDAS
A versatile data analysis library used for data manipulation and exploration. It provides powerful data
structures like DataFrames, which simplify handling, cleaning, and preprocessing structured datasets.
MATPLOTLIB
A foundational plotting library in Python for creating static, interactive, and animated visualizations. It
supports a wide variety of plots, including line graphs, scatter plots, and histograms, aiding in data
analysis and presentation.
SEABORN
A Python library built on top of Matplotlib, designed to simplify statistical data visualization. It offers
high-level interfaces for creating attractive and informative charts, such as heatmaps and pair plots,
essential for exploratory data analysis.
8
SCIKIT-LEARN (SKLEARN)
A widely-used library for machine learning tasks, providing tools for data preprocessing, feature
selection, model training, and evaluation. The “train_test_split” function from Scikit-learn is
particularly useful for splitting datasets into training and testing subsets.
TensorFlow: A robust open-source framework developed by Google for building, training, and
deploying machine learning models. TensorFlow supports both deep learning and general machine
learning applications, offering scalable and high-performance solutions.
Keras: A high-level API within TensorFlow, simplifying the creation and training of neural networks.
It includes modules for layers, activations, optimizers, losses, and metrics, allowing users to build
complex deep learning models with ease.
These libraries collectively streamline tasks such as data processing, visualization, model creation, and
evaluation, forming the backbone of this machine learning project.
Loading a dataset is one of the fundamental steps in machine learning, as it provides the raw data on
which models are trained, validated, and tested. The process of loading a dataset involves accessing the
data from its source, reading it into memory, and preparing it for analysis and modeling.
In this project, breast ultrasound images and their corresponding masks are utilized to develop a machine
learning model. The dataset consists of two categories: benign and malignant samples. To manage and
preprocess this data efficiently, a frame object “framObjTrain” is defined.
A custom function, “LoadData”, is employed to automate the loading and preprocessing of the dataset.
9
The function performs the following key tasks:
● Parameters:
1. “frameObj”: A dictionary used to store the images and masks.
2. “imgPath”: Path to the directory containing the ultrasound images.
3. “maskPath”: Path to the directory containing the corresponding masks.
4. “shape”: The desired dimensions (default: 256x256 pixels) for resizing images and masks.
● Functionality:
1. The function retrieves the filenames from the “imgPath” and identifies unique images based
on a predefined naming convention.
2. It constructs the file paths for the corresponding images and masks, ensuring alignment
between them.
3. Each image and mask is loaded, resized to the target dimensions, and appended to the
respective lists (“img” and “mask”) in “frameObj”.
By centralizing the data into “framObjTrain”, the dataset is prepared for subsequent analysis and model
development, ensuring uniformity in image dimensions and structure. This preprocessing step is vital for
enabling the machine learning model to learn effectively from the data.
10
3.4 DATASET PREPROCESSING
The dataset consists of a total of 1578 images of breast ultrasound collected from 600 women patients.
The images are divided into three classes: Benign, Malignant and Normal. The Benign class consists of
the most number of images (891 images, 56.46% of the total images), followed by the Malignant (421
images, 26.67% of total images) and Normal class (266 images, 16.85% of total images). Since the data
was already pre-processed and annotated at the source, there weren’t any duplicate images in the data.
11
CHAPTER 4
IMPLEMENTATION
The Breast Cancer Image Segmentation project aims to implement a deep learning-based approach for
segmenting breast cancer regions in medical images using a ResNet architecture. The project is structured
to ensure modularity and ease of development. The dataset will be organized into separate directories for
training, testing, and validation, facilitating efficient preprocessing and model evaluation. A dedicated src/
folder will house all essential scripts, including data loading, model definition, training procedures, and
utility functions. Saved models and logs will be systematically stored in models/ and logs/ directories,
respectively, ensuring reproducibility and tracking of results. Additionally, outputs such as segmented
images and evaluation metrics will be saved in a results/ folder for analysis. A README.md file will
provide comprehensive project documentation, and dependencies will be listed in a requirements.txt file to
streamline environment setup. This structured approach ensures a clean workflow, making the
implementation process efficient and scalable.
12
4.1 DATA LOADING
The dataset for breast cancer segmentation consists of ultrasound images and corresponding masks, which
are used to identify regions of interest, such as benign or malignant tissues. A structured approach was
implemented to preprocess and load the data. A dictionary, framObjTrain, was defined to store image and
mask data separately in the form of lists. The LoadData function was developed to automate the process of
loading images and masks from their respective directories, resizing them to a uniform shape of 256x256
pixels for consistency, and appending them to the framObjTrain dictionary.
The function begins by retrieving image names from the specified directory and extracting unique
identifiers to match images with their respective masks. These pairs are loaded using Matplotlib's image
reader and resized using OpenCV. By iterating over the dataset, the function ensures that all image-mask
pairs are systematically added to the data frame.
Once the data was loaded, a visualization step was performed to confirm the alignment of ultrasound
images and their corresponding segmentation masks. A benign or malignant image and its mask were
displayed side by side, demonstrating the dataset's structure and verifying the correctness of the loading
process. This visualization serves as an initial validation step and provides insights into the dataset's quality
and format, which is crucial for building an effective segmentation mode.
13
4.2 TESTING AND EVALUATION
To assess the model's performance, a structured testing pipeline was developed for predicting and
evaluating segmentation masks on a subset of validation images.
Testing Pipeline
● Prediction Process:
● The predict16 function processes a batch of 16 validation images:
● Resizes the images to the defined shape (256x256).
● Feeds the resized images into the trained U-Net model for segmentation predictions.
● Outputs include:
● Predicted segmentation masks.
● Preprocessed input images.
● Corresponding ground truth masks for comparison.
● Visualization:
● The Plotter function generates a side-by-side comparison of:
1. Input image.
2. Predicted segmentation mask.
3. Ground truth mask.
● Visualization is organized using subplots, with each row representing a complete
comparison for one image.
● Qualitative Insights:
● Visualization highlights the model's accuracy in segmenting cancerous regions.
● Key examples from the dataset, ranging from straightforward to complex cases, were
displayed for evaluation.
● Performance Highlights:
● The model demonstrated a strong ability to segment cancerous regions accurately.
● Visualization validated the effectiveness of the U-Net architecture for breast cancer image
segmentation tasks.
● Opportunities for Improvement:
● Cases with discrepancies between predictions and ground truth highlighted areas where the
model could be further refined.
14
Fig. 4: Images, Predicted Masks and Actual Masks
15
4.3 MODEL SAVING, IMAGE PREPARATION, AND DATASET SPLITTING
The project includes essential steps to save the trained segmentation model, preprocess the dataset for
EfficientNet training, and split it into training and testing subsets.
Model Saving
The trained U-Net model, named BreastCancerSegmentor, was saved in the HDF5 format (.h5) for future
use. This ensures the model's weights and architecture are preserved, allowing for easy reuse and inference
without retraining.
Image Preparation
A preprocessing pipeline was created to prepare breast ultrasound images for training with EfficientNet.
The prepare_image function resizes each image to a uniform shape of 224x224 pixels and preprocesses it
using EfficientNet’s preprocessing function to normalize pixel values. The dataset directory was scanned,
and all images, excluding mask files, were selected. Labels were assigned based on their folder names
(benign, malignant, or normal).
The images and their corresponding labels were stored in dictionaries for structured data management.
After preprocessing, the images were converted into NumPy arrays for compatibility with machine
learning models. The target labels were encoded into numerical values using LabelEncoder, with the target
classes being: benign, malignant, and normal.
Data Splitting
The processed dataset was split into training and testing subsets using an 90-10 split ratio. This ensured a
balanced distribution of data, enabling the model to learn effectively during training and be evaluated on
unseen examples during testing. The resulting shapes of the training and testing sets were as follows:
● Training set: 702 images with shape (224, 224, 3)
● Testing set: 78 images with shape (224, 224, 3)
This structured approach to saving the model and preparing the dataset ensures robustness and scalability
for further training and evaluation tasks, particularly with EfficientNet for classification tasks.
16
4.4 MODEL BUILDING AND TRAINING
This section focuses on building and training a transfer learning model based on ResNet152V2 for breast
cancer classification.
● Base Model:
○ The ResNet152V2 architecture, pre-trained on the ImageNet dataset, is used as the feature
extractor.
○ The include_top=False argument excludes the classification layers, retaining only the
convolutional base for feature extraction.
○ The weights of the base model are frozen (trainable = False), ensuring the pre-trained
features remain unchanged during training.
● Custom Layers for Transfer Learning:
○ The output of the base model is processed through a series of custom layers to adapt it to the
current task:
■ Convolutional Blocks: Two convolutional layers with 256 and 128 filters, followed
by Batch Normalization and ReLU activation, are used to refine feature maps.
Global Average Pooling is applied to reduce spatial dimensions.
■ Fully Connected (FC) Layers:
● A flattening layer prepares the pooled features for dense layers.
● Dense layers with 64 and 32 neurons, Batch Normalization, and ReLU
activation refine the features. Dropout regularization prevents overfitting.
■ Output Layer: The final layer is a softmax classifier with 3 units, corresponding to
the three classes: benign, malignant, and normal.
● Compilation:
The model is compiled with:
○ Optimizer: RMSprop with a learning rate of 0.001, suitable for handling non-stationary
objectives.
○ Loss Function: Sparse Categorical Crossentropy, ideal for multi-class classification with
integer labels.
○ Metrics: Sparse Categorical Accuracy to measure prediction accuracy.
17
Training the Model
Early Stopping:
Training is monitored using an EarlyStopping callback to halt training if no improvement in validation loss
is observed for 60 epochs, saving computational resources.
Best Model Checkpointing:
The ModelCheckpoint callback saves the model with the best validation accuracy to the file
best_model.h5.
Training Process:
● The model is trained using a batch size of 32 for a predefined number of epochs (EPOCHS), with
training and validation data split as per the earlier data preparation.
● GPU acceleration (/gpu:0) is utilized to speed up the training process.
Outcome:
● The training results, including metrics and loss trends, are captured in the history object for further
analysis.
● The best-performing model is saved for future evaluation and deployment.
● This transfer learning approach leverages the power of a pre-trained model while tailoring it to the
specific classification task, ensuring high efficiency and performance.
18
4.5 MODEL EVALUATION AND FINE-TUNING
Example plots:
Initial Evaluation:
● The model is evaluated on the test dataset using incept_model.evaluate(), yielding the following
results:
○ Loss: 1.1878
○ Sparse Categorical Accuracy: 0.6154 (61.54%)
● These metrics indicate baseline performance and suggest room for improvement.
21
4.7 WEB APP INTERFACE
The Breast Cancer Segmentation Web Application was developed using Streamlit to create an
interactive and user-friendly interface for medical image segmentation. The application integrates various
libraries, including TensorFlow, NumPy, PIL (Python Imaging Library), and OpenCV, to facilitate deep
learning-based segmentation of ultrasound images. The page layout and title are configured using
Streamlit's “st.set_page_config” to ensure an intuitive experience for users. This app aims to simplify the
complex process of segmentation by providing an easy-to-use interface where users can upload images and
receive segmentation results in real time.
The segmentation model, trained on breast cancer ultrasound images, is loaded using Streamlit’s
“@st.cache_resource” decorator. This ensures the model is cached and loaded only once per session,
optimizing performance and reducing redundancy. The model is stored in the file
“BreastCancerSegmentor.h5” and predicts segmentation masks for identifying regions of interest. The
preprocessing pipeline includes converting the uploaded image to RGB format, resizing it to 256×256256
\times 256256×256 pixels (the model's expected input size), normalizing pixel values to the range [0, 1],
and adding a batch dimension. These steps ensure compatibility with the pre-trained model and improve
segmentation accuracy.
The application processes the uploaded image using the “segment_image” function. This function predicts
the segmentation mask, thresholds it to create a binary mask for better visualization, and resizes it to the
original dimensions of the uploaded image. The results are displayed alongside the original image,
enabling users to clearly view the segmented areas. The app provides a seamless user experience by
displaying a spinner during the processing phase and delivering appropriate success or error messages
based on the outcome of the operation.
The interface design includes a sidebar for navigation, where users can explore options like uploading an
image, learning about the app, or contacting the developers. The main content area allows users to upload
images in JPG, JPEG, or PNG formats and initiate segmentation by clicking a button labeled "Segment
Image." The segmentation results are then presented on the screen, ensuring a smooth and efficient
workflow. This app demonstrates the power of combining deep learning with interactive design to provide
real-time breast cancer segmentation, making it a valuable tool for medical diagnostics.
22
Fig. 5. Web-App interface
23
CHAPTER 5
However, the validation accuracy showed signs of overfitting, where the model performed well on the
training set but struggled to generalize to unseen data. The peak validation accuracy was 73.08% at
Epoch 4, but by the final Epoch 60, it dropped to 69.23%. This difference between training and
validation accuracy indicates that while the model became very good at memorizing the training data, it
failed to generalize well to the validation set, leading to overfitting.
The training loss consistently decreased, starting from 0.1923 at Epoch 4 and reaching 0.0942 at Epoch
10, reflecting the model's continuous learning and reduction in error. The training loss reduction
demonstrates the model's ability to improve over time as it adjusted to the training data. In contrast, the
validation loss fluctuated throughout training. Initially, it was 1.0499 at Epoch 4 and increased to 1.3091
by Epoch 60, further indicating that the model did not generalize well to the validation data.
24
Fig. 7. Training history for inception model (prediction) across 60 epochs
These results suggest that the model, although successful at learning from the training set, may require
adjustments, such as regularization techniques or data augmentation, to improve its ability to generalize
unseen data and reduce overfitting.
25
5.1.1 WEB APP-RESULTS
The Breast Cancer Segmentation Web Application effectively allows users to upload ultrasound
images and receive real-time segmentation results. The application processes the uploaded image using
the pre-trained deep learning model which is in .h5 file. After the image is preprocessed, the model
performs segmentation, predicting a mask that highlights regions of interest in the uploaded image. This
mask is then post-processed and resized to match the original dimensions of the image, ensuring a
seamless display of the segmentation output. Streamlit is used to display the segmented regions,
highlighting areas that may indicate cancer. The user-friendly interface ensures easy navigation and
interaction, with a clear display of the original image and segmentation mask for comparison. The app is
efficient, using TensorFlow and caching to optimize performance.
While the app works well for single-image uploads, future improvements could include batch processing,
zoom features for detailed inspection, and expanded model capabilities for different cancer types. Overall,
the app provides a valuable tool for breast cancer detection, aiding in early diagnosis and medical
decision-making.
Fig. 8. Uploaded Ultrasound Image (Left) and the Segmented Image (Right)
26
CHAPTER 6
CONCLUSION
Compared to traditional methods such as standalone U-Net or conventional CNNs, the proposed
approach demonstrates superior segmentation accuracy and adaptability. Earlier techniques often
struggled with generalizing across diverse datasets due to limited capacity for capturing global
context or handling noisy, low-contrast ultrasound images. The hybrid model addresses these
shortcomings by combining transformers' global attention mechanisms with ResNet's localized
feature extraction, delivering improved performance in detecting small or subtle tumors.
Preprocessing techniques, such as resizing and normalization, further enhance the model's
robustness, enabling it to handle variability in input data. This hybrid architecture not only
achieves more reliable segmentation but also reduces manual parameter tuning, overcoming
significant limitations of earlier models.
The model achieved promising results, with a training accuracy of 97.80% and a training loss of
0.0809 after 60 epochs, underscoring its learning capability. However, the validation accuracy of
62.82% and a validation loss of 0.9853 highlight areas for improvement in generalization to
unseen data. The integration of U-Net techniques for isolating regions of interest contributed to
enhanced detection and classification of abnormal growths, particularly in challenging breast
ultrasound images. By addressing these issues and refining the model further, this framework
could achieve even greater clinical impact. The performance metrics validate the effectiveness of
combining transformers and ResNet, showcasing their potential to overcome limitations of
conventional segmentation approaches in medical imaging.
27
To extend its practical usability, a Streamlit-based web application was developed, enabling
real-time segmentation of breast ultrasound images. This application provides a user-friendly
interface for clinicians and researchers, bridging advanced AI models and routine clinical
workflows. It ensures accessibility and usability, especially in resource-limited settings, making
it a valuable tool for early breast cancer detection. Future work should aim to address
generalization challenges by incorporating larger and more diverse datasets, as well as exploring
advanced augmentation and domain adaptation techniques. Additionally, incorporating
explainability mechanisms can enhance trust in the model, facilitating broader adoption in
clinical environments. This hybrid approach marks a significant step forward in breast cancer
diagnostics, combining cutting-edge technology with practical implementation to improve
patient outcomes and advance the field of medical image analysis.
28
CHAPTER 7
FUTURE WORKS
Future work for this research will focus on several key areas to enhance the model's applicability and
performance. Firstly, efforts will be directed toward developing a more comprehensive application that
integrates both classification and segmentation functionalities. By incorporating classification capabilities,
the system can not only segment regions of interest but also provide insights into the nature of the detected
abnormalities, such as differentiating between benign and malignant tumors. This would significantly
improve the diagnostic utility of the application, offering a unified platform for clinicians to analyze breast
ultrasound images more effectively.
Secondly, improving the model's validation accuracy will be a priority to ensure robust generalization to
unseen data. This can be achieved by using larger and more diverse datasets, including ultrasound images
from different populations and imaging conditions. Data augmentation techniques, such as elastic
transformations, random rotations, and contrast adjustments, will be explored to simulate real-world
variability and improve the model's adaptability. Additionally, advanced optimization techniques like
learning rate schedulers, fine-tuning deeper layers, and incorporating ensemble learning strategies will be
implemented to enhance performance further.
Another focus will be on refining the segmentation model by exploring hybrid architectures and
incorporating advanced transformer-based techniques such as Swin Transformers or Vision Transformers
(ViT) for improved contextual understanding. Domain adaptation methods will also be considered to
handle variations across imaging devices and datasets. Addressing these challenges can significantly
improve the segmentation quality, particularly for complex cases with subtle or ambiguous tumor features.
Lastly, the user interface of the Streamlit-based web application will be expanded to include additional
functionalities, such as real-time visualization of classification probabilities, automated reporting, and
integration with electronic medical records (EMRs). This will ensure the application is not only accurate
but also practical and scalable for deployment in clinical settings. With these enhancements, the research
will contribute to a more comprehensive, efficient, and user-friendly tool for early breast cancer detection
and diagnosis.
29
REFERENCES
[2] Shamshad F, Khan S, Zamir SW, Khan MH, Hayat M, Khan FS, Fu H. "Transformers in
Medical Imaging: A Survey." Med Image Anal 2023;88:102802.
DOI:10.1016/j.media.2023.102802.
[3] Islam, M. R., Rahman, M. M., Ali, M. S., Nafi, A. A. N., Alam, M. S., Godder, T. K.,
Miah, M. S., & Islam, M. K. (2024). Enhancing breast cancer segmentation and classification:
An Ensemble Deep Convolutional Neural Network and U-Net approach on ultrasound images.
Medical Image Analysis, 102802. DOI:10.1016/j.media.2023.102802
[4] Sulaiman, A., Anand, V., Gupta, S., Rajab, A., Alshahrani, H., Al Reshan, M. S., Shaikh,
A., & Hamdi, M. (2024). Attention-based U-Net model for breast cancer segmentation using
BUSI dataset. Scientific Reports, 14, 22422. DOI:10.1038/s41598-024-21764-3
[5] Sun, W., Tseng, T.-L., Zhang, J., & Qian, W. (2016). Enhancing deep convolutional
neural network scheme for breast cancer diagnosis with unlabeled data. Computers in Biology
and Medicine, 74, 82-91. DOI:10.1016/j.compmedimag.2016.07.004
[6] Xu, W., Fu, Y.-L., & Zhu, D. (2023). ResNet and its application to medical image
processing: Research progress and challenges. Computers in Biology and Medicine, 161,
107660.DOI:10.1016/j.cmpb.2023.107660
[7] Zheng, Y., Baloch, S., Englander, S., Schnall, M. D., & Shen, D. (2008). Segmentation
and Classification of Breast Tumor Using Dynamic Contrast-Enhanced MR Images. Journal of
Magnetic Resonance Imaging, 28(4), 934–944.
DOI:10.1002/jmri.21419.
[8] Iqbal, A., & Sharif, M. (2023). Memory-efficient transformer network with feature fusion
for breast tumor segmentation and classification task. Engineering Applications of Artificial
Intelligence, 114, 107292. DOI:10.1016/j.engappai.2023.107292
30