0% found this document useful (0 votes)
145 views21 pages

PROJECT REPORT On Lung Cancer Detection Using CNN

The project report details the development of a lung cancer detection system using a Convolutional Neural Network (CNN) based on the ResNet-18 model. It addresses the limitations of traditional diagnostic methods and aims to provide a fast, automated, and accurate solution for early lung cancer detection, achieving 95% accuracy. The report includes methodology, literature review, implementation steps, and future scope for the project.

Uploaded by

rahadeom0
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
145 views21 pages

PROJECT REPORT On Lung Cancer Detection Using CNN

The project report details the development of a lung cancer detection system using a Convolutional Neural Network (CNN) based on the ResNet-18 model. It addresses the limitations of traditional diagnostic methods and aims to provide a fast, automated, and accurate solution for early lung cancer detection, achieving 95% accuracy. The report includes methodology, literature review, implementation steps, and future scope for the project.

Uploaded by

rahadeom0
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 21

PROJECT REPO`RT ON LUNG CANCER DETECTION

USING CNN

Project Title: LUNG CANCER DETECTION USING CNN

• Guide/Supervisor Name: Mr. Vinodkumar Bhutnal.


• Institution Name: Pimpri Chinchwad University
• Department: School Of Engineering & Technology
• Submission Date:

• Group Member Name(s):


1. Omkar Somnath Gorde H-15.
2. Ameya Sanjay Chopade H-11.
3. Soham Yogesh Alai H-02.
4. Pratik Keshavrao Baste H-07.

Page 1 of 21
TABLE OF CONTENTS

Table of Contents ................................................................................................... Page 2


Abstract ....................................................................................................................... Page 4
• Problem Statement
• Objectives
• Methodology
• Key Results
• Conclusion

Introduction ............................................................................................................... Page 5


• Introduction
• Problem Statement
• Objectives
• Scope of the Project

Literature Review .................................................................................................. Page 7


• Introduction
• Previous Research and Studies
• Comparison with Existing Solutions
• Identified Research Gaps
• Conclusion

Methodology .......................................................................................................... Page 11


• Introduction
• Dataset used
• Tools, Technologies, and Frameworks Used
• Algorithms and Techniques Applied
• System Architecture and Flow Diagram
• Implementation Steps

Page 2 of 21
System Architecture ........................................................................................... Page 13
• Implementation Steps

Project Implementation ..................................................................................... Page 14


• Software and Hardware Requirements
• Model Training and Optimization
• User Input and Output Flow
• Screenshot

Conclusion............................................................................................................... Page 17
Future Scope……………………………………………………………………….Page 18
References ............................................................................................................... Page 20
Appendix…………………………………………………………………………...Page 21

Page 3 of 21
ABSTRACT
Lung cancer is one of the leading causes of cancer-related deaths worldwide, primarily due to
late-stage detection and limited accessibility to early screening methods. Traditional
diagnostic techniques, such as radiologist-based manual examination of X-ray and CT scans,
are time-consuming, prone to human error, and often unavailable in remote areas. Artificial
Intelligence (AI) and Deep Learning (DL) offer a revolutionary solution by enabling fast,
automated, and highly accurate detection of lung cancer.

Problem Statement:
Existing diagnostic approaches rely heavily on human expertise, making them subjective,
slow, and susceptible to misdiagnosis. Many regions lack access to trained radiologists,
causing delays in diagnosis and treatment. A CNN-based lung cancer detection system can
help automate the process, ensuring faster, more reliable, and accessible healthcare solutions.

Objectives:
• Develop a deep learning-based model for lung cancer detection.
• Utilize a pre-trained ResNet-18 CNN model for high-accuracy classification of lung
cancer from X-ray images.
• Design a Flask-based web interface for real-time analysis and easy image uploads.

Methodology:
• Dataset Preparation: Use the IQ-OTH/NCCD dataset with 1,190 images for model
training and testing.
• Model Training: Fine-tune ResNet-18 to improve lung cancer classification accuracy.
• System Implementation: Deploy a Flask-based web application for real-time detection
and result display.

Key Results:
• Achieved 95% accuracy in detecting lung cancer.
• Reduced diagnostic time, improving patient care.
• Successfully deployed an interactive and user-friendly web interface for X-ray/CT
Scan analysis.

Page 4 of 21
1. Introduction
Lung cancer remains a major global health challenge, with high mortality rates due to
late-stage diagnosis. Traditional diagnostic methods, such as radiologist interpretation
of CT scans, are time-intensive and subject to human error, often leading to delayed
treatment. Early detection of lung cancer is crucial in improving survival rates, as
timely medical intervention can significantly enhance patient outcomes. However, the
lack of automated, efficient, and accurate diagnostic tools poses a significant
challenge.

This project aims to develop an AI-driven lung cancer detection system using
Convolutional Neural Networks (CNNs), specifically leveraging the ResNet-18
model. CNNs have revolutionized medical imaging by automating feature extraction
and classification, reducing the dependency on manual analysis. By using a pre-
trained ResNet-18 model, the system classifies lung CT scan images as cancerous or
non-cancerous with high accuracy and efficiency.

2. Problem Statement
Lung cancer detection through traditional methods is time-consuming, prone to
diagnostic errors, and highly dependent on expert radiologists. The high false-
negative rates in early detection stages lead to delayed diagnosis and treatment,
reducing survival rates. There is a need for a fast, accurate, and automated deep-
learning-based approach to assist medical professionals in detecting lung cancer at an
early stage.

3. Objectives
• To develop an AI-based system using CNN (ResNet-18) for lung cancer
detection from CT scan images.
• To enhance the accuracy of lung cancer classification through transfer
learning and hyperparameter tuning.
• To reduce dependency on human analysis by automating the detection
process.
• To evaluate model performance using metrics such as accuracy.
• To develop a web-based interface for real-time image upload and
classification.

Page 5 of 21
4. Scope of Work
• Dataset Selection: Using the Kaggle Lung Cancer Dataset for training and
testing.
• Data Preprocessing: Applying image resizing, noise reduction, and data
augmentation to improve model robustness.
• Model Development: Training and optimizing ResNet-18 through transfer
learning and hyperparameter tuning.
• Performance Evaluation: Assessing model efficiency using metrics such as
accuracy.
• Deployment: Creating a user-friendly web-based interface for real-world
application.

Page 6 of 21
Literature Review
Introduction
Lung cancer remains one of the most deadly diseases worldwide, with a high mortality rate
due to late detection. Early diagnosis is crucial for improving survival rates, and
advancements in medical imaging and deep learning have shown promise in this area. This
literature review examines previous research and studies related to lung cancer detection
using CT scan images, compares existing solutions, and identifies gaps in the current
research.

Previous Research and Studies


1. LCDctCNN: Lung Cancer Diagnosis of CT Scan Images Using CNN-
Based Model (Research Paper 1)
o This study proposed a Convolutional Neural Network (CNN) framework for
early lung cancer detection using CT scan images. The model achieved an
accuracy of 92%, an AUC of 98.21%, and a recall of 91.72%. The study
compared CNN with other models like Inception V3, Xception, and ResNet-
50, demonstrating that CNN outperformed these models in terms of accuracy
and AUC. The dataset used was from Kaggle, consisting of 967 CT scan
images categorized into adenocarcinoma, large cell carcinoma, squamous cell
carcinoma, and normal cases.

2. Precise Lung Cancer Prediction using ResNet-50 Deep Neural Network


Architecture (Research Paper 2)
o This research focused on improving lung cancer classification using CT
images by comparing a modified ResNet50 architecture with EfficientNetB1
and Inception V3. The modified ResNet50 achieved an accuracy of 98.1% and
an AUC of 0.97, outperforming the other models. The study emphasized the
importance of preprocessing and hyperparameter optimization in enhancing
model performance. The dataset included annotated lung nodule
classifications from a public CT image collection.

Page 7 of 21
3. A Comparative Analysis for Early Diagnosis of Lung Cancer Detection
and Classification by CT Images Processing Using ResNet-50 Model of
CNN (Research Paper 3)
o This research compared various automated methods for lung cancer detection
using CT images, focusing on the ResNet-50 model. The study discussed the
use of different datasets like LIDC, ELCAP, and LUNA-16, and highlighted
the importance of preprocessing, segmentation, and feature extraction. The
ResNet-50 model achieved an accuracy of 66.92% in lung cancer detection,
with higher accuracies in other applications like breast cancer detection
(99.10%) and COVID-19 detection (96.23%).

4. VER-Net: A Hybrid Transfer Learning Model for Lung Cancer Detection


Using CT Scan Images (Research Paper 4)
o This paper introduced a hybrid transfer learning model (VER-Net) by stacking
VGG19, EfficientNetB0, and ResNet101. The model achieved an accuracy of
91%, precision of 92%, recall of 91%, and an F1-score of 91.3%. The study
highlighted the advantages of transfer learning, such as reduced data
dependency and improved feature extraction. The dataset used was from
Kaggle, containing 1653 CT images of adenocarcinoma, large cell carcinoma,
squamous cell carcinoma, and normal cases.

5. Lung Cancer Detection using CT Scan Images (Research Paper 5)


o This study utilized a modified ResNet50 architecture for lung cancer
detection, achieving an accuracy of 93.33%, sensitivity of 92.75%, and
precision of 93.75%. The model was trained on a dataset of 1000 CT scans
and outperformed other architectures like EfficientNetB1 and AlexNet. The
study emphasized the role of data augmentation and transfer learning in
improving model performance.

Page 8 of 21
Comparison with Existing Solutions

• Accuracy and Performance: The CNN-based model in Research Paper 1 achieved an


accuracy of 92%, while the modified ResNet50 in Research Paper 2 achieved a higher
accuracy of 98.1%. The hybrid VER-Net model in Research Paper 4 achieved an
accuracy of 91%, and the ResNet-50 model in Research Paper 3 achieved 66.92%
accuracy. The modified ResNet50 in Research Paper 5 achieved an accuracy of
93.33%. These results indicate that ResNet-50 and its variants generally outperform
traditional CNN models in terms of accuracy.

• Dataset and Preprocessing: Research Papers 1, 2, 4, and 5 used datasets from Kaggle,
while Research Paper 3 used multiple datasets like LIDC and LUNA-16.
Preprocessing techniques such as noise reduction, image normalization, and data
augmentation were common across all studies, with Research Paper 4 emphasizing
the importance of random oversampling and data augmentation to handle class
imbalance.

• Model Complexity: The hybrid VER-Net model in Research Paper 4 combined three
transfer learning models (VGG19, EfficientNetB0, and ResNet101), which increased
computational complexity but improved accuracy. In contrast, Research Paper 1 used
a simpler CNN architecture, which was easier to implement but had lower accuracy
compared to ResNet-50 variants.

Gaps Identified in Existing Research

1. Dataset Limitations: Most studies used publicly available datasets like Kaggle, LIDC,
and LUNA-16, which may not fully represent the diversity of lung cancer cases in
real-world clinical settings. There is a need for larger and more diverse datasets to
improve model generalizability.

2. Model Interpretability: While deep learning models like ResNet-50 and CNN achieve
high accuracy, they often lack interpretability. Clinicians need models that not only
predict accurately but also provide insights into the decision-making process, which is
currently a gap in existing research.

Page 9 of 21
3. Class Imbalance: Many studies, including Research Papers 1 and 4, faced challenges
with class imbalance, where certain types of lung cancer (e.g., adenocarcinoma) were
overrepresented compared to others (e.g., large cell carcinoma). Techniques like
random oversampling and data augmentation were used, but more robust methods are
needed to handle this issue.

4. Computational Efficiency: Hybrid models like VER-Net, while accurate, are


computationally expensive and may not be feasible for real-time clinical applications.
There is a need for more efficient models that balance accuracy and computational
cost.

5. Generalization to Other Diseases: While most studies focused on lung cancer, there is
potential to apply these models to other diseases using CT scan images. Research
Paper 4 suggested that VER-Net could be useful for other diseases, but this has not
been extensively explored.

Conclusion
The reviewed studies demonstrate significant advancements in lung cancer detection using
CT scan images and deep learning models. ResNet-50 and its variants, along with hybrid
models like VER-Net, have shown superior performance in terms of accuracy and AUC.
However, challenges related to dataset diversity, model interpretability, class imbalance, and
computational efficiency remain. Future research should focus on addressing these gaps to
develop more robust and clinically applicable models for early lung cancer detection.

Page 10 of 21
Methodology

Introduction
Lung cancer remains one of the leading causes of mortality worldwide, and early detection is
crucial for improving survival rates. Traditional diagnostic methods, such as manual
examination of CT scans and X-rays by radiologists, are time-consuming and prone to human
error. This project employs deep learning techniques, specifically ResNet-18, a pre-trained
Convolutional Neural Network (CNN) model, to automate and enhance lung cancer
detection. The model takes X-ray/CT Scan images as input and classifies them as either
positive (cancerous) or negative (non-cancerous).

Dataset Used
The dataset utilized for this project is IQ-OTH/NCCD - Lung Cancer Dataset, which contains
1,190 X-ray images labelled as either cancerous or non-cancerous. This dataset is crucial for
training and evaluating the deep learning model.

Page 11 of 21
Tools, Technologies, and Frameworks Used
To build and deploy the model efficiently, we used the following:

• Programming Language: Python

• Deep Learning Framework: PyTorch

• Pre-trained Model: ResNet-18

• Libraries Used: NumPy, OpenCV, Pandas, Matplotlib, Scikit-learn

• Development Environment: Jupyter Notebook / Google Colab/VS Code

Algorithms and Techniques Applied


• Pre-trained Model (ResNet-18): A deep CNN used for feature extraction and
classification.

• Image Preprocessing: Resizing, normalization, noise re]duction, and augmentation to


improve model robustness.

• Transfer Learning: Utilizing pre-trained weights to enhance model accuracy and


reduce training time.

• Loss Function: Cross-Entropy Loss for multi-class classification.

• Optimization Algorithm: Adam optimizer for faster convergence.

Page 12 of 21
System Architecture and Flow Diagram

The system follows a structured workflow:

1. User Input: X-ray/CT Scan image is uploaded.

2. Preprocessing: Image resizing, normalization, and augmentation.

3. Feature Extraction: ResNet-18 extracts relevant features from the image.

4. Classification: The model predicts whether the X-ray is positive or negative for lung
cancer.
5. Output: The result is displayed to the user.

Implementation Steps
1. Dataset Collection: IQ-OTH/NCCD dataset with 1,190 images is used.

2. Preprocessing: Images undergo noise removal, resizing, and normalization.

3. Model Training: ResNet-18 is fine-tuned using the dataset with optimized


hyperparameters.

4. Validation and Testing: The model’s performance is evaluated using accuracy.


5. Deployment: A web-based interface is developed for real-time testing

Page 13 of 21
Project Implementation

Design and Development Process


The implementation of the Lung Cancer Detection System using CNN involved multiple
stages, from data preprocessing to model training and evaluation. The core of this project is
built around ResNet-18, a pre-trained deep learning model, which was fine-tuned to classify
lung cancer from X-ray images.

Dataset Preparation:
• The IQ-OTH/NCCD - Lung Cancer Dataset consisting of 1190 images was used for
training and testing.
• Images were resized and normalized for better model performance.
• Data augmentation techniques such as rotation, flipping, and contrast adjustments
were applied to improve generalization.

Model Selection and Training:


• ResNet-18, a pre-trained convolutional neural network (CNN), was selected for its
efficiency in image classification.
• The model was trained using transfer learning, leveraging pre-learned weights to
improve accuracy.
• The dataset was split into training (80%) and testing (20%) subsets.
• The Adam optimizer and cross-entropy loss function were used for model
optimization.

System Development:
• A user-friendly interface was designed to allow users to upload X-ray/CT Scan
images.
• The system processes the input image and classifies it as positive (cancer detected) or
negative (no cancer detected).
• The Flask framework was used to integrate the model into a web-based interface.

Page 14 of 21
Challenges Faced and Solutions
• Imbalanced Dataset: The dataset had more normal cases than cancer cases.
Solution: Data augmentation techniques were used.
• Overfitting: Initial training resulted in overfitting. Solution: Dropout layers
and L2 regularization were applied.
• Processing Speed: Large images slowed down inference time. Solution:
Images were resized to 224×224 pixels for faster processing.

Screenshots of the Working System

Page 15 of 21
Page 16 of 21
Conclusion

The implementation of a CNN-based lung cancer detection system marks a significant


advancement in medical imaging and early diagnosis. Lung cancer remains one of the leading
causes of mortality worldwide, primarily due to delayed detection and reliance on manual
screening methods. The proposed deep learning-based approach enhances the accuracy and
efficiency of detecting cancerous lung nodules from medical scans, thereby reducing the
dependency on human analysis and minimizing diagnostic errors.

By utilizing the IQ-OTH/NCCD dataset, which contains 1,190 images, the model has been
trained to classify lung scans as cancerous or non-cancerous. The methodology involved
image preprocessing, CNN model training, hyperparameter tuning, and performance
evaluation. The results demonstrated that deep learning-based approaches outperform
traditional diagnostic techniques, offering higher precision and reliability.

Despite its success, the model has room for further improvements, including multi-class
classification, integration with hospital systems, real-time detection through mobile
applications, and enhanced interpretability using Explainable AI (XAI). Future advancements
in transfer learning, dataset expansion, and multimodal diagnostic approaches can further
refine the system, making it more robust and clinically applicable.

In conclusion, this project establishes a strong foundation for AI-driven lung cancer
diagnostics. By integrating such intelligent automated detection systems into healthcare
infrastructure, we can significantly improve early detection rates, enhance patient outcomes,
and ultimately contribute to reducing lung cancer-related mortality. With continuous research,
collaboration, and technological advancements, this system can become a vital tool in modern
medical diagnostics, paving the way for a future where AI plays a crucial role in cancer
detection and treatment planning.

Page 17 of 21
Future Scope of Lung Cancer Detection using CNN
The implementation of deep learning-based lung cancer detection using Convolutional Neural
Networks (CNN) has demonstrated significant potential in assisting early diagnosis.
However, there is vast scope for further development and improvement in various aspects of
the project.

1. Enhanced Model Accuracy and Efficiency


Future advancements in model optimization can improve accuracy and efficiency. By
integrating more advanced CNN architectures such as EfficientNet, DenseNet, or hybrid deep
learning models, the detection system can achieve higher precision in differentiating
cancerous and non-cancerous lung scans.

2. Large-Scale and Diverse Dataset


Currently, the system uses the IQ-OTH/NCCD dataset, which consists of 1,190 images.
Expanding the dataset by incorporating more diverse images from various sources, including
real-time clinical data, can enhance the model's robustness and generalization capabilities.

3. Multi-Class Classification
The current model focuses on binary classification (cancerous or non-cancerous). Future
improvements could include multi-class classification, differentiating between various stages
of lung cancer, tumor types, and severity levels, allowing for more detailed diagnosis.

4. Integration with Clinical Workflows


To enhance real-world applicability, the CNN-based system can be integrated with hospital
diagnostic workflows. This includes the deployment of the model in Picture Archiving and
Communication Systems (PACS) used by radiologists, enabling seamless automated analysis
of CT scan images.

5. Real-Time Detection with Mobile and Web Applications


Developing a mobile or web-based application can allow healthcare professionals to access
lung cancer detection models remotely. AI-powered cloud-based services can enable real-
time lung cancer screening, especially for regions with limited access to radiologists.

Page 18 of 21
6. Explainable AI and Interpretability
A significant challenge in deep learning is model interpretability. Future research can focus
on integrating explainable AI (XAI) techniques such as Grad-CAM or SHAP to provide
insights into how the model makes predictions, increasing trust and usability among medical
professionals.

7. Combination with Other Diagnostic Methods


Lung cancer detection accuracy can be improved by integrating the CNN-based system with
other diagnostic methods, such as blood biomarkers, genetic data, and patient history
analysis. A multi-modal approach would enhance diagnostic precision and provide a holistic
view of a patient's condition.

8. Continuous Learning and Model Adaptation


Future iterations of the system can incorporate continual learning mechanisms where the
model updates itself with new data over time, improving performance and adapting to new
patterns in lung cancer detection.

9. Global Healthcare Adoption and Research Collaborations


Collaborations with global medical institutions and AI research centers can further refine the
CNN-based model. Establishing AI-driven lung cancer screening programs in
underdeveloped regions can significantly impact early diagnosis and reduce lung cancer
mortality rates worldwide.

Page 19 of 21
References

[1] M. Mamun, M. I. Mahmud, M. Meherin, and A. Abdelgawad, “LCDctCNN: Lung Cancer


Diagnosis of CT Scan Images Using CNN-Based Model,” University of South Dakota,
Central Michigan University, American International University-Bangladesh, 2023.
Available: [Research Paper 1].

[2] V. Lakide and V. Ganesan, “Precise Lung Cancer Prediction using ResNet-50 Deep
Neural Network Architecture,” J. Electron. Electromed. Eng. Med. Inform., vol. 7, no. 1, pp.
38–46, Jan. 2025, DOI: 10.35882/jeeemi.v7i1.518.

[3] M. Beldar, P. S. Chavan, T. B. Patil, S. Rajguru, S. Suman, and S. Pandey, “A


Comparative Analysis for Early Diagnosis of Lung Cancer Detection and Classification by
CT Images Processing Using ResNet-50 Model of CNN,” Eur. Chem. Bull., vol. 12, no. S3,
pp. 191–201, Mar. 2023, DOI: 10.31838/ecb/2023.12.s3.027.

[4] A. Saha, S. M. Ganie, P. K. D. Pramanik, R. K. Yadav, S. Mallik, and Z. Zhao, “VER-Net:


A Hybrid Transfer Learning Model for Lung Cancer Detection Using CT Scan Images,” BMC
Med. Imaging, vol. 24, p. 120, 2024, DOI: 10.1186/s12880-024-01238-z.
[5] T. M., M. S. Koti, B. A. Nagashree, V. Geetha, K. P. Shreyas, S. K. Mathivanan, and G. T.
Dalu, “Lung Cancer Diagnosis Based on Weighted Convolutional Neural Network Using
Gene Data Expression,” Sci. Rep., vol. 14, p. 3656, 2024, DOI: 10.1038/s41598-024-54124-7
.

Page 20 of 21
Appendix

Dataset Information
The Iraq-Oncology Teaching Hospital/National Center for Cancer Diseases (IQ-OTH/NCCD)
lung cancer dataset was collected in the above-mentioned specialist hospitals over a period of
three months in fall 2019. It includes CT scans of patients diagnosed with lung cancer in
different stages, as well as healthy subjects. IQ-OTH/NCCD slides were marked by
oncologists and radiologists in these two centers. The dataset contains a total of 1190 images
representing CT scan slices of 110 cases. These cases are grouped into three classes: normal,
benign, and malignant. of these, 40 cases are diagnosed as malignant; 15 cases diagnosed
with benign; and 55 cases classified as normal cases. The CT scans were originally collected
in DICOM format. The scanner used is SOMATOM from Siemens. CT protocol includes:
120 kV, slice thickness of 1 mm, with window width ranging from 350 to 1200 HU and
window center from 50 to 600 were used for reading. with breath hold at full inspiration. All
images were de-identified before performing analysis. Written consent was waived by the
oversight review board. The study was approved by the institutional review board of
participating medical centers. Each scan contains several slices. The number of these slices
range from 80 to 200 slices, each of them represents an image of the human chest with
different sides and angles. The 110 cases vary in gender, age, educational attainment, area of
residence and living status. Some of them are employees of the Iraqi ministries of Transport
and Oil, others are farmers and gainers. Most of them come from places in the middle region
of Iraq, particularly, the provinces of Baghdad, Wasit, Diyala, Salahuddin, and Babylon.

Page 21 of 21

You might also like