0% found this document useful (0 votes)

8 views96 pages

Final Book

The document presents a deep learning approach for lung cancer detection and patient support as part of a Bachelor's degree project in Bio Artificial Intelligence. It discusses the significance of early detection in improving patient outcomes, the challenges of current screening methods, and the potential of AI to enhance diagnostic accuracy. The study aims to develop an AI model that can effectively analyze CT scans to identify lung cancer and its types, thereby addressing the limitations of existing diagnostic techniques.

Uploaded by

firstcliuser

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

8 views96 pages

Final Book

Uploaded by

firstcliuser

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 96

Artificial intelligence project applied in

A Deep Learning Approach For Lung Cancer

Detection and Patient Support

Documentation and Applied Study, in partial

fulfillment of the requirements for the degree of
Bachelor of Science in Bio Artificial
Intelligence

Omar Ibrahim Mohamed Elseidy Ali Mostafa Mostafa Erfan

Omar Shaaban Ramadan Suliman Kareem Ahmed Ibrahim Dwidar

Anas Ramzy El Sayed El Nemr Mohamed Ahmed Aboelfetooh

Moftah

Malak Mohamed elbordany Khalid Ayman Ragab Elmaria

Under the Supervision of

Dr: Fatma M. Talaat

Assistant professor

Eng: Abdelmawala Yousef

2024
Artificial intelligence project applied in

A Deep Learning Approach For Lung Cancer

Detection and Patient Support

Presented by:

Omar Ibrahim Mohamed Elseidy Ali Mostafa Mostafa Erfan

Omar Shaaban Ramadan Suliman Mohamed Ahmed Aboelfetooh Moftah

Anas Ramzy El Sayed El Nemr Kareem Ahmed Ibrahim Dwidar

Malak Mohamed elbordany Khalid Ayman Ragab Elmaria

Documentation and Applied Study, in partial

fulfillment of the requirements for the degree of
Bachelor of Science in Bio Artificial
Intelligence

Under the Supervision of

Dr: Fatma M. Talaat

Assistant professor

2024
Eng: Abdelmawala Yousef

Dedication

To my supportive parents,
who always believe in my
abilities and encourage me
to face challenges. Thanks
to you, I was able to achieve
this accomplishment. With
all my love and gratitude.

2024
TABLE OF CONTENTS III

LIST OF FIGURES IV

NOMENCLATURE V

ACKNOWLEDGEMENT VI

ABSTRACT VI

CHAPTER 1: INTRODUCTION 1

1.1 OVERVIEW 2
1.2 PROBLEM DEFINITION AND OBJECTIVES 4
1.3 PROPOSED SOLUTION 5
1.4 MOTIVATION 5
1.5 PROJECT PLAN 7
CHAPTER 2: LUNG CANCER 8

2.1 OVERVIEW 10
2.2 TYPES OF LUNG CANCER 11
2.3 SYMPTOMS 12
2.4 PREVENTION 13
2.5 DIAGNOSIS 13
2.6 TREATMENT AND CARE 14
2.7 STAGES OF CARE 14
2.8 CLINICAL TRIALS 15
2.9 LUNG CANCER SCANS 16
CHAPTER 3: LITERATURE REVIEW 18

CHAPTER 4: PROPOSED TECHNIQUE 28

4.1 INPUT AND OUTPUT 29

4.2 DATASET 29
4.3 DATA PREPROCESSING OF CT SCANS 30
4.4 TRANSFER LEARNING WITH MOBILENET 31
4.5 DISCUSSION 32
4.6 COMPARISON 33
4.7 FLOWCHART OF THE APPLICATION 35
4.8 CONCLUSION 36
CHAPTER 5: IMPLEMENTATION 37
5.1 DATA 38
5.2 DATA PREPROCESSING 42
5.3 PROGRAMING LANGUAGE 43
5.4 ACCURACY 44
5.5 CONFUSION MATRIX 45
5.6 MODEL SELECTION 46
CHAPTER 6: LAYOUT 60

6.1 UI/UX 61
6.2 WEB SITE 64
6.3 APPLICATION 66
CHAPTER 7: USED SOFTWARE 71

7.1 INTRODUCTION 72
7.2 TOOLS USED 72
CHAPTER 8: CONCLUSION 77

8.1 CONCLUSION 78
8.2 FUTURE WORK 80
CHAPTER 9: REFERENCE 81

Table of Contents
List of Figures

Figure 1-1 AI in Health Care 1

Figure 2-1 Lung Cancer Cell 9

Figure 4-2 Application Flow chart 35

Figure 5-1:Normal CT scan. 38

Figure 5-3:Adenocarcinoma LC CT scan. 39

Figure 5-4:Large cell carcinoma CT scan. 40

Figure 5-5: Squamous cell carcinoma CT scan. 41

Figure 5-6 Accuracy Equation 44

Figure 5-7 Confusion Matrix Table 45

Figure 5-8: CT scan from Validate file. 46

Figure 5-9: CT scan from Train file. 46

Figure 5-10: CT scan from Test file. 46

Figure 5-11: VGG16 Architecture 49

Figure 5-12: EfficientNet-B0 Architecture 53

Figure 5-13: Mobilenet Architecture 55

Figure 5-14: Mobilenet Architecture 2 55

Figure 6-1: Sign up page. 61

Figure 6-2: Home Page 62

Figure 6-3: Patient History 63

Figure 6-5 Diagnosis & Treatment Pa 64

Figure 6-6 Patient History 65

Figure 6-7 Print As PDF 65

Figure 6-8 Application Sign Up 66

Figure 6-9Application Log In 67

Figure 6-10 Application Home Page 68

Figure 6-11Application Guide 69

Figure 6-12Lung cancer types 70

Figure 7-1: Colab Interface 73

Figure 7-2: Kaggle Interface 74

Figure 7-3 Android Studio interface 75

Nomenclature

 LDCT : Low dose computed tomography

 PET : polyethylene terephthalate
 CT : Computed tomography
 MRI : Magnetic resonance imaging
 FP : False positive
 FN : False negatives
 LDA : Linear discriminant analysis
 ODNN : Optimal deep neural network
 CAD : Computer-aided diagnosis
 CDSS : Clinical Decision Support System
 CNN : Convolutional Neural Network
 LIDC : Lung Image Database Consortium
 RNN : Recurrent neural network
 SFT : Sequential fine-tuning
 FROC : Free-response receiver operating characteristic curve
 AUC : Area under the receiver operating characteristic curve
Acknowledgement

We would like to express our sincere gratitude to everyone who contributed to the
development of the lung cancer detection and type identification project.

Firstly, we would like to express our appreciation to our advisors and mentors who
provided invaluable guidance and support throughout the project. Your ideas and
comments have helped us improve our ideas and make critical decisions. We would
also like to thank our supervisor, Dr. Fatima Muhammad Talaat, for her invaluable
guidance and support throughout the research and experimentation process.

Additionally, we would like to thank the broader community of AI and medical

researchers and enthusiasts for inspiring us and pushing the boundaries of what is
possible. Your work has paved the way for projects like ours, and we are excited to be
a part of this space.

Thank you all for your contributions to this project and your continued support of
our work.
Abstract

Lung cancer remains the leading cause of death worldwide. Early and accurate
detection is critical to improving patient outcomes. Using complex algorithms and
software, artificial intelligence (AI) is able to mimic human cognition in analyzing,
interpreting, and making sense of complex data, and is currently being successfully
applied in various healthcare settings. By leveraging AI's ability to measure
information from images, and its superior ability to recognize complex patterns in
images compared to humans, AI has the potential to help clinicians interpret LDCT
images obtained in the lung cancer screening setting. This book explores the potential
of deep learning to revolutionize lung cancer diagnosis. Intelligent Imaging: A Deep
Learning Approach to Lung Cancer Detection and Patient Support details the
development of a deep learning model to identify lung cancer in chest scans. The
book delves into the technical aspects of the model while emphasizing its real-world
applicability.
CHAPTER 1: INTRODUCTION
Chapter 1:INTRODUCTION

Figure 1-1 AI in Health Care

Chapter 1:INTRODUCTION

1.1 OVERVIEW

Lung cancer is a malignant tumor that arises from lung cells, especially within the
epithelial lining of the bronchi, bronchioles, or alveoli. It is widespread and associated
with high mortality rates on a global scale. In its early stages, Lung cancer shows no
symptoms or presents mild manifestations. As a result, it is usually diagnosed at an
advanced stage. Delay in identifying a medical condition affects the effectiveness of
treatment and reduces the likelihood of long-term survival. The two common types of
Lung cancer are non-small cell lung cancer (NSCLUNG CANCER) and small cell
lung cancer (SCLUNG CANCER). NSCLUNG CANCER is usually classified into
two subtypes: lung squamous cell carcinoma (LUSC) and lung adenocarcinoma
(LUAD). The precise classification of Lung cancer, including LUSC, LUAD, and
SCLUNG CANCER, has an important role in determining the prognosis of Lung
cancer compared with benign and malignant classifications. Accurate classification of
LUNG CANCER at the initial diagnosis stage significantly improves the treatment
efficacy and thus increases the survival rate of patients. PET and computed
tomography (CT) are widely used as non-invasive diagnostic imaging modalities in
clinical practice, where they serve as valuable tools to evaluate the specific diagnosis
of Lung cancer.

Early detection and treatment of Lung cancer through effective screening methods is
vital in enhancing patient outcomes. Based on results from the National Lung
Screening Trial, low-dose helical CT screening is more effective in reducing mortality
in high-risk populations. However, the Lung cancer screening process is prone to
giving false positive (FP) results, which leads to increased costs due to unwarranted
medical interventions and may lead to psychological distress in individuals.
Computer-aided diagnosis has notable advantages in Lung cancer detection, including
enhanced scope in early cancer screening and reduced incidence of FP findings
throughout the diagnostic process.

In the field of Lung cancer detection, there have been notable developments in the
form of new methods and technologies to enhance early diagnosis and treatment
effectiveness. Liquid biopsies are used to test blood samples for cancer. These
diagnostic tests can identify genetic abnormalities and alterations associated with

2
Chapter 1:INTRODUCTION

Lung cancer. Thus, these tests provide a non-invasive way to diagnose the disease and
monitor the effectiveness of treatment. Low dose computed tomography (LDCT)
scanning has emerged as a widely applied method for timely identification of Lung
cancer. LDCT scans use lower radiation levels than traditional CT scans while
providing high-resolution images of the lung area. New bronchoscopy methodologies,
including electromagnetic navigation bronchoscopy and robot-assisted bronchoscopy,
facilitate minimally invasive lung lesion biopsies. These tools promote rapid and
accurate identification of medical conditions. The integration of genomics, proteomics
and metabolomics has allowed the development of diverse strategies for Lung cancer
identification. These methodologies use several molecular markers to enhance
diagnostic accuracy and discover potential targets for therapeutic intervention.

Deep learning-based Lung cancer screening techniques can reduce mortality by

detecting the disease at the initial stage. It can help reduce false negatives (FN) by
detecting subtle or early indicators of Lung cancer that humans may leave unnoticed.
Imaging modalities including computed tomography, magnetic resonance imaging
(MRI), and computed tomography (PET) can be combined using DL algorithms to
understand the disease and aid in diagnosis and treatment planning. The staging of
cancer is directly related to the extent of the disease. A combination of imaging
methods and biopsies of suspicious tissue identifies types of cancer. Cancer staging
helps caregivers choose chemotherapy, immunotherapy, radiation, and surgical
strategies. In particular, a higher cancer stage increases the mortality rate. The
effectiveness of medical treatments depends on the stage of the cancer. Providing
more accurate and trustworthy diagnoses can help reduce misdiagnosis and
unwarranted care. The Lung cancer scan process produces large amounts of medical
imaging data. DL models effectively analyze CT scans, chest radiography, and other
imaging modalities due to their ability to analyze big data. Using DL models, doctors
can gain insight into a patient's condition by combining data from multiple imaging
modalities and other clinical data sources.

Due to privacy concerns and the cost of data acquisition, medical image datasets are
typically small. Pre-trained models can apply broader image dataset expertise to
medical images, enabling models to be trained with minimal medical data. These

3
Chapter 1:INTRODUCTION

models can extract hierarchical information from images, including fine details and
important patterns. Feature extraction supports medical image classification models.

1.2 PROBLEM DEFINITION AND OBJECTIVES

Despite the fact that significant advances in the diagnosis and treatment of lung
cancer have been made, the disease is still associated with poor clinical outcomes and
survival is strongly determined by the stage of disease at diagnosis and thus, whereas
the five-year survival rate for patients with the early-stage disease is 56%, in those
with advanced disease the 5-year survival rate is less than 5%. Considering that only
16% of lung cancers are diagnosed in the early stage and that most patients present
with advanced disease, developing screening tests capable of detecting the disease in
the initial stages has been a long-term goal in lung cancer care.

Several screening methods have been tested so far, including sputum cytology, chest
radiographs (CXR), and low-dose computer tomography (LDCT), and recently the
analysis of various biomarkers, however, data from clinical trials indicate that only
the use of low-dose computer tomography scans (LDCT) in heavy smoker individuals
has been associated with a significant reduction in lung-cancer-related mortality.

Although the introduction of targeted therapies and immunotherapeutic agents,

especially immune checkpoint inhibitors (utilized alone or in combination with
standard chemotherapeutic regimens), have resulted in a longer duration of overall
survival compared with standard chemotherapy, these novel therapies are not
effective in all patients; thus, early detection remains the most important intervention
window for improving patient survival.

Despite the fact that screening lung cancer with LDCT has demonstrated a clear
benefit for reducing all-cause mortality, the high rate of false-positives and the cost of
unnecessary diagnostic procedures needed to confirm or rule out those false-positives
are important limitations of this approach.

4
Chapter 1:INTRODUCTION

1.3 PROPOSED SOLUTION

The emergence of artificial intelligence as a new tool for evaluating medical data
means new opportunities to improve the diagnosis and treatment of various human
diseases. In the case of lung cancer diagnosis, coupling AI algorithms with available
clinical and biomedical data has the potential to improve lung cancer screening
methods. For example, AI has the potential to improve the analysis and interpretation
of lung images obtained via magnetic resonance imaging (MRI) or computer
tomography (CT) and can be useful in deciphering the clinical significance of data
derived from tissue or fluid biomarkers. Better. Accordingly, we decided to design an
artificial intelligence model that detects cancer and accurately determines its type
from CT scans.

1.4 MOTIVATION

 Combating the Leading Cancer Killer

Lung cancer remains the leading cause of cancer-related deaths globally, claiming
more lives than breast, prostate, and pancreatic cancer combined.

Early detection is crucial for improving survival rates and potentially achieving
complete cure.

 Deep Learning's Potential for Early Detection

Deep learning algorithms excel at analyzing complex medical images like CT scans
and PET scans.

They can identify subtle patterns and features indicative of early-stage lung cancer,
which might be missed by human analysis alone.

 Enhanced Accuracy and Efficiency

Deep learning models can be trained on vast datasets of medical images, leading to
highly accurate predictions and classifications.

This can assist radiologists in their diagnoses, potentially reducing human error and
improving overall detection efficiency.

5
Chapter 1:INTRODUCTION

 Personalized Patient Support

Deep learning algorithms can analyze patient data beyond just images, including
genetic information and medical history.

This can aid in personalized treatment plans, risk assessments, and targeted therapies
tailored to individual patients.

 Potential for Real-Time Applications

As deep learning technology advances, it has the potential to be integrated into real-
time diagnostic tools.

This could enable faster diagnoses during medical procedures, leading to quicker
treatment initiation and improved patient outcomes.

 Public Health Benefits

Widespread adoption of accurate and efficient lung cancer detection methods powered
by deep learning can significantly impact public health.

Early detection translates to better treatment outcomes, reduced mortality rates, and
potentially a decrease in the overall burden of lung cancer on healthcare systems.

6
Chapter 1:INTRODUCTION

1.5 PROJECT PLAN

Firstly, we put a plan to improve our project, so we put the important points and start
searching to reach our goal. Here are the important points we think about:

1. searching data: (Chapter 4)

 We started searching for CT scans for lung cancer.
 We used Kaggle and GitHub as our search engines.
 We found 4 types so now we have 4 classes for our model.

2. Select model: (Chapter 4)

 We searched for the best model for image classification and object detection
so the model can detect well.
 Our goal was a model with many numbers of layers. We found 3 models
(Vgg16,MobileNet,IfficientB0) so now we need to choose the best one of
them.

3. UI/UX: (Chapter 5)
 Now we need a design that fits the situation, so we decided to choose designs
like hospitals and clinics designs to let the user feel familiar with our
application.

4. Application & web site: (Chapter 5)

 We built a web site and application that is easily used by doctors and patients,
so you don’t need to call a doctor to use our app.

5. Merging: (Chapter 5)
 Now we need to merge the application with the model.
 We found that Fast API is the best tool to handle this job so we will use Fast
API to link our model with the application and web site.

7
CHAPTER 2: LUNG CANCER

8
Chapter 2 :LUNG CANCER

Figure 2-2 Lung Cancer Cell

9
Chapter 2 :LUNG CANCER

2.1 OVERVIEW

Lung cancer is a type of cancer that starts when abnormal cells grow in an
uncontrolled way in the lungs. It is a serious health issue that can cause severe harm
and death.

Symptoms of lung cancer include a cough that does not go away, chest pain and
shortness of breath.

It is important to seek medical care early to avoid serious health effects. Treatments
depend on the person’s medical history and the stage of the disease.

The most common types of lung cancer are non-small cell carcinoma (NSCLUNG
CANCER) and small cell carcinoma (SCLUNG CANCER). NSCLUNG CANCER is
more common and grows slowly, while SCLUNG CANCER is less common but
often grows quickly.

Lung cancer is a significant public health concern, causing a considerable number of

deaths globally. GLOBOCAN 2020 estimates of cancer incidence and mortality
produced by the International Agency for Research on Cancer (IARC) show as lung
cancer remains the leading cause of cancer death, with an estimated 1.8 million deaths
(18%) in 2020.

Smoking tobacco (including cigarettes, cigars, and pipes) is the primary risk factor for
lung cancer, but it can also affect non-smokers. Other risk factors include exposure to
secondhand smoke, occupational hazards (such as asbestos, radon and certain
chemicals), air pollution, hereditary cancer syndromes, and previous chronic lung
diseases.

10
Chapter 2 :LUNG CANCER

2.2 TYPES OF LUNG CANCER

The main types of lung cancer are non-small cell lung cancer and (NSCLUNG
CANCER) and small cell lung cancer (SCLUNG CANCER).

2.2.1 non-small cell lung cancer

About 80% to 85% of lung cancers are NSCLUNG CANCER. The main subtypes of
NSCLUNG CANCER are adenocarcinoma, squamous cell carcinoma, and large cell
carcinoma. These subtypes, which start from different types of lung cells, are grouped
together as NSCLUNG CANCER because their treatment and prognoses (outlooks)
are often similar.

 Adenocarcinoma: Lung adenocarcinoma starts in cells in the lung that make

mucus, called epithelial cells. Epithelial cells line the surface of the lungs.
Adenocarcinoma is the most common type of non-small cell lung cancer.

Lung adenocarcinoma occurs mainly in people who smoke or used to smoke, but it is
also the most common type of lung cancer seen in people who don’t smoke. It is more
common in women than in men, and it is more likely to occur in younger people than
other types of lung cancer.

 Squamous cell carcinoma: Squamous cell carcinoma starts in squamous cells,

which are flat cells that line the inside of the airways in the lungs. They are often
linked to a history of smoking and tend to be found in the central part of the lungs,
near a main airway (bronchus).
 Large cell (undifferentiated) carcinoma: Large cell carcinoma can appear in
any part of the lung. It tends to grow and spread quickly, which can make it harder

to treat. A subtype of large cell carcinoma, known as large cell neuroendocrine

carcinoma (LUNG CANCERNEC), is a fast-growing cancer that is very similar
to small cell lung cancer.
 Other subtypes: A few other subtypes of NSCLUNG CANCER, such as
adenosquamous carcinoma and sarcomatoid carcinoma, are much less common.

11
Chapter 2 :LUNG CANCER

2.2.2 Small Cell Lung Cancer (SCLUNG CANCER).

This type of lung cancer tends to grow and spread faster than NSCLUNG CANCER.
In most people with SCLUNG CANCER, the cancer has already spread beyond the
lungs at the time it is diagnosed. Since this cancer grows quickly, it tends to respond
well to chemotherapy and radiation therapy. Unfortunately, for most people the

cancer will return at some point.

Small cell lung cancer (SCLC) continues to carry a poor prognosis with a five-year
survival rate of 3.5% and a 10-year survival rate of 1.8%.

2.3 SYMPTOMS

Lung cancer can cause several symptoms that may indicate a problem in the lungs.

The most common symptoms include:

 cough that does not go away.

 chest pain
 shortness of breath
 coughing up blood (hemoptysis)
 fatigue
 weight loss with no known cause.
 lung infections that keep coming back.

Early symptoms may be mild or dismissed as common respiratory issues, leading to

delayed diagnosis.

12
Chapter 2 :LUNG CANCER

2.4 PREVENTION

Not smoking tobacco is the best way to prevent lung cancer.

Other risk factors to avoid include:

 secondhand smoke
 air pollution
 workplace hazards like chemicals and asbestos.

Early treatment can prevent lung cancer from becoming worse and spreading to other
parts of the body.

Prevention of lung cancer includes primary and secondary prevention measures.

Primary prevention aims to prevent the initial occurrence of a disease through risk
reduction and promoting healthy behavior. In public health, these preventive measures
include smoking cessation, promoting smoke-free environments, implementing
tobacco control policies, addressing occupational hazards, and reducing air pollution
levels.

Secondary prevention for lung cancer involves screening methods that aim to detect
the disease in its early stages, before symptoms become apparent and can be indicated
for high-risk individuals. In this population, early detection can significantly increase
the chances of successful treatment and improve outcomes. The primary screening
method for lung cancer is low dose computed tomography (LDCT).

2.5 DIAGNOSIS

Diagnostic methods for lung cancer include physical examination, imaging (such as
chest X-rays, computed tomography scans, and magnetic resonance imaging),
examination of the inside of the lung using a bronchoscopy, taking a sample of tissue
(biopsy) for histopathology examination and definition of the specific subtype
(NSCLUNG CANCER versus SCLUNG CANCER), and molecular testing to identify
specific genetic mutations or biomarkers to guide the best treatment option.

13
Chapter 2 :LUNG CANCER

2.6 TREATMENT AND CARE

Treatments for lung cancer are based on the type of cancer, how much it has spread,
and the person’s medical history. Early detection of lung cancer can lead to better
treatments and outcomes.

 Surgery
 radiotherapy (radiation)
 chemotherapy
 targeted therapy
 immunotherapy

Surgery is often used in the early stages of lung cancer if the tumor has not spread to
other areas of the body. Chemotherapy and radiation therapy can help shrink the
tumor.

Doctors from several disciplines often work together to provide treatment and care of
people with lung cancer.

Supportive care is important for people with lung cancer. It aims to manage
symptoms, provide pain relief, and give emotional support. It can help to increase
quality of life for people with lung cancer and their families.

14
Chapter 2 :LUNG CANCER

2.7 STAGES OF CARE

1-Early-stage disease

The primary treatment for early-stage lung cancer (i.e. tumor limited to the lung, with
no metastatic dissemination to distant organs or lymph nodes) is surgical removal of
the tumor through procedures such as lobectomy, segmentectomy, or wedge resection.
Neoadjuvant therapy (chemotherapy and/or radiation therapy before surgery) can help
reduce tumor size, making it more manageable for surgical removal. Adjuvant
treatment (chemotherapy and/or radiation therapy) is very often recommended after
surgery to reduce the risk of cancer recurrence. In cases where surgery is not feasible,
radiation therapy or stereotactic body radiation therapy (SBRT) may be used as the
primary treatment. Targeted therapy and immunotherapy may also be considered
based on specific tumor characteristics. Individualized treatment plans should be
discussed with healthcare professionals.

2-Advanced disease

The treatment for metastatic stage lung cancer, where the cancer has spread to distant
organs or lymph nodes, is based on various factors, including the patient's overall
health, the extent and location of metastases, histology, genetic profile, and individual
preferences. The primary goal is to prolong survival, alleviate symptoms, and improve
quality of life. Systemic therapies, such as chemotherapy, targeted therapy, and
immunotherapy, play a crucial role in the treatment of metastatic lung cancer.

Chemotherapy is often the first-line treatment for the majority of patients around the
world and involves the use of drugs that circulate throughout the body to kill cancer
cells. Combination chemotherapy regimens are commonly used, and the choice of
drugs depends on factors such as the histological type of the cancer and the patient's
general health conditions. Targeted therapy, designed to block the signaling pathways
that drive the growth of cancer cells, is an important option for patients with specific
genetic mutations or biomarkers identified in their tumor. Immunotherapy,
specifically immune checkpoint inhibitors, has revolutionized the treatment of
metastatic lung cancer. These drugs help to stimulate the immune system to recognize
and attack cancer cells. Local treatments, such as radiation therapy and surgery, may

15
Chapter 2 :LUNG CANCER

be used to manage specific metastatic sites or alleviate symptoms caused by tumor

growth.

2.8 CLINICAL TRIALS

Clinical trials like this are crucial as they offer patients access to novel treatments and
help advance medical knowledge, potentially leading to new standard treatment
protocols. Participation in such trials not only provides access to cutting-edge
therapies but also contributes to the broader fight against cancer by supporting the
development of more effective treatments.

2.9 LUNG CANCER SCANS

Medical imaging tools help radiologists diagnose lung diseases. Among these medical
imaging methods, CT offers more advantages, including size, location,
characterization, and growth of the lesion, which can determine the information of
lung cancer and nodules. 4D CT provides more precise targeting of administered
radiation, which greatly impacts lung cancer management. An automatic detection
system based on linear discriminant analysis (LDA) and optimal deep neural network
(ODNN) has been developed for lung cancer classification in lung CT images. LDA
reduced the extracted image features to reduce the feature dimensions. ODNN is
applied and optimized by modified gravity search algorithm to provide more accurate
classification results. Compared with CT, LDCT is more sensitive for early-stage lung
nodules and detects cancer with lower radiation. However, it does not help reduce
lung cancer deaths.

LDCT is recommended annually for high-risk smokers ages 55 to 74. Computed

tomography yields significantly higher sensitivity and specificity for detecting lung

16
Chapter 2 :LUNG CANCER

nodules than CT scans due to reactive or granulomatous nodular disease. PET

provides good correlation with longer progression times and overall survival rates.
18F-FDG PET has been applied to diagnose solitary pulmonary nodules. 18F-FDG
PET is a critical choice for inpatient and advanced NSCLUNG CANCER for radical
radiotherapy. PET-assisted radiotherapy provides greater accuracy and cures
approximately 32% of patients with stage IIIA lung cancer. 18F-FDG PET provides
an important assessment of response in patients with NSCLUNG CANCER
undergoing induction chemotherapy.

MRI is the most powerful tool for lung imaging without ionizing radiation, but it
provides insufficient information with high costs and time-consuming limitations. It
fails to detect approximately 10% of small lung nodules (4-8 mm in diameter). MRI
with ultra-short echo time (UTE) can improve signal intensity and reduce lung
sensitivity. MRI with UTE is sensitive for detecting small lung nodules (4-8 mm).
MRI achieves a higher lung nodule detection rate than LDCT. MRI with different
pulse

sequences have also improved the sensitivity of detecting lung nodules. The authors
investigated T1-weighted and T2-weighted MRI for the detection of small lung
nodules. Compared with a 3T 1.5 MRI, a 1.5T MRI is much easier to identify ground
glass opacities. Ground-glass opacities have been successfully detected in 75% of
people with lung fibrosis who received 1.5 Tesla MRI with SSFP sequences. MRI
with T2-weighted fast spin echo provides similar or better performance for detecting
ground-glass infiltration in immunocompromised subjects.

17
Chapter 2 :LUNG CANCER

18
CHAPTER 3: LITERATURE REVIEW

19
Chapter 3:LITERATURE REVIEW

Introduction: The process of diagnosing some diseases can be aided using

automation methods through computer-aided diagnosis (CAD). This method uses
software to segment, predict, localize and classify symptoms, which will be used to
infer the presence and severity of diseases. In this work, the focus is on providing a
review of CAD methods used to identify cancerous nodules in lung CT scans. In
general, CT allows doctors to recognize the presence of lung cancer nodules,
especially when the nodules are large, which corresponds to the late stage of the
disease. However, it is important to recognize nodules at an early stage, which are
usually small in size before the patient develops a golf ball-sized lung tumor.

Several studies have investigated the use of deep learning algorithms for CT-based
lung cancer screening and diagnosis. In general, there are unique image attenuation
patterns in CT images for healthy and unhealthy scans. To distinguish the lungs from
the surrounding tissues, straightforward techniques such as numerical approaches,
gray-level thresholding, and shape-based approaches can be used to perform simple
lung segmentation.

In [1], CNN based model for automatic detection of lung cancer provided lung CT
scan image. We proposed an algorithm known as CNN based Automatic Lung Cancer
Detection (CNN-ALCD) which is based on supervised learning phenomenon. The
learned model is capable of detecting lung cancer from any newly arrived test sample.
The proposed solution has different mechanisms such as preprocessing, building CNN
with different layers, training the CNN model and performing lung cancer detection.
Empirical study revealed that the proposed CNN based model outperforms many
existing neural network-based methods with highest accuracy 94.11%. Therefore, the
proposed system can be integrated with a Clinical Decision Support System (CDSS)
in healthcare units for automatic diagnosis of lung cancer.

20
Chapter 3:LITERATURE REVIEW

In [2], aims to classify malignant and non-malignant cells development in the lungs
using the 2D Convolutional Neural Network (CNN) algorithm to classify the tumors
found in lung as malignant or benign. This method was evaluated on Kaggle CT
scans, experimental results show that our method achieves 88.76% accuracy in
identifying lung nodules from CT images, which is more efficient as compared to
accuracy obtained by the traditional neural network systems.

In [3], some Computed Tomography (CT) images of the Lung Image Database
Consortium (LIDC) dataset are adopted as training and testing data, data
preprocessing is completed by intercepting pixels, normalization and other methods,
data enhancement is realized such as rotation and scaling methods, and the pulmonary
nodule sample library is expanded. Utilizing the constructed lung nodule sample
library, train the Convolutional Neural Network (CNN) model, complete the detection
and segmentation of pulmonary nodules, and exact the regions of pulmonary nodules.
The size and regularity features of pulmonary nodules are extracted, and lung cancer
recognition is realized according to the size and shape of pulmonary nodules. The
experiment results show the lung cancer detection and identification method based on
convolutional neural network with morphological features has higher accuracy.

In [4], In the validating conviction, the enactment of the neural network technique has
been initiated to examine the cancerous growth in the gathered image datasets. With
the help of Artificial intelligence and deep learning technique the cancerous growth
can be evaluated. In accordance to knock back the performance measures the
supervised learning technique is implemented with the use of the deep learning
technique. Convolutional Neural Network the stratagem for tumor detection. The
substructure of this work includes the following constraints such as image acquisition,
image pre-processing, image enhancement, image segmentation, feature extraction,
neural identification. To put it succinctly, machine learning technique gives an
innovational approach to enrich the decision support in lung tumor medicaments at
less cost.

21
Chapter 3:LITERATURE REVIEW

In [5], The use of machine learning is an efficient way to distribute the work of
doctors and process a large amount of data to produce accurate results on the go.
Three phases of CT image pre-processing, Deep Learning, and Convolutional Neural
Network use make up the diagnosis approach. The pre-processing converts raw data
into usable form and deep learning algorithm assigns weight to the data, in the last
stage CNN is used to conclude the health status of the lung, i.e. normal or abnormal.

In [6], combines three types of optimizers with six deep learning models to conduct a
performance comparison. This investigation focuses on six models AlexNet,
GoogleNet, ResNet, Inception V3, EfficientNet b0, and SqueezeNet. The different
models are assessed by comparing their performance with a stochastic gradient with
momentum, Adam, and RMSProp optimization techniques. The study showed that
CPU training takes time for training without GPU support. According to this study,
the google net with Adam as optimizer gives Accuracy-92.08%, Precision-100%,
Recall-86.89%, F1score-92.98%, FPR-0%, FNR-13.11%,outperforming the other
deep learning architectures. When comparing the computational time for deep
learning models, it is observed that Inception V3 takes the most time to train, and
AlexNet takes the least time.

In [7], “Modern deep learning model advancements can be applied to create advanced
computer-aided diagnosis methods to find malignant nodules. The suggested method
classifies nodules seen in CT scan pictures as malignant or benign utilizing a Particle
Swarm Optimization-RNN. The identification and categorization of malignant
nodules has made substantial use of image analysis and neural networks. RNNs are
therefore more suited for the job of classifying and detecting nodules. Additional
characteristics of PSO-RNNs include multiple feature extraction. The suggested PSO-
RNN model, which makes use of the domain expertise of the CT scan pictures of the
lung in the department of medicine and Multilayer Perceptron, will be appropriate for
the early recognition and characterization of CT images including nodules with an
accuracy of 93.52%.”

22
Chapter 3:LITERATURE REVIEW

In [8], a lung nodule detection algorithm based on deep learning. The proposed
method is intended for chest radiography, which has been proved to be an effective
tool for detecting pulmonary nodules in clinical practice. We propose a novel
convolutional neural network (CNN) architecture, which can learn to detect and
classify pulmonary nodules from medical images. Our model obtained the most
advanced results on the lung nodule detection task (lndt). Lndt is a challenging
benchmark data set with high sensitivity and specificity, and has promising
performance on other data sets, including chest X-ray data set (cord). This research is
carried out by using deep learning technology, which is widely used in image
recognition and pattern recognition. Compared with other existing methods, the
proposed method can detect the presence or absence of pulmonary nodules in chest X-
ray images with high accuracy.

In [9], Lung cancer is nothing but abnormal swelling of lung tissues and could be a
life threatening one. As per statistics, it is responsible for more deaths than any other
type of cancer. It is important to identify and treat this anomaly from the patient's
perspective. For identifying the tumour cells from CT scans, numerous image-
processing and soft-computing procedures are used. CT scan images are mostly used
in image processing since these are high quality and clear images with more ppi (pixel
per inch). Using this method, small nodes of tissues (nodules) can be found. In
primary finding of lung cancer, patient's probabilities of existence are increased.
Hence, an effective CAD system for lung tumour detection has been projected. This
system comprises three points: initial level processing, segmentation and
classification of nodules. The study of lung disorders requires accurate segmentation
of lung images which is very important in detecting lung cancer. Lung images contain
noise and weak boundaries so, accurate detection of lung nodule is very difficult or
challenging task. This paper covers the comprehensive review of methods used for
lung nodule detection.

23
Chapter 3:LITERATURE REVIEW

In [10], a deep learning model has been proposed which can perfectly detect and
predict lung cancer levels from histopathological information. The model has been
trained and validated using 15,000 lung cancer histopathological image data and has
got 99.80% prediction accuracy from our model.

In [11], CNN-based approach for the classification of lung cancer and attained
95.62% accuracy. When applied to classifying lung cancer, the solution achieves the
most outstanding performance possible throughout the entire dataset. The overfitting
issue that arises during lung cancer classification tasks may be solved with the help of
the proposed framework, which also outperforms existing methods that are considered
to be state-of-the-art.

In [12], Lung cancer is one of the most common and dangerous cancers in the world.
However, lives can be saved through early diagnosis by CT scan images, which is the
best imaging technique in the medical field for early treatment. Though CT scan
imaging is the best technique, doctors and radiologists face some difficulties such as
not being able to diagnose early and commence treatment and to interpret and identify
cancer from CT scan images because of the limitation of equipment and specialists.
Therefore, to identify cancerous cells accurately, computer-aided diagnosis can be
more helpful for doctors. Computer-aided techniques based on image processing and
machine learning have been extensively researched and are being implemented
currently to address this issue.

In [13], The second leading cause for the exponential increase in the mortality rate
globally is due to lung cancer. Over-consumption of tobacco and cigarettes are the
major reasons. Uncontrollable cell growth in lung region will affect the survival rate
of humans. Manual interpretation of disease prediction might be challenging due to
the exponential increase in medical reports. So, early detection of tumor from proper
manifestation can be done through Computer Aided Diagnosis (CAD) techniques. In

24
Chapter 3:LITERATURE REVIEW

this paper, we present a comprehensive analysis of different lung cancer detection

techniques for predicting the nodule as benign or malicious.

In[14], Bronchoscopy inspection, as a follow-up procedure next to the radiological

imaging, plays a key role in the diagnosis and treatment design for lung disease
patients. When performing bronchoscopy, doctors have to make a decision
immediately whether to perform a biopsy. Because biopsies may cause uncontrollable
and life-threatening bleeding of the lung tissue, thus doctors need to be selective with
biopsies. In this paper, to help doctors to be more selective on biopsies and provide a
second opinion on diagnosis, we propose a computer-aided diagnosis (CAD) system
for lung diseases, including cancers and tuberculosis (TB). Based on transfer learning
(TL), we propose a novel TL method on the top of DenseNet: sequential fine-tuning
(SFT). Compared with traditional fine-tuning (FT) methods, our method achieves the
best performance. In a data set of recruited 81 normal cases, 76 TB cases and 277
lung cancer cases, SFT provided an overall accuracy of 82% while other traditional
TL methods achieved an accuracy from 70% to 74%. The detection accuracy of SFT
for cancers, TB, and normal cases are 87%, 54%, and 91%, respectively. This
indicates that the CAD system has the potential to improve lung disease diagnosis
accuracy in bronchoscopy and it may be used to be more selective with biopsies.

In [15], Lung cancer is a major contributor to global mortality rates and identification
is critical to improve patient outcomes. In recent years, machine learning algorithms
have demonstrated promising results in identifying lung nodules from medical
images. The most compelling area of research for scientists is the early detection of
lung cancer. This study is a method for lung nodule detection using CT images. The
study incorporates a hybrid model that combines multiple machine learning
algorithms including CNN, SVM, DTC, ANN, and KNN to improve the accuracy of
nodule detection. The hybrid model demonstrated high accuracy in identifying
various types of lung nodules, including Adenocell carcinoma, squamous cell
carcinoma, and large cell carcinoma. Specifically, the model achieved an accuracy
rate of over 90% in detecting and differentiating normal lung tissue and Adenocele

25
Chapter 3:LITERATURE REVIEW

carcinomas. Accuracy graphs and priority setting were utilized to assess the model's
capability in accurately predicting the presence of lung cancer. Additionally, the
efficiency of the hybrid model was compared with other machine learning algorithms,
including SVM, Random Forest, and Decision Trees. A large dataset of CT scans was
collected for training and evaluation purposes. The results demonstrated the
advantages of the suggested hybrid model in terms of accuracy and efficiency. This
study highlights the importance of early lung nodule identification using CT scans and
demonstrates the effectiveness of the hybrid model in accurately identifying different
types of lung nodules.

In [16], Automated analysis of structural imaging such as lung Computed

Tomography (CT) plays an increasingly important role in medical imaging
applications. Despite significant progress in the development of image registration
and segmentation methods, lung registration and segmentation remain a challenging
task. In this paper, we present a novel image registration and segmentation approach,
for which we develop a new mathematical formulation to jointly segment and register
three-dimensional lung CT volumes. The new algorithm is based on a level-set
formulation, which merges a classic Chan–Vese segmentation with the active dense
displacement field estimation. Combining registration with segmentation has two key
advantages: it allows to eliminate the problem of initializing surface-based
segmentation methods, and to incorporate prior knowledge into the registration in a
mathematically justified manner, while remaining computationally attractive. We
evaluate our framework on a publicly available lung CT data set to demonstrate the
properties of the new formulation. The presented results show the improved accuracy
for our joint segmentation and registration algorithm when compared to registration
and segmentation performed separately.

In [17] A dataset of 1222 patients with lung adenocarcinoma were retrospectively

enrolled from three medical institutions. Anonymized preoperative CT images and
pathological labels of atypical adenomatous hyperplasia, adenocarcinoma in situ,

26
Chapter 3:LITERATURE REVIEW

minimally invasive adenocarcinoma, invasive adenocarcinoma (IAC) with five

predominant components were obtained. These pathological labels were divided into

27
Chapter 3:LITERATURE REVIEW

2-category classification (IAC; non-IAC), 3-category and 8-category. We modeled the

classification task of histological .

subtypes based on modified ResNet-34 deep learning network, radiomics strategies

and deep radiomics combined algorithm. Then we established prognostic models in
lung adenocarcinoma patients with survival outcomes. The accuracy (ACC), area
under ROC curves (AUCs) and C-index were primarily performed to evaluate the
algorithms.

This study included a training set (n = 802) and two validation cohorts (internal, n =
196; external, n = 224). The ACC of deep radiomics algorithm in internal validation
achieved 0.8776, 0.8061 in the 2-category, 3-category classification, respectively.
Even in 8 classifications, the AUC ranged from 0.739 to 0.940 in internal set. Further,
we constructed a prognosis model that C-index was 0.892(95% CI: 0.846–0.937) in
internal validation set.

In [18] This study recruited participants prospectively in two rural sites of western
China. A deep learning system was developed to assist clinicians to identify the
nodules and evaluate the malignancy with state-of-the-art performance assessed by
recall, free-response receiver operating characteristic curve (FROC), accuracy (ACC),
area under the receiver operating characteristic curve (AUC).

This study enrolled 12,360 participants scanned by mobile CT vehicle and detected
9511 (76.95%) patients with pulmonary nodules. Majority of participants were female
(8169, 66.09%), and never-smokers (9784, 79.16%). After 1-year follow-up, 86
patients were diagnosed with lung cancer, with 80 (93.03%) of adenocarcinoma, and
73 (84.88%) at stage I. This deep learning system was developed to detect nodules
(recall of 0.9507; FROC of 0.6470) and stratify the risk (ACC of 0.8696; macro-AUC
of 0.8516) automatically.

28
Chapter 3:LITERATURE REVIEW

Approach Description Strengths Weaknesses

CNN-ALCD [1] Automatic detection High accuracy High computational

using CNN, high (94.11%), CDSS resources, needs
accuracy large, labeled dataset
integration

2D CNN for Classification Classifies Efficient Lower accuracy than

[2] malignant/nonmalignant classification, higher some advanced
cells, good accuracy accuracy than models, limited to 2D
traditional methods

Deep Learning with Trained on Extremely high Requires

histopathological accuracy (99.80%), histopathological
Histopathological Data [10]
images, very high detailed tissue-level data, high complexity
accuracy data

CNN-Based Classification CNN for lung cancer High accuracy Needs large dataset,
classification, addresses (95.62%), handles dependent on input
[11]
overfitting overfitting quality

29
Chapter 3:LITERATURE REVIEW

CHAPTER 4: PROPOSED TECHNIQUE

30
Chapter 4:PROPOSED TECHNIQUE

In this chapter, we will describe the technology proposed for our project. The
approach consists of several steps that include pre-processing CT images and using a
convolutional neural network (CNN) to detect and classify lung nodules.

The primary contribution of this project is to provide healthcare practitioners with

advanced AI algorithms meticulously designed to rapidly and accurately detect and
assess lung cancer in computed tomography (CT) scans. Through comprehensive
exploration of deep learning techniques, the goal is to create a robust system capable
of autonomously categorizing various stages and types of lung cancer identified in CT
images. By harnessing the power of convolutional neural networks (CNNs), the
ultimate aim is to redefine diagnostic precision and effectiveness in clinical practice,
pushing the boundaries of medical imaging and elevating patient care worldwide. This
chapter will discuss preprocessing steps and the proposed model.

4.1 INPUT AND OUTPUT

Input: CT Scan Images

Output: Detection and Classification of Lung Nodules

4.2 DATASET

The dataset used for this project includes a collection of CT scan images of lungs.
Each image is labeled with the presence or absence of lung nodules, as well as the
characteristics of any nodules present.

Source: [Medical Imaging in Lung Cancer | Kaggle]

31
Chapter 4:PROPOSED TECHNIQUE

4.3 DATA PREPROCESSING OF CT SCANS

There was very little data available on the disease, so we took the step of data
Augmentation the data to obtain more data

4.3.1 Data Augmentation

Data augmentation is a technique commonly used in machine learning, particularly in

computer vision tasks such as image classification, object detection, and
segmentation. It involves applying a variety of transformations to the existing training
data to generate new, slightly modified samples.
The goal of data augmentation is to increase the diversity of the training dataset,
thereby improving the generalization and robustness of the trained model.
data augmentation techniques for image data include:

 Rotation
 Translation
 Scaling
 Flapping
 Shearing
 Zooming
 Brightness and Contrast Adjustment
 Noise Injection

4.3.2 Preprocessing

we made a preprocessed data to make sure that the model will understand
the data.

 Resize Images
 Channel Ordering
 Mean Subtraction
 Batching
 Normalization

32
Chapter 4:PROPOSED TECHNIQUE

4.4 TRANSFER LEARNING WITH MOBILENET

A pre-trained MobileNet model is chosen as the base architecture.

The pre-trained model has been trained on a massive image dataset (ImageNet) and
has learned valuable features for image recognition.

The base layers of the pre-trained MobileNet are frozen. This prevents these layers
from being modified during training and focuses the training process on the final
layers for lung cancer classification.

4.4.1 Fine-Tuning for Lung Cancer Classification

New classification layers are added on top of the frozen pre-trained MobileNet
architecture. These new layers are specifically designed for binary classification
(cancerous vs. non-cancerous) or multi-class classification (different lung cancer
types).

The entire model, including the frozen pre-trained layers and the newly added
classification layers, is then trained on the preprocessed CT scan dataset. During
training, the model's internal parameters are adjusted to minimize classification errors
on the lung cancer classification task.

4.4.2 Training Details

(In this section, elaborate on the training process. Specify the optimizer used, loss
function, and any hyperparameter tuning techniques employed. You can mention the
batch size and the number of training epochs here).

4.4.3 Evaluation

The trained model's performance is evaluated on a separate test dataset not used
during training. This ensures an unbiased assessment of the model's generalization
ability.

Standard evaluation metrics for classification tasks, such as accuracy, precision,

recall, and F1-score, are used to quantify the model's effectiveness in classifying lung
cancer cases.

33
Chapter 4:PROPOSED TECHNIQUE

4.5 DISCUSSION

the MobileNet framework, here are some key components you can include in your
proposed model architecture for lung cancer classification:

1. Pre-trained MobileNet Layers (Frozen):

 The foundation of your model will be a pre-trained MobileNet variant, such as

MobileNet V2.
 These pre-trained layers are frozen, meaning their weights will not be updated
during training. This leverages the valuable image recognition features the
model has already learned on a large dataset (e.g., ImageNet).

2. Depthwise Separable Convolutions:

 A core principle of MobileNet is the use of depthwise separable convolutions.

These convolutions are computationally efficient as they break down the standard
convolution operation into two separate steps:
1. Depthwise convolution: This applies a filter to each individual channel of the
input data, essentially extracting features for each channel independently.
2. Pointwise convolution: This applies a 1x1 convolution to combine the features
from the depthwise convolution, reducing the number of channels.

3. Bottleneck Blocks (Optional):

MobileNetV2 utilizes bottleneck blocks, which are a type of residual connection.

These blocks introduce a non-linearity (e.g., ReLU activation) and can help improve
the model's learning capacity while maintaining efficiency.

4. New Classification Layers:

On top of the frozen pre-trained MobileNet layers, you'll add new classification layers
specific to your task. These layers will be responsible for learning the patterns that
differentiate cancerous from non-cancerous lung patterns in the CT scans.

The number and type of these layers will depend on whether you're performing binary
classification (cancerous vs. non-cancerous) or multi-class classification (different
lung cancer types).

34
Chapter 4:PROPOSED TECHNIQUE

Common choices for the final layer include a dense layer with a sigmoid activation
for binary classification or a SoftMax activation for multi-class classification.

Additional Considerations:

 Pooling Layers: You might consider including pooling layers (e.g., average
pooling) within the pre-trained MobileNet architecture to further reduce the
dimensionality of the data and control overfitting. However, be mindful not to lose
too much spatial information crucial for lung cancer classification.
 Batch Normalization: Batch normalization layers can be added after each
convolutional layer to improve the model's training stability and potentially
accelerate convergence.

4.6 COMPARISON

We started to train the model on the data and activating the early stop and check
point tools .

We have used this model and have made some modifications that we will mention in
detail in the next chapter, but we obtained somewhat satisfactory results.

35
Chapter 4:PROPOSED TECHNIQUE

Approach Description strengths Weaknesses

LungNet Used CNNs for Achieved 90% Dependence on meticulously

classifying lung nodules accuracy in annotated 3D CT scans, which
from CT scans detecting malignant may be labor-intensive and
nodules costly to acquire in large
quantities.

NoduleSegNet Used CNNs for Achieved 91% High computational resources,

segmenting lung accuracy in needs large labeled dataset
nodules from 3D CT segmenting lung
scans. nodules from 3D
CT scans, providing
precise boundaries
for analysis

CNN-ALCD Automatic detection High accuracy CT-Scan of one person consists

[1] using CNN for non- (94.11%), CDSS of multi-images for the lung in
small lung cancer integration different lung position so we
detection using need to train our model on all
MobileNet positions that could CT-Scan
appear in the image.

LCCT CNN for non-small lung High accuracy

cancer detection using (95%), Excellent
MobileNet confusion matrix,
Available data

36
Chapter 4:PROPOSED TECHNIQUE

4.6.1 Comparison with other CNN models

4.7 FLOWCHART OF THE APPLICATION

Figure 4-3 Application Flow chart

Chapter 4:PROPOSED TECHNIQUE

4.8 CONCLUSION

This chapter presented a MobileNet-based deep learning model for lung cancer
classification from CT scans. The model leverages transfer learning to exploit the pre-
trained features of MobileNet and fine-tune them for the specific task of lung cancer
detection. The proposed approach offers a computationally efficient and potentially
mobile-friendly solution for lung cancer screening. Future work will involve
exploring techniques to address class imbalance and improve model interpretability
for better clinical adoption.

38
CHAPTER 5: IMPLEMENTATION

39
Chapter 5 : IMPLEMENTATION

5.1 DATA

 Source: Medical Imaging in Lung Cancer | Kaggle

 Imaging Techniques used: CT Scans .
 Image Type: Dicom Image converted to PNG.

5.1.1 Dataset Description

Data contains 3 chest cancer types which are Adenocarcinoma, Large cell
carcinoma, Squamous cell carcinoma , and 1 folder for the normal cell Data folder
is the main folder that contains all the step folders inside Data folder are test ,
train, valid.

“The precise location and size of a lung tumor can depend on factors such as the
specific subtype of cancer, the stage of the disease, and the patient's unique anatomy.”

1. Normal

Figure 5-4:Normal
Figure CT scan.
5-5 Normal.

40
Chapter 5 : IMPLEMENTATION

2. Adenocarcinoma

 Location: Adenocarcinoma commonly arises in the outer regions (peripheral) of

the lungs, though it can also occur centrally. It tends to start in the smaller airways
and may spread along the lung's connective tissue.
 Tumor Size: Adenocarcinomas can range from small nodules to large masses.
They may present as ground-glass opacities, solid nodules, or consolidation on CT
scans.
 Presence of Cavitation: Adenocarcinoma typically doesn't cavitate early in its
development. However, as it progresses, especially in response to treatments like
chemotherapy, cavitation may occur.
 Lymph Node Involvement: Adenocarcinoma often metastasizes to the lymph
nodes, but the pattern of lymph node involvement can vary widely.
 Pattern of Spread: Adenocarcinoma tends to spread to distant organs such as the
brain, bones, and liver. It can also spread via the bloodstream to other parts of the
body.

Figure 5-6:Adenocarcinoma LC CT scan.

41
Chapter 5 : IMPLEMENTATION

3. Large cell carcinoma

 Location: Large cell carcinoma can occur anywhere in the lungs and doesn't have
a specific predilection for central or peripheral locations.
 Tumor Size Large cell carcinomas are often larger in size compared to other
types of lung cancer. They may present as large, bulky masses on CT scans.
 Presence of Cavitation: Cavitation is less common in large cell carcinoma
compared to squamous cell carcinoma, but it can occur, especially in larger
tumors with central necrosis.
 Lymph Node Involvement: Large cell carcinoma may involve regional lymph
nodes, but the pattern of lymph node involvement is less predictable compared to
squamous cell carcinoma.
 Pattern of Spread: Large cell carcinoma tends to grow rapidly and may spread
early to distant organs such as the brain, bones, or adrenal glands. It can also
spread locally within the chest.

Figure 5-7:Large cell carcinoma CT scan.

42
Chapter 5 : IMPLEMENTATION

4. Squamous cell carcinoma

 Location Squamous cell carcinoma often arises centrally in the larger bronchi,
though it can also occur peripherally. It tends to grow within the airway, causing
obstruction.
 Tumor Size: Squamous cell carcinomas can vary in size, from small nodules to
larger masses. They often present as discrete, solid masses on CT scans.
 Presence of Cavitation: Cavitation is relatively common in squamous cell
carcinoma, particularly in larger tumors. Central necrosis can lead to cavitation,
which may be visible on CT scans.
 Lymph Node Involvement: Squamous cell carcinoma has a higher propensity for
involving regional lymph nodes, particularly those near the trachea and main
bronchi.
 Pattern of Spread: Squamous cell carcinoma typically spreads locally within the
chest, including to adjacent structures such as the chest wall or mediastinum. It
can also metastasize to distant organs.

Figure 5-8: Squamous cell carcinoma CT scan.

43
Chapter 5 : IMPLEMENTATION

5.2 DATA PREPROCESSING

Data preprocessing is a crucial step in the machine learning pipeline that involves
transforming raw data into a format suitable for training a machine learning model. It
typically includes several steps such as cleaning, transforming, and organizing the
data.

5.2.1 Data Augmentation

Data augmentation is a technique commonly used in machine learning, particularly in

The goal of data augmentation is to increase the diversity of the training dataset,
thereby improving the generalization and robustness of the trained model.

data augmentation techniques for image data include:

 Rotation: Rotate the image from a certain angle, introducing variations in object
orientations.
 Translation: Shift the image horizontally or vertically, simulating different object
positions within the frame.
 Scaling: Resize the image, making objects appear larger or smaller relative to the
image size.
 Flapping: Flip the image horizontally or vertically, creating mirror images.
 Shearing: Skew the image along one of its axes, introducing perspective
distortions.
 Zooming: Zoom into or out of the image, focusing on specific regions or
capturing a broader context.
 Brightness and Contrast Adjustment: Increase or decrease the brightness and
contrast of the image.
 Noise Injection: Add random noise to the image, simulating variations in
lighting conditions or sensor noise.

44
Chapter 5 : IMPLEMENTATION

We have used all data augmentation techniques on our project database to obtain the
following:

1. Increase Training Data: Data augmentation expands the effective size of the
training dataset by generating new, modified samples from the existing data. This
is particularly beneficial when the original dataset is limited in size, as it helps
prevent overfitting and improves the generalization ability of the model.
2. Improve Model Robustness: By exposing the model to a wider range of
variations and perturbations in the data, data augmentation encourages the model
to learn features that are more robust and invariant to such changes.
This makes the model more capable of handling variations in real-world data that
it may encounter during deployment.

5.3 PROGRAMING LANGUAGE

The main language used in this project is Python. Python’s open-source libraries are
not the only feature that makes it favorable for machine learning and AI tasks. Python
is also highly versatile and flexible, meaning it can also be used alongside other
programming languages when needed. Even further, it can operate on nearly all OS
and platforms on the market.

Implementing Deep Neural Networks can be extremely time consuming, but Python
offers many packages that cut down on this. It is also an object-oriented
programming (OOP) language, which makes it extremely useful for efficient data
use and categorization.

45
Chapter 5 : IMPLEMENTATION

5.4 ACCURACY

Accuracy is a metric that measures how often a machine learning model correctly
predicts the outcome. You can calculate accuracy by dividing the number of correct
predictions by the total number of predictions.

In other words, accuracy answers the question: how often the model is right?

Figure 5-9 Accuracy Equation

You can measure the accuracy on a scale of 0 to 1 or as a percentage. The higher the
accuracy, the better. You can achieve a perfect accuracy of 1.0 when every prediction
the model makes is correct.

This metric is simple to calculate and understand. Almost everyone has an intuitive
perception of accuracy: a reflection of the model's ability to correctly classify data
points.

46
Chapter 5 : IMPLEMENTATION

5.5 CONFUSION MATRIX

In the field of machine learning and specifically the problem of statistical

classification, a confusion matrix, also known as error matrix, is a specific table
layout that allows visualization of the performance of an algorithm, typically a
supervised learning one; in unsupervised learning it is usually called a matching
matrix.

Each row of the matrix represents the instances in an actual class while each column
represents the instances in a predicted class, or vice versa – both variants are found in
the literature. The name stems from the fact that it makes it easy to see whether the
system is confusing two classes (i.e. commonly mislabeling one as another).

It is a special kind of contingency table, with two dimensions ("actual" and

"predicted"), and identical sets of "classes" in both dimensions (each combination of
dimension and class is a variable in the contingency table).

Given a sample of 12 individuals, 8 that have been diagnosed with cancer and 4 that
are cancer-free, where individuals with cancer belong to class 1 (positive) and non-
cancer individuals belong to class 0 (negative), we can display that data as follows:

Figure 5-10 Confusion Matrix Table

47
Chapter 5 : IMPLEMENTATION

5.6 MODEL SELECTION

5.6.1 Introduction

At the heart of our graduation project lies the ambition to provide healthcare
practitioners with cutting-edge AI algorithms precisely designed to detect cancers
quickly and accurately in human lungs on computed tomography (CT) scans. Through
comprehensive exploration of advanced AI techniques, our goal is to create a robust
system capable of independent classification of a variety of cancers identified on CT
images. By harnessing the power of deep learning methodologies, our ultimate goal is
to redefine diagnostic accuracy and effectiveness in clinical practice, pushing the
boundaries of medical imaging and raising the standard of patient care around the
world.

Before training the models

1. we created Train,Validation,Test Batches

2. Assert our data & Show it :

Figure 5-13: CT scan from Train file. Figure 5-12: CT scan from Validate file. Figure 5-11: CT scan from Test file.

48
Chapter 5 : IMPLEMENTATION

5.6.2 Libraries

WE use Libraries as they provide a set of pre-built functions and tools that simplify
the process of developing and deploying deep learning solutions. By using these
libraries, developers and data scientists can focus more on the problem-solving aspect
rather than spending time on coding complex algorithms from scratch, so here is the
libraries we used in our models.

1. Tensor flow

TensorFlow is an end-to-end open-source machine learning platform that contains

comprehensive tools, libraries and community resources. It is meant for developers,
data scientists and researchers to build and deploy applications powered by machine
learning. TensorFlow was essentially built to scale, developed by Google Brain team,
TensorFlow accelerates ML and deep neural network research. It can run on multiple
CPUs or GPUs and mobile operating systems. Also, it has several wrappers in
languages like Python, C++, or Java.

2. OpenCV

Provides a way to interact with the operating system, such as navigating directories
and handling file paths.

3. NumPy

Fundamental library for numerical computing in Python, providing support for array
operations and mathematical functions.

4. matplotlib

Is a collection of command style functions that make matplotlib work like MATLAB.
Each pyplot function makes some change to a figure: e.g., creates a figure, creates a
plotting area in a figure, plots some lines in a plotting area, decorates the plot with
labels, etc. In matplotlib. Pyplot various states are preserved across function calls, so
that it keeps track of things like the current figure and plotting area, and the plotting

49
Chapter 5 : IMPLEMENTATION

functions are directed to the current axes (please note that “axes” here and in most
places in the documentation refers to the axes part of a figure and not the strict
mathematical term for more than one axis).

5. Seaborn

Is a Python data visualization library based on matplotlib. It provides a high-level

interface for drawing attractive and informative statistical graphics.

6. sklearn.metrics

The sklearn.metrics module implements several loss, score, and utility functions to
measure classification performance. Some metrics might require probability estimates
of the positive class, confidence values, or binary decisions values. Most
implementations allow each sample to provide a weighted contribution to the overall
score, through the sample_weight parameter.

7. OS(Operating System)

The os system() method executes the command (a string) in a subshell. This method is
implemented by calling the Standard C function system() with some limitations. If
command generates any output, it is sent to the interpreter standard output stream.

Provides a way to interact with the operating system, such as navigating directories
and handling file paths.

8. keras.callbacks

A callback is an object that can perform actions at various stages of training (e.g. at
the start or end of an epoch, before or after a single batch, etc.). You can use callbacks
to: Write TensorBoard logs after every batch of training to monitor your metrics.
Periodically save your model to disk. Do early stopping.

9. Pandas

Offers high-level data structures and data manipulation tools, particularly useful for
handling tabular data.

50
Chapter 5 : IMPLEMENTATION

5.6.3 VGG16:

Vgg16 is a deep convolutional neural network architecture designed for image

classification(16 layer). it was introduced by the Visual Geometry Group at the
University of Oxford in 2014. The VGG architecture gained popularity for its
simplicity and effectiveness, and it served as a foundation for many subsequent deep
learning models. However, it has a relatively large number of parameters, making it
computationally expensive. Vgg16 is our first used model but unfortunately we didn’t
get the desired accuracy.

Figure 5-14: VGG16 Architecture

51
Chapter 5 : IMPLEMENTATION

5.6.3.1 Preprocessing

Firstly, we made a preprocessing on data to make sure that the model will understand
the data.

1. Resize Images: VGG16 expects input images to have a fixed size. The original
VGG16 model was trained on 224x224 pixel images. Therefore, before feeding
images into the model, you need to resize them to match this size. This can be
done using libraries like OpenCV or PIL (Python Imaging Library).

2. Mean Subtraction: Subtracting the mean pixel value across all images in the
dataset is a common preprocessing step. For VGG16, you would typically subtract
the mean RGB pixel value computed over the entire ImageNet dataset. This helps
center the data around zero and can improve convergence during training.

3. Normalization: After resizing and mean subtraction, it is common to normalize

the pixel values to a specific range. For VGG16, the pixel values are often scaled
to the range [0, 1] by dividing each pixel value by 255.0. Alternatively, you can
normalize the pixel values to have a mean of 0 and a standard deviation of 1.

4. Channel Ordering: Ensure that the input image is in the correct channel ordering
expected by the model. VGG16 expects images to be in the 'RGB' format, where
channels are ordered as Red, Green, Blue.

5. Batching: Prepare the input data in batches for efficient processing. CNNs like
VGG16 often process images in batches to take advantage of parallel processing
capabilities provided by modern hardware.

52
Chapter 5 : IMPLEMENTATION

Then We have made preprocessing on the model “vgg16” :

We removed the last layer and set the trainable = false so the model does not know
anything about the older data and trains on every image alone and gives us the results.

Then we set the dropout rate = 0.5. This means during each training iteration, half of
the neurons in that layer will be randomly deactivated.

We used the SoftMax type activation to predict the class of an input image and we
used optimizer Adam.

 SoftMax is a mathematical function that converts a vector of real numbers into a

vector of probabilities, where each element of the output vector represents the
probability that the corresponding element of the input vector belongs to a
particular class. It's commonly used in machine learning algorithms, especially in
multiclass classification tasks.

 Adam is an iterative optimization algorithm used to minimize the loss function

during the training of neural networks. we have used 25 epochs and early stopping
to get the accuracy.

53
Chapter 5 : IMPLEMENTATION

4.4.3.2 Performance

1. Train Accuracy

Train accuracy is 82.7% only.

2. Validate

Validate Accuracy is 70 % only.

3. Test

Test Accuracy is 86 % only.

4. Confusion Matrix

54
Chapter 5 : IMPLEMENTATION

After checking the accuracy of the model, we found that this model doesn’t fit our
needs, so we tried another model to get better accuracy.

5.6.4 EfficientNet-B0

EfficientNet was first introduced by a team of researchers at Google in a paper

published in 2019. The paper, “EfficientNet: Rethinking Model Scaling for
Convolutional Neural Networks,” describes a new method for scaling CNN
architectures that achieves both high accuracy and efficiency.

In the EfficientNet architecture, the input image is first passed through a series of
convolutional layers that reduce the resolution of the image while also increasing the
number of channels. This is followed by a series of bottleneck layers, which are
composed of a depthwise separable convolution followed by a pointwise convolution.
These layers reduce the computational cost of the model while also increasing its
depth.

The output of the bottleneck layers is passed through a series of fully connected layers
that produce the final output of the model. The final output is a vector of probabilities,
one for each class in the dataset, indicating the likelihood that the input image belongs
to each class.

Figure 5-15: EfficientNet-B0 Architecture

55
Chapter 5 : IMPLEMENTATION

5.6.4.1 Performance

1. Test

Test accuracy is 93% only.

2. Confusion Matrix

We have also worked on this model so that we can get better results.

After applying all the previous steps of processing, uploading the data, dividing it, and
running the model, we got better accuracy but it wasn’t good enough as we expected.

56
Chapter 5 : IMPLEMENTATION

5.6.5 MobileNet:

MobileNet is a computer vision model open-sourced by Google and designed for

training classifiers. It uses depthwise convolutions to significantly reduce the number
of parameters compared to other networks, resulting in a lightweight deep neural
network. MobileNet is TensorFlow’s first mobile computer vision model.

Figure 5-16: Mobilenet Architecture

Figure 5-17: Mobilenet Architecture 2

57
Chapter 5 : IMPLEMENTATION

5.6.5.1 Preprocessing

Firstly, we made a preprocessing on our data to make sure that the model will
understand the data.

 Resize Images: MobileNet, like most CNNs, expects input images to have a fixed
size. The size can vary depending on the specific MobileNet variant you're using
(e.g., MobileNetV1, MobileNetV2). For example, MobileNetV1 commonly uses
224x224 pixel images, while MobileNetV2 can handle various input sizes.

 Mean Subtraction: Similar to VGG16, you may subtract the mean pixel value
across all images in the dataset. However, MobileNet models might have been
trained with different datasets than VGG16, so it's essential to use the appropriate
mean pixel values for MobileNet.

 Normalization: Normalize the pixel values to a specific range. MobileNet

typically expects pixel values in the range [-1, 1] or [0, 1]. You might need to
scale the pixel values accordingly.

 Channel Ordering: Ensure that the input image is in the correct channel ordering
expected by the model. MobileNet typically expects images to be in the 'RGB'
format.

 Batching: Prepare the input data in batches for efficient processing. MobileNet,
like VGG16, benefits from processing images in batches for parallel processing.

58
Chapter 5 : IMPLEMENTATION

Then we made a preprocessing on the model to improve the accuracy :

 We removed the last 5 layers, then added dropout rate=0.5 , then we added a layer
called global average pooling 2d, which reduces overfitting, then we put the
outputs of the layers that contain 4 neurons, and the activation function is
SoftMax.
 Then we made the last 50 layers in the mobile net did not train the trainable =
false. so that the model does not know anything and trains on all the images alone
and gives us the results.

 we used the SoftMax type activation to predict the class of an input image and we
used optimizer Adam.

4.4.5.2 Create Train & Validate & Test

We prepare a data generator for training a model using images from a directory. The
images are preprocessed using the MobileNet preprocessing function before being fed
into the model.

 Train

We classified the data for train into 4 classes each class indicates a type of tumor
while the first class indicates the normal case so know we have (normal = 0,

adenocarcinoma = 1, large cell = 2, squamous = 3).

The total number of images is 11494 belonging to 4 classes.

 Validate
Then we classified the validation data as same as trained data but the total number
of the images in the validated data is 1553 belonging to 4 classes.

 Test

59
Chapter 5 : IMPLEMENTATION

The tested data is also the same. Classified into 4 types but now the number of images
is 315 images belonging to 4 classes.

4.4.5.3 Running the Model(MobileNet Model)

1. Importing Libraries

“from keras.callbacks import EarlyStopping”

This line imports the `EarlyStopping` callback from the Keras library. Callbacks are
functions that can be applied during the training process of a neural network to
perform certain actions at specific points.

2. Defining Callbacks

An instance of the `EarlyStopping` callback is created. This callback monitors the

validation loss (`val_loss`) during the training process. If the validation loss doesn't
improve for 16 consecutive epochs (`patience=16`), the training will stop early.

3. Check Point

`ModelCheckpoint`, which saves the model weights whenever it observes an

improvement in the validation accuracy (`val_accuracy`). The saved model will be
stored in a file named `'lungModel2_v2.h5'`.We activated the ‘save_only_best’ tool
to only save the model with the best accuracy.

4. Training the Model

We started to train the model on the data and activating the early stop and check
point tools using 20 epochs.

60
Chapter 5 : IMPLEMENTATION

An epoch is when all the training data is used at once and is defined as the total
number of iterations of all the training data in one cycle for training the machine
learning model.

4.4.5.4 Performance

1. Train

We got 99.5% in training evaluation.

2. Validation

We got 92.7% only.

3. Test

We got 94.2% in training evaluation.

4. Confusion Matrix

61
Chapter 5 : IMPLEMENTATION

In this model, after making many modifications, as we mentioned before, we achieved

great and fairly satisfying results, and after consulting the project supervisors, we
approved this model for the website and mobile phone application.

62
CHAPTER 6: LAYOUT

63
In this chapter we will talk about the layout and the designs we used in our project.
We started our mission by choosing a suitable design then we built the web site and
application.

6.1 UI/UX

Using FIGMA we started to create our design.

Figma is a collaborative web application for interface design, with additional offline
features enabled by desktop applications for macOS and Windows. The feature set of
Figma focuses on user interface and user experience design, with an emphasis on real-
time collaboration, utilizing a variety of vector graphics editor and prototyping tools.
The Figma mobile app for Android and iOS allows viewing and interacting with
Figma prototypes in real-time on mobile and tablet devices.

Firstly, we designed the log in and sign-up page to create the user’s account.

Figure 6-18: Sign up page.

64
Then we create the home page, so the user has multiple choices as scanning his CT
scan or watching the videos of symptoms and causes or treatments and diagnosis.
There are also some hospitals and centers as reference if the user doesn’t know where
to go.

Figure 6-19: Home Page

65
Here is the patient history page as the user can see his previous scans and the output
with the dates so he can see if there is progress or not.

Figure 6-20: Patient History

66
There are many other features, but we will discuss them later. Now we need to
convert this design into a real platform able to be used by a user.

6.2 WEB SITE

Figure 6-21 Scan page in web site

67
Figure 6-22 Diagnosis & Treatment Page

68
Chapter 6: LAYOUT

Figure 6-23 Patient History

Figure 6-24 Print As PDF

69
Chapter 6: LAYOUT

6.3 APPLICATION

Figure 6-25 Application Sign Up

70
Chapter 6: LAYOUT

Figure 6-29Lung cancer types

Figure 6-26Application Log In

Chapter 6: LAYOUT

CHAPTER 7: USED SOFTWARE

72
Chapter 7: USED SOTWARE

7.1 INTRODUCTION

In this chapter, we will display all the software products used in this project, and we
will also clarify some points specifically for evaluating the model.

7.2 TOOLS USED

1-Google colab : our coding notebook.

2-Kaggle : Data search engine.

3- Android studio: For building our Application using Flutter Framework.

4- Fast Api Framework: Merging Tool.

73
Chapter 7: USED SOTWARE

7.2.1 Google Colab

Google Collaboratory, or Colab, is a user-friendly hosted Jupyter Notebook service

tailored for machine learning, data science, and education. Requiring no setup, it
offers free access to computing resources, including GPUs and TPUs, making it ideal
for tasks requiring intensive computation.

With zero configuration needed, Colab enables Python coding directly in the browser,
supports easy sharing of notebooks stored in Google Drive, and seamlessly integrates
executable code with rich text, images, HTML, and LaTeX. Utilizing popular Python
libraries such as NumPy and matplotlib, Colab empowers users to analyze and
visualize data efficiently.

Figure 7-30: Colab Interface

74
Chapter 7: USED SOTWARE

7.2.2 Kaggle

Kaggle is an online community and platform tailored for data scientists and AI
enthusiasts, offering collaborative features, dataset publishing, GPU-integrated
notebooks, and competitive challenges. Founded in 2010 by Anthony Gold bloom and
Jeremy Howard and later acquired by Google in 2017, Kaggle aims to empower
professionals and learners in their data science journey by providing robust tools and
resources.

Users can engage in contests hosted by major companies, share and explore datasets,
exchange code snippets, and participate in discussions. Additionally, Kaggle offers
free courses with certificates upon successful completion, making it an inclusive hub
for knowledge-sharing and skill development in the fields of data science and
artificial intelligence.

Figure 7-31: Kaggle Interface

75
Chapter 7: USED SOTWARE

7.2.3 Android Studio

Android Studio is the official integrated development environment (IDE) for Google's
Android operating system, built on JetBrains' IntelliJ IDEA software and designed
specifically for Android development. It is available for download on Windows,
macOS and Linux based operating systems. It is a replacement for the Eclipse
Android Development Tools (E-ADT) as the primary IDE for native Android
application development. Android Studio is licensed under the Apache license but it
ships with some SDK updates that are under a non-free license, making it not open
source.

Android Studio was announced on May 16, 2013, at the Google I/O conference. It
was in early access preview stage starting from version 0.1 in May 2013, then entered
beta stage starting from version 0.8 which was released in June 2014.The first stable
build was released in December 2014, starting from version 1.0.At the end of 2015,
Google dropped support for Eclipse ADT, making Android Studio the only officially
supported IDE for Android development.

On May 7, 2019, Kotlin replaced Java as Google's preferred language for Android app
development. Java is still supported, as is C++.

Figure 7-32 Android Studio interface

76
Chapter 7: USED SOTWARE

7.2.4 Fast Api Framework

Fast API is a modern, fast (high-performance), web framework for building APIs with
Python based on standard Python type hints.

The key features are:

Fast: Very high performance, on par with NodeJS and Go. One of the fastest Python
frameworks available.

Fast to code: Increase the speed to develop features by about 200% to 300%. *

Fewer bugs: Reduce about 40% of human (developer) induced errors. *

Intuitive: Great editor support. Completion everywhere. Less time debugging.

Easy: Designed to be easy to use and learn. Less time reading docs.

Short: Minimize code duplication. Multiple features from each parameter declaration.
Fewer bugs.

Robust: Get production-ready code. With automatic interactive documentation.

Standards-based: Based on (and fully compatible with) the open standards for APIs:
Open API (previously known as Swagger) and JSON Schema.

77
CHAPTER 8: CONCLUSION

78
Chapter 8 :CONCLUSION

Introduction: This section discusses the conclusions of this project in relation to the
space of the detection model design and development as a whole, as well as
applications and possible future work.

8.1 CONCLUSION

In conclusion, the integration of artificial intelligence (AI) into lung cancer detection
through deep learning models represents a significant advancement in medical
imaging and diagnostics. The capability of AI to analyze and interpret low dose
computed tomography (LDCT) images with high precision offers a promising tool to
enhance early detection of lung cancer, which is critical for improving patient
outcomes. By leveraging complex algorithms and superior pattern recognition
abilities, AI surpasses human limitations in identifying subtle anomalies in chest
scans, thereby potentially increasing the accuracy and efficiency of lung cancer
screening programs.

"Health Lung" explores the technical development and real-world application of such
models, highlighting the transformative potential of AI in healthcare. This approach
not only aids clinicians in making more informed decisions but also supports
personalized patient care through accurate and timely diagnosis.

Overall, the adoption of deep learning models in lung cancer detection underscores a
pivotal shift towards more intelligent, data-driven medical practices, heralding a new
era in the fight against lung cancer. This project exemplifies the vital role of AI in
advancing medical technology and improving patient care outcomes.

Moreover, the real-world applicability of AI-driven diagnostics extends beyond mere

image analysis. It supports personalized patient care by providing insights that are
specific to an individual’s unique medical profile, ultimately facilitating more tailored
and effective treatment plans.

In summary, the application of deep learning to lung cancer detection exemplifies the
transformative potential of AI in healthcare. It not only enhances the diagnostic
process but also significantly contributes to improving patient outcomes by enabling
earlier detection and intervention. The continued advancement and integration of AI
technologies in medical practice are poised to revolutionize the field, making

79
Chapter 8 :CONCLUSION

intelligent imaging an indispensable component of modern healthcare. This project

highlights the critical role AI will play in the future of medical diagnostics and patient
care, heralding a new era of precision medicine.

Here is a comparison between the recent approaches described in the provided

summaries:

Approach Description Accuracy

CNN-ALCD [1] Automatic High accuracy

detection using (94.11%), CDSS
CNN integration

2D CNN for Classifies higher accuracy

Classification [2] malignant/non- than traditional
malignant cells, methods (94.9%)
good accuracy

Deep Learning Trained on Extremely high

with histopathological accuracy
Histopathological images, very high (99.80%), detailed
Data [10] accuracy tissue-level data

CNN-Based CNN for lung High accuracy

Classification [11] cancer (95.62%), handles
overfitting

LCCT Classifies High Accuracy

between CT scans (95.7%) with
,Detect the tumor perfect confusion
perfectly matrix

Each approach has its strengths and weaknesses, making them suitable for different
applications and contexts in lung cancer detection. The choice of approach would
depend on the specific requirements, available data, and computational resources.

80
Chapter 8 :CONCLUSION

8.2 FUTURE WORK

In the next update we hope to add new features as :

Chatbot: Chatbots are conversational tools that perform routine tasks efficiently.
People like them because they help them get through those tasks quickly so they can
focus their attention on high-level, strategic, and engaging activities that require
human capabilities that cannot be replicated by machines.so we want to add chatbot
specially a type called hybrid chatbot. A hybrid chatbot is a harmonious blend of
chatbot and live chat that combines the best of both worlds. A customer service
representative will be available in live chat to answer any customer’s questions, which
may be too complex or nuanced for automation alone.

An AI component in a chatbot replicates conversations based on how it is

programmed and the needs of the conversation. On the other hand, a hybrid chatbot
will initiate an automated chat conversation and attempt to resolve the user’s query as
quickly and simply as possible. If it does not function as expected, a customer service
representative can intervene at any moment or in the subject matter area where the
chatbot cannot complete the task.

Reservations: we also need to add a feature that allows the user to reserve an
appointment so if the scans were positive the user can reserve his appointment in the
hospital or the clink from the web site.

Improving: The main goal of this project is to help in the medical field and help the
patients to have the most accurate results. we want to improve the accuracy of the
model so we can get more accurate results, but now this depends on the data. There is
a problem with lung cancer CT scans. The problem is it’s very hard to get data from
online search engines, so we need to contract with a specialized hospital in lung
cancer treatment . After taking this data from the hospital, we can train our model on
more data so the model will be more accurate.

Otherwise, we can add some new features as using computer vision techniques or
NLP in the near future.

81
CHAPTER 9: REFERENCE

82
Chapter 9 :REFERENCE

[1] M. Aharonu and R. Kumar, "Convolutional Neural Network based Framework

for Automatic Lung Cancer Detection from Lung CT Images", International
Conference on Smart Generation Computing Communication and Networking
(SMART GenCon), Dec. 2022.

[2] V. G. Biradar, P. K. Pareek, V. K. S and P. Nagarathna, "Lung Cancer

Detection and Classification using 2D Convolutional Neural Network", 2022 IEEE
2nd Mysore Sub Section International Conference (MysuruCon), Oct. 2022.

[3] Y. Zhang, B. Dai, M. Dong, H. Chen and M. Zhou, "A Lung Cancer Detection
and Recognition Method Combining Convolutional Neural Network and
Morphological Features", IEEE 5th International Conference on Computer and
Communication Engineering Technology (CCET), Aug. 2022, [online] Available:
https://doi.org/10.1109/ccet55412.2022.9906329.

[4] M. Ramkumar, C. G. Babu, A. R. A. Wahhab, K. Abinaya, B. A. Balaji and

N. A. Chakravarthy, "Detection and Diagnosis of Lung Cancer using Machine
Learning Convolutional Neural Network Technique", 2022 Smart Technologies
Communication and Robotics (STCR), Dec. 2022.

[5] Jain, P. Singh, A. K. Pandey, M. Singh, H. B. Singh and A. Singh, "Lung

Cancer Detection Using Convolutional Neural Network", 3rd International
Conference on Issues and Challenges in Intelligent Computing Techniques (ICICT),
Nov. 2022, [online] Available: https://doi.org/10.1109/icict55121.2022.10064513.

[6] N. Vijayan and J. Kuruvilla, "The impact of transfer learning on lung cancer
detection using various deep neural network architectures", 2022 IEEE 19th India
Council International Conference (INDICON), Nov. 2022.

83
Chapter 9 :REFERENCE

[7] K. P raveena, C. Vimala, S. Hemachandra and K. Praveena, "Lung Carcinoma

Detection using Deep learning", International Conference on Advances in Electronics
Communication Computing and Intelligent Information Systems (ICAECIS), Apr.
2023, [online] Available: https://doi.org/10.1109/icaecis58353.2023.10170278.

[8] Zhao, "Lung Nodule Detection Algorithm Based on Deep Learning In Medical
Images", International Conference on Artificial Intelligence of Things and
Crowdsensing (AIoTCs), Oct. 2022, [online] Available:
https://doi.org/10.1109/aiotcs58181.2022.00118.

[9] N. A. Pande and D. Bhoyar, "A comprehensive review of Lung nodule

identification using an effective Computer-Aided Diagnosis (CAD) System", 2022
4th International Conference on Smart Systems and Inventive Technology (ICSSIT),
Jan. 2022.

[10] R. D. Mohalder, J. P. Sarkar, K. A. Hossain, L. Paul and M. Raihan, "A

Deep Learning Based Approach to Predict Lung Cancer from Histopathological
Images", 2021 International Conference on Electronics Communications and
Information Technology (ICECIT), Sep. 2021.

[11] Abugabah, F. Shahid, A. Al-Afeef and R. Khan, "Smart Health Care

Management System for Diagnosis of Lungs Cancer", Fourteenth International
Conference on Ubiquitous and Future Networks (ICUFN), Jul. 2023, [online]
Available: https://doi.org/10.1109/icufn57995.2023.10200746.

[12] S. A. D. L. V. Senarathna, S. P. Y. A. A. Piyumal, R. Hirshan and W. G. C.

W. Kumara, "Lung Cancer Detection and Prediction of Cancer Stages Using Image
Processing", 3rd International Conference on Electrical Control and Instrumentation
Engineering (ICECIE), Nov. 2021, [online] Available:
https://doi.org/10.1109/icecie52348.2021.9664658.

84
Chapter 9 :REFERENCE

[13] V. Prakash and P. Vas, "Survey on lung Cancer Detection

Techniques", 2020 International Conference on Computational Performance
Evaluation (ComPE), Jul. 2020.

[14] T. Tan et al., "Optimize transfer learning for lung diseases in bronchoscopy
using a new concept: Sequential Fine-Tuning", IEEE Journal of Translational
Engineering in Health and Medicine, vol. 6, pp. 1-8, Jan. 2018.

[15] R. P. Puneeth, A. B. S, A. Shetty, A. K. Rao and A. A. Sooda , "Performance

Evaluation of Machine Learning Algorithms for Lung Nodule Detection using
Multimodal Imaging - A Hybrid Approach", 2nd International Conference on Edge
Computing and Applications (ICECAA), Jul. 2023, [online] Available:
https://doi.org/10.1109/icecaa58104.2023.10212387.

[16] Swierczynski P., Papie B.W., Schnabel J.A., Macdonald C. A level-set

approach to joint image segmentation and registration with application to CT lung
imaging. Comput. Med. Imaging Graph. 2018;65:58–68.

[17] Wang C., Shao J., Lv J., Cao Y., Zhu C., Li J., Shen W., Shi L., Liu D., Li
W. Deep learning for predicting subtype classification and survival of lung
adenocarcinoma on computed tomography. Transl. Oncol. 2021;14:101141.
doi: 10.1016/j.tranon.2021.101141.

[18] Shao J., Wang G., Yi L., Wang C., Lan T., Xu X., Guo J., Deng T., Liu D., Chen
B., et al. Deep learning empowers lung cancer screening based on mobile low-dose
computed tomography in resource-constrained sites. Front. Biosci.
Landmark. 2022;27:212.

ASEAN TMHS GMP Training Chapter 4 Annex 1 SOP On Personal Hygiene
No ratings yet
ASEAN TMHS GMP Training Chapter 4 Annex 1 SOP On Personal Hygiene
5 pages
Final Book
No ratings yet
Final Book
95 pages
Final Edition 1
No ratings yet
Final Edition 1
90 pages
Early Detection of Lung Cancer Using AI and ML
No ratings yet
Early Detection of Lung Cancer Using AI and ML
6 pages
Lung Cancer Detection Using Deep Learning and Explainable Methods
No ratings yet
Lung Cancer Detection Using Deep Learning and Explainable Methods
4 pages
Lung Cancer (LDCT) 2024
No ratings yet
Lung Cancer (LDCT) 2024
14 pages
R182254V Proposal Prayer Mupikata
No ratings yet
R182254V Proposal Prayer Mupikata
8 pages
Hybrid Model Detection and Classification of Lung Cancer
No ratings yet
Hybrid Model Detection and Classification of Lung Cancer
11 pages
Re Paper
No ratings yet
Re Paper
7 pages
Application of Artificial Intelligence in Lung Cancer
No ratings yet
Application of Artificial Intelligence in Lung Cancer
17 pages
Articulo Inteligencia Artiificial
No ratings yet
Articulo Inteligencia Artiificial
10 pages
Lung Cancer (CT) 2024
No ratings yet
Lung Cancer (CT) 2024
9 pages
Lung Cancer Detection Model Using Deep Learning Te
No ratings yet
Lung Cancer Detection Model Using Deep Learning Te
17 pages
Aihc Report
No ratings yet
Aihc Report
13 pages
Graduation Project Paper
No ratings yet
Graduation Project Paper
8 pages
597 Icac3n23
No ratings yet
597 Icac3n23
5 pages
Deep Learning and Machine Learning Algorithms To Predict Lung Cancer
No ratings yet
Deep Learning and Machine Learning Algorithms To Predict Lung Cancer
5 pages
IJRAR22B3053
No ratings yet
IJRAR22B3053
18 pages
An Integrated Deep Learning Based Enhanced Grey Wolf Optimization For Lung Cancer Prediction
No ratings yet
An Integrated Deep Learning Based Enhanced Grey Wolf Optimization For Lung Cancer Prediction
14 pages
Lung Cancer
No ratings yet
Lung Cancer
13 pages
A CAD System For Lung Cancer Detection Using Hybri
No ratings yet
A CAD System For Lung Cancer Detection Using Hybri
20 pages
Documentation
No ratings yet
Documentation
67 pages
SAGE Digital Health LC Bayesian May 2023 20552076231172632
No ratings yet
SAGE Digital Health LC Bayesian May 2023 20552076231172632
17 pages
Cancers 14 03856 v3
No ratings yet
Cancers 14 03856 v3
11 pages
Lung Cancer Prediction
No ratings yet
Lung Cancer Prediction
14 pages
Lung Cancer Detection System Using Image Processin
No ratings yet
Lung Cancer Detection System Using Image Processin
9 pages
LungCancerD SRS
No ratings yet
LungCancerD SRS
7 pages
Literature Survey For Lung Cancer Analysis and Prediction
No ratings yet
Literature Survey For Lung Cancer Analysis and Prediction
6 pages
Newppt Ai Sic
No ratings yet
Newppt Ai Sic
11 pages
Cancers 16 00674
No ratings yet
Cancers 16 00674
18 pages
Deep Learning Techniques For Lung Cancer Recogniti
No ratings yet
Deep Learning Techniques For Lung Cancer Recogniti
7 pages
TSP CMC 52404
No ratings yet
TSP CMC 52404
37 pages
Lung Cancer Diagnosis Using Prewitt & SVM As Hybrid Model
No ratings yet
Lung Cancer Diagnosis Using Prewitt & SVM As Hybrid Model
8 pages
1 s2.0 S1746809423007528 Main
No ratings yet
1 s2.0 S1746809423007528 Main
12 pages
Icimia48430 2020 9074947
No ratings yet
Icimia48430 2020 9074947
8 pages
Manuscript Reference
No ratings yet
Manuscript Reference
31 pages
JISIoT Paper-Scopus
No ratings yet
JISIoT Paper-Scopus
12 pages
Integration of AI in Lung Cancer
No ratings yet
Integration of AI in Lung Cancer
11 pages
1 s2.0 S1877050923001643 Main
No ratings yet
1 s2.0 S1877050923001643 Main
9 pages
10 1109@iccsp48568 2020 9182258
No ratings yet
10 1109@iccsp48568 2020 9182258
4 pages
Poc 3-1 All Units Notes
No ratings yet
Poc 3-1 All Units Notes
10 pages
Deep Learning-Based Diagnosis of Lung Cancer Using A Nationwide Respiratory Cytology Image Set: Improving Accuracy and Inter-Observer Variability
No ratings yet
Deep Learning-Based Diagnosis of Lung Cancer Using A Nationwide Respiratory Cytology Image Set: Improving Accuracy and Inter-Observer Variability
16 pages
Sample Report
No ratings yet
Sample Report
55 pages
Lung Cancer Detection
No ratings yet
Lung Cancer Detection
16 pages
A Novel Method To Detect Lung Cancer Using Deep Learning
No ratings yet
A Novel Method To Detect Lung Cancer Using Deep Learning
9 pages
C4 - Project Report Phase 2
No ratings yet
C4 - Project Report Phase 2
55 pages
Artificial Intelligence
No ratings yet
Artificial Intelligence
31 pages
Investigation of Lung Cancer Prediction and Classification Using CT-Scan Images by Employing Machine Learning & Population Based Techniques
No ratings yet
Investigation of Lung Cancer Prediction and Classification Using CT-Scan Images by Employing Machine Learning & Population Based Techniques
11 pages
Final Lung Record
No ratings yet
Final Lung Record
49 pages
1CD22MC043 Part 2
No ratings yet
1CD22MC043 Part 2
50 pages
Lung Cancer Classification and Detection Using CNN
No ratings yet
Lung Cancer Classification and Detection Using CNN
8 pages
Grab Detection Systems in Lung Cancer and Imaging, Volume 1 1st Edition Unlimited Ebook Download
No ratings yet
Grab Detection Systems in Lung Cancer and Imaging, Volume 1 1st Edition Unlimited Ebook Download
16 pages
TSP CMC 52755
No ratings yet
TSP CMC 52755
17 pages
Ijarcce 2023 12709
No ratings yet
Ijarcce 2023 12709
9 pages
Systematic Review For Lung Cancer Detection and Lu
No ratings yet
Systematic Review For Lung Cancer Detection and Lu
21 pages
Lung Cancer Detection Using Digital Image Processing On CT Scan Images
No ratings yet
Lung Cancer Detection Using Digital Image Processing On CT Scan Images
7 pages
Ayman El Baz, Jasjit S Suri Detection Systems in 230125 165150
No ratings yet
Ayman El Baz, Jasjit S Suri Detection Systems in 230125 165150
242 pages
Lung Cancer Detection and Prediction of Cancer Stages Using Image Processing
No ratings yet
Lung Cancer Detection and Prediction of Cancer Stages Using Image Processing
9 pages
Ensemble Deep Learning Models For Lung Cancer Diagnosis in Histopathological Images
No ratings yet
Ensemble Deep Learning Models For Lung Cancer Diagnosis in Histopathological Images
12 pages
TSP CMC 54460
No ratings yet
TSP CMC 54460
26 pages
Applied Machine Learning and Multi-criteria Decision-making in Healthcare
From Everand
Applied Machine Learning and Multi-criteria Decision-making in Healthcare
PublishDrive
No ratings yet
Epidemic Typhus Sameer
No ratings yet
Epidemic Typhus Sameer
14 pages
Cambridge International AS & A Level: BIOLOGY 9700/21
No ratings yet
Cambridge International AS & A Level: BIOLOGY 9700/21
20 pages
Cervical Fracture With Posterior Ligamentous Injury While Skydiving
No ratings yet
Cervical Fracture With Posterior Ligamentous Injury While Skydiving
1 page
Ocular Symptoms For History Taking - A.K.Khurana
No ratings yet
Ocular Symptoms For History Taking - A.K.Khurana
4 pages
Model Practical
No ratings yet
Model Practical
41 pages
ALT - Colorimetric
No ratings yet
ALT - Colorimetric
2 pages
July-24 Current Affairs by Mayank Sir 01.09.24
No ratings yet
July-24 Current Affairs by Mayank Sir 01.09.24
15 pages
Updates of Migraine: Dr. Hayam Abdel-Tawab Lect. of Neurology
No ratings yet
Updates of Migraine: Dr. Hayam Abdel-Tawab Lect. of Neurology
57 pages
Fungal, Protozoal and Parasitic Disorders - Part 2 - MCQ 2024 - DR Buddini Dissanayake
No ratings yet
Fungal, Protozoal and Parasitic Disorders - Part 2 - MCQ 2024 - DR Buddini Dissanayake
4 pages
Wepik Understanding Common Psychiatric Disorders in The Elderly A Comprehensive Overview 20241210204132BgKh
No ratings yet
Wepik Understanding Common Psychiatric Disorders in The Elderly A Comprehensive Overview 20241210204132BgKh
14 pages
1pages From The Anatomy of Sports Injuries - Brad Walker
No ratings yet
1pages From The Anatomy of Sports Injuries - Brad Walker
6 pages
Neisseria
No ratings yet
Neisseria
8 pages
Sample Ultrasound Report Findings - Peritoneum-Related
No ratings yet
Sample Ultrasound Report Findings - Peritoneum-Related
2 pages
Drugs Used in Hep C
No ratings yet
Drugs Used in Hep C
57 pages
Pharmacotherapy: A Pathophysiologic Approach 11th Edition Joseph T. Dipiro - Ebook PDF Instant Download
100% (3)
Pharmacotherapy: A Pathophysiologic Approach 11th Edition Joseph T. Dipiro - Ebook PDF Instant Download
77 pages
Chapter 2 - Cellular and Tissue Injury - Bin
No ratings yet
Chapter 2 - Cellular and Tissue Injury - Bin
42 pages
Von Willebrand Disease Written Report
No ratings yet
Von Willebrand Disease Written Report
3 pages
NCP For Cough 1
No ratings yet
NCP For Cough 1
3 pages
Practice Test 7: - Anopheles Mosquitoes Would 24.suck Up Infected Blood and Pass It On To The Next Person They Bite
No ratings yet
Practice Test 7: - Anopheles Mosquitoes Would 24.suck Up Infected Blood and Pass It On To The Next Person They Bite
14 pages
Agglutination Reaction JMHFHS
No ratings yet
Agglutination Reaction JMHFHS
50 pages
Clinical Parasitology Nematodes
No ratings yet
Clinical Parasitology Nematodes
6 pages
Retinal Degenerative Diseases XX
No ratings yet
Retinal Degenerative Diseases XX
462 pages
Case Study Respiratoty Failure
No ratings yet
Case Study Respiratoty Failure
2 pages
World Brain Tumor Day, 2021
No ratings yet
World Brain Tumor Day, 2021
40 pages
14.12 Penetrance and Expressivity
No ratings yet
14.12 Penetrance and Expressivity
1 page
Knee Dropping Test PDF
No ratings yet
Knee Dropping Test PDF
4 pages
Physical Therapy Medical Terminology
No ratings yet
Physical Therapy Medical Terminology
40 pages
Pharmeasy
No ratings yet
Pharmeasy
14 pages
Pediatric Orthopedic Deformities Vol.1
No ratings yet
Pediatric Orthopedic Deformities Vol.1
807 pages