0% found this document useful (0 votes)
10 views55 pages

C4 - Project Report Phase 2

The document presents a major project report on 'Lung Cancer Diagnosis Using Deep Learning,' submitted by a group of students for their Bachelor of Technology degree in Information Technology. It discusses the significance of early detection of lung cancer through deep learning techniques, particularly Convolutional Neural Networks (CNNs), and outlines the methodology, objectives, and structure of the project. The report emphasizes the potential of these advanced techniques to improve diagnostic accuracy and save lives by identifying lung cancer at an early stage.

Uploaded by

Infinity
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views55 pages

C4 - Project Report Phase 2

The document presents a major project report on 'Lung Cancer Diagnosis Using Deep Learning,' submitted by a group of students for their Bachelor of Technology degree in Information Technology. It discusses the significance of early detection of lung cancer through deep learning techniques, particularly Convolutional Neural Networks (CNNs), and outlines the methodology, objectives, and structure of the project. The report emphasizes the potential of these advanced techniques to improve diagnostic accuracy and save lives by identifying lung cancer at an early stage.

Uploaded by

Infinity
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 55

LUNG CANCER DIAGNOSIS USING DEEP

LEARNING
A Major Project Report Submitted in partial fulfilment of the requirements for the award
of the degree of
BACHELOR OF TECHNOLOGY
IN
INFORMATION TECHNOLOGY
Submitted by
KONERU KETHAN SAI (20071A12E5)
MALLELA KARTHIK REDDY (20071A12F2)
METLA SAI CHARAN (20071A12F3)
PAIDISETTY SAI AMRUTHA (20071A12F6)
PAPPULA DEVI PRASANNA (20071A12F7)

Under the Guidance of

Mr. I Pavan Kumar


Assistant Professor, Department of IT, VNR VJIET

Major Project -Phase - II

DEPARTMENT OF INFORMATION TECHNOLOGY

VALLURUPALLI NAGESWARA RAO VIGNANA JYOTHI


INSTITUTE OF ENGINEERING & TECHNOLOGY
An Autonomous Institute, NAAC Accredited with ‘A++’ Grade, NBA Accredited for CE, EEE,
ME, ECE, CSE, EIE, IT B. Tech Courses, Approved by AICTE, New Delhi, Affiliated to
JNTUH, Recognized as “College with Potential for Excellence” by UGC

ISO 9001:2015 Certified, QS I GUAGE Diamond Rated

Vignana Jyothi Nagar, Pragathi Nagar, Nizampet (S.O), Hyderabad – 500 090, TS, India
APRIL 2024
VALLURUPALLI NAGESWARA RAO VIGNANA JYOTHI
INSTITUTE OF ENGINEERING AND TECHNOLOGY
An Autonomous Institute, NAAC Accredited with ‘A++’ Grade, NBA Accredited for CE, EEE, ME, ECE,
CSE, EIE, IT B. Tech Courses, Approved by AICTE, New Delhi, Affiliated to JNTUH, Recognized as
“College with Potential for Excellence” by UGC, ISO 9001:2015 Certified, QS I GUAGE Diamond Rated

Vignana Jyothi Nagar, Pragathi Nagar, Nizampet(SO), Hyderabad-500090, TS, India

DEPARTMENT OF INFORMATION TECHNOLOGY

CERTIFICATE
This is to certify that the project report entitled “LUNG CANCER DIAGNOSIS
USING DEEP LEARNING” is a bonafide work done under our supervision and is
being submitted by Mr. Koneru Kethan Sai (20071A12E5), Mr. Mallela Karthik
Reddy (20071A12F2), Mr. Metla Sai Charan (20071A12F3), Miss. Paidisetty Sai
Amrutha(20071A12F6), Miss. Papulla Devi Prasanna(20071A12F7) in partial
fulfilment for the award of the degree of Bachelor of Technology in Information
Technology, of the VNRVJIET, Hyderabad during the academic year 2023-2024.

Certified further that to the best of our knowledge the work presented in this thesis has
not been submitted to any other University or Institute for the award of any Degree or
Diploma.

Mr. I Pavan Kumar Dr D Srinivasa Rao

Assistant Professor & Project Guide Associate Professor & HOD

Department of IT Department of IT
External Examiner

VALLURUPALLI NAGESWARA RAO VIGNANA JYOTHI


INSTITUTE OF ENGINEERING AND TECHNOLOGY
An Autonomous Institute, NAAC Accredited with ‘A++’ Grade, NBA Accredited for CE, EEE, ME, ECE,
CSE, EIE, IT B. Tech Courses, Approved by AICTE, New Delhi, Affiliated to JNTUH, Recognized as
“College with Potential for Excellence” by UGC, ISO 9001:2015 Certified, QS I GUAGE Diamond Rated

Vignana Jyothi Nagar, Pragathi Nagar, Nizampet(SO), Hyderabad-500090, TS, India

DEPARTMENT OF INFORMATION TECHNOLOGY

DECLARATION
We declare that the major project work entitled “LUNG CANCER DIAGNOSIS
USING DEEP LEARNING” submitted in the department of Computer Science and
Engineering, Vallurupalli Nageswara Rao Vignana Jyothi Institute of Engineering and
Technology, Hyderabad, in partial fulfilment of the requirement for the award of the
degree of Bachelor of Technology in Information Technology is a bonafide record of
our own work carried out under the supervision of Mr. I Pavan Kumar, Assistant
Professor, Department of IT, VNRVJIET. Also, we declare that the matter embodied
in this thesis has not been submitted by us in full or in any part thereof for the award of
any degree/diploma of any other institution or university previously.

Place: Hyderabad.

Koneru Kethan Mallela Karthik Metla Sai Charan Paidisetty Sai


Sai Reddy Amrutha
(20071A12F2) (20071A12F6)
(20071A12F3)
(20071A12E5)

Pappula Devi
Prasanna

(20071A12F7)
ACKNOWLEDGEMENT

We express our deep sense of gratitude to our beloved President, Sri. D. Suresh Babu,
VNR Vignana Jyothi Institute of Engineering & Technology for the valuable guidance
and for permitting us to carry out this project.

With immense pleasure, we record our deep sense of gratitude to our beloved Principal,
Dr.C.D.Naidu, for permitting us to carry out this project.

We express our deep sense of gratitude to our beloved Professor Dr. SRINIVASA RAO
DAMMAVALAM, Associate Professor and Head, Department of Information
Technology, VNR Vignana Jyothi Institute of Engineering & Technology, Hyderabad-
500090 for the valuable guidance and suggestions, keen interest and through
encouragement extended throughout the period of project work.

We take immense pleasure to express our deep sense of gratitude to our beloved Guide,
Mr. I Pavan Kumar, Assistant Professor in Information Technology, VNR Vignana
Jyothi Institute of Engineering & Technology, Hyderabad, for his valuable suggestions
and rare insights, for constant source of encouragement and inspiration throughout my
project work.

We express our thanks to all those who contributed for the successful completion of our
project work.

Mr. Koneru Kethan Sai (20071A12E5)

Mr. Mallela Karthik Reddy (20071A12F2)

Mr. Metla Sai Charan (20071A12F3)

Miss. Paidisetty Sai Amrutha (20071A12F6)

Miss. Papulla Devi Prasanna (20071A12F7)

i
ii
ABSTRACT

Lung Cancer Detection is known as the one amongst the deadliest cancer around the
world and also causes largest deaths related to cancer. Detecting Lung Cancer in the
last stage is what leading to deaths at a higher rate. This can be stopped by detecting
Lung Cancer at an early stage and can save many lives.

Lung cancer, a threatening disease that contributes significantly to the global burden
of cancer-related diseases, frequently escapes early-stage detection, resulting in
considerable loss of life. Recognizing lung cancer in its early stages presents a good
opportunity to save lives. Computed tomography (CT) imaging has become popular as
a tool for identifying even very small size tumors, that may go unnoticed many times.
However, differentiating between malignant and not so harmful tumors remains as a
tough task, for even the most experienced medical practitioners.

In recent years, the arrival of deep learning techniques has become a ground for
research in medical field. Convolutional neural networks (CNNs), also known as
expertise in tasks related to image analysis, have demonstrated extraordinary potential
in the domain of lung cancer detection from CT images. This surrounding review
undertakes a learning comprehensive value of various types of strategies involved in
controlling CNNs for the detection and classification of lung cancer. Moreover, it
undergoes into the crucial processes, such as pre-processing, segmentation, and nodule
extraction, which have gained improved acceptance and performed considered amount
of work to develop in augmenting diagnostic precision.

The acceptance of this various types of approach, combination of the capabilities of


CNNs that are used together, signifies a central boundary for the ongoing campaign
against lung cancer, offering new hope for early detection and improved outcomes in
the battle against this deadly disease.
iii
TABLE OF CONTENTS
Acknowledgements i

Abstract ii

List of Figures iv

Chapter-1: Introduction 1

1.1 Definition 2

1.2 Scope of the Work 3

1.3 Objective 3

1.4 Thesis Organization 4

Chapter-2: Literature Survey 5

Chapter-3: Methodology 16

3.1 Introduction 16

3.2 Existing System 17

3.3 Requirements 17

3.4 Proposed System 18

3.5 Workflow Diagram 20

3.6 Case Study 21

Chapter-4: Design 28

4.1 Algorithms 28

4.2 Data Flow Diagrams/UML Diagrams 29

iv
Chapter-5: Implementation 34

Chapter-6: Results

Chapter-7: Conclusion 35

Chapter-8: Future Scope

References 36

LIST OF FIGURES

S.no Contents Page no.

1 Fig 3.5.1 Workflow diagram 20

2 Fig 4.2.1.1 Use case diagram 30

3 Fig 4.2.2.1 Sequence diagram 31

4 Fig 4.2.3.1 Activity diagram 32

5 Fig 4.2.4.1 Design Architecture 33

v
CHAPTER 1

INTRODUCTION

Lung cancer represents an extreme global health challenge, standing at the frontline of

cancer-related loss of life and holding the doubts of differences of possessing the highest

death rate among all types of cancers. This lung cancer can be of two types majorly

which is small-cell lung cancer and non-small-cell lung carcinoma. While Computed

Tomography (CT) imaging has demonstrated its remarkable effectiveness in removing

even the tiniest kind of lung tumors, the early detection of this disease remains as an

upward conflict. The main difficulty lies in differentiating the cancerous growths from

their similar types of tumors, which often share a similar appearance. Moreover, lung

cancer can emerge without any early types of symptoms, or it may present symptoms that

copy those of respiratory infections. These factors compound the challenge of time-to-

time diagnosis, encourages for an urgent search for any other alternative detection

methods.

Within the dimension of medical science, deep learning has emerged as an entire wide-

range of information that helps in, capturing the attention of researchers who are now

putting to use this transformative technology for lung cancer detection and classification.

The typical workflow for identifying lung cancer from CT scan images involves a series

of central stages: image processing, segmentation, feature extraction, and classification.

1
Each of these phases assumes a critical role in the development of a robust and accurate

classifier. Researchers have invested their considerable efforts in clarifying these stages,

discovering innovative techniques to support the diagnostic process. Simultaneously,

numerous systems and methodologies have been developed in way to identify the task of

detecting and classifying lung cancer.

This deals with review embark on an exploration of recent advancements in the domain

of lung cancer detection systems, attention to detailing and examining the types of

techniques and models that have come to light to achieve effective and efficient lung

cancer diagnosis and classification. With the help of these developments, we can be able

to find a way to cure lung cancer in early stages thereby reducing the death rate. The

advancement of this technologies and experienced medical team underscores the urgency

and promise of this field.

1.1 DEFINITION

Deep learning has emerged as an encouragement in the field of medicine, drawing

considerable interest because of its potential applications in the domain of lung cancer

detection and classification. The traditional approach to identifying lung cancer from CT

scan images makes it necessary and well-defined sequence of stages: image processing,

segmentation, feature extraction, and classification. Each of these phases plays a central

part in developing a classification and detection model. Researchers have tried to tune

2
and optimize these stages, innovating some inventive techniques which main goal is to

augmenting the overall diagnostic process. As a result, a different type of systems and

methodologies had evolved, specifically having many complex challenges of detecting

and classifying lung cancer.

1.2 SCOPE OF WORK

Our research focuses on addressing the complex challenges that are associated with the

lung cancer, a global health concern. We suggested the need for more effective methods

of detecting and classifying lung cancer, considering its status as a leading cause of

cancer-related to larger number of deaths. Some of the key aspects of our work include

understanding the different forms of lung cancer, recognizing the diagnostic struggles in

early detection, and acknowledging the central part of deep learning in medical

applications. We also searched into the standard diagnostic workflow, highlighting the

importance of stages like image processing, segmentation, feature extraction, and

classification. Innovations in these stages are a core focus, contributing to the

development of various systems and methodologies of choosing from appropriate to use

for lung cancer detection.

1.3 OBJECTIVE

The aim of project is to use lung images with an image resolution of -1024 x 2389-pixel

values to construct a deep learning model for classification of lung cancer. The objective
3
is to create a highly accurate and efficient system capable of studying these images and

identify the presence of tumor in a specific area of lung and classifying what type of lung

cancer was it. In this way early detection of lung cancer can help in saving many lives at

an early stage and appropriate treatment can be provided.

1.4 THESIS ORGANIZATION

The thesis is organized into six chapters, they are detailed in the following section

 In Chapter 1, we have specified the introduction of our project and its scope.

 Chapter 2, presents a comprehensive examination of research related to lung

cancer detection and classification, highlighting key discoveries that form a

foundation for the study outlined in this thesis.

 Chapter 3, delves into existing systems, highlighting their strengths and

weaknesses while addressing any existing deficiencies. And discuss about some

proposed systems. Within this chapter, the prerequisites for the recommended

system are defined, and categorized into functional and non-functional

components. The suggested system describes its structure, components, and

technical specifications while rectifying identified issues and limitations. The

Workflow diagram is presented in here and the case study in which we included

what are the filters that we have used in our project

4
 Chapter 4, we present what are the algorithms that we used in our project to

develop it. The dataflow of model diagram and UML diagrams about our project

are included

 In Chapter 5, we have mentioned the way we implemented the overall project and

the code with explanation.

 In Chapter 6, we discussed about the results we obtained which were shown

accordingly.

 In Chapter 7, it searches into the project concluding remarks, which related to the

application of deep learning in the topic of lung cancer detection and

classification.

 In Chapter 8, the future scope of lung cancer classification and detecting it was

proposed in deep learning and it’s uses.

5
CHAPTER 2

LITERATURE SURVEY

Anand Gudur et al.[1] proposed a four-step procedure for lung cancer detection. These

four steps include processing, segmentation, feature extraction and classification. In pre-

processing, they wanted to remove any noise, or improve the image qualities. They used

a median filter for the pre-processing the input image to remove the background noise.

Next is image segmentation where the input CT image is divided into regions based on

various features. The techniques that were mentioned as part of this step were

thresholding and water shed algorithms. They used c-GAN (conditional Generative

Adversarial Network) to extract the lung from the input image. The input slices are sent

through encoders and converted into feature maps. These feature maps are then used as

input to a multi-scale feature extraction module. Then they are fed into decoders to

extract the lung segmentation. For segmenting nodules, a multi-level, single-click region

growth strategy is used. It is a technique used to identify regions of interest in an image.

It works at various levels of granularity. Blood vessels are removed by some

morphological operations. The third step, feature extraction is the important step which

produces important features from the data which can be used for effective classification

of the image. ResNet-50 model is used to extract the required features. Some of these

features include surface area, mean intensity, etc. Then, the classification step which is

the final step that involves ascertaining whether the input image is cancerous or not. This

is a supervised learning task that includes techniques like CNN, etc. They used SVM to

classify the deep learning features that are presented to it by ResNet-50, which then
6
classifies the input image. There are 3 different labels Tumors (T), nodes(N), and

metastases (M) which were classified using three distinct Resnet50 networks, and a SVM

model. Every image is provided as input to these three combinations. The present study

got dice similarity coefficient of 0.99 and Jacquard index of 0.98 which beats the other

models under study NMF (0.85, 0.80), U-Net (0.96, 0.93), ResNet (0.96, 0.94).

Vani Rajasekar et al.[2] used different deep learning models for feature analysis to

predict lung cancer. The models they used were CNN Gradient Descent (CNN GD),

VGG-16, VGG-19, Inception V3 and Resnet-50. When it comes to CNN, there are five

different kinds of layers involved in it. The input layer as the name suggest takes the

input image that is provided to it before it passes to further into the network. Next, comes

the convolutional layer which is used for the feature extraction. It uses ReLU as its

activation function which introduces non-linearity into learning process. A filter which is

runs over the input image with a certain step (stride) and padding if any to the input

image. Then for each filter position it outputs an integer which represents the extracted

feature like edges. Then the ReLU activation function takes the values and converts all

the negative numbers to zero. To decrease the dimension of the generated feature map,

pooling layers are used which either take the maximum or the average of the pixels in the

window based on the type used. Then the obtained feature map is mapped to a normal

neural network layer which carry out the remaining process. The SoftMax layer

calculates the probability that the image has to belong to any of the available class. Then

the output layer predicts the final output. All the models that are used are different CNN

architecture which basically use these basic components. Gradient descent is an


7
optimization algorithm that allows the model to learn the weights so that the overall loss

decreases, increasing the predictive power. We obtain a convex function and the

algorithm tries to adjust the weights so as to move to the minimum of that convex

function to decrease the loss. The next model VGG-16 contains 16 layers in it in 6

blocks. The first and second block consists of 2 conv layers with 1 pooling layer

respectively and third, fourth and fifth block consists of 3 conv layers with 1 pooling

layer and final block consists of 3 fully connected (Dense) layers. The conv layers use

3*3 filters and pooling layers use 2*2 filter. VGG-19 uses 19 layers in 6 blocks. The

difference between VGG-16 and 19 is that the latter one uses 4 conv layers instead of 3 in

third, fourth and fifth blocks. Inception v3 model consists uses different dimensional

conv layers in a single module to extract features of different size which is one of the

state-of-the-art models. ResNet-50 consists of 50 layers which consists of residual paths

between every two layers. This ensures that the features learnt in the previous layers are

forwarded and the problem of vanishing and exploding gradients is taken care of. The

CNN GD is said to be performing better than all the other models that are considered.

CNN GD has scored an accuracy of 97.86. The other model’s accuracy scores are VGG-

16 (96.52), Inception V3 (93.54), VGG-19 (92.17), ResNet -50 (93.47).

Sudhir Reddy et al.[3] proposed an algorithm for forecasting the growth of malignant

cells in the CT image. The forecasting can help us to anticipate the area and size of the

tumor. Firstly, CT images are obtained from the respective sources which are then pre-

processed accordingly. The CT images are paired to gain spatial information along

another axis too. CNN are used to extract the required features from these images which
8
can be used for further purposes. CNN is a feed-forward neural network and the

architecture consists of various convolutional layers that can be used to extract 2D

information from the images. This is one of the main advantages of the CNNs since other

methodologies like those of ML algorithms (SVM, etc.) and normal ANN (artificial

neural networks) will ignore the spatial information of the pixels in the image. These

networks also use pooling layers to highlight the important features thus extracted,

removing the unwanted ones and reducing the size of the feature map. These conv and

pooling layers make up the architecture which draws out the required features from the

images which are then provided to a fully connected network or other ML models for

classification purposes. The proposed algorithm contains four stages namely extortion

location for locating the tumor cells, PC vision to differentiate the tumor with its features

like intensity variations, edges, etc. AI enabled bio-informatics to generate the

information regarding the CT image under study, and clinical picture determination to

provide right way for clinical assistance to the patient. This algorithm outperformed other

ML ones like SVM, Naïve Bayes, Decision Tree, Random forests. The proposed method

got an accuracy of 92.81. Random forests (81.3), Decision Trees (83.6), Logistic

Regression (84.06), Naïve Bayes (86.25), SVM (88.94).

Pandian et al.[4] found that the textural features of the images proved to efficiently

classify the normal and the malignant ones. So, they used a CNN and GoogleNet with

VGG-16 as base network. GoogleNet and VGG-16 were applied on the dataset initially.

VGG-16 consists of 16 layers in its architecture in 6 blocks. It is one of the famous

pretrained architecture. On the other hand, GoogleNet consists of 28 layers with 9


9
inception blocks among them to extract features of various sizes in the image. The

inception block consists of 2 (1*1) conv layers that are connected to (3*3), (5*5)

respectively. There is a (1*1) layer which gets its input from a pooling layer and there is

another (1*1) layer that is directly connected to the output of the block. These outputs are

concatenated by a concat layer and there are occasional normalization layers. The output

of the overall network is connected to a linear layer which is connected to softmax layer.

The CNN network that they experimented on consists of two blocks of layers with

convolution, normalization layer followed by the pooling layer. The VGG-16 is used as a

base network for the mentioned CNN architecture.

Bushara et al.[5] wanted to harness the power of data augmentation in classifying the

lung cancer. So, they developed Augmented CNN for this task. It mainly consists of three

stages, image data acquisition, data augmentation and classification using CNN. Total of

2066 images were obtained from LIDC-IDRI repository of which 80% were used as

training data and 20% for testing. 20% of validation data was taken from the training

data. Then next comes the data augmentation step. Data augmentation is the process the

generating more data from the existing one by various operations like rotation, scaling,

shearing, zooming, flip, etc. Finally, there comes the classification step which involves

training the CNN model to classify the data into benign or malignant. Before the data is

sent to model, it undergoes through necessary preprocessing steps. The CNN consists of

convolutional layers, pooling layers, fully connected and SoftMax activation layers.

ReLU is used as activation for the convolutional layers to introduce non-linearity into the

learned feature maps which can help us to learn non-linearly separable data. The
10
proposed augmented CNN outperformed the U-Net and ResNet combination, and other

models which shows the importance of the data augmentation or large amount of data for

training the models at hand. The proposed method got an accuracy of 95. The other

models like UNet and ResNet, CNN, KNG-CNN got 84, 84.15, 87.3 respectively.

Tehnan I. A. Mohamed et al.[6] used an algorithm called Ebola optimization search

algorithm for obtaining best weights and bias for the given CNN model architecture.

They proposed a hybrid metaheuristic and CNN algorithm where they first obtained a

vector to the CNN architecture which is then used to search for the best weights and bias.

The EOSA algorithm is based on an improved SIR (Susceptible, infectious, recovered)

model of the disease. The model consists of compartments, which are Susceptible (S),

Exposed (E), Infected (I), Hospitalized (H), Recovered (R), Vaccinated (V), Quarantine

(Q), and Death (D). This creates a search space that can be used to obtain optimal weights

and bias. Before the model is trained, the input images are processed. They used

grayscale filter to convert the images to grayscale, gaussian blur filter for smoothing,

Otsu’s thresholding technique is used where a threshold value is selected that divides the

image into foreground and background for image segmentation. Image normalization is

then performed that changes the range of pixel intensity values. Morphological operations

like erosion and dilation are used. Then they used contrast-limited adaptive histogram

equalization (CLAHE) filter to remove the noise, applied wavelet transform and finally

split the dataset into training and test set in 80%, 20% split. The CNN consists of 4

blocks that contains 2 conv layers, zero padding layer, pool helper (custom layer for

preselecting features) and max pooling layer respectively. These are followed by Linear,
11
Dropout layers with SoftMax at last. Based on the CNN model thus developed and input

data, a solution vector is obtained which is sent to the EOSA for optimal weights on

which the model is trained. By this study, they observed that the EOSA-CNN performed

well then bare CNN and other optimization algorithms under study like, Genetic

algorithms, Multiverse Optimizer. The accuracy for GA-CNN, LCBO-CNN, MVO-CNN,

SBO-CNN, WOA-CNN, and EOSA-CNN were 0.81, 0.81, 0.79, 0.81, 0.81, and 0.82,

respectively clearly outperforming others.

Venkatesh et al.[7] proposed a five-phase methodology for lung cancer detection which

processing, segmentation, nodule detection, feature extraction, and classification. In the

pre-processing step, they used median filtering to remove any unwanted noise in the

images. For image segmentation, they wanted to use Otsu thresholding. Otsu thresholding

is a thresholding technique that selects a threshold value for segmenting foreground and

background pixels. The cuckoo search algorithm is used extraction of regions of interest

from the segmented images. This algorithm is inspired by the nature. The algorithm is,

 Initialize a population of cuckoo eggs.

 Randomly place each cuckoo egg in a nest.

 For each cuckoo egg:

o If the nest is empty, place the cuckoo egg in the nest.

12
o Elseif, the nest already contains an egg, with probability pa (Probability

that a cuckoo egg will replace a host bird's egg.), replace the host bird's

egg with the cuckoo egg.

o Else, abandon the cuckoo egg.

 Generate new cuckoo eggs using a Levy flight distribution.

 Repeat steps 2-4 until a termination criterion is met.

 The best cuckoo egg is the solution to the optimization problem.

A Levy flight is a random walk in which the step lengths have a stable probability

distribution. The LBP (Local Binary Pattern) is used to extract required features from the

images. The LBP operator is a texture descriptor that works by comparing the center

pixel of a 3x3 neighborhood to its 8 neighbors and assigning a binary value to each

neighbor depending on the value of center pixel. Then for classification, a model

containing 2 conv layers, 2 pooling layers are used with fully connected, dropout and

SoftMax layers at last of the architecture. This methodology outperformed those of

particle swarm optimization and genetic algorithm. The proposed model obtained an

accuracy of 96.97. The PSO and GA methodologies obtained 96.93, 90.49 respectively.

Chaunzwa et al.[8] used VGG-16 model for lung cancer classification. As the

preprocessing steps, they performed isotropic rescaling and density normalization. They

used two different VGG-16 pretrained architectures for this process. Model A is fine-

tuned using the data of 172 patients who are affected by adenocarcinoma or squamous
13
cell. This model is also used for extracting features for the machine learning classifiers

like, KNN, SVM, Linear-SVM, RF. The features are derived from the last pooling and

first fully connected layers, corresponding to 512-D and 4096-D vectors. Model B is the

fully connected VGG16 network tuned with a data of 228 cases with all histology types.

KNN trained on 4096-D clearly outperforms all the other models, that are developed.

Model A comes next. The same KNN trained on 512-D stands top among the other

models that are trained on same features. This clearly states that the features given by

CNN are valuable then the handmade features.

Sori et al.[9] proposed a model called DFD-Net (which stands for Denoising First

Detection Network). There are mainly three phases involved in this methodology.

Preprocessing which consists of lung segmentation and suspicious nodule detection,

denoising the obtained result and then classification. Firstly, under the pre-processing of

the images, the input images pixel values are converted into Hounsfield Unit, on which

thresholding is applied to remove unnecessary tissues from the image. Then Gaussian

filter is applied to the image thus obtained. The nodules that are suspicious are removed

by U-Net model which consists of encoder for feature extraction and decoder for image

reconstruction and segmentation. Thus, obtained images are used to train the DFD-Net.

First, the images pass through a denoising model named DR-Net (Deep Neural Network

with Multi-Layer Residual Blocks), which is used to remove the noise from the images

that are supplied to it. The output of this model is sent to the detection part of the DFD-

Net. The detection process is done by a 2 path CNN the first one used 3*3 and second

one used a 5*5 filter to capture features in those dimensions. First path consists of 4 and
14
second one consists of 3 blocks (conv + pooling). The outputs of each of the paths are

concatenated and final output of whole model is calculated. The proposed approach beats

3D CNN and other approaches that are proposed by other researchers. The obtained

accuracy was 87.8 outperforming other models in their study.

Ahmed et al.[10] proposed 3D approach for lung cancer detection. They wanted to take

the advantage of spatial information in the third dimension so that the detection of lung

cancer would be more accurate. To do that, they used 3D CNN which are used to extract

features from 3D image slices. 3D CNN too consist of basic components like input layer,

convolutional layer, pooling layer and fully connected layer in their architecture. First,

they performed some pre-processing on the data. They converted the pixel values into

Hounsfield units and twenty of these 2D slices are stacked. On the obtained result, they

performed, resizing and averaging. Then segmentation is performed to remove any

unwanted portions in the image by manual thresholding. The obtained processed data is

then feed to the 3D CNN architecture that consists of 2 convolutional (with filter

dimensions as 3*3*3) and 2 max pooling layers (with kernel of 2*2*2) and a dense layer

for predicting the presence of cancer. They obtained an accuracy of 80% which is a fine.

15
CHAPTER 3
METHODOLOGY

3.1 INTRODUCTION
Lung cancer is one of the deadliest cancers that effects humans. The growth of malignant

tumors in lungs is a cause of it. The disease can be cured if it is detected in early stages

thereby reducing the deaths caused due to it. So, the researchers are trying to develop a

system which detects lung cancer in its early stages itself which can be helpful. In

developing such a system, we need data which should accurately display the tumors in

early stage of lung cancer. Lung cancer tumor cells in early stage may be very small so

revealing such tumors can be challenging. CT (Computed Tomography) images are said

to reveal even such small cells which is what that is required to develop a system for

early detection of lung cancer.

There are various steps that are involved in developing a system to detect and classify

lung cancer. The steps that include are pre-processing, lung segmentation, feature

extraction, classification. Pre-processing mainly deals with the removal of noise in the

data, and making the data suitable for further process. Segmentation is the process of

finding the region of tumor and separating it from the rest of the image. Feature

extraction deals with the extraction of required features from the segmented image and

finally classification deals with classifying the image.

16
Different techniques are used in different stages of the above pipeline to detect the lung

cancer. In pre-processing techniques like Gaussian filtering, Discrete wavelet transform,

etc can be used to clean the noise and improve the image features. In segmentation,

images processing techniques like thresholding or deep learning models like CNN can be

used. Generally, for feature extraction, CNN are used. And finally for classification, we

can use ML algorithms or deep learning to classify lung cancer.

3.2 EXISTING SYSTEM

There are many systems that use ML algorithms for detection and classification purposes.

ML algorithms normally use hand-engineered features for classification process which

may not be helpful as the handpick features are not said to very efficient in detecting and

classifying the lung cancer. And ML algorithms tend to be less effective in high

dimensional data such as CT images. The non-linear relationship among the data is also

not modelled clearly in ML and they are not quite suitable for unstructured data.

3.3 REQUIREMENTS

Software Requirements

1.Pogramming Languages-Python

2.Deep Learning Framework-TensorFlow, Keras

17
Hardware Requirements

1.RAM-8gb or higher

2.System type-64-bit windows

3.Cloud–based GPU

3.4 PROPOSED SYSTEM

We proposed to use CNNs (Convolutional Neural Networks) which are a type of neural

networks that consider the spatial information of the pixels in the image. CNN are used to

extract the features from the images which can be used to detect and classify the images.

CNN use filters as weights which extract important features from the image provided to

it.

There are various CNN architectures that are proposed by various researchers and are

used for various purposes,

3.4.1 ALEXNET

We tried to use modified alexnet model which takes 128*128*1 input image. It consists

of a convolutional layer with 96 filters of size 11*11 and stride of 4 and maxpooling layer

with size 3*3 and stride 2*2. The next set of layers consist of 256 filters with 5*5 filter

18
and same maxpooling layer. The output is next given to layers that consist of 2

convolutional layers with 384 filters and 3*3 size. Finally, it passes through conv layer

containing 256 filters with same filter size as previous one and maxpooling layer with

3*3 size and stride of 2*2. The output of this is flattened and given to 2 dense layers of

4096 nodes with dropout regularization and finally to output softmax layer.

3.5 WORKFLOW DIAGRAM

19
Fig 3.5.1 Workflow diagram

First the input images are sent for pre-processing where the noise is removed and images

are enhanced. This is the preprocessing step that every system consist of as unwanted

noise removal is must for accurate results. Then these enhanced images are sent to a

segmentation model to segment for tumor regions. The segmentation task can be a model

like CNN or image processing task like thresholding. This segmented output is then sent

to the classification model for lung cancer classification.

3.6 CASE STUDY

3.6.1 PREWITT OPERATOR

Let us consider a 3*3 image,

a b c

d e f

g h i

20
Prewitt operator is one of the basic operators used for extract of features from images.

Prewitt operator only considers the contrast difference between the pixels. So, the

equation obtained would look like,

G=(c−g−a+i+ f −d ),(c−g+ a−i+b−h)

Then the overall Prewitt filter is obtained as,

-1 0 1 -1 -1 -1

-1 0 1 0 0 0

-1 0 1 1 1 1

Horizontal edge Vertical edge

3.6.2 SOBEL OPERATOR

Let us consider a 3*3 image where we need to find if the central pixel consists of the

edge in it.

a b c

d e f

g h i

21
The distance between the horizontal and vertical pixels is 1 and diagonal pixels is √2.

The directional derivative of the central pixel e and the pixel to its right f, can be

written as (f −e ) i^ , where i^ is a unit vector in the positive x- direction,

and e and f represent the intensities of the pixels at their positions respectively. The

directional derivative between the central pixel and the pixel to its left d, can be

written as −( d −e ) i^ . Finally, the overall diagonal derivative would be,

1
[( c−e) cos 45° i^ +( c−e) sin 45 ° ^j ]
√2

1
(c−e)( i^ + ^j)
2

Repeating the same process and adding all the eight results, give

[( 1 ^
]
f −d ) + (−a+ c−g+i ) i+[(b−h)+
2
1
2
( a+c−g−i ) ] ^j

-1 0 1 1 2 1

-2 0 2 0 0 0

-1 0 1 -1 -2 -1

22
1 1
is multiplied for each pixel. is coefficient which is dropped to get the required
8 16

operator.

3.6.3 LAPLACIAN OPERATOR

Laplacian operator is a result of second order differentiation. The finite difference of the

second order differentiation will look like,

''
f x ( x , y ) =f ( x +2 , y)−2∗f (x +1 , y )+ f (x , y )

''
f y ( x , y )=f (x , y+ 2)−2∗f (x , y +1)+f ( x , y )

The corresponding filters are,

0 0 0 0 1 0

1 -2 1 0 -2 0

0 0 0 0 1 0

X-derivative Y-derivative

The combined result would look like,

0 1 0

1 -4 1
23
0 1 0

24
CHAPTER 4
DESIGN

4.1 Algorithms Used

The algorithms that are considered to be used is convolutional neural network (CNN)

which is a type of neural network highly used for image data.

Convolutional Neural Network: Convolutional Neural Network (CNN) is a special type

of deep learning model which is used for the analysis of grid-structured data, such as

images. It uses convolutional layers to extract patterns and features from the data. These

layers use filters which scan across input data, capturing spatial information in the data.

Additionally, pooling layers are used to reduce the dimensionality of representations and

select the best features out of the available ones. CNNs make the learned parameters to

classify or make predictions regarding complex data, by the use of spatial information for

pattern recognition. CNNs work well in the domains of image and signal processing due

to their capability to learn and extract the features which can be used for various

purposes. They even can be used as the basic components in tasks like object

identification, image classification, and other computer vision applications.

25
4.2 Design and Workflow Diagrams

4.2.1 Use Case Diagram


Use-case diagrams are commonly used UML diagrams that are used to show the use

cases of a project. Use-case diagrams helps us in showing the behavioral attributes of a

system or a class, providing an insightful representation of how various elements interact

within a given context. These diagrams mainly use a group of actors, use cases, and their

interrelationships with one another. In a use-case diagram, a boundary is present which is

system's boundary that consists of various actors involved in the system with use cases

that they rely on. Actors are external entities, who work with the system with the

provided use cases. The use-case diagram serves as a visual tool for knowing, visualizing,

and documenting the behavioral aspects of a particular system or any component. These

diagrams are particularly useful for clearly explaining and clarifying the operational

requirements of an entire system or specific constituent parts. They provide a detailed

overview of how different actors (which can be individuals, other systems, or even

hardware devices) interact with the system's functionalities to achieve specific goals or

complete the given tasks.

26
Fig 4.2.1.1 Use case diagram

4.2.2 Sequence Diagram


A sequence diagram is an interaction diagram which is mainly used to show the

sequential order of the messages exchanged between objects. Within a specific time

period, it shows the objects that participate in the interaction by showing their lifelines

and the accurate sequence of messages that they send. Sequence diagrams are particularly

used in the scenarios that involve real-time specifications and processes because of their

ability to provide a clear and detailed representation of the sequential flow of information

between various components of the system.


27
Fig 4.2.2.1 Sequence diagram

4.2.3 Activity Diagram


Activity diagrams as the name suggests depicts the activity of the system. They are used

as a visual representation for the dynamic behavior of the system. Activity diagrams

show the flow of messages from one action to another action. Actions within these

diagrams tell us about those specific system functions. Activity diagrams are not only

helpful in constructing executable systems through various engineering methods but they

can also be used in showing the dynamic characteristics of a system. They represent the

activities of the system that is under study.

28
Fig 4.2.3.1 Activity diagram

29
4.2.4 DESIGN ARCHITECTURE

Data Collection Preparing and


preprocessing the dataset

Segmentation Segmented
Images

Classification Classification
Model Building Model Training

Model
Evaluation

Classification

Fig 4.2.4.1 Design Architecture

First, we start with the data collection where we collect the required data (CT images).

Then we preprocess the data to remove the noise in the dataset. A segmentation model is

built which is used for segmentation tasks. To this model, the preprocessed data is sent to
30
segment the images into tumor and background portions. A classification model is built

for classifying the cancer type to which the segmented images are sent. Finally, the

system is evaluated based on some metrics and classification is carried out.

31
CHAPTER 5
IMPLEMENTATION

LIBRARIES
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import pydicom
from pydicom.pixel_data_handlers.util import
apply_modality_lut
from pathlib import Path
import os
from tqdm import tqdm
import cv2 as cv
import skimage
from skimage.segmentation import clear_border
from skimage.measure import label,regionprops, perimeter
from skimage.morphology import ball, disk, dilation,
binary_erosion, erosion, binary_closing
from skimage import exposure, filters, util
from tensorflow import keras
from tensorflow.keras.models import Sequential, Model
from tensorflow.keras.layers import Conv2D, MaxPooling2D,
Flatten, Dense, Dropout, Reshape, Input
from tensorflow.keras.metrics import Accuracy, Recall,
Precision, F1Score, AUC

We have used different libraries for building the model. Numpy is a popular library for
working with arrays and mathematical operations. Pandas is used to work with
dataframes and data in excel sheets. Matplotlib is a famous plotting library. Seaborn is
also another famous plotting library that extends the features of matplotlib. Pydicom is a
library that is specifically used to work with medical images in dicom format. Pathlib and
os can be used to work with path and operating system related operations. Cv and
skimage are used to perform image related tasks easily and tensorflow is used to build
and train models. Tqdm is used to display a progress bar for iterations that might take
long time to run.

32
SEGMENTATION
First we would be setting some constants and then we will be segmenting the image to
get the portion of lungs that we require for future purpose.

DATA_PATH_1 = Path("../Data/manifest-1703080534832/")
DATA_PATH_2 = Path("../Data/manifest-1703166770855/")
OUTPUT_PATH = Path("../segmented/")
ANNOTATION_PATH = Path("../Annotation/")

We have actually installed the data in two different folders so DATA_PATH_1 and
DATA_PATH_2 consists of those two paths. And OUTPUT_PATH consists of the path
where segmented images are stored and ANNOTATION_PATH consists of path that
consists of annotations of the data.

df = pd.read_csv(DATA_PATH_1.joinpath('metadata.csv'))

Each data folder consists of the metadata csv file that we used to perform segmentation
easily. We can change the data path after it done on first one.

def transform_to_hu(dicom_img, img):


intercept = dicom_img.RescaleIntercept
slope = dicom_img.RescaleSlope
hu_img = img * slope + intercept
return hu_img

def windowing_img(img, level, width):


img = apply_modality_lut(img.pixel_array, img)
img_min = level - width // 2
img_max = level + width // 2
window_image = img
window_image[window_image < img_min] = img_min
window_image[window_image > img_max] = img_max
return window_image

Next, we have defined two functions. The first function transform_to_hu () converts the
image to Hounsfield unit based on the values in the dicom file of the image. The second
33
function windowing_img () performs windowing which is the process of converting the
image such that the abnormalities in the image are visible clearly.

def border_removal(img):
rescaled_img = exposure. rescale_intensity (img,
in_range='image', out_range= (0, 255)). astype('i')
thresh = rescaled_img < 100
cleared = clear_border(thresh)
return rescaled_img, cleared

The above function border_removal () is used to remove the borders that are present after
the windowing of the image is done. Since we only need the lungs in the image, we are
removing the borders in the image. We will rescale the image and find out the threshold
of the image and then we will clear the borders of the image.

def extract_lung(img):
labeled_img = label(img)
areas = [r. area for r in regionprops(labeled_img)]
areas.sort()
if len(areas) > 2:
for region in regionprops(labeled_img):
if region.area < areas[-2]:
for coordinates in region.coords:
labeled_img[coordinates[0],
coordinates[1]]=0
binary = labeled_img > 0
return binary

Then we will extract the lungs from the borderless images using extract_lung (). We will
first label the image and pick the two regions with highest area from the labeled regions
which are the required lung images and convert them into binary images.

def reconstruct_img(img, binary):


get_high_vals = (binary == 0)
result = img.copy()
result[get_high_vals] = 255
return result

34
Since we got the binary of the image, we are reconstructing the image with the above
reconstruct_img () function.

def dilate_res(img):
res_er = dilation(img, disk(0.9))
return res_er

Then we are just applying a morphological operation called dilation to the resultant
image so that the small parts of the image become clear.

def overall_extraction(img):
rescaled_img, rmborder_img = border_removal(img)
lung_img = extract_lung(rmborder_img)
morph_img = reconstruct_img(rescaled_img, lung_img)
temp = dilate_res(morph_img)
return temp

In this function we are just trying to apply all the functions that we have created
previously one after the other sequentially. Instead calling each function we are just
trying to call all the functions by calling a single function which is more readable and
understandable.

folder_name = ''
for r in tqdm(range(0, df.shape[0])):
p = df.loc[r, "File Location"][2:]
t = p.split("\\")[1]
folder_name = t
if t.find("A") != -1:
typ = 'adenocarcinoma'
elif t.find("B") != -1:
typ = 'smallcell'
elif t.find("E") != -1:
typ = 'largecell'
else:
typ = 'squamouscell'
dicom_path = Path(os.path.join(DATA_PATH_2,p))
c = 0
for dimg in dicom_path.iterdir():
img = pydicom.read_file(dimg)
slices = img[0x28, 0x02].value
wd_img = windowing_img(img, -220, 1300)
35
if slices > 1:
wd_img = wd_img[:, :, -1]
result = overall_extraction(wd_img)
instance_id = img[0x08, 0x18].value
instance_path =
ANNOTATION_PATH.joinpath(f"{t[8:]}/{instance_id}.xml")
if os.path.exists(instance_path):

np.save(OUTPUT_PATH.joinpath(f"{typ}/{instance_id}.npy"),
result)
else:

np.save(OUTPUT_PATH.joinpath(f"safe/{instance_id}.npy"),
result)
c += 1

In the above code snippet, we are trying to segment all the images and store them as a
numpy array in the required path. We are iterating over all the rows of the metadata file
which contains all the CT slices that are downloaded. Then we are getting the location
where the slice is downloaded and knowing the type of the slice. Based on that we are
placing them in anyone of the folder that is available. Since a single CT scan contains
more than one slice so we are trying to iterate over all the CT slices that are available and
we are performing windowing and other extraction functions that we defined previously.
Then based on the presence of the annotation file we are moving them into required
folders.

DATA AUGMENTATION

Data augmentation is the process of generating new images from the images that are
already present which helps in increasing the number of images that are present and
balancing the overall dataset.

TYPE = "largecell"
DATA = Path(f"../segmented/{TYPE}/")
AUGMENT_DATA = Path(f"../segmented/augment_{TYPE[0:2]}/")
COUNT = 200

36
These are some of the constants that we are setting before starting with data
augmentation.

def augment_data(img):
data_augmentation = tf.keras.Sequential([
tf.keras.layers.RandomContrast(0.2),
tf.keras.layers.RandomZoom(0.1, fill_mode =
'constant', fill_value = 255),
tf.keras.layers.RandomTranslation(height_factor=0.2
,width_factor=0.2, fill_mode = 'constant', fill_value =
255),
tf.keras.layers.RandomBrightness(0.2)
])
return data_augmentation(img)

This function returns the augmented image when a normal image is given. We randomly
choose contrast, zoom and brightness. We also translate the image as the process of data
augmentation.

for filename in tqdm(os.listdir(DATA)):


img = np.load(DATA.joinpath(filename))
for i in range(1, COUNT + 1):
result = augment_data(img[:, :, tf.newaxis])
np.save(AUGMENT_DATA.joinpath(f"{filename[:-
4]}_{i}.npy"), result)

In this code snippet, we are trying to generated and save the augmented images at a
specific location. For each image we are trying to generate some images based on the
constant set previously. Data augmentation increases the balance between the data.

ALEXNET

SEGMENTED_DATA = Path('../segmented/')
MODEL_PATH = Path("../models/")
TRAIN_SPLIT = 0.8
VAL_SPLIT = 0.1
CT_SLICES = 20293

37
These are the constants that are set to train the model.

def datagen():
for dirpath, subdirs , files in
os.walk(SEGMENTED_DATA):
if len(files) > 0:
for i in range(len(files)):
data = np.load(os.path.join(dirpath,
files[i]))
# print(dirpath, dirpath.split('\\'))
depth = 5
cancer_type = dirpath.split('\\')[2]
if cancer_type == 'adenocarcinoma':
label = tf.one_hot(1, depth)
elif cancer_type == 'largecell':
label = tf.one_hot(2, depth)
elif cancer_type == "smallcell":
label = tf.one_hot(3, depth)
elif cancer_type == "squamouscell":
label = tf.one_hot(4, depth)
else:
label = tf.one_hot(0, depth)
# print(data.shape)
# print(type(label))
if data.ndim == 2:
data = data[:, :, np.newaxis]
yield data, label

The above function datagen() is a data generator functions that generates the data from
the directory that is mentioned. It generates the data in the form of a pair where data and
label are present. This generator function is given as input to the model for training.

def resize_images(img, label):


img = tf.image.resize(img, (128, 128))
return img, label

The above function is used to resize the images from 512, 512 to 128, 128 so that it can
be helpful for training.

dataset = tf.data.Dataset.from_generator(datagen,
output_signature = (tf.TensorSpec(shape=(512, 512, 1),
dtype=tf.int32), tf.TensorSpec(shape=(5), dtype=tf.int32)))
38
dataset = dataset.map(resize_images).map(lambda x, y:
(tf.cast(x, tf.int32), (tf.cast(y, tf.float32))))

In this code, we are actually applying the resize function after generating the dataset from
the data generator function that we have written previously. This dataset is given to the
model to train on.

dataset.shuffle(buffer_size = 1000)
train_size = int(TRAIN_SPLIT * CT_SLICES)
val_size = int(VAL_SPLIT * CT_SLICES)
train_ds = dataset.take(train_size)
val_ds = dataset.skip(train_size).take(val_size)
test_ds = dataset.skip(train_size).skip(val_size)

Here we are shuffling the dataset according to the required size that we have mentioned
in the constants.

def alexnet(input_shape=(128, 128, 1), num_classes=5):


input_layer = Input(shape=input_shape)

x = Conv2D(96, (11, 11), strides=(4, 4),


activation='relu')(input_layer)
x = MaxPooling2D(pool_size=(3, 3), strides=(2, 2))(x)

x = Conv2D(256, (5, 5), padding='same',


activation='relu')(x)
x = MaxPooling2D(pool_size=(3, 3), strides=(2, 2))(x)

x = Conv2D(384, (3, 3), padding='same',


activation='relu')(x)

x = Conv2D(384, (3, 3), padding='same',


activation='relu')(x)

x = Conv2D(256, (3, 3), padding='same',


activation='relu')(x)
x = MaxPooling2D(pool_size=(3, 3), strides=(2, 2))(x)

x = Flatten()(x)

x = Dense(4096, activation='relu')(x)
x = Dropout(0.5)(x)

39
x = Dense(4096, activation='relu')(x)
x = Dropout(0.5)(x)

output_layer = Dense(num_classes, activation='softmax')


(x)

model = Model(inputs=input_layer, outputs=output_layer)


return model

model = alexnet()

model.compile(optimizer = 'adam', loss =


'categorical_crossentropy',metrics = [Accuracy(), Recall(),
Precision(), F1Score(), AUC()])

history = model.fit(train_ds.batch(32), epochs=20)

This is the overall alexnet model code for building and training the model.

40
CHAPTER 6
RESULTS

We were able to obtain around 95% accuracy, precision and recall of 0.96 and F1 score

of 0.91 which are decent results.

1
CHAPTER 7
CONCLUSION

The application of deep learning in lung cancer diagnosis can be very helpful in changing

the way we detect, classify, and manage this deadly disease. Deep learning models can be

improved in many ways like improve accuracy, early detection, and personalized

treatment for lung cancer patients which will be very useful in early detection of lung

cancer. These models can also be integrated with various medical data sources, such as

medical images, electronic health records, which can give them a more detailed

understanding about the disease.

The future of lung cancer diagnosis using deep learning contain numerous opportunities,

from early detection to continuous monitoring, medicine recommendation and other

things. With the help of AI, healthcare professionals can make more informed decisions,

saving lives and improving patient conditions.

We even need to take care of various things such as addressing ethical, privacy, and

regulatory considerations to make sure that deep learning in healthcare is safe and

effective. Collaboration between researchers, healthcare institutions, and other involved

people is essential to improve the field as a whole, which finally benefits the patients and

improving the living conditions of the society.

1
CHAPTER 8
FUTURE SCOPE

Drug Discovery and Treatment Optimization: AI can help identify potential ways for

lung cancer treatment and in optimizing the treatment plans based on patient-specific

data, reducing the adverse effects and improving efficacy.

Personalized Medicine: Deep learning can provide appropriate treatment plans based on

the specific characteristics of a patient's lung cancer. It extracts various features of tumors

and identifies a treatment plan that needs to be performed.

Remote Diagnosis: Deep learning models can be used for some medical applications,

which allows healthcare professionals to remotely diagnose, performs treatment and

monitor lung cancer patients.

Risk Assessment and Prediction: Deep learning can be used to develop models that assess

the risk of developing lung cancer based on various types of factors such as smoking

history, genes, and environmental exposure on the patient.

Integration with Electronic Health Records (EHRs): EHRs can be collected and used to

provide data for training and enhancing deep learning models. Integration with such

patient records can enable more accurate diagnosis.

2
REFERENCES

[1] Gudur, A., Sivaraman, H., & Vimal, V. (2023). Deep Learning-Based Detection of

Lung Nodules in CT scans for Cancer Screening. International Journal of Intelligent

Systems and Applications in Engineering, 11(7s), 20-28.

[2] Rajasekar, V., Vaishnnave, M. P., Premkumar, S., Sarveshwaran, V., & Rangaraaj,

V. (2023). Lung cancer disease prediction with CT scan and histopathological

images feature analysis using deep learning techniques. Results in Engineering, 18,

101111.

[3] Reddy, N. S., & Khanaa, V. (2023). Intelligent deep learning algorithm for lung

cancer detection and classification. Bulletin of Electrical Engineering and

Informatics, 12(3), 1747-1754.

[4] Pandian, R., Vedanarayanan, V., Kumar, D. R., & Rajakumar, R. (2022). Detection

and classification of lung cancer using CNN and Google net. Measurement: Sensors,

24, 100588.

[5] AR, B. (2022). A deep learning-based lung cancer classification of CT images using

augmented convolutional neural networks. ELCVIA Electronic Letters on Computer

Vision and Image Analysis, 21(1).

[6] Mohamed, T. I., Oyelade, O. N., & Ezugwu, A. E. (2023). Automatic detection and

classification of lung cancer CT scans based on deep learning and Ebola optimization

search algorithm. Plos one, 18(8), e0285796.

3
[7] Venkatesh, C., Ramana, K., Lakkisetty, S. Y., Band, S. S., Agarwal, S., & Mosavi,

A. (2022). A neural network and optimization-based lung cancer detection system in

CT images. Frontiers in Public Health, 10, 769692.

[8] Chaunzwa, T. L., Hosny, A., Xu, Y., Shafer, A., Diao, N., Lanuti, M., ... & Aerts, H.

J. (2021). Deep learning classification of lung cancer histology using CT images.

Scientific reports, 11(1), 5471.

[9] Sori, W. J., Feng, J., Godana, A. W., Liu, S., & Gelmecha, D. J. (2021). DFD-Net:

lung cancer detection from denoised CT scan image using deep learning. Frontiers of

Computer Science, 15, 1-13.

[10] Ahmed, T., Parvin, M. S., Haque, M. R., & Uddin, M. S. (2020). Lung cancer

detection using CT image based on 3D convolutional neural network. Journal of

Computer and Communications, 8(03), 35.

You might also like