C4 - Project Report Phase 2
C4 - Project Report Phase 2
LEARNING
A Major Project Report Submitted in partial fulfilment of the requirements for the award
of the degree of
BACHELOR OF TECHNOLOGY
IN
INFORMATION TECHNOLOGY
Submitted by
KONERU KETHAN SAI (20071A12E5)
MALLELA KARTHIK REDDY (20071A12F2)
METLA SAI CHARAN (20071A12F3)
PAIDISETTY SAI AMRUTHA (20071A12F6)
PAPPULA DEVI PRASANNA (20071A12F7)
Vignana Jyothi Nagar, Pragathi Nagar, Nizampet (S.O), Hyderabad – 500 090, TS, India
APRIL 2024
VALLURUPALLI NAGESWARA RAO VIGNANA JYOTHI
INSTITUTE OF ENGINEERING AND TECHNOLOGY
An Autonomous Institute, NAAC Accredited with ‘A++’ Grade, NBA Accredited for CE, EEE, ME, ECE,
CSE, EIE, IT B. Tech Courses, Approved by AICTE, New Delhi, Affiliated to JNTUH, Recognized as
“College with Potential for Excellence” by UGC, ISO 9001:2015 Certified, QS I GUAGE Diamond Rated
CERTIFICATE
This is to certify that the project report entitled “LUNG CANCER DIAGNOSIS
USING DEEP LEARNING” is a bonafide work done under our supervision and is
being submitted by Mr. Koneru Kethan Sai (20071A12E5), Mr. Mallela Karthik
Reddy (20071A12F2), Mr. Metla Sai Charan (20071A12F3), Miss. Paidisetty Sai
Amrutha(20071A12F6), Miss. Papulla Devi Prasanna(20071A12F7) in partial
fulfilment for the award of the degree of Bachelor of Technology in Information
Technology, of the VNRVJIET, Hyderabad during the academic year 2023-2024.
Certified further that to the best of our knowledge the work presented in this thesis has
not been submitted to any other University or Institute for the award of any Degree or
Diploma.
Department of IT Department of IT
External Examiner
DECLARATION
We declare that the major project work entitled “LUNG CANCER DIAGNOSIS
USING DEEP LEARNING” submitted in the department of Computer Science and
Engineering, Vallurupalli Nageswara Rao Vignana Jyothi Institute of Engineering and
Technology, Hyderabad, in partial fulfilment of the requirement for the award of the
degree of Bachelor of Technology in Information Technology is a bonafide record of
our own work carried out under the supervision of Mr. I Pavan Kumar, Assistant
Professor, Department of IT, VNRVJIET. Also, we declare that the matter embodied
in this thesis has not been submitted by us in full or in any part thereof for the award of
any degree/diploma of any other institution or university previously.
Place: Hyderabad.
Pappula Devi
Prasanna
(20071A12F7)
ACKNOWLEDGEMENT
We express our deep sense of gratitude to our beloved President, Sri. D. Suresh Babu,
VNR Vignana Jyothi Institute of Engineering & Technology for the valuable guidance
and for permitting us to carry out this project.
With immense pleasure, we record our deep sense of gratitude to our beloved Principal,
Dr.C.D.Naidu, for permitting us to carry out this project.
We express our deep sense of gratitude to our beloved Professor Dr. SRINIVASA RAO
DAMMAVALAM, Associate Professor and Head, Department of Information
Technology, VNR Vignana Jyothi Institute of Engineering & Technology, Hyderabad-
500090 for the valuable guidance and suggestions, keen interest and through
encouragement extended throughout the period of project work.
We take immense pleasure to express our deep sense of gratitude to our beloved Guide,
Mr. I Pavan Kumar, Assistant Professor in Information Technology, VNR Vignana
Jyothi Institute of Engineering & Technology, Hyderabad, for his valuable suggestions
and rare insights, for constant source of encouragement and inspiration throughout my
project work.
We express our thanks to all those who contributed for the successful completion of our
project work.
i
ii
ABSTRACT
Lung Cancer Detection is known as the one amongst the deadliest cancer around the
world and also causes largest deaths related to cancer. Detecting Lung Cancer in the
last stage is what leading to deaths at a higher rate. This can be stopped by detecting
Lung Cancer at an early stage and can save many lives.
Lung cancer, a threatening disease that contributes significantly to the global burden
of cancer-related diseases, frequently escapes early-stage detection, resulting in
considerable loss of life. Recognizing lung cancer in its early stages presents a good
opportunity to save lives. Computed tomography (CT) imaging has become popular as
a tool for identifying even very small size tumors, that may go unnoticed many times.
However, differentiating between malignant and not so harmful tumors remains as a
tough task, for even the most experienced medical practitioners.
In recent years, the arrival of deep learning techniques has become a ground for
research in medical field. Convolutional neural networks (CNNs), also known as
expertise in tasks related to image analysis, have demonstrated extraordinary potential
in the domain of lung cancer detection from CT images. This surrounding review
undertakes a learning comprehensive value of various types of strategies involved in
controlling CNNs for the detection and classification of lung cancer. Moreover, it
undergoes into the crucial processes, such as pre-processing, segmentation, and nodule
extraction, which have gained improved acceptance and performed considered amount
of work to develop in augmenting diagnostic precision.
Abstract ii
List of Figures iv
Chapter-1: Introduction 1
1.1 Definition 2
1.3 Objective 3
Chapter-3: Methodology 16
3.1 Introduction 16
3.3 Requirements 17
Chapter-4: Design 28
4.1 Algorithms 28
iv
Chapter-5: Implementation 34
Chapter-6: Results
Chapter-7: Conclusion 35
References 36
LIST OF FIGURES
v
CHAPTER 1
INTRODUCTION
Lung cancer represents an extreme global health challenge, standing at the frontline of
cancer-related loss of life and holding the doubts of differences of possessing the highest
death rate among all types of cancers. This lung cancer can be of two types majorly
which is small-cell lung cancer and non-small-cell lung carcinoma. While Computed
even the tiniest kind of lung tumors, the early detection of this disease remains as an
upward conflict. The main difficulty lies in differentiating the cancerous growths from
their similar types of tumors, which often share a similar appearance. Moreover, lung
cancer can emerge without any early types of symptoms, or it may present symptoms that
copy those of respiratory infections. These factors compound the challenge of time-to-
time diagnosis, encourages for an urgent search for any other alternative detection
methods.
Within the dimension of medical science, deep learning has emerged as an entire wide-
range of information that helps in, capturing the attention of researchers who are now
putting to use this transformative technology for lung cancer detection and classification.
The typical workflow for identifying lung cancer from CT scan images involves a series
1
Each of these phases assumes a critical role in the development of a robust and accurate
classifier. Researchers have invested their considerable efforts in clarifying these stages,
numerous systems and methodologies have been developed in way to identify the task of
This deals with review embark on an exploration of recent advancements in the domain
of lung cancer detection systems, attention to detailing and examining the types of
techniques and models that have come to light to achieve effective and efficient lung
cancer diagnosis and classification. With the help of these developments, we can be able
to find a way to cure lung cancer in early stages thereby reducing the death rate. The
advancement of this technologies and experienced medical team underscores the urgency
1.1 DEFINITION
considerable interest because of its potential applications in the domain of lung cancer
detection and classification. The traditional approach to identifying lung cancer from CT
scan images makes it necessary and well-defined sequence of stages: image processing,
segmentation, feature extraction, and classification. Each of these phases plays a central
part in developing a classification and detection model. Researchers have tried to tune
2
and optimize these stages, innovating some inventive techniques which main goal is to
augmenting the overall diagnostic process. As a result, a different type of systems and
Our research focuses on addressing the complex challenges that are associated with the
lung cancer, a global health concern. We suggested the need for more effective methods
of detecting and classifying lung cancer, considering its status as a leading cause of
cancer-related to larger number of deaths. Some of the key aspects of our work include
understanding the different forms of lung cancer, recognizing the diagnostic struggles in
early detection, and acknowledging the central part of deep learning in medical
applications. We also searched into the standard diagnostic workflow, highlighting the
1.3 OBJECTIVE
The aim of project is to use lung images with an image resolution of -1024 x 2389-pixel
values to construct a deep learning model for classification of lung cancer. The objective
3
is to create a highly accurate and efficient system capable of studying these images and
identify the presence of tumor in a specific area of lung and classifying what type of lung
cancer was it. In this way early detection of lung cancer can help in saving many lives at
The thesis is organized into six chapters, they are detailed in the following section
In Chapter 1, we have specified the introduction of our project and its scope.
weaknesses while addressing any existing deficiencies. And discuss about some
proposed systems. Within this chapter, the prerequisites for the recommended
Workflow diagram is presented in here and the case study in which we included
4
Chapter 4, we present what are the algorithms that we used in our project to
develop it. The dataflow of model diagram and UML diagrams about our project
are included
In Chapter 5, we have mentioned the way we implemented the overall project and
accordingly.
In Chapter 7, it searches into the project concluding remarks, which related to the
classification.
In Chapter 8, the future scope of lung cancer classification and detecting it was
5
CHAPTER 2
LITERATURE SURVEY
Anand Gudur et al.[1] proposed a four-step procedure for lung cancer detection. These
four steps include processing, segmentation, feature extraction and classification. In pre-
processing, they wanted to remove any noise, or improve the image qualities. They used
a median filter for the pre-processing the input image to remove the background noise.
Next is image segmentation where the input CT image is divided into regions based on
various features. The techniques that were mentioned as part of this step were
thresholding and water shed algorithms. They used c-GAN (conditional Generative
Adversarial Network) to extract the lung from the input image. The input slices are sent
through encoders and converted into feature maps. These feature maps are then used as
input to a multi-scale feature extraction module. Then they are fed into decoders to
extract the lung segmentation. For segmenting nodules, a multi-level, single-click region
morphological operations. The third step, feature extraction is the important step which
produces important features from the data which can be used for effective classification
of the image. ResNet-50 model is used to extract the required features. Some of these
features include surface area, mean intensity, etc. Then, the classification step which is
the final step that involves ascertaining whether the input image is cancerous or not. This
is a supervised learning task that includes techniques like CNN, etc. They used SVM to
classify the deep learning features that are presented to it by ResNet-50, which then
6
classifies the input image. There are 3 different labels Tumors (T), nodes(N), and
metastases (M) which were classified using three distinct Resnet50 networks, and a SVM
model. Every image is provided as input to these three combinations. The present study
got dice similarity coefficient of 0.99 and Jacquard index of 0.98 which beats the other
models under study NMF (0.85, 0.80), U-Net (0.96, 0.93), ResNet (0.96, 0.94).
Vani Rajasekar et al.[2] used different deep learning models for feature analysis to
predict lung cancer. The models they used were CNN Gradient Descent (CNN GD),
VGG-16, VGG-19, Inception V3 and Resnet-50. When it comes to CNN, there are five
different kinds of layers involved in it. The input layer as the name suggest takes the
input image that is provided to it before it passes to further into the network. Next, comes
the convolutional layer which is used for the feature extraction. It uses ReLU as its
activation function which introduces non-linearity into learning process. A filter which is
runs over the input image with a certain step (stride) and padding if any to the input
image. Then for each filter position it outputs an integer which represents the extracted
feature like edges. Then the ReLU activation function takes the values and converts all
the negative numbers to zero. To decrease the dimension of the generated feature map,
pooling layers are used which either take the maximum or the average of the pixels in the
window based on the type used. Then the obtained feature map is mapped to a normal
neural network layer which carry out the remaining process. The SoftMax layer
calculates the probability that the image has to belong to any of the available class. Then
the output layer predicts the final output. All the models that are used are different CNN
decreases, increasing the predictive power. We obtain a convex function and the
algorithm tries to adjust the weights so as to move to the minimum of that convex
function to decrease the loss. The next model VGG-16 contains 16 layers in it in 6
blocks. The first and second block consists of 2 conv layers with 1 pooling layer
respectively and third, fourth and fifth block consists of 3 conv layers with 1 pooling
layer and final block consists of 3 fully connected (Dense) layers. The conv layers use
3*3 filters and pooling layers use 2*2 filter. VGG-19 uses 19 layers in 6 blocks. The
difference between VGG-16 and 19 is that the latter one uses 4 conv layers instead of 3 in
third, fourth and fifth blocks. Inception v3 model consists uses different dimensional
conv layers in a single module to extract features of different size which is one of the
between every two layers. This ensures that the features learnt in the previous layers are
forwarded and the problem of vanishing and exploding gradients is taken care of. The
CNN GD is said to be performing better than all the other models that are considered.
CNN GD has scored an accuracy of 97.86. The other model’s accuracy scores are VGG-
Sudhir Reddy et al.[3] proposed an algorithm for forecasting the growth of malignant
cells in the CT image. The forecasting can help us to anticipate the area and size of the
tumor. Firstly, CT images are obtained from the respective sources which are then pre-
processed accordingly. The CT images are paired to gain spatial information along
another axis too. CNN are used to extract the required features from these images which
8
can be used for further purposes. CNN is a feed-forward neural network and the
information from the images. This is one of the main advantages of the CNNs since other
methodologies like those of ML algorithms (SVM, etc.) and normal ANN (artificial
neural networks) will ignore the spatial information of the pixels in the image. These
networks also use pooling layers to highlight the important features thus extracted,
removing the unwanted ones and reducing the size of the feature map. These conv and
pooling layers make up the architecture which draws out the required features from the
images which are then provided to a fully connected network or other ML models for
classification purposes. The proposed algorithm contains four stages namely extortion
location for locating the tumor cells, PC vision to differentiate the tumor with its features
information regarding the CT image under study, and clinical picture determination to
provide right way for clinical assistance to the patient. This algorithm outperformed other
ML ones like SVM, Naïve Bayes, Decision Tree, Random forests. The proposed method
got an accuracy of 92.81. Random forests (81.3), Decision Trees (83.6), Logistic
Pandian et al.[4] found that the textural features of the images proved to efficiently
classify the normal and the malignant ones. So, they used a CNN and GoogleNet with
VGG-16 as base network. GoogleNet and VGG-16 were applied on the dataset initially.
inception block consists of 2 (1*1) conv layers that are connected to (3*3), (5*5)
respectively. There is a (1*1) layer which gets its input from a pooling layer and there is
another (1*1) layer that is directly connected to the output of the block. These outputs are
concatenated by a concat layer and there are occasional normalization layers. The output
of the overall network is connected to a linear layer which is connected to softmax layer.
The CNN network that they experimented on consists of two blocks of layers with
convolution, normalization layer followed by the pooling layer. The VGG-16 is used as a
Bushara et al.[5] wanted to harness the power of data augmentation in classifying the
lung cancer. So, they developed Augmented CNN for this task. It mainly consists of three
stages, image data acquisition, data augmentation and classification using CNN. Total of
2066 images were obtained from LIDC-IDRI repository of which 80% were used as
training data and 20% for testing. 20% of validation data was taken from the training
data. Then next comes the data augmentation step. Data augmentation is the process the
generating more data from the existing one by various operations like rotation, scaling,
shearing, zooming, flip, etc. Finally, there comes the classification step which involves
training the CNN model to classify the data into benign or malignant. Before the data is
sent to model, it undergoes through necessary preprocessing steps. The CNN consists of
convolutional layers, pooling layers, fully connected and SoftMax activation layers.
ReLU is used as activation for the convolutional layers to introduce non-linearity into the
learned feature maps which can help us to learn non-linearly separable data. The
10
proposed augmented CNN outperformed the U-Net and ResNet combination, and other
models which shows the importance of the data augmentation or large amount of data for
training the models at hand. The proposed method got an accuracy of 95. The other
models like UNet and ResNet, CNN, KNG-CNN got 84, 84.15, 87.3 respectively.
algorithm for obtaining best weights and bias for the given CNN model architecture.
They proposed a hybrid metaheuristic and CNN algorithm where they first obtained a
vector to the CNN architecture which is then used to search for the best weights and bias.
model of the disease. The model consists of compartments, which are Susceptible (S),
Exposed (E), Infected (I), Hospitalized (H), Recovered (R), Vaccinated (V), Quarantine
(Q), and Death (D). This creates a search space that can be used to obtain optimal weights
and bias. Before the model is trained, the input images are processed. They used
grayscale filter to convert the images to grayscale, gaussian blur filter for smoothing,
Otsu’s thresholding technique is used where a threshold value is selected that divides the
image into foreground and background for image segmentation. Image normalization is
then performed that changes the range of pixel intensity values. Morphological operations
like erosion and dilation are used. Then they used contrast-limited adaptive histogram
equalization (CLAHE) filter to remove the noise, applied wavelet transform and finally
split the dataset into training and test set in 80%, 20% split. The CNN consists of 4
blocks that contains 2 conv layers, zero padding layer, pool helper (custom layer for
preselecting features) and max pooling layer respectively. These are followed by Linear,
11
Dropout layers with SoftMax at last. Based on the CNN model thus developed and input
data, a solution vector is obtained which is sent to the EOSA for optimal weights on
which the model is trained. By this study, they observed that the EOSA-CNN performed
well then bare CNN and other optimization algorithms under study like, Genetic
SBO-CNN, WOA-CNN, and EOSA-CNN were 0.81, 0.81, 0.79, 0.81, 0.81, and 0.82,
Venkatesh et al.[7] proposed a five-phase methodology for lung cancer detection which
pre-processing step, they used median filtering to remove any unwanted noise in the
images. For image segmentation, they wanted to use Otsu thresholding. Otsu thresholding
is a thresholding technique that selects a threshold value for segmenting foreground and
background pixels. The cuckoo search algorithm is used extraction of regions of interest
from the segmented images. This algorithm is inspired by the nature. The algorithm is,
12
o Elseif, the nest already contains an egg, with probability pa (Probability
that a cuckoo egg will replace a host bird's egg.), replace the host bird's
A Levy flight is a random walk in which the step lengths have a stable probability
distribution. The LBP (Local Binary Pattern) is used to extract required features from the
images. The LBP operator is a texture descriptor that works by comparing the center
pixel of a 3x3 neighborhood to its 8 neighbors and assigning a binary value to each
neighbor depending on the value of center pixel. Then for classification, a model
containing 2 conv layers, 2 pooling layers are used with fully connected, dropout and
particle swarm optimization and genetic algorithm. The proposed model obtained an
accuracy of 96.97. The PSO and GA methodologies obtained 96.93, 90.49 respectively.
Chaunzwa et al.[8] used VGG-16 model for lung cancer classification. As the
preprocessing steps, they performed isotropic rescaling and density normalization. They
used two different VGG-16 pretrained architectures for this process. Model A is fine-
tuned using the data of 172 patients who are affected by adenocarcinoma or squamous
13
cell. This model is also used for extracting features for the machine learning classifiers
like, KNN, SVM, Linear-SVM, RF. The features are derived from the last pooling and
first fully connected layers, corresponding to 512-D and 4096-D vectors. Model B is the
fully connected VGG16 network tuned with a data of 228 cases with all histology types.
KNN trained on 4096-D clearly outperforms all the other models, that are developed.
Model A comes next. The same KNN trained on 512-D stands top among the other
models that are trained on same features. This clearly states that the features given by
Sori et al.[9] proposed a model called DFD-Net (which stands for Denoising First
Detection Network). There are mainly three phases involved in this methodology.
denoising the obtained result and then classification. Firstly, under the pre-processing of
the images, the input images pixel values are converted into Hounsfield Unit, on which
thresholding is applied to remove unnecessary tissues from the image. Then Gaussian
filter is applied to the image thus obtained. The nodules that are suspicious are removed
by U-Net model which consists of encoder for feature extraction and decoder for image
reconstruction and segmentation. Thus, obtained images are used to train the DFD-Net.
First, the images pass through a denoising model named DR-Net (Deep Neural Network
with Multi-Layer Residual Blocks), which is used to remove the noise from the images
that are supplied to it. The output of this model is sent to the detection part of the DFD-
Net. The detection process is done by a 2 path CNN the first one used 3*3 and second
one used a 5*5 filter to capture features in those dimensions. First path consists of 4 and
14
second one consists of 3 blocks (conv + pooling). The outputs of each of the paths are
concatenated and final output of whole model is calculated. The proposed approach beats
3D CNN and other approaches that are proposed by other researchers. The obtained
Ahmed et al.[10] proposed 3D approach for lung cancer detection. They wanted to take
the advantage of spatial information in the third dimension so that the detection of lung
cancer would be more accurate. To do that, they used 3D CNN which are used to extract
features from 3D image slices. 3D CNN too consist of basic components like input layer,
convolutional layer, pooling layer and fully connected layer in their architecture. First,
they performed some pre-processing on the data. They converted the pixel values into
Hounsfield units and twenty of these 2D slices are stacked. On the obtained result, they
unwanted portions in the image by manual thresholding. The obtained processed data is
then feed to the 3D CNN architecture that consists of 2 convolutional (with filter
dimensions as 3*3*3) and 2 max pooling layers (with kernel of 2*2*2) and a dense layer
for predicting the presence of cancer. They obtained an accuracy of 80% which is a fine.
15
CHAPTER 3
METHODOLOGY
3.1 INTRODUCTION
Lung cancer is one of the deadliest cancers that effects humans. The growth of malignant
tumors in lungs is a cause of it. The disease can be cured if it is detected in early stages
thereby reducing the deaths caused due to it. So, the researchers are trying to develop a
system which detects lung cancer in its early stages itself which can be helpful. In
developing such a system, we need data which should accurately display the tumors in
early stage of lung cancer. Lung cancer tumor cells in early stage may be very small so
revealing such tumors can be challenging. CT (Computed Tomography) images are said
to reveal even such small cells which is what that is required to develop a system for
There are various steps that are involved in developing a system to detect and classify
lung cancer. The steps that include are pre-processing, lung segmentation, feature
extraction, classification. Pre-processing mainly deals with the removal of noise in the
data, and making the data suitable for further process. Segmentation is the process of
finding the region of tumor and separating it from the rest of the image. Feature
extraction deals with the extraction of required features from the segmented image and
16
Different techniques are used in different stages of the above pipeline to detect the lung
etc can be used to clean the noise and improve the image features. In segmentation,
images processing techniques like thresholding or deep learning models like CNN can be
used. Generally, for feature extraction, CNN are used. And finally for classification, we
There are many systems that use ML algorithms for detection and classification purposes.
may not be helpful as the handpick features are not said to very efficient in detecting and
classifying the lung cancer. And ML algorithms tend to be less effective in high
dimensional data such as CT images. The non-linear relationship among the data is also
not modelled clearly in ML and they are not quite suitable for unstructured data.
3.3 REQUIREMENTS
Software Requirements
1.Pogramming Languages-Python
17
Hardware Requirements
1.RAM-8gb or higher
3.Cloud–based GPU
We proposed to use CNNs (Convolutional Neural Networks) which are a type of neural
networks that consider the spatial information of the pixels in the image. CNN are used to
extract the features from the images which can be used to detect and classify the images.
CNN use filters as weights which extract important features from the image provided to
it.
There are various CNN architectures that are proposed by various researchers and are
3.4.1 ALEXNET
We tried to use modified alexnet model which takes 128*128*1 input image. It consists
of a convolutional layer with 96 filters of size 11*11 and stride of 4 and maxpooling layer
with size 3*3 and stride 2*2. The next set of layers consist of 256 filters with 5*5 filter
18
and same maxpooling layer. The output is next given to layers that consist of 2
convolutional layers with 384 filters and 3*3 size. Finally, it passes through conv layer
containing 256 filters with same filter size as previous one and maxpooling layer with
3*3 size and stride of 2*2. The output of this is flattened and given to 2 dense layers of
4096 nodes with dropout regularization and finally to output softmax layer.
19
Fig 3.5.1 Workflow diagram
First the input images are sent for pre-processing where the noise is removed and images
are enhanced. This is the preprocessing step that every system consist of as unwanted
noise removal is must for accurate results. Then these enhanced images are sent to a
segmentation model to segment for tumor regions. The segmentation task can be a model
like CNN or image processing task like thresholding. This segmented output is then sent
a b c
d e f
g h i
20
Prewitt operator is one of the basic operators used for extract of features from images.
Prewitt operator only considers the contrast difference between the pixels. So, the
-1 0 1 -1 -1 -1
-1 0 1 0 0 0
-1 0 1 1 1 1
Let us consider a 3*3 image where we need to find if the central pixel consists of the
edge in it.
a b c
d e f
g h i
21
The distance between the horizontal and vertical pixels is 1 and diagonal pixels is √2.
The directional derivative of the central pixel e and the pixel to its right f, can be
and e and f represent the intensities of the pixels at their positions respectively. The
directional derivative between the central pixel and the pixel to its left d, can be
1
[( c−e) cos 45° i^ +( c−e) sin 45 ° ^j ]
√2
1
(c−e)( i^ + ^j)
2
Repeating the same process and adding all the eight results, give
[( 1 ^
]
f −d ) + (−a+ c−g+i ) i+[(b−h)+
2
1
2
( a+c−g−i ) ] ^j
-1 0 1 1 2 1
-2 0 2 0 0 0
-1 0 1 -1 -2 -1
22
1 1
is multiplied for each pixel. is coefficient which is dropped to get the required
8 16
operator.
Laplacian operator is a result of second order differentiation. The finite difference of the
''
f x ( x , y ) =f ( x +2 , y)−2∗f (x +1 , y )+ f (x , y )
''
f y ( x , y )=f (x , y+ 2)−2∗f (x , y +1)+f ( x , y )
0 0 0 0 1 0
1 -2 1 0 -2 0
0 0 0 0 1 0
X-derivative Y-derivative
0 1 0
1 -4 1
23
0 1 0
24
CHAPTER 4
DESIGN
The algorithms that are considered to be used is convolutional neural network (CNN)
of deep learning model which is used for the analysis of grid-structured data, such as
images. It uses convolutional layers to extract patterns and features from the data. These
layers use filters which scan across input data, capturing spatial information in the data.
Additionally, pooling layers are used to reduce the dimensionality of representations and
select the best features out of the available ones. CNNs make the learned parameters to
classify or make predictions regarding complex data, by the use of spatial information for
pattern recognition. CNNs work well in the domains of image and signal processing due
to their capability to learn and extract the features which can be used for various
purposes. They even can be used as the basic components in tasks like object
25
4.2 Design and Workflow Diagrams
within a given context. These diagrams mainly use a group of actors, use cases, and their
system's boundary that consists of various actors involved in the system with use cases
that they rely on. Actors are external entities, who work with the system with the
provided use cases. The use-case diagram serves as a visual tool for knowing, visualizing,
and documenting the behavioral aspects of a particular system or any component. These
diagrams are particularly useful for clearly explaining and clarifying the operational
overview of how different actors (which can be individuals, other systems, or even
hardware devices) interact with the system's functionalities to achieve specific goals or
26
Fig 4.2.1.1 Use case diagram
sequential order of the messages exchanged between objects. Within a specific time
period, it shows the objects that participate in the interaction by showing their lifelines
and the accurate sequence of messages that they send. Sequence diagrams are particularly
used in the scenarios that involve real-time specifications and processes because of their
ability to provide a clear and detailed representation of the sequential flow of information
as a visual representation for the dynamic behavior of the system. Activity diagrams
show the flow of messages from one action to another action. Actions within these
diagrams tell us about those specific system functions. Activity diagrams are not only
helpful in constructing executable systems through various engineering methods but they
can also be used in showing the dynamic characteristics of a system. They represent the
28
Fig 4.2.3.1 Activity diagram
29
4.2.4 DESIGN ARCHITECTURE
Segmentation Segmented
Images
Classification Classification
Model Building Model Training
Model
Evaluation
Classification
First, we start with the data collection where we collect the required data (CT images).
Then we preprocess the data to remove the noise in the dataset. A segmentation model is
built which is used for segmentation tasks. To this model, the preprocessed data is sent to
30
segment the images into tumor and background portions. A classification model is built
for classifying the cancer type to which the segmented images are sent. Finally, the
31
CHAPTER 5
IMPLEMENTATION
LIBRARIES
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import pydicom
from pydicom.pixel_data_handlers.util import
apply_modality_lut
from pathlib import Path
import os
from tqdm import tqdm
import cv2 as cv
import skimage
from skimage.segmentation import clear_border
from skimage.measure import label,regionprops, perimeter
from skimage.morphology import ball, disk, dilation,
binary_erosion, erosion, binary_closing
from skimage import exposure, filters, util
from tensorflow import keras
from tensorflow.keras.models import Sequential, Model
from tensorflow.keras.layers import Conv2D, MaxPooling2D,
Flatten, Dense, Dropout, Reshape, Input
from tensorflow.keras.metrics import Accuracy, Recall,
Precision, F1Score, AUC
We have used different libraries for building the model. Numpy is a popular library for
working with arrays and mathematical operations. Pandas is used to work with
dataframes and data in excel sheets. Matplotlib is a famous plotting library. Seaborn is
also another famous plotting library that extends the features of matplotlib. Pydicom is a
library that is specifically used to work with medical images in dicom format. Pathlib and
os can be used to work with path and operating system related operations. Cv and
skimage are used to perform image related tasks easily and tensorflow is used to build
and train models. Tqdm is used to display a progress bar for iterations that might take
long time to run.
32
SEGMENTATION
First we would be setting some constants and then we will be segmenting the image to
get the portion of lungs that we require for future purpose.
DATA_PATH_1 = Path("../Data/manifest-1703080534832/")
DATA_PATH_2 = Path("../Data/manifest-1703166770855/")
OUTPUT_PATH = Path("../segmented/")
ANNOTATION_PATH = Path("../Annotation/")
We have actually installed the data in two different folders so DATA_PATH_1 and
DATA_PATH_2 consists of those two paths. And OUTPUT_PATH consists of the path
where segmented images are stored and ANNOTATION_PATH consists of path that
consists of annotations of the data.
df = pd.read_csv(DATA_PATH_1.joinpath('metadata.csv'))
Each data folder consists of the metadata csv file that we used to perform segmentation
easily. We can change the data path after it done on first one.
Next, we have defined two functions. The first function transform_to_hu () converts the
image to Hounsfield unit based on the values in the dicom file of the image. The second
33
function windowing_img () performs windowing which is the process of converting the
image such that the abnormalities in the image are visible clearly.
def border_removal(img):
rescaled_img = exposure. rescale_intensity (img,
in_range='image', out_range= (0, 255)). astype('i')
thresh = rescaled_img < 100
cleared = clear_border(thresh)
return rescaled_img, cleared
The above function border_removal () is used to remove the borders that are present after
the windowing of the image is done. Since we only need the lungs in the image, we are
removing the borders in the image. We will rescale the image and find out the threshold
of the image and then we will clear the borders of the image.
def extract_lung(img):
labeled_img = label(img)
areas = [r. area for r in regionprops(labeled_img)]
areas.sort()
if len(areas) > 2:
for region in regionprops(labeled_img):
if region.area < areas[-2]:
for coordinates in region.coords:
labeled_img[coordinates[0],
coordinates[1]]=0
binary = labeled_img > 0
return binary
Then we will extract the lungs from the borderless images using extract_lung (). We will
first label the image and pick the two regions with highest area from the labeled regions
which are the required lung images and convert them into binary images.
34
Since we got the binary of the image, we are reconstructing the image with the above
reconstruct_img () function.
def dilate_res(img):
res_er = dilation(img, disk(0.9))
return res_er
Then we are just applying a morphological operation called dilation to the resultant
image so that the small parts of the image become clear.
def overall_extraction(img):
rescaled_img, rmborder_img = border_removal(img)
lung_img = extract_lung(rmborder_img)
morph_img = reconstruct_img(rescaled_img, lung_img)
temp = dilate_res(morph_img)
return temp
In this function we are just trying to apply all the functions that we have created
previously one after the other sequentially. Instead calling each function we are just
trying to call all the functions by calling a single function which is more readable and
understandable.
folder_name = ''
for r in tqdm(range(0, df.shape[0])):
p = df.loc[r, "File Location"][2:]
t = p.split("\\")[1]
folder_name = t
if t.find("A") != -1:
typ = 'adenocarcinoma'
elif t.find("B") != -1:
typ = 'smallcell'
elif t.find("E") != -1:
typ = 'largecell'
else:
typ = 'squamouscell'
dicom_path = Path(os.path.join(DATA_PATH_2,p))
c = 0
for dimg in dicom_path.iterdir():
img = pydicom.read_file(dimg)
slices = img[0x28, 0x02].value
wd_img = windowing_img(img, -220, 1300)
35
if slices > 1:
wd_img = wd_img[:, :, -1]
result = overall_extraction(wd_img)
instance_id = img[0x08, 0x18].value
instance_path =
ANNOTATION_PATH.joinpath(f"{t[8:]}/{instance_id}.xml")
if os.path.exists(instance_path):
np.save(OUTPUT_PATH.joinpath(f"{typ}/{instance_id}.npy"),
result)
else:
np.save(OUTPUT_PATH.joinpath(f"safe/{instance_id}.npy"),
result)
c += 1
In the above code snippet, we are trying to segment all the images and store them as a
numpy array in the required path. We are iterating over all the rows of the metadata file
which contains all the CT slices that are downloaded. Then we are getting the location
where the slice is downloaded and knowing the type of the slice. Based on that we are
placing them in anyone of the folder that is available. Since a single CT scan contains
more than one slice so we are trying to iterate over all the CT slices that are available and
we are performing windowing and other extraction functions that we defined previously.
Then based on the presence of the annotation file we are moving them into required
folders.
DATA AUGMENTATION
Data augmentation is the process of generating new images from the images that are
already present which helps in increasing the number of images that are present and
balancing the overall dataset.
TYPE = "largecell"
DATA = Path(f"../segmented/{TYPE}/")
AUGMENT_DATA = Path(f"../segmented/augment_{TYPE[0:2]}/")
COUNT = 200
36
These are some of the constants that we are setting before starting with data
augmentation.
def augment_data(img):
data_augmentation = tf.keras.Sequential([
tf.keras.layers.RandomContrast(0.2),
tf.keras.layers.RandomZoom(0.1, fill_mode =
'constant', fill_value = 255),
tf.keras.layers.RandomTranslation(height_factor=0.2
,width_factor=0.2, fill_mode = 'constant', fill_value =
255),
tf.keras.layers.RandomBrightness(0.2)
])
return data_augmentation(img)
This function returns the augmented image when a normal image is given. We randomly
choose contrast, zoom and brightness. We also translate the image as the process of data
augmentation.
In this code snippet, we are trying to generated and save the augmented images at a
specific location. For each image we are trying to generate some images based on the
constant set previously. Data augmentation increases the balance between the data.
ALEXNET
SEGMENTED_DATA = Path('../segmented/')
MODEL_PATH = Path("../models/")
TRAIN_SPLIT = 0.8
VAL_SPLIT = 0.1
CT_SLICES = 20293
37
These are the constants that are set to train the model.
def datagen():
for dirpath, subdirs , files in
os.walk(SEGMENTED_DATA):
if len(files) > 0:
for i in range(len(files)):
data = np.load(os.path.join(dirpath,
files[i]))
# print(dirpath, dirpath.split('\\'))
depth = 5
cancer_type = dirpath.split('\\')[2]
if cancer_type == 'adenocarcinoma':
label = tf.one_hot(1, depth)
elif cancer_type == 'largecell':
label = tf.one_hot(2, depth)
elif cancer_type == "smallcell":
label = tf.one_hot(3, depth)
elif cancer_type == "squamouscell":
label = tf.one_hot(4, depth)
else:
label = tf.one_hot(0, depth)
# print(data.shape)
# print(type(label))
if data.ndim == 2:
data = data[:, :, np.newaxis]
yield data, label
The above function datagen() is a data generator functions that generates the data from
the directory that is mentioned. It generates the data in the form of a pair where data and
label are present. This generator function is given as input to the model for training.
The above function is used to resize the images from 512, 512 to 128, 128 so that it can
be helpful for training.
dataset = tf.data.Dataset.from_generator(datagen,
output_signature = (tf.TensorSpec(shape=(512, 512, 1),
dtype=tf.int32), tf.TensorSpec(shape=(5), dtype=tf.int32)))
38
dataset = dataset.map(resize_images).map(lambda x, y:
(tf.cast(x, tf.int32), (tf.cast(y, tf.float32))))
In this code, we are actually applying the resize function after generating the dataset from
the data generator function that we have written previously. This dataset is given to the
model to train on.
dataset.shuffle(buffer_size = 1000)
train_size = int(TRAIN_SPLIT * CT_SLICES)
val_size = int(VAL_SPLIT * CT_SLICES)
train_ds = dataset.take(train_size)
val_ds = dataset.skip(train_size).take(val_size)
test_ds = dataset.skip(train_size).skip(val_size)
Here we are shuffling the dataset according to the required size that we have mentioned
in the constants.
x = Flatten()(x)
x = Dense(4096, activation='relu')(x)
x = Dropout(0.5)(x)
39
x = Dense(4096, activation='relu')(x)
x = Dropout(0.5)(x)
model = alexnet()
This is the overall alexnet model code for building and training the model.
40
CHAPTER 6
RESULTS
We were able to obtain around 95% accuracy, precision and recall of 0.96 and F1 score
1
CHAPTER 7
CONCLUSION
The application of deep learning in lung cancer diagnosis can be very helpful in changing
the way we detect, classify, and manage this deadly disease. Deep learning models can be
improved in many ways like improve accuracy, early detection, and personalized
treatment for lung cancer patients which will be very useful in early detection of lung
cancer. These models can also be integrated with various medical data sources, such as
medical images, electronic health records, which can give them a more detailed
The future of lung cancer diagnosis using deep learning contain numerous opportunities,
things. With the help of AI, healthcare professionals can make more informed decisions,
We even need to take care of various things such as addressing ethical, privacy, and
regulatory considerations to make sure that deep learning in healthcare is safe and
people is essential to improve the field as a whole, which finally benefits the patients and
1
CHAPTER 8
FUTURE SCOPE
Drug Discovery and Treatment Optimization: AI can help identify potential ways for
lung cancer treatment and in optimizing the treatment plans based on patient-specific
Personalized Medicine: Deep learning can provide appropriate treatment plans based on
the specific characteristics of a patient's lung cancer. It extracts various features of tumors
Remote Diagnosis: Deep learning models can be used for some medical applications,
Risk Assessment and Prediction: Deep learning can be used to develop models that assess
the risk of developing lung cancer based on various types of factors such as smoking
Integration with Electronic Health Records (EHRs): EHRs can be collected and used to
provide data for training and enhancing deep learning models. Integration with such
2
REFERENCES
[1] Gudur, A., Sivaraman, H., & Vimal, V. (2023). Deep Learning-Based Detection of
[2] Rajasekar, V., Vaishnnave, M. P., Premkumar, S., Sarveshwaran, V., & Rangaraaj,
images feature analysis using deep learning techniques. Results in Engineering, 18,
101111.
[3] Reddy, N. S., & Khanaa, V. (2023). Intelligent deep learning algorithm for lung
[4] Pandian, R., Vedanarayanan, V., Kumar, D. R., & Rajakumar, R. (2022). Detection
and classification of lung cancer using CNN and Google net. Measurement: Sensors,
24, 100588.
[5] AR, B. (2022). A deep learning-based lung cancer classification of CT images using
[6] Mohamed, T. I., Oyelade, O. N., & Ezugwu, A. E. (2023). Automatic detection and
classification of lung cancer CT scans based on deep learning and Ebola optimization
3
[7] Venkatesh, C., Ramana, K., Lakkisetty, S. Y., Band, S. S., Agarwal, S., & Mosavi,
[8] Chaunzwa, T. L., Hosny, A., Xu, Y., Shafer, A., Diao, N., Lanuti, M., ... & Aerts, H.
[9] Sori, W. J., Feng, J., Godana, A. W., Liu, S., & Gelmecha, D. J. (2021). DFD-Net:
lung cancer detection from denoised CT scan image using deep learning. Frontiers of
[10] Ahmed, T., Parvin, M. S., Haque, M. R., & Uddin, M. S. (2020). Lung cancer