0% found this document useful (0 votes)
16 views

Lung Cancer Detection Using Transfer Learning

Lung cancer is one of the deadliest cancers worldwide. However, the early detection of lung cancer significantly improves survival rate. Cancerous (malignant) and noncancerous (benign) pulmonary nodules are the small growths of cells inside the lung. Detection of malignant lung nodules at an early stage is necessary for the crucial prognosis.

Uploaded by

jagan
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views

Lung Cancer Detection Using Transfer Learning

Lung cancer is one of the deadliest cancers worldwide. However, the early detection of lung cancer significantly improves survival rate. Cancerous (malignant) and noncancerous (benign) pulmonary nodules are the small growths of cells inside the lung. Detection of malignant lung nodules at an early stage is necessary for the crucial prognosis.

Uploaded by

jagan
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 13

Lung Cancer Prediction using CNN and

Transfer Learning
Table of Contents
1. Introduction
2. Visualization of Dataset
3. Proposed Model
4. Convolutional neural network
5. Transfer Learning : VGG16-Net
6. Future work
7. Reference
INTRODUCTION
Lung cancer is one of the deadliest cancers worldwide. However, the early detection of lung cancer significantly improves
survival rate. Cancerous (malignant) and noncancerous (benign) pulmonary nodules are the small growths of cells inside the
lung. Detection of malignant lung nodules at an early stage is necessary for the crucial prognosis.
Early-stage cancerous lung nodules are very much similar to non-cancerous nodules and need a differential diagnosis on the
basis of slight morphological changes, locations, and clinical biomarkers. The challenging task is to measure the probability
of malignancy for the early cancerous lung nodules. Various diagnostic procedures are used by physicians, in connection, for
the early diagnosis of malignant lung nodules, such as clinical settings, computed tomography (CT) scan analysis
(morphological assessment), positron emission tomography (PET) (metabolic assessments), and needle prick biopsy analysis

For the input layer, lung nodule CT images are used and are collected for various steps of the project. The source of the
dataset is the LUNA16 dataset .

The LUNA16 dataset is a subset of LIDC-IDRI dataset, in which the heterogeneous scans are filtered by different criteria.
Since pulmonary nodules can be very small, a thin slice should be chosen. Therefore scans with a slice thickness greater than
2.5 mm were discarded.
VISUALIZATION OF DATASET
Visualization of dataset is an important part of training , it gives better understanding of dataset. But CT scan images are hard
to visualize for a normal pc or any window browser. Therefore we use the pydicom library to solve this problem. The
Pydicom library gives an image array and metadata information stored in CT images like patient’s name,patient’s id, patient’s
birth date,image position , image number , doctor’s name , doctor’s birth date etc.
(fig 3.Small sample of Metadata contain in a single dicom
slice)
PROPOSED MODELS
The proposed model is a convolutional neural network approach based on lung segmentation on CT scan images. At first we
preprocess the dataset of luna16. We tried three different models of Convolutional Neural Networks, which are based on the
comparative study of performance of each type model in different dataset and for different classification problems.

Convolutional Neural Networks


A convolutional neural network, or CNN, is a deep learning neural network designed for processing structured arrays of data
such as images. Convolutional neural networks are widely used in computer vision and have become the state of the art for
many visual applications such as image classification, and have also found success in natural language processing for text
classification. Convolutional neural networks are very good at picking up on patterns in the input image, such as lines,
gradients, circles, or even eyes and faces. It is this property that makes convolutional neural networks so powerful for
computer vision. Unlike earlier computer vision algorithms, convolutional neural networks can operate directly on a raw
image and do not need any preprocessing. A convolutional neural network is a feed-forward neural network, often with up to
20 or 30 layers. The power of a convolutional neural network comes from a special kind of layer called the convolutional
layer.
Convolutional Neural Networks
A convolutional neural network, or CNN, is a deep learning neural network designed for processing structured arrays of data
such as images. Convolutional neural networks are widely used in computer vision and have become the state of the art for
many visual applications such as image classification, and have also found success in natural language processing for text
classification. Convolutional neural networks are very good at picking up on patterns in the input image, such as lines,
gradients, circles, or even eyes and faces. It is this property that makes convolutional neural networks so powerful for
computer vision. Unlike earlier computer vision algorithms, convolutional neural networks can operate directly on a raw
image and do not need any preprocessing. A convolutional neural network is a feed-forward neural network, often with up
to 20 or 30 layers. The power of a convolutional neural network comes from a special kind of layer called the convolutional
layer. Convolutional neural networks contain many convolutional layers stacked on top of each other, each one capable of
recognizing more sophisticated shapes. With three or four convolutional layers it is possible to recognize handwritten digits
and with 25 layers it is possible to distinguish human faces.
TRANSFER LEARNING : VGG16-NET
VGG Net is the name of a pre-trained convolutional neural network (CNN) invented by Simonyan and Zisserman from Visual
Geometry Group (VGG) at University of Oxford in 2014 and it was able to be the 1st runner-up of the ILSVRC (ImageNet
Large Scale Visual Recognition Competition) 2014 in the classification task. VGG Net has been trained on ImageNet
ILSVRC dataset which includes images of 1000 classes split into three sets of 1.3 million training images, 100,000 testing
images and 50,000 validation images. The model obtained 92.7% test accuracy in ImageNet. VGG Net has been successful in
many real world applications such as estimating the heart rate based on the body motion, and pavement distress detection
VGG Net has learned to extract the features (feature extractor) that can distinguish the objects and is used to classify unseen
objects. VGG was invented with the purpose of enhancing classification accuracy by increasing the depth of the CNNs. VGG
16 and VGG 19, having 16 and 19 weight layers, respectively, have been used for object recognition. VGG Net takes input of
224×224 RGB images and passes them through a stack of convolutional layers with the fixed filter size of 3×3 and the stride
of 1. There are five max pooling filters embedded between convolutional layers in order to down-sample the input
representation (image, hidden-layer output matrix, etc.). The stack of convolutional layers are followed by 3 fully connected
layers, having 4096, 4096 and 1000 channels, respectively. The last layer is a soft-max layer . Below figure shows VGG
network structure.
But in our approach we have images with the shape of (512,512) . so we build our own model using vgg16-net architecture.
And compile the model with a powerful adam optimizer , learning rate is 0.0001 , entropy is binary_crossentropy and
accuracy metrics. The below figure shows model summary , convolution layers, max-pooling layers and params.
FUTURE WORK

So, in order to increase the accuracy of the model we will try to do more efficient data-preprocessing techniques are to be
implemented now after and before the image segmentation process which will mainly focus on efficient division of data into
cancerous and non-cancerous classes and making the dataset compatible to be processed with computer vision library of
python otherwise implementing the algorithms on the dataset from self defined functions.

Also a new data processing, training and classification pipeline is to be proposed which will help the models to predict the
data more accurately.

Current Suggestions includes the use of some other transfer learning models from imagenet in keras including the one
proposed above and implementation of Feature Extraction Algorithms like BRISK and SIFT from Computer Vision Library
and also integrating the ML training methods.
REFERENCES
1. Bjerager M., Palshof T., Dahl R., Vedsted P., Olesen F. Delay in diagnosis of lung cancer in general practice. Br. J. Gen. Pract.
2006;56:863–868. [PMC free article] [PubMed] [Google Scholar]

2. Nair M., Sandhu S.S., Sharma A.K. Cancer molecular markers: A guide to cancer detection and management. Semin. Cancer Biol.
2018;52:39–55. doi: 10.1016/j.semcancer.2018.02.002. [PubMed] [Google Scholar]

3. Silvestri G.A., Tanner N.T., Kearney P., Vachani A., Massion P.P., Porter A., Springmeyer S.C., Fang K.C., Midthun D., Mazzone P.J.
Assessment of plasma proteomics biomarker’s ability to distinguish benign from malignant lung nodules: Results of the PANOPTIC
(Pulmonary Nodule Plasma Proteomic Classifier) trial. Chest. 2018;154:491–500. doi: 10.1016/j.chest.2018.02.012. [PMC free article]
[PubMed] [Google Scholar]

4. Shi Z., Zhao J., Han X., Pei B., Ji G., Qiang Y. A new method of detecting pulmonary nodules with PET/CT based on an improved
watershed algorithm. PLoS ONE. 2015;10:e0123694. [PMC free article] [PubMed] [Google Scholar]

5. Lee K.S., Mayo J.R., Mehta A.C., Powell C.A., Rubin G.D., Prokop C.M.S., Travis W.D. Incidental Pulmonary Nodules Detected on
CT Images: Fleischner 2017. Radiology. 2017;284:228–243. [PubMed] [Google Scholar]
ABOUT TechieYan Technologies

TechieYan Technologies offers a special platform where you can study all the most cutting-edge technologies directly from
industry professionals and get certifications. TechieYan collaborates closely with engineering schools, engineering students,
academic institutions, the Indian Army, and businesses.

Address: 16-11-16/V/24, Sri Ram Sadan, Moosarambagh, Hyderabad 500036

Phone : +91 7075575787

Website: https://techieyantechnologies.com/
THANK YOU

You might also like