Lung Cancer

International Journal of Computer Trends and Technology (IJCTT) – Volume 67 Issue 11 - November 2019
Lung Cancer Detection Model

UsingConvolution Neural Network and Fuzzy
Clustering Algorithms
P. Prakashbabu1, D. Ashok Kumar 2, T.Vithyaa3
Department of Computer Science, Government Arts College,
Kulithalai, Karur ,Tamilnadu - 639 120, India.
Abstract: This paper discusses the formation of Lung detect lung cancer is by the use of image processing,
cancer detection system by using the techniques of fuzzy c-means and convolutional neutral network to
Image processing. The system formed can take any develop Computer aided diagnosis. In this paper, CT
type of medical image within the three choices scan image, MRI scan image and ultrasound images
consisting of CT, MRI and Ultrasound images. Here are used. A CT scan or Computerized Axial
the proposed model is developed using Fuzzy-C- Tomography (CAT) scan is the most sensitive and
Means and Convolution Neural Network (CNN) specific detection modality produces cross-sectional
algorithm used for feature selection. This paper is an images of specific areas of scanned object by the use
extension of image processing using lung cancer of computer processed combination of many X-ray
detection and produces the results of feature images taken from different angle [1]. Radio waves
extraction and feature selection after segmentation. and magnetic field is used to form images of a body
The system formed accepts any one of medical image in an imaging technique known as Nuclear Magnetic
within the three choices consisting of MRI, CT and Resonance Imaging (NMRI) The aim of this paper
Ultrasound image as input. After preprocessing of is to design a system which can take any one of the
image, wiener filter is used for remove noise and three images as input and produces the desired
unwanted region. This present work proposes a output. The algorithms used are
method to detect the cancerous cells effectively from sensitivity,specificity and accuracy.The proposed
the CT, MRI scan and Ultrasound images. Pixel model consists of following steps such as:
Segmentation has been used for FCM segmentation Collection of lung image data set, preprocessing,
and filter is used for De-noising the medical images. wiener filter and FCM segmentation of CT and MRI
Simulation results are obtained for the cancer images. Every step is described in further sections.
detection system using MATLAB and comparison is
done between normal lung and abnormallung We apply an extensive preprocessing techniques
medical images. to get the accurate nodules in order to enhance the
accuracy of detection of lung cancer. Moreover, we
Keyword : Lung cancer,lung segmentation,Fuzzy-C- perform an end-to-end training of CNN from scratch
Means,CNN, Feature extraction. in order to realize the full potential of the neural
network i.e. to learn discriminative features.
I. INTRODUCTION Extensive experimental evaluations are performed on
One of the major reasonsfor non-accidental death is a dataset comprising lung nodules from more than
cancer. It has been proved that lung cancer is the 1390 low dose CT scans [2].
topmost cause of cancer death in men and women
worldwide. The death rate can be reduced if people
go for early diagnosis so that suitable treatment can
be administered by the clinicians within specified
time. Cancer is, when a group of cells go irregular
growth uncontrollably and lose balance to form
malignant tumors which invades surrounding
tissues. Cancer can be classified as Non-small cell
lung cancer (NSCLC) and small cell lung cancer
(SCLC).In this paper we confine to Non-small cell
lung cancer (NSCLC) as it is more prevalent than
small cell lung cancer (SCLC). There‟s a difference
between the diagnosis and treatment of non-small Figure 1: CT scan slice containing a small early stage lung
cell and small cell lung cancer. The various ways to cancer nodule.
ISSN: 2231-2803 http://www.ijcttjournal.org Page 18

II. REVIEW OF THE LITERATURE In [11], J. Tan et al. designed a framework that
detected lung nodules, then reduced the false positive
Recently, deep artificial neural networks have been for the detected nodules based on Deep neural
applied in many applications in pattern recognition network and Convolutional Neural Network. The
and machine learning, especially, Convolutional CNN has four convolutional layers and four pooling
neural networks (CNNs) which is one class of models layers. The filter was of depth 32 and size 3,5. The
[3]. Another approach of CNNs was applied on used dataset was acquired from the LIDC-IDRI for
ImageNet Classification in 2012 is called an about 85 patients. The resulted sensitivity was of
ensemble CNNs which outperformed the best results 0.82. The False positive reduction gotten by DNN
which were popular in the computer vision was 0.329.
community [4]. There has also been popular latest
research in the area of medical imaging using deep In [12], R. Golan proposed a framework that
learning with promising results. train the weights of the CNN by a back propagation
to detect lung nodules in the CT image sub-volumes.
H. Suk et al. [5] suggested a new latent and This system achieved sensitivity of 78.9% with 20
shared feature representation of neuro-imaging data false positives, while 71.2% with 10 FPs per scan, on
of brain using Deep Boltzmann Machine (DBM) for lung nodules that have been annotated by all four
AD/MCI diagnosis. G. Wu et al. [6] developed deep radiologists.Convolutional neural networks have
feature learning for deformable registration of brain achieved better than Deep Belief Networks in current
MR images to improve image registration by using studies on benchmark computer vision datasets. The
deep features. Y. Xu et al. [7] presented the CNNs have attracted considerable interest in machine
effectiveness of using deep neural networks (DNNs) learning since they have strong representation ability
for feature extraction in medical image analysis as a in learning useful features from input data in recent
supervised approach. Kumar et al. [8] proposed a years.
CAD system which uses deep features extracted from
an autoencoder to classify lung nodules as either Fuzzy k-c-means clustering algorithm used for
malignant or benign on LIDC database. In [9], Yaniv medical image segmentation which was introduced in
et al. presented a system for medical application of Ajala, 2012 [13]. Here fuzzy-c-means is a method of
chest pathology detection in x-rays which uses clustering algorithm which allows one piece of data
convolutional neural networks that are learned from a belongs to two or more clusters and k-means is a
non-medical archive. that work showed a simple clustering method in which we use low
combination of deep learning (Decaf) and PiCodes computational complexity as compared to fuzzy c-
features achieves the best performance. The proposed means. When both Clustering methods were
combination presented the feasibility of detecting combined to produce a more time efficient
pathology in chest x-ray using deep learning segmentation algorithm called as fuzzy-k-c-means
approaches based on nonmedical learning. The used clustering algorithm. They offered that thresholding
database was composed of 93 images. They obtained which is the most elementary technique for medical
an area under curve (AUC) of 0.93 for Right Pleural image segmentation, in which this algorithm divides
Effusion detection, 0.89 for Enlarged heart detection pixels in different classes depending upon their gray
and 0.79 for classification between healthy and level. It is also said that it approaches division of
abnormal chest x-ray. scalar images by forming a binary partition of the
intensity values of an image and lastly determines an
In [10], Suna W. et al., implemented three intensity value. This intensity value is termed as
different deep learning algorithms, Convolutional threshold, which separates the desired classes.
Neural Network (CNN), Deep Belief Networks Classifier techniques which were used for pattern
(DBNs), Stacked DenoisingAutoencoder (SDAE), recognition, partitions a feature space derived from
and compared them with the traditional image feature the image using data with known labels. A feature
based CAD system. The CNN architecture contains space is a set of N*M matrix where N relates to the
eight layers of convolutional and pooling layers, number of observations and M relates to the number
interchangeably. For the traditional compared to of attributes. Classifiers are known as supervised
algorithm, there were about 35 extracted texture and methods since they require training data which are
morphological features. These features were fed to manually segmented and then used it for
the kernel based support vector machine (SVM) for automatically segmenting new data.
training and classification. The resulted accuracy for
the CNN approach reached 0.7976 which was little In [14], Fatma, 2012 two more segmentation
higher than the traditional SVM, with 0.7940. They methods were used which were Hopfield Neural
used the Lung Image Database Consortium and Network (HNN), and Fuzzy C-Mean (FCM)
Image Database Resource Initiative (LIDC/IDRI) clustering algorithm. In this they found that the HNN
public databases, with about 1018 lung cases. provides enhanced, accurate and reliable

segmentation results than FCM clustering in all

cases. The HNN also divides the nuclei and
cytoplasm regions while FCM failed in the detection Input Image
of the nuclei. FCM only detected a part of the nucleus
not the whole nucleus in a particular cell. Also FCM
was not found subtle to intensity variations because
the segmentation error at convergence was found Pre -processing
larger with FCM in comparison to HNN. According
to the utmost latest estimates of the statistics which
are provided by world health organization indicates
that there happened around 7.6 million deaths FCM
worldwide each year because of this type of cancer. Segmentation
Moreover, they also found that mortality from cancer
are estimated to rise continuously, and will come near
to 17 million deaths worldwide in 2020. So, better Training
methods are required to extract the nucleus region for image Feature
very early detection. A magazine in (IEEE, Pulse)
provided us the knowledge about current trends in Extraction
medical image analysis.
In [15] Mokhled, 2012 first images which were

improved through Gabor filter. It has given better CNN Classifier
results than other enhancement techniques. They only
worked on colored image enhancement and not
extract the nucleus region and even not the cell
region. In Features Extraction stage they acquire the
general features of the enhanced and segmented
image which later they used in Binarization. A Validate
refined Charged Fluid Model (CFM) along with
improved Otsu‟s method was used for the automatic
segmentation of MRI images.
In [16] Sajith, 2012 glandular cells were detected
by using multiple color spaces and two clustering Normal Lung Abnormal
algorithms which were K-means and Fuzzy C-means. Lung
A novel lung segmentation technique was
proposed by Lin-Yu-Tseng et al to improve
segmentation accuracy as well as to separate and Figure 2: Proposed Model.
eradicate the trachea from lungs [17]. Anita
Choudhary et al used Digital Image Processing At first the CT image of lung cancer is read from
Techniques to achieve more quality and accuracy the data base. Usually the acquired image contains
[18]. AzianAzamini Abdullah et al described the low noise and if the noise is removed directly then is
development of an algorithm that detects symptoms a chance of losing clarity so the included noise is
of lung cancer in X-ray films by CNN (Cellular removed by using processing technique. Median filter
Neural Network) templates simulation [19]. which is a nonlinear digital filter where it is used to
reduce the noise. The enhanced image is them
III. METHODOLOGY processed through the segmentation by FCM
In this method the lung cancer is detected and forcast algorithm.The extracted image is then given to the
from CT image using dynamic particle swarm classifier known as CNN which classify whether the
optimization method. The processing techniques of lung nodule.
proposed method are shown in fig 2.
A. Input Image.
Image acquisition is a process of acquiring a digital
image from data base. Generally the images are
acquired by different types of scanners like MRI and
CT. CT image is acquired from CT scanner.
Computed Tomography (CT) is an imaging
procedure that generates cross sectional images
signifying in each pixel. This scan is a Non-Invasive
and painless diagnostic tool. It also referred as CAT
(computerized axial tomography).

Figure6: Binarization image.

Figure 3: Input lung cancer image.
The input CT scan lung cancer image as shown in

figure 3.
B. Preprocessing.
The images are subjected to pre-processing steps to
remove noise and unwanted region. First, get the
input image. Resize the image to the size acceptable
to the processing system. Convert resized image into
gray image in order to use only one color channel.
Gray- scale comparison involves simple algebraic
scalar operators. Gray scale image is enough to
distinguish peaks of intensity. After converting the Figure 7: Estimated Bias Field.
gray scale into binary image. That binary image is a
digital image for each pixel with two possible values. Input image noise and unwanted region as show in
The nextthing after acquiring an image is to figure 4, To remove noise and unwanted region, the
redimension it. Because each image has different input image is processed though filtered as show in
sizes so we can resize it with the same size. They figure 5, Binarization image as shown in figure 6 and
convert it to a gray scale image after resizing an input Estimate bias fild image as show in figure 7.
image.
C. Segmentation
It presents an automatic graph cut-based
segmentation framework that uses a distance-
constrained energy function to produce topologically
restricted solutions. This term ensures that labels are
assigned only to the lung pixels even in the presence
of other anatomical regions with similar lung-like
patterns. The Euclidean distance was specified to
make it clear that the distance referred to in this work
is the distance between two points, not the distance as
a measure of the difference between two regions.
Any metric can therefore be used to measure the
distance between points or regions. The contribution
of this work is to create an automatic method of lung
Figure 4: Noisy image. Figure5: Wiener Filter segmentation using Graph Cut that produces
Image. topographically restricted solutions to accurately
identify the lungs in a CT image.

Figure8: FCM filter. Figure12: Modified regional maxima superimposed on

original image.
Figure9: Gradient magnitude.
Figure 13: Threshold opening – closing by reconstruction.
D. Training.
Back-propagation algorithm is used to train the CNN
to detect lung tumors in CT image of size 512 × 512
pixel. It consists of two phases. In the first phase, a
CNN consists of multiple volumetric convolution,
rectified linear units (ReLU) and max pooling layers
is used to extract valuable volumetric features from
input data. The second phase is the classifier. It has
Figure 10: Opening – closing by reconstruction. multipleFC and threshold layers, followed by a
SoftMax layer to perform the high-level reasoning of
the neural network. No scaling was applied to the CT
images of the dataset to preserve the original values
of the DICOM images as much as possible. During
training, the randomsub-volumes extracted from the
CT images of the training set and are normalized
according to an estimate of the normal distribution of
the voxel values in the dataset.
Figure11: Regional maxima of opening-closing by

reconstruction.

Best Validation Performance is 0.0086636 at epoch 142 Training: R=0.98677 Validation: R=0.87533
0 1 1
10
Output ~= 0.94*Target + 0.0031

Data Data
Train 0.8
Fit
0.8
Fit
Y=T Y=T
Validation
0.6 0.6
Test
Best
Mean Squared Error (mse)
0.4 0.4
-1
10
0.2 0.2
0 0
0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1
Target Target
-2
10 Test: R=0.86456 All: R=0.95237
1 1

Data Data
Fit Fit
0.8 0.8
Y=T Y=T
0.6 0.6
-3
10 0.4 0.4
0.2 0.2
0 0
0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1
Target Target
-4
10
0 20 40 60 80 100 120 140
Figure 16: Regression.
148 Epochs
Figure 14: CNN Performance Validation. The cancer affects level fixed using training,
validation, and test as regression in figure 16.
0
Gradient = 0.00076703, at epoch 148 E.Validate.
10 The neural network based on convolutional and
segmentation has been implemented in MATLAB
and the system is trained with sample data sets for the
gradient
-2
10 model to understand and familiarize the lung cancer.
A sample image has been fed as an input to the
-4
trained model and the model at this stage is able to
10 tell the presence of cancer and locate the cancer spot
in the sample image of a lung cancer. The process
Validation Checks = 6, at epoch 148
6
involves the feeding the input image, preprocessing,
feature extraction, identifying the cancer spot and
4
indicate the results to the user. In case of the
val fail
malignancy is present, a message indicating the

2
presence of will be displayed on the screen .
0
0 20 40 60 80 100 120 140
148 Epochs
Figure 15: CNN Training state.
Mean squared errorforvalidation, training, andtest

isdescribed in figure14.Validation check and
training stateas show in figure15.
Figure17: Cancer detected part.

5. Update the weights {Wvn }by

using(4)
(1/| 𝑋𝑝 − 𝐶𝑛 |2)1/(p-1)
Wvn = ,
n=1,2,….N,v=1,2,..V (4)
𝑁 2 1/(p+1)
𝑖 =1
(1//| 𝑋𝑝 − 𝐶𝑛 | )
Figure 18: Result.
6. If the input is altered, do again from step 3,
The cancer affected part as show in figure 17. The else stop the process.
result for figure 18.
7. Set each pixel to a cluster according to the
IV. CLUSTERING ALGORITHMS maximum weight.
A. Fuzzy- C-Means Clustering Algorithm. B. Convolution Neural Network.
Clustering is the method of separating the data into Architecture of one hidden layer is depicted in Figure
homogenous units by considering the relationship of 19. It is examined for its skill to classify theNodules.
objects. The clustering method is the allocation of the This network consists of three layers namely, one
feature vectors into N clusters. Every n thcluster has input layer, one hidden layer, and one output layer.
Cn as its center. Fuzzy Clustering is employed in The input layer has P neurons that represent the P x P
numerous areas such as pattern recognition and fuzzy pixel of the image obtained from segmentation
detection. Among various kinds of fuzzy clustering process. The hidden layer contains groups of N x N
methods, Fuzzy C-Mean clustering (FCM) is the neurons organized as a sovereign N x N feature map
extensively used one. FCM utilizes reciprocal (where N=P-r+1) and the r x r area is represented as
distance to determine fuzzy weights. The input of this the interested area. Each hidden neuron selects input
process is a pre known number of clusters, N. The from a r x r adjacent section on the input image
mean position of every the members of a cluster is section. If the neurons in the similar feature map are
identified. The output is the segregating of N clusters one neuron distant, then their interested areas in the
on a class of objects. Thegoal of the FCM cluster is input layer are one pixel distant. Each neuron of the
to reduce the total weighted mean square error, similar feature map is reserved to take the identical
(MSE). The FCM consents each feature vector to group of R weights and accomplish the equal action
match with several clusters of different fuzzy on the resultant fragments of the input image.
membership values. The final segmentation is based
on the optimum weight of the feature vector over all
clusters. The steps involved in the FCM algorithm are
given below.
Input: feature vectors (image voxels) v=
{v1,v2,……..vn}N=number of cluster.
Output: A group of clusters that lessning the sum of
error of distance.
Steps:
1. Set random weight for every pixel using fuzzy
weighting with positive weights {Wvn } ranging
from 0-1.
2. Normalize the starting weights for eachvnvoxel
on all N clusters by using the below equation.
Wvn / i=1N Wvi (1)
3. Normalize the weights on n=1 1,….,N for each Figure 19: Architecture of One Hidden Layer CNN.
v to get Wvn as given below
The advantage of hindering the weights permits the
W nv network to achieve shift-invariant pattern recognition.
Wvn = 𝐕 W n ,v=1,2,……….,V (2) Hence, the total action is represented as the r x r
𝐢=𝟏 v
convolution kernel. The feature map is the output
4. Estimate new centroidsCn,n=1,…….,n from obtained from the convolution of the input with the r x
r convolution kernel. Each hidden neuron yj creates its
output by means of an activation function represented.
V n
r=1 Wv v ,n=1,2,……N
Cn= (3) The minimum and maximum activation functions are
zero and one, correspondingly.

Because the Kaggle dataset alone proved to be

Wij - the weight between the hidden neuron, j and the inadequate to accurately classify the validation set,
pixel, iof the input image. we also used the patient lung CT scan dataset with
labeled nodules from the Lung Nodule Analysis 2016
Xi- gray value of the input pixel i. (LUNA16) Challenge [21] to train a U-Net for lung
Aj - the bias of the hidden neuron j. nodule detection. The LUNA16 dataset contains
x1,x2,…,xr Pixels on input image and they are labeled data for 888 patients, which we divided into a
connected to the neuron, j. training set of size 710 and a validation set of size
The output layer is entirely linked to the hidden layer. 178. For each patient, the data consists of CT scan
The sigmoid activation function, zo of the output data and a nodule label (list of nodule center
neuron is represented by, coordinates and diameter). For each patient, the CT
W0j - Weight between the output neuron and neuron, scan data consists of a variable number of images
j in the hidden layer (typically around 100-400, each image is an axial
nN2 - total number of neurons in the hidden layer slice) of 512 × 512 pixels.
go - bias of the output neuron.
Hence, the network contains (O+P2+nN2 ) number of LUNA16 data was used to train a U-Net for
neurons and (nN2 (R2 +O+ 1) +O) number of links. nodule detection, one of the phases in our
These numbers include the input neurons and bias classification pipeline. The problem is to accurately
links also. The number of independent links is given predict a patient‟s label („cancer‟ or „no cancer‟)
by nN2(O+ 1) + nk2 + O. O represents the number of based on the patient‟s Kaggle lung CT scan.
output neurons.
The network weights as well as the bias weights VI. EXPERIMENTAL RESULTS.
are altered by the application of the Back Propagation
(BP) algorithm. The BP algorithm iteratively alters The enactment of the study proposed is valuedby
the weights with theintention of reducing the total benchmark metrics: Sensitivity, Specificity, and
error of the actual output vector from the target Accuracy.The description of these metrics and how
vector. The error functionto be reduced is called as their values are estimated. They are valued using
the Sum-of-Squared Error (SSE).During training, the confusion matrix which includes true and false
interested areas within one hidden class are restricted positive and true and false negative. The true
to consume the equal form of weights. The weights negative and positive envisage that the cases are
between hidden and output layers and the weights of diseased and non-diseased in which they are in fact
every interested area, are altered by means of diseased and non-diseased. The false negative and
stochastic mode. In this method, the weight positive are simply contradictory to the true negative
difference for each training sample is obtained from and positive.
each back-propagated error and are altered
instantaneously for every neuron. 𝑇𝑃
Sensitivity =
V. DATA SECTION 𝑇𝑃+𝐹𝑁
(5)
Sensitivity was truthful positive estimates
Our primary dataset is the patient lung CT scan divided by the entire positives.
dataset from KagglesData Science Bowl (DSB) 2017
[20]. The dataset contains labeled data for 1387 𝑇𝑁
patients, which we divide into training set of size Specificity =
968, and test set of size 419. For each patient, the 𝑇𝑁+𝐹𝑃
(6)
data consists of CT scan data and a label (0 for no
cancer, 1 for cancer). Note that the Kaggle dataset
Specificity was truthful non- positive estimates
does not have labeled nodules. For each patient, the
divided by the entire negatives.
CT scan data consists of a variable number of images
(typically around 100- 400, each image is an axial
slice) of 512 × 512 pixels. The slices are provided in 𝑇𝑃+𝑇𝑁
Accuracy =
DICOM format. Around 75% of the provided labels 𝑇𝑃+𝑇𝑁+𝐹𝑃+𝐹𝑁
in the Kaggle dataset are 0, so we used a weighted (7)
loss function in our malignancy classifier to address
this imbalance. Where TP - True Positive, TN – True Negative,
FP- False Positive, FN- False Negative.

Table 1: Kaggles Dataset Accuracy Calculate.
Sample Dataset TP TN FP FN Sensitivity Specificity Accuracy
Training dataset 968 842 10 12 104 89% 46% 88%
Testing/ Validation 419 376 5 7 31 93% 42% 91%
Average 89%
Table 2: LUNA16 Dataset Accuracy Calculate.
Sample Dataset TP TN FP FN Sensitivity Specificity Accuracy
Training dataset 710 601 9 11 89 86% 45% 86%
88%
Testing/ Validation 178 148 4 5 21 44% 85%
Average 85%
VIII. REFERENCES
VII. CONCLUSION
[1] W.J. Choi and T.S. Choi, “Automated pulmonary nodule
detection system in computed tomography images: A
In this paper we developed a convolutional neural
hierarchical block classification approach,” Entropy, vol.
network (CNN) architecture to detect nodules in 15, no. 2, pp. 507–523, 2013.
patients of lung cancer and detect. This step is a [2] A. Chon, N. Balachandar, and P. Lu, “Deep convolutional
preprocessing step for CNN. While we perform neural networks for lung cancer detection,” tech. rep.,
Stanford University, 2017.
well considering that we use less labeled data than [3] Y. LeCun, K. Kavukcuoglu, and C. Farabet, “Convolutional
most state-of-the-art CAD systems. As an networks and applications in vision.,” in Proceedings of the
interesting observation, the first layer is a IEEE International Symposium on Circuits and Systems
preprocessing layer for segmentation using (ISCAS), pp. 253–256, IEEE, 2010.
different techniques. Threshold, FCM and CNN are [4] K. Alex, I. Sutskever, and G. E. Hinton, “Imagenet
classification with deep convolutional neural networks,” in
used to identify the nodules of patients. Advances in Neural Information Processing Systems 25
(NIPS 2012) (F. Pereira, C. J. C. Burges, L. Bottou, and K.
The network can be trained end-to-end from Q. Weinberger, eds.), pp. 1097–1105, 2012.
[5] H. Suk, S. Lee, and D. Shen, “Hierarchical feature
image patches. Its main requirement is the
representation and multimodal fusion with deep learning for
availability of training database, but otherwise no AD/MCI diagnosis,” NeuroImage, vol. 101, pp. 569–582,
assumptions are made about the objects of interest 2014.
or underlying image modality. [6] G. Wu, M. Kim, Q. Wang, Y. Gao, S. Liao, and D. Shen,
“Unsupervised deep feature learning for deformable
registration of mr brain images.,” Medical Image
In the future, it could be possible to extend Computing and Computer-Assisted Intervention, vol. 16,
our current model to not only determine whether or no. Pt 2, pp. 649–656, 2013.
not the patient has cancer, but also determine the [7] Y. Xu, T. Mo, Q. Feng, P. Zhong, M. Lai, and E. I. Chang,
exact location of the cancerous nodules. The most “Deep learning of feature representation with multiple
instance learning for medical image analysis,” in IEEE
immediate future work is to use FCM segmentation International Conference on Acoustics, Speech and Signal
as the initial lung segmentation. Also, we saved our Processing, ICASSP, pp. 1626–1630, 2014.
model at accuracy, but perhaps we could have [8] D. Kumar, A. Wong, and D. A. Clausi, “Lung nodule
classification using deep features in ct images,” in 2015
saved at other metrics. Other future work include
12th Conference on Computer and Robot Vision, pp. 133–
extending our models to images for other cancers. 138, June 2015.
The advantage of not requiring too much labeled [9] Y. Bar, I. Diamant, L. Wolf, S. Lieberman, E. Konen, and
data specific to our cancer is it could make it H. Greenspan, “Chest pathology detection using deep
learning with non-medical training,” Proceedings -
generalizable to other cancers. International Symposium on Biomedical Imaging, vol.
2015-July, pp. 294–297, 2015.

[10] W. Sun, B. Zheng, and W. Qian, “Computer aided lung

cancer diagnosis with deep learning algorithms,” in SPIE
Medical Imaging, vol. 9785, pp. 97850Z–97850Z,
International Society for Optics and Photonics, 2016.
[11] J. Tan, Y. Huo, Z. Liang, and L. Li, “A comparison study
on the effect of false positive reduction in deep learning
based detection for juxtapleural lung nodules: Cnnvsdnn,”
in Proceedings of the Symposium on Modeling and
Simulation in Medicine, MSM ‟17, (San Diego, CA, USA),
pp. 8:1–8:8, Society for Computer Simulation International,
2017.
[12] R. Golan, C. Jacob, and J. Denzinger, “Lung nodule
detection in ct images using deep convolutional neural
networks,” in 2016 International Joint Conference on
Neural Networks (IJCNN), pp. 243–250, July 2016.
[13] AjalaFunmilola A, Oke O.A, Adedeji T.O, Alade O.M, Oyo
Adewusi E.A, ―Fuzzy k-c-means Clustering Algorithm for
Medical Image Segmentation‖, Journal of Information
Engineering and Applications, ISSN 2224-5782 (print)
ISSN 2225-0506 (online), Vol 2, No.6, 2012.
[14] Christian D., Naoufel W., Fatma T., Hussain, "Cell
Extraction from Sputum Images for Early lung Cancer
Detection", IEEE 978-1-4673-0784-0/12, 2012 .
[15] Mokhled S. AL-TARAWNEH, ―Lung Cancer Detection
Using Image Processing Techniques‖, Leonardo Electronic
Journal of Practices and Technologies, ISSN 1583-1078,
Issue 20, January-June 2012.
[16] SajithKecheril S, D Venkataraman, J Suganthi and K
Sujathan, "Segmentation of Lung Glandular Cells using
Multiple Color Spaces", International Journal of Computer
Science, Engineering and Applications (IJCSEA) Vol.2,
No.3, June 2012 .
[17] Tseng L, Huang L, “An Adaptive Thresholding Method for
Automatic Lung Segmentation in CT Images”, IEEE
AFRICON,pp 1-5, September 23-25, 2009.
[18] Chaudhary A, Singh S S, “Lung Cancer Detection Using
Digital Image Processing”, IJREAS vol 2, 1351-1359, Issue
2, 2012.
[19] Abdullah A A and Mohamaddiah H, “Development of
Cellular Neural Network Algorithm for Detecting Lung
Cancer Symptoms”, IEEE EMBS Conference on
Biomedical Engineering & Sciences, 138- 143, 2010.
[20] Kaggle, “Data science bowl 2017.”
https://www.kaggle.com/c/datascience-bowl-2017/data,
2017.
[21] LUNA16, “Lung nodule analysis 2016.”
https://luna16.grandchallenge.org/, 2017.

Lung Cancer

Uploaded by

Copyright:

Available Formats

Lung Cancer

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Lung Cancer

Uploaded by

Copyright:

Available Formats

International Journal of Computer Trends and Technology (IJCTT) – Volume 67 Issue 11 - November 2019

Lung Cancer Detection Model

ISSN: 2231-2803 http://www.ijcttjournal.org Page 18

ISSN: 2231-2803 http://www.ijcttjournal.org Page 19

segmentation results than FCM clustering in all

In [15] Mokhled, 2012 first images which were

ISSN: 2231-2803 http://www.ijcttjournal.org Page 20

Figure6: Binarization image.

The input CT scan lung cancer image as shown in

ISSN: 2231-2803 http://www.ijcttjournal.org Page 21

Figure8: FCM filter. Figure12: Modified regional maxima superimposed on

Figure9: Gradient magnitude.

Figure 13: Threshold opening – closing by reconstruction.

Figure11: Regional maxima of opening-closing by

ISSN: 2231-2803 http://www.ijcttjournal.org Page 22

Output ~= 0.94*Target + 0.0031

Output ~= 0.75*Target + 0.0075

Output ~= 0.75*Target + 0.0081

Output ~= 0.88*Target + 0.0045

malignancy is present, a message indicating the

Figure 15: CNN Training state.

Mean squared errorforvalidation, training, andtest

Figure17: Cancer detected part.

ISSN: 2231-2803 http://www.ijcttjournal.org Page 23

5. Update the weights {Wvn }by

ISSN: 2231-2803 http://www.ijcttjournal.org Page 24

Because the Kaggle dataset alone proved to be

ISSN: 2231-2803 http://www.ijcttjournal.org Page 25

Table 1: Kaggles Dataset Accuracy Calculate.

Sample Dataset TP TN FP FN Sensitivity Specificity Accuracy

Training dataset 968 842 10 12 104 89% 46% 88%

Testing/ Validation 419 376 5 7 31 93% 42% 91%

Table 2: LUNA16 Dataset Accuracy Calculate.

Sample Dataset TP TN FP FN Sensitivity Specificity Accuracy

Training dataset 710 601 9 11 89 86% 45% 86%

ISSN: 2231-2803 http://www.ijcttjournal.org Page 26

[10] W. Sun, B. Zheng, and W. Qian, “Computer aided lung

ISSN: 2231-2803 http://www.ijcttjournal.org Page 27

You might also like