Lung Cancer Detection Based On CT-Scan Images With Detection Features Using Gray Level Co-Occurrence Matrix GLCM and Support Vector Machine SVM Methods
Lung Cancer Detection Based On CT-Scan Images With Detection Features Using Gray Level Co-Occurrence Matrix GLCM and Support Vector Machine SVM Methods
Anwar Anwar
Department o f Informatics and
Computer Engineering
Politeknik Elektronika Negeri Surabaya
Surabaya, Indonesia
anwar@pasca. student.pens.ac.id
Abstract— Lung cancer is all malignant diseases in the lungs, Lung cancer is a tumor located in the lung that grows
including malignancies originating from the lungs themselves rapidly and spreads to other organs. The process of cancer is
(primary) or those originating from other organs (metastasis). characterized by abnormal cell growth that can damage other
Lung cancer is one of the leading causes of death worldwide. normal tissue cells. Based on histopathology, lung cancer is
Lung cancer is a tumor that grows rapidly and can spread to
divided into two types: Small Cell Lung Cancer (SCLC) is a
other organs. The onset of cancer is characterized by abnormal
cell growth that can damage other normal tissue cells. small cell lung cancer, and Non-Small Cell Lung Cancer
Computerized Tomography (CT) is an imaging technique often (NSCLC) is a non-small cell lung cancer consisting of
used to diagnose lung cancer. Lung cancer can be classified into squamous cell cancer (ACC), adenocarcinoma. (ADC), and
benign and malignant cancer. It is very important to diagnose large cells. From the division of these types of cancer, NSCLC
lung cancer at an early stage to speed up the treatment process is a cancer that causes 80-90% of deaths in the world. While
and the actions that will be taken. This study aims to develop a based on pathological residues are converted into benign and
lung cancer detection system based on CT-scan images. This malignant cancers. Most cases of lung cancer are found at an
detection system has 4 main stages, namely pre-processing of
advanced stage, making healing difficult (Dandil, 2014).
CT-Scan images to improve image quality, segmentation to
identify and separate the desired cancer object from the Computerized Tomography (CT) is one of the imaging
background, feature extraction based on area, contrast, energy, techniques most often used in the diagnosis of lung cancer.
entropy, and homogeneity. The classification of lung cancer into
Lung cancer with multiple pathological residues can be seen
cancer benign and malignant cancer. From the system trial, the
accuracy level based on the system decision in determining the on CT. Lung cancer is categorized into benign and malignant
diagnosis of lung cancer is benign or malignant was 83.33%. cancer. During diagnosis, cancer of a certain density can be
categorized as benign cancer in some cases. However, in
Keywords— CT-Scan Image, Classification, Lung Cancer, many cases that occur, the lung which tends to be congested
Segmentation, System Detection, Introduction is usually classified as malignant cancer. It is very important
to make a diagnosis of lung cancer at an early stage to improve
I. I n t r o d u c t io n the treatment and treatment process. Systems designed for
medical applications provide a variety of benefits to
Lung cancer is a disease of the lungs which includes
successfully optimize lung cancer detection. This allows
malignancy originating from the lungs itself (primary) or
panderita to start the treatment process early on with the help
diseases originating from other organs (metastasis). Lung
of this system and can simplify the decision-making process
cancer is one of the leading causes of death worldwide where
of doctors (Riti, 2016).
there were 9.6 million deaths in 2018 with a death rate of 1.76
million (WHO, 2018). According to WHO statistics, in Segmentation aims to make the image easier to analyze,
Indonesia the percentage of causes of cancer death in men is where the objects used are distinguished from one another.
21.8%, in women it is 9.1 %, so with an average o f22,475 men The process of giving a label to each pixel in an image aims
and 8,390 women who are thought-about lung cancer each to distinguish each characteristic it has. Image segmentation
year ( WHO, 2018). produces a set of contours separated from the background.
Each pixel in an image has a different color, intensity, and
644
Authorized licensed use limited to: Alliance University. Downloaded on March 13,2025 at 15:18:04 UTC from IEEE Xplore. Restrictions apply.
B. General system design
The system planning used in this research consists of 6
main parts, namely pulmonary CT-scan image input, pre
processing, segmentation, feature extraction, classification, Fig. 3. Diagram of preprocessing
and decision making which are described in Figure 1.
a. Gray Scale
The inserted image is an image of the CT-Scan image that
Input Image Input Image
needs to be fixed to grayscale to make it easier to do further
processing.
Segmentation
Thresholding
F in d
From the image, the quality of the grayscale is improved to
Contours make it easier when the next process is done. Then the output
Thresholding of the grayscale process.
3) Segmentation
The segmentation stage is used to identify and separate the
cancer object desired by the background. The segmentation
phase uses the find contour method, where the results of the
thresholding process that have been done before then by
looking at the widest cancer area by these pixels. This
segmentation process takes pictures from the pre-processing
Fig. 2. Lung Cancer CT Scan Image. results and then is used to take the area detected by cancer
from the original image. Where the system takes the most
2) Pre-processing value 1 pixel among the area of the pre-processed lung CT
The second process that is carried out after successfully scan.
loading the image is the preprocessing process. In this process,
4) Feature extraction
two stages will be carried out, namely grayscale to improved
the quality of gray and to convert grayscale images to binary The feature extraction stage based on the shape is done by
using the thresholding method. calculating the area value of the segmented cancer object
which is then reconstructed to the original image color. After
645
Authorized licensed use limited to: Alliance University. Downloaded on March 13,2025 at 15:18:04 UTC from IEEE Xplore. Restrictions apply.
the cancer object is detected, then the area will be calculated In making the .txt file using labels in the form of numbers,
using the contour Area function in OpenCV. wherein this research there are three labels, namely label 0 for
the parameter values of the normal lung, label 1 for the
The contour Area function works by taking the pixel area parameter values of the malignant lung cancer and label 2 for
from the contour. After getting the pixel value from the the parameter values from benign lung cancers. The parameter
contour, the value will be converted to millimeter using values used to consist of five parameter values, namely the
formula. area of the lung cancer, the value of contrast, energy, entropy,
p ix e ls x 25,4 and homogeneity.
mm = ------- —---- (1)
dpi
• SV M Training
with 96 dpi it means there are 96 pixels per inch, where 1 inch After the creation of the .txt file is done, the next step is
= 25.4 mm. So that 1 pixel that is read equals 0.2645833
the training process that is carried out through the command
millimeters.
prompt using the existing library in Open CV. The command
The feature extraction stage based on the texture is carried is used at the command prompt.
out using the Gray Level Co-occurrence Matric (GLCM) The training process through this command prompt
method. The GLCM method will calculate the contrast, generates a file .model that contains a database or a place to
energy, entropy, and homogeneity of the cancer object. The store the model parameters studied by SVM-train. This file
formulas used to calculate these values are:
.model will be used for predictions in the testing process.
Contrast: B. Testing Process
Con = (2) The testing process in Open CV can be done using the
SVM::predict library. This process aims to classify input
Energy: samples using trained SVM. Figure 4 is an illustration of how
the testing process works using the SVM-Predict method.
E = ZyCpCi,/)2) (3)
Entropy:
E n = (.i-D (4)
Homogeneity:
H= p(j,n
i+\i-n (5)
5) Classification
Fig. 4. How the Testing Process Works
The classification stage is carried out using the Support
Vector Machine (SVM) method. Inputs used in this stage are Figure 4 is an illustration of how the testing process works.
parameter values in the form of cancer area, contrast, energy, In the testing process using SVM, the input used is the
entropy, and homogeneity. While the output produced in the parameter values generated from the feature extraction stage.
classification stage is a decision in the form of normal, benign, The values of these parameters are cancer area, contrast,
or malignant. There are 2 stages in the classification process energy, entropy, and homogeneity of CT-Scan images that are
using SVM, namely training and testing. used as input from the system. Then the parameter values will
be matched with the parameter values from the training
A. Training process
process stored in the database with a .model file. The output
The training process in Open CV can be done using the of this process is 0, 1, or 2, where 0 is normal, 1 is malignant
SVM::train library. This process aims to build the SVM lung cancer, while 2 is a benign lung cancer using equation
model. The training or learning process using the Support (6).
Vector Machine (SVM) method is done in two stages, namely
the creation of a .txt file and the training process which is done yi(wxi + b) > O.untuk i = 1, 2, ...n (6)
through the command prompt.
Where xi is the input data, Yi is the output result that has
• Creation o f a .txt File a value of +1 or +2, w and b are the parameter values. If the
data output yi = +1, then the result is malignant lung cancer,
The first step that must be prepared in the training process
whereas if the data output yi = +2, then the result is benign
using the Support Vector Machine (SVM) method is the
lung cancer.
creation of a .txt file. This file will be used to store data in the
form of feature extraction values from images that have been C. Decision M aking
labeled as normal, benign lung cancers and malignant lung
The last stage is decision making. The decision-making
cancers. In this process 35 data CT, CT scan images are used,
stage is carried out after obtaining the values of the area,
there are 10 normal data, 20 malignant lung cancer data, and
contrast, energy, entropy, and homogeneity of the input image
5 benign lung cancer data. Writing format in .txt file.
which are then matched with the data of the parameter values
646
Authorized licensed use limited to: Alliance University. Downloaded on March 13,2025 at 15:18:04 UTC from IEEE Xplore. Restrictions apply.
in the result database of the training process. Decisions
resulting from this system can be normal, benign lung cancers,
or malignant lung cancers.
V. CONCLUSION
This paper discusses the development of a CT-Scan based
image-based lung cancer detection system. This system can
help in answering the problem of determining lung cancer
based on benign and malignant types which can be seen from
CT scan images which are then processed with this system so
that it can contribute to the medical field to facilitate the
Fig. 7. Result Find Contour
diagnosis of lung cancer. From the system trial, the level of
After successfully knowing the cancer contour area, then accuracy based on the system decision in determining the
the reconstruction process is carried out to retrieve and return diagnosis of benign or malignant lung cancer is 83.33%.
the color of the contour to find results to the original, then the
Re f e r e n c e s
output of the reconstruction process can be seen in Figure 8.
[1] https://www.who.int/en/news-room/fact-sheets/detail/cancer (akses 28
ju n i2019)
[2] , and A. Canan,
"Artificial neural network-based classification system for lung nodules
on computed tomography scans," 2014 6th International Conference of
Soft Computing and Pattern Recognition (SoCPaR), Tunis, 2014, pp.
382-386.
[3] Y. F. Riti, H. A. Nugroho, S. Wibirama, B. Windarta, and L. Choridah,
"Feature extraction for lesion margin characteristic classification from
CT Scan lungs image," 2016 1st International Conference on
647
Authorized licensed use limited to: Alliance University. Downloaded on March 13,2025 at 15:18:04 UTC from IEEE Xplore. Restrictions apply.
Information Technology, Information Systems and Electrical on Systems, Signals and Image Processing (IWSSIP), London, 2015,
Engineering (ICITISEE), Yogyakarta, 2016, pp. 54-58. pp. 5-8.
[4] R. Wulandari, R. Sigit, and S. Wardhana, "Automatic lung cancer [8] E. Rendon-Gonzalez and V. Ponomaryov, "Automatic Lung nodule
detection using color histogram calculation," 2017 International segmentation and classification in CT images based on SVM," 2016
Electronics Symposium on Knowledge Creation and Intelligent 9th International Kharkiv Symposium on Physics and Engineering of
Computing (IES-KCIC), Surabaya, 2017, pp. 120-126. Microwaves, Millimeter and Submillimeter Waves (MSMW),
[5] L. Anifah, Haryanto, R. Harimurti, Z. Permatasari, P. W. Rusimamto, Kharkiv, 2016, pp. 1-4.
and A. R. Muhamad, "Cancer lung detection on CT scan image using [9] A. Kulkarni and A. Panditrao, "Classification of lung cancer stages on
artificial neural network backpropagation based gray level co CT scan images using image processing," 2014 IEEE International
occurrence matrices feature," 2017 International Conference on Conference on Advanced Communications, Control and Computing
Advanced Computer Science and Information Systems (ICACSIS), Technologies, Ramanathapuram, 2014, pp. 1384-1388.
Bali, 2017, pp. 327-332. [10] S. A. El-Regaily, M. A. M. Salem, M. H. A. Aziz and M. I. Roushdy,
[6] D. P. Kaucha, P. W. C. Prasad, A. Alsadoon, A. Elchouemi, and S. "Lung nodule segmentation and detection in computed tomography,"
Sreedharan, "Early detection of lung cancer using SVM classifier in 2017 Eighth International Conference on Intelligent Computing and
biomedical image processing," 2017 IEEE International Conference on Information Systems (ICICIS), Cairo, 2017, pp. 72-78.
Power, Control, Signals and Instrumentation Engineering (ICPCSI), [11] M. H. Jony, F. Tuj Johora, P. Khatun and H. K. Rana, "Detection of
Chennai, 2017, pp. 3143-3148. Lung Cancer from CT Scan Images using GLCM and SVM," 2019 1st
[7] F. Taher, N. Werghi, and H. Al-Ahmad, "Computer-aided diagnosis International Conference on Advances in Science, Engineering and
system for early lung cancer detection," 2015 International Conference Robotics Technology (ICASERT), Dhaka, Bangladesh, 2019, pp. 1-6.
64 8
Authorized licensed use limited to: Alliance University. Downloaded on March 13,2025 at 15:18:04 UTC from IEEE Xplore. Restrictions apply.