Article 4
Article 4
to acquire a clear cut knowledge about condition of the correct answer. So it can be directly compared
the patients. Moreover, it helps physicians in with learning process. A supervised learning
identifying the type and vigorous of the cancer algorithm learns from labeled training data, helps
cells [8, 9]. you to predict outcomes for unforeseen data.
Machine learning techniques can be classified into Where as in unsupervised machine learning
two major types based on its application and technique, the dataset will not have a clear label.
working nature. While considering the lung cancer Instead it should be programmed in a manner, in
prediction several research contributions and which it should discover the information on its
prediction methods were been introduced. In this own. Unsupervised machine learning techniques
research work, we have taken two supervised can perform more complex processing tasks
learning methods such as Artificial Neural compared to the supervised learning algorithm but
Networks (ANN), Support Vector Machine (SVM) the results of the unsupervised machine learning
and two unsupervised learning methods Apriori are unpredictable compared to other deep learning
and K-means for this comparative study. The and natural learning process.
datasets were downloaded from the open- source
Cancer Imaging archives and given as the training A. Supervised learning algorithm: Artificial
set to these machine learning algorithm. The Neural Networks (ANN)
ANN maintains an interconnected nodes, called as
preprocessing, feature extraction and selection are
neurons to gather information by identifying
kept same for all these four methods.
relationships and new pattern between the data. It
This comparative study paper is organized in such
has three layer such as, input neuron layer, hidden
a manner such that, Section 2, describes the
neuron layer and output neuron layer. Neurons in
difference between the supervised learning and
each layers will receive the input data, performs
unsupervised learning. Section 3, explains the
operations and forwards the data to the nearby
preprocessing, feature extraction and selection.
connected neurons. Each neurons and the edge
Section 4, evaluates the performance of the ANN,
which connects the neurons has a particular weight.
SVM, Apriori and K-means and Section 5
The weight will change on the neurons based on
concludes and discusses about the future work of
the learnings. ANN allows both forward and
the comparative study.
backward propagation for learning.
The final result of ANN are produced based on the
2. SUPERVISED LEARNING AND
maximum probability of neurons present in output
UNSUPERVISED LEARNING
layer. Even there exist several algorithms to predict
In supervised learning, the machine learning
the early state lung cancer, using ANN will
system is trained with a well labeled information,
which means that some data is already tagged with produce an accurate result.
be attained only if the SVM draws the hyper plane maximized to get the clear idea. Deleting the
separating all the objects to its classes correctly. support vectors will change the position of the
hyper plane. These are the points that help us build
In here, support vectors are data that are very closer accurate SVM model.
to the hyper plane and influence the position and
orientation of the hyper plane. Using these support C. Unsupervised learning algorithm – Apriori
vectors, the margin of the classifier can be Algorithm
maximized to get the clear idea. Deleting the
The Apriori algorithm is a classical frequent item
support vectors will change the position of the
sets generation algorithm and a milestone in the
hyper plane. These are the points that help us build
development of data mining. It is used for finding
accurate SVM model.
frequent item in a dataset for Boolean association
In here, support vectors are data that are very closer rule. Apriori algorithm uses prior knowledge of
to the hyper plane and influence the position and frequent item properties. An iterative approach or
orientation of the hyper plane. Using these support level-wise search where k-frequent item are used to
vectors, the margin of the classifier can be find k+1 item.
Cancer
96 8 4
Feature extraction is another dimensionality Type ‘T’
reduction method through which the raw data will Cancer
5 89 5
be transformed into a group of manageable data for Type ‘M’
further processing. It plays an important role in Cancer
4 5 104
image processing as multiple parameters are Type ‘N’
needed to process the images. It includes low level Table. 1. Prediction of cancer type using ANN
extraction, edge level extraction, curvature method
extraction, shape detection, motion detection and
so on. Here the low level processing of images True/Actual
includes several detection such as detection of Type Type Type
edges, detection of corners, blob detection for ‘T’ ‘M’ ‘N’
detecting the regions in the images, ridge detection
Predicted
Cancer
101 6 2
for extracting the thin line which is brighter than Type ‘T’
the nearby regions and feature transform through Cancer
difference in scales of images. The curvature 7 95 8
Type ‘M’
extraction intends to extract the direction of edges. Cancer
It also identifies the change in intensity of images 8 5 88
Type ‘N’
and the autocorrelation. The shape detection Table. 2. Prediction of cancer type using linear
involves in finding the threshold of the images, SVM
region extraction and template matching. It also
includes hough transformation which involves in Table 1 and Table 2, represent the prediction made
extract the imperfect features of the objects by through ANN and SVM respectively. Accuracy of
comparing it within the class through voting ANN model is 90.2%. In 320 total predicted value,
procedure. The motion detection model involves in ANN has correctly predicted 290 values. However,
extracting the motion of images and the optical accuracy of SVM algorithm is 88%, where 284
flow by admiring the area of the images. predictions are made correctly.
Using Apriori
Full Data
Cataracts Cataracts Cataracts Cataracts Another Another
Disease Post
not not not not Allergic Allergic
Diagnose Operation
Specified Specified Specified Specified Rhinitis Rhinitis
Age Cluster -- Elder Elder Elder -- Child Adult
Gender Male -- Female -- Female Female Male
Status of
Out Out Out Out Out Out Out
Care
Confidence
69 76 60 66 69 69 69
(%)
Table 6. Data processing using Apriori algorithm
K-Means + Apriori
Cluster 1 Cluster 2 Cluster 3 Cluster 4
Cataracts Cataracts Another
Disease
not not allergic Post Operation
Diagnose
Specified Specified rhinitis
Age Cluster Elder Elder Child Adult
Gender Male Female Female Male
Status of Care Outpatient Outpatient Outpatient Outpatient
Confidence (%) 66 66 92 93
Table 7. Data processing using K- Means and Aprior