0% found this document useful (0 votes)
17 views8 pages

Comparative Study of Classification Techniques On Breast Cancer FNA Biopsy Data

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views8 pages

Comparative Study of Classification Techniques On Breast Cancer FNA Biopsy Data

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

Comparative Study of Classification Techniques

on Breast Cancer
FNA Biopsy Data
Haowen You1 and George Rumbe2

1
Department of Systems and Information Engineering, University of Virginia, Charlottesville,
Virginia.
2
Department of Systems Science and Industrial Engineering, Binghamton University. Binghamton,
New York.

Abstract - Accurate diagnostic detection of the who are newly diagnosed [8]. Additional statistics as of
cancerous cells in a patient is critical and may alter the 2006 estimated 214,460 new cancer diagnosis and total
subsequent treatment and increase the chances of death at least 41,000 within the US [10]. Early detection
survival rate. Machine learning techniques have been and accurate diagnosis has been crucial in reducing the
instrumental in disease detection and are currently number of deaths which has increased the survival rate of
being used in various classification problems due to those diagnosed with breast cancer [8].
their accurate prediction performance. Various The challenging effect of the identification of the
techniques may provide different desired accuracies and cancerous cells in a patient is highly subjective and it is
it is therefore imperative to use the most suitable method reliant on the physician expertise. This may lead to
which provides the best desired results. This research inaccurate predictions since the experiments are prone to
seeks to provide comparative analysis of Support Vector human and visual error and may be affected by blurred
Machine, Bayesian classifier and other Artificial neural mammogram visuals [11]. The aforementioned
network classifiers (Backpropagation, linear challenges necessitate the need for accurate tools for
programming, Learning vector quantization, and K detection and classification of the breast cancer cells.
nearest neighborhood) on the Wisconsin breast cancer There have been effective systems such as the machine
classification problem. decision support systems (MDSS) used in aiding breast
cancerous cells detection [8]. Machine learning techniques
Keywords: Artificial Neural Networks, Classification, have been instrumental in providing evidence in support
Breast cancer diagnosis of the accuracy of the classification of breast cancer
patients. Once the breast cancer diagnosis has been
performed the prognosis is subsequently determined to
III. INTRODUCTION predict the future development and characteristics of the
cancerous cells. Prognosis has been determined to be
The development of automated diagnostics was instigated
more complex due to the censoring of data [9].
by the need to aid the physician in decision making. There
Diagnosis is employed to significantly and accurately
application in healthcare has spanned from the
discern between malignant and benign cancerous patterns.
electrocardiograms to ultrasounds etc. The traditional set-
Some of the conventional used approaches for breast
up for error detection and monitoring of disease
cancer detection/diagnosis include mammography;
progression heavily rest on the technicians within the
surgical biopsy and fine needle aspirate [9]. The
healthcare. The increase in the number of patients within
sensitivity results from the aforementioned approaches in
healthcare who require continuous assessment has led to
accurately identifying the malignant lumps ranges as
the technical development of the automated systems.
follows, mammography 68%-79%, fine needle aspirate
Transformations of the qualitative information to
65%-98% and surgical biopsy about 100% [9]. The
quantitative measures are at the forefront in solving
surgical biopsy despite being an effective approach has
classification problems. Breast cancer has been identified
been determined as a costly procedure which induces
as the second largest cause of cancer deaths among
negative psychological behavior on the patients [10].
women of age 40 and 55. The number of breast cancer
Another effective method to diagnose breast masses is
diagnosis is estimated to be 1.2 million among women
based on Fine Needle Aspiration biopsy, which is a
every year according to projections by the World Health
technique to extract cell samples from lump and conduct
Organization [4]. According to statistics by the American
vision observation on the cellular under microscope [1].
cancer society in 2001, about 40,200 deaths are caused by
Diagnosis conclusion (benign and malignant) can be
the breast cancers and 192,000 cases consist of women
-6-
International Journal of Artificial Intelligence and Interactive Multimedia, Vol. 1, Nº 3.

drawn according to the judgment of domain experts [2]. systems should be able to provide higher accuracy of
Currently, artificial intelligence techniques, which deal disease identification as malignant or benign. In addition,
with the diagnosis as a pattern classification problem with the systems should also be able to determine with a degree
the cellular nuclei shape information from cell slides of confidence indicating the accuracy of diagnosis with
images, have been introduced into this area, to improve some levels. Another major important aspect is the
the accuracy, consistency and efficiency of this diagnosis systems interpretability which provides information on the
process. steps followed resulting to the outcomes generated. The
Artificial neural network on the other hand has been
A. Research Objective
determined to be an effective tool in classification though
The objective of this research is to provide a comparative the operations within the network structure are hidden.
study on the utilized potential classification tools (linear Classification problem seems to have generated interests
programming, back-propagation neural network, support among researchers. The classification approach is used in
vector machine and Bayesian network) on the problem by data analysis and pattern recognition problems. This
a benchmark dataset which consist of numeric cellular approach involves classifier modeling which is used as a
shape features extracted from preprocessed Fine Needle function that associates a class to different attributes. The
Aspiration biopsy image of cell slides. concept of association based on similarities or trained
B. Research scope performance has been embedded in various approaches
such as neural networks, decision trees, decision graphs
This research will first implement Support Vector
and etc [14]. The methodology of the neural networks can
Machine (SVM) and Bayesian network solution on the
be performed in two phases i.e. training and testing. The
benchmark dataset. Then a comparison on this benchmark
training phase involves feature extraction and computation
dataset between the former adopted techniques (linear
utilizing the classification rules. On the other hand, testing
programming and back-propagation neural network) and
data is used for performance evaluation on the accuracy of
these two newly developed modeling approaches will be
the classification process determined by the training data
conducted. The measurement of this comparative study
[10]. Breast cancer diagnosis and prognosis has instigated
will be selected according to the proposed measures by
the research interest and has been explored utilizing
the latest publication on this problem [4]. These will
various artificial neural networks such as Radial Basis
include classification accuracy, sensitivity, specificity,
Function, Multilayer perceptrons, Backpropagation, and
positive predictive value and negative predictive value. K-
Learning Vector Quantization network. Other methods
fold cross-validation [5] will also be used to evaluate the
which have been utilized to determine the breast cancer
overall performance of each model built by
diagnosis includes Fuzzy systems and Evolutionary
aforementioned approaches. The organization for the rest
algorithms. The fuzzy systems are used to represents
of this research will be as follows, Section 2: provides
different degrees of the disease (malignant or benign) a
detail information on the literature review, Section 3
patient suffers from; on the other hand, the evolutionary
introduces the strategies employed by the SVM and
algorithms are used to perform search to determine the
Bayesian network classifier, Section 4 discusses detail
most suitable fuzzy systems [6].
analysis on the results, the complexity of modeling
Isotonic separation which is a linear programming
process and the computation expenditure of these
technique is based on the underlying assumption of
approaches, and Section 5 provides the summary and
maintaining same consistency in diagnosis. For example
conclusion of the research.
the Breast cancer dataset (Wisconsin) a patients being
diagnosed with malignant tumor based on certain
IV. LITERATURE REVIEW characteristics of the cell structures, for other patients
The increase in the number of deaths determined within showing similar symptoms with more damage to the cells
the healthcare systems has led to the development of would end up receiving the same diagnosis [7] and Rank
medical diagnostic support systems to aid the medical nearest neighbor technique (k-RNN) [11]. The k-RNN has
personnel’s in decision making process [10]. Various been determined as technique used in approximating the
experts systems and machine learning algorithms have densities based on the evaluations of the nearest neighbors
been utilized to provide supporting information based on [11]. The aforementioned technique has been applied in
the input knowledge. Some of the significant univariate and multivariate data in examining various
developments include 2D and 3D medical imaging, classifications problems including breast cancer. In order
feature extraction, pattern analysis and classification have for a patient to receive the appropriate breast cancer
been used in providing solutions for edge detection and treatment, it is necessary that accurate classification of the
region growing among other problems [10]. According to cells be determined. This has lead researchers to combine
Pena-Reyes and Sipper (1999) an effective diagnostic and employ various machine learning techniques and
select the one with the highest prediction accuracy [16].

-7-
A Direct Path to Intelligent Tools ISSN - 1989-1660

The comparative analysis of the ANN ranges from two to with the different random variables. The independence
six networks or more being evaluated to determine the assumption provides information on the probability
most appropriate technique. Integration of different ANN distribution that is represented within the network.
networks has lead to improve performance measures. The Generally, the probability distribution within the networks
RBF properties when applied to tuning the SVM has been must initially be specified from the root nodes followed by
determined to provide higher prediction accuracy for the condition probabilities of the remaining non-root
breast cancer data [12]. nodes based on the direct predecessor’s combinations
[13]. The conditional probabilities can only determine
V. METHODOLOGY based on the fact that information on some of the nodes in
the network have been identified.
There have been numerous artificial neural network The Bayesian network classifier uses the unsupervised
approaches used for examining the classification of breast learning algorithm, where the class target is unknown
cancer cells, some of these approaches are Bayesian though we have the inputs (attributes) [14] and the
classifier and SVM. This section provides descriptive classifier learning algorithm can be structured into two
discussions on the SVM and Bayesian classifier phases (i) Function for assessment of a certain network
framework. In addition, it examines the strategies based on a data and (ii) an approach for examining space
employed and some of the parameters that are used for within the networks. There are various learning algorithm
effective classification of patterns. employed to the Bayesian network this includes AD (All
A. 3.1 Support Vector Machine Stratagem dimensions) Trees, TAN (Tree Augmented Naïve Bayes)
Support Vector Machine (SVM) was introduced by and K2. K2 has been used in breast cancer classification
Vapnik and it is a technique based on the statistical problems due to its fast convergence ability. Bayesian nets
learning theory and has been applied for solving have been utilized in providing solutions to medical
classification and regression problems [15]. The objective diagnosis, heuristic search and map learning problems
of the SVM is to separate two classes by determining the among other challenges [13]. The Bayesian network is
linear classifier that maximizes the margin and it is based on independence assumption between the nodes.
referred to as the optimal separating hyperplane [15]. C. 3.3 Data Structure
SVM has been employed in various classification problem The benchmark dataset in this research will obtained from
and mostly current interest in breast cancer detection due the UCI Irvine machine learning repository
its robustness. The regularization parameter and kernel http://archive.ics.uci.edu/ml/index.html. This dataset was
function are the two major components that have to been originally created by Dr. Wolberg, Street and
determined before conducting training. Some of the Mangasarian all from University of Wisconsin. Data items
significant researches employed using the SVM for breast in the dataset are composed of ID number, the diagnosis
cancer detection utilized heuristics SVM approaches such which will either be classified as malignant (M) or benign
as the smooth SVM, the linear SVM and general non (B) and numeric shape features of extract cellular nuclei
linear SVM [12]. The goal of SVM is to determine a such as radius, texture, perimeter, area, smoothness,
suitable hyperplane with maximum margin which can be compactness, concavity, concave points, and symmetry
computed as an optimization problem [10]. and fractal dimension. The dataset was composed of a
B. 3.2 Bayesian Network Approach total of 569 observations with benign and malign cases
Bayesian networks are characterized by the use of the being 357 and 212 observations respectively. Each of the
probabilistic approach in problem solving and encompass dataset in the observation is composed of 30 variables and
the uncertainty of certain occurrences. Its origin is based 10 of the featured variables are related to the
on the probability distribution which can be depicted aforementioned characteristics [3].
graphically. The Bayesian network classifier is composed
of a set of variables related to each other by directed VI. RESULTS AND DATA ANALYSIS
edges. The variables represent the data attributes, class This section provides discussion on the result and analysis
and arcs, which when applied to the conditional for SVM, Bayesian, LVQ, KNN and BNT _Clustering.
probability table depicts their relationship in a visual Furthermore, a comparative analysis of the
format. The Bayesian network classifiers are also referred aforementioned approaches is presented. The SVM and
to as directed acyclic graphs that provide information on Bayesian network classifier approach were developed
joint probability distribution on various random variables using MATLAB, and the 10 variables (see section 3.3)
[14]. It has been determined that the Bayesian network were experimented with within the classifiers.
classifier, the connecting arcs between different nodes
provides an independence assumption that is associated

-8-
International Journal of Artificial Intelligence and Interactive Multimedia, Vol. 1, Nº 3.

A. 4.1 Support Vector Machine


Training for the SVM was conducted by varying a variety
of C and gamma (ɣ) values based on 10 fold cross
validation. The ranges of C and ɣ were selected within
the range of 2-15- 25 and 23-215 respectively [18]. The two
major SVM classifiers evaluated were C-SVM and Nu–
SVM and the kernel functions that were used include
polynomial, sigmoid and radial basis function. By
examining the C-SVM employing the polynomial kernel
function, the value of C=1 and ɣ= 2-3 showed 98.07%
prediction accuracy which was the best from all other
combinations. Figure 1 shows the surface plot for the
varieties of C and ɣ. The initial values examined shows a
flat surface which represents that the classification Figure 2: Surface plot for the C-SVM and RBF kernel
accuracy remained constant at 62.74% and progressively functions
better predictions above 90% were determined.

Figure 3: Surface plot for the C-SVM and sigmoid kernel


function
Figure 1: Surface plot for the C-SVM and Polynomial Similar discussions were also presented using the Nu-
kernel function SVM classifier with the polynomial, sigmoid and RBF
The RBF kernel function was also examined and it kernel functions. The prediction accuracy of 92.79% was
showed better prediction accuracy as compared to the determined between the regions where C=2-1 and ɣ =2-6,
polynomial kernel function as shown in Figure 2. C=215 C=215 and ɣ =23 as shown Figure 4. The flat topmost
and ɣ =2-15 showed a higher prediction accuracy of regions which lies between the boundaries C=215 and ɣ2-9
98.24%. From Figure 2, the flat regions at the top indicate and 23 and ɣ23 showed a consistent prediction accuracy of
high accuracy prediction. The best prediction accuracy of more than 90%. Figure 5 shows the surface plot for the
97.54% for the C-SVM using sigmoid kernel functions Nu-SVM and RBF kernel function which has a flat feature
was determined between two regions (see Figure 3), i.e., map with a small raised region due to high prediction
when C=210 and ɣ =2-6 and when C= 210 and ɣ =2-9. accuracy above 90% obtained for the C and ɣ parameters.
Prediction accuracy of 95.08 where C=2 1 and ɣ =23, C=25
and ɣ =21. A higher prediction accuracy of 93.67% using
Nu-SVM and sigmoid kernel function was determined
within the region where C=215 and ɣ =2-15 as shown in
Figure 6 below. Low prediction accuracy of less than 70%
was obtained for values of C=2-5-215 and ɣ=1, and ɣ=8

-9-
A Direct Path to Intelligent Tools ISSN - 1989-1660

Figure 7, shows the Davies-Bouldin index which was


utilized in this project.

Figure 4: Surface plot for the Nu-SVM and polynomial


kernel function

Figure 7: Davies-Bouldin index for different clusters on


mean radius data item

The Bayesian network classifier was used for breast


cancer classification. Three types of Bayesian network
i.e., Naïve, K2 and Bdeu were examined to determine best
network with higher prediction accuracy. The topologies
for these different networks are shown in Figures 8 and 9.
The topology for Naïve Bayes (see Figure 8) shows no
learning takes place between input variables in the
Figure 5: Surface plot for the Nu-SVM and RBF kernel network. On the other hand, for K2 and Bdeu (see Figure
function 9) there is learning of relationship between the input
variables. The experiments for each of the network were
conducted by examining all the input features (All), mean
and standard deviation (Mean+SE) and Mean. Results
obtained from the network as illustrated in Figure 10
shows that Bdeu network with (All) had a higher
prediction accuracy of 91.31%, followed by Naïve (All) at
89.55% and K2(Mean+SE) at 88.41%.

Figure 6: Surface plot for the Nu-SVM and sigmoid


kernel function
B. Bayesian Network
The Bayesian network utilizes the Davies-Bouldin index
during data preprocessing to change continuous data to
discrete. In addition, Davies-Bouldin index assists in
determining the appropriate cluster to be used in
evaluating the network. The smaller the bouldin index
indicates the most appropriate selection of the clusters.

-10-
International Journal of Artificial Intelligence and Interactive Multimedia, Vol. 1, Nº 3.

Figure 8: Naïve Bayesian network topology

Figure 9: K2 and Bdeu network topology

Figure 10: Bayesian network prediction results

-11-
International Journal of Artificial Intelligence and Interactive Multimedia, Vol. 1, Nº 3.

C. Learning Vector Quantization


A combination of parameters of hidden neurons (5, 10,
15, 20, 25, and 30) and learning parameters (0.01, 0.1,
0.5, and 1) were varied against each other. The number of
iterations for the network was set at 50. A higher
prediction accuracy of 90.47% was determined with
learning rate of 0.1 and 5 hidden neurons. Figure 11,
shows the LVQ surface plot with low and high regions
varying with the increase of learning rate and hidden
neurons.

Figure 12: KNN prediction accuracy


E. Comparative Analysis
Table 1 shows a comparative analysis for the six different
networks i.e., Support vector machines (SVM), Bayesian
network (BNT), K nearest neighborhood (KNN),
Learning vector quantization (LVQ), Linear programming
(LP) and Backpropagation network (BPN). Based on the
results in Table 1, the K nearest neighborhood had a
higher prediction accuracy of 100%, followed by the
SVM using the RBF kernel function with prediction
accuracy of 98.24%. The K2 Bayesian network had poor
Figure 11: LVQ accuracy prediction surface plot prediction accuracy compared to all the networks
evaluated. The results shows that machine learning
techniques can provide accurate prediction and may
D. K-Nearest Neighborhood (KNN) enable proper classification of patient’s condition and
improve their quality of life. Although the Bayesian
The KNN was evaluated using the Euclidean and network classifier performance was poor compared to the
Cityblock distance approach. The K (neighbors) evaluated SVM, the CPU time it took to produce the output results
ranged from 1 to 15. Figure 12 shows the results obtained was low compared to other network. Table 2 shows the
and with a higher prediction accuracy being observed training and prediction time associated by each of the
using both approaches. The Euclidean distance approach network observed in this project. The Bayesian network
showed a prediction accuracy of 100% with K=5, 10 and shows a low prediction time of 0.07seconnds.
11, similarly to the Cityblock distance approach with
K=13.

Table 1: Comparative performance of breast cancer


Type SVM BNT KNN
Euclidea LVQ LP BPN
Kernel C-SVM Nu-SVM Naïve K2 Bdeu n CityBlock
Polynomial 97.54% 92.79% 89.55 88.41 91.04 97.50
RBF 98.24% 95.08% % % 91.31% 100% 100% % % 95.33%
Sigmoid 97.72% 93.85% [17]

Table 2: Networks CPU time


Type of Network Training/Prediction Time (seconds)
SVM (2.11s)/0.94s
Bayesian Network Classifier (4.27+1.51)s/0.07s
Learning Vector Quantization 67.18s/1.45
K Nearest neighbor N/A/0.08s

-12-
International Journal of Artificial Intelligence and Interactive Multimedia, Vol. 1, Nº 3.

[17] Kim, Y., Jang, S., Cho, K., and Park, G., Performance comparison
between Backpropagation, Neuro-Fuzzy Network, and SVM, Springer
VII. DISCUSSION AND CONCLUSION Berlin/Heidelberg, 2006.
Early detection of breast cancer cells can be predicted [18] Hsu, C., Chang, C., and Lin, C., “A practical guide to support vector
classification”, Technical report, Department of Computer Science and
accurately by the use of machine learning techniques. This Information Engineering, National Taiwan University, Taipei,
may result in the decrease of health cost and may enhance time 2003.http//www.csie.ntu.edu.tw/~cjlin/libsvm/.
required for a patient to receive treatment. In this project the
SVM and the Bayesian network have been discussed in
providing diagnostic and prognosis assessment for breast
cancer. The SVM has been determined to be more superior to
Bayesian network since it provides higher prediction accuracy.
By comparing the performance of both networks to other
neural network approaches, the KNN has been examined to
provide 100% classification. The prediction accuracy of the
networks discussed in this project emphasizes the need of
employing the machine learning techniques not only on the
prediction of breast cancer data but on other medical
conditions in which predictions of conditions are difficult to
diagnose.

VIII. REFERENCES
[1] McMorran, J., Crowther., D.C., “Fine needle aspiration cytology
(breast)”, General Practice Notebook – a UK medical reference on the
world wide web, Feb 2009.
[2] Olvi, L.M., Street, W.N., “Breast cancer diagnosis and prognosis via
linear programming”, Operations Research, Vol.43, No.4, 1995, pp.
570-577.
[3] UCI Irvine machine learning repository, “Breast Cancer Wisconsin
(Diagnostic) Data Set”,
http://archive.ics.uci.edu/ml/datasets/Breast+Cancer+Wisc
onsin+(Diagnostic), Nov. 1995.
[4] Akay, M., “Support vector machines combined with feature selection
for breast cancer diagnosis”, Expert systems with applications, Vol.36,
2009, pp.3240-3247.
[5] Kohavi, R., "A study of cross-validation and bootstrap for accuracy
estimation and model selection". Proceedings of the fourteenth
international joint conference on artificial intelligence, Vol.12, No.2,
1995, pp. 1137–1143.
[6] Pena-Reyes, C., and Sipper, M., A fuzzy approach to breast cancer
diagnosis, Artificial intelligence medicine, Vol.17, 1999, pp.131-135.
[7] Ryu, Y., Chandrasekaran, R., and Jacob, V., Breast cancer prediction
using the isotonic separation technique, European journal of operation
research, Vol.181, 2007, pp.842-854.
[8] West, D., Mangiameli, P., rampal, R., West, V., “Ensemble strategies
for a medical diagnostic decision support system: A breast cancer
diagnosis application”, European journal of operation research, Vol.162,
2005, pp.532-551.
[9] Pantel, P., “Breast cancer diagnosis and prognosis”, University of
Manitoba (1998).
[10] Maglogiannis, I., and Zafiropoulos, “An intelligent system for
automated breast cancer diagnosis and prognosis using SVM based
classifiers”, Application intelligence, Vol.30, 2009, pp.24-36.
[11] Bagui, S., Bagui, S., Pal, K., and Pal, N., “Breast cancer detection using
rank nearest neighbor classification rules”, Pattern recognition, 36,
2003, pp.25-34.
[12] Mu, T., and Nandi, A., “Breast cancer detection from FNA using SVM
with different parameter tuning systems and SOM-RBF classifier”,
Journal of the Franklin Institute, Vol. 344, 2007, pp.285-311.
[13] Charniak, E., “Bayesian networks without tears,” Artificial intelligence
magazine, Vol.12, No.4, 1991, pp.50-63.
[14] Friedman. N., Geiger, D., and Goldszmidt, M., “Bayesian classifier”,
Machine learning, Vol. 29, 1997, pp.131-163.
[15] Gunn, S., “Support vector machines for classification and regression,
Technical paper, 1998.
[16] Ubyeli. E., “Implementing automated diagnostic systems for breast
cancer detection, Expert systems with application”, Expert systems with
applications, Vol.33, 2007, pp.1054-1062.

-13-

You might also like