023-Chapter 21 Classification of Lymphoma Cell Image Based On Improved SVM
023-Chapter 21 Classification of Lymphoma Cell Image Based On Improved SVM
Ting Yan, Quan Liu, Qin Wei, Fen Chen and Ting Deng
Abstract Due to the diversity of lymphoma, its classification must rely on expe-
rienced pathologist in clinic pathologic analysis. In order to improve the accuracy of
lymphoma classification, lots of image processing technologies and recognition
methods were presented. The support vector machine (SVM) has been widely
applied in medical image classification as an effective classification method.
However, the application of SVM is blocked by the limitation that each classifier
must adopt the same feature vector. In this paper, an improved SVM is proposed to
overcome this limitation. Through the analysis of features of different classes,
different feature vectors are obtained for each class of objects respectively. And
then the improved SVM based on “one-against-one” strategy is applied to classify
each class one by one. According to the results of classifying seven different
lymphoma images, our classification method is effective to acquire the higher
precision than conventional SVM and PSO-SVM model in lymphoma
classification.
21.1 Introduction
Lymphoma is a kind of cancer affecting lymph nodes and breaks the immunity of
body quickly. Automatic classification of lymphoma images is able to help for more
accurate and less labor-consuming in diagnosis of this disease. In particular, the
support vector machine (SVM) has become an effective tool in automatic classi-
fication of medical image. Varol et al. [1] presented that an ensemble of linear
support machine classifiers was employed for classification of structural magnetic
resonance images of the brain after ranking individual image feature through
Welch’s t-test. Schwamborn et al. [2] applied a five-dimensional genetic algorithm
and an SVM algorithm to distinguish classical Hodgkin lymphoma (HL) from
lymphadenitis. Lu et al. [3] proposed a novel method for liquor classification based
on SVM after considering the deficiency of the conventional liquor classification
method. Moreover, SVM was referred to as a predominant role for classification of
breast cancer [4, 5] and brain states [6].
SVM often adopts the same feature vector for each class which restricts its
application and affects the final results of classification in practice. With regard to
this limitation, an improved SVM algorithm is proposed in this paper. It not only
includes a feature selection method to construct one effective and adaptive feature
vector for each class, but also presents a two-class classifier consisting of two parts:
feature combination and a “one-against-one” SVM. The rest of the paper is orga-
nized as follows. Section 21.2 describes the improved SVM algorithm. Section 21.3
introduces the application of the proposed algorithm to the case of lymphoma
images and experimental results. Section 21.4 demonstrates other classical classi-
fication algorithm for comparison purposes. Section 21.5 gives conclusions and
points out our future work in lymphoma image processing.
Fig. 21.1 The distribution of different feature vectors. a Distribution of feature 1 and feature 2.
b Distribution of feature 1 and feature 3
and {V1min, V2min} represent the maximum and minimum values of the two classes in
one specified feature (such as F1), respectively. The distance is calculated as follows:
V1min V2max ; V1min V2max or V2max V1min
D¼ ð21:1Þ
V2min V1max ; V2min V1max or V1max V2min
(a) (b)
V1max V1max
V1min
V2max
V2max V1min
V2min V2min
Fig. 21.2 The distribution of a specified feature. a The distance between two classes is larger than
zero. b The distance between two classes is less than zero
202 T. Yan et al.
The main idea of SVM is to find the optical classification plane of two classes that
maximized the margin between the two. Image training sample is set as (xi, yi),
i = 1, 2, … , l, x ∈ Rn, y ∈ {+1, −1}, l is the number of samples and n is the input
dimension, also expresses the number of characteristic value in an image. The
following is the equation of finding best optical classification plane:
minkwk2 Xl
þC ei
2 i¼1
ð21:2Þ
s:t: yi ðw xi þ bÞ 1 ei
ei 0; i ¼ 1; 2; . . .; l
Standard SVM only divides two types of objects in classification, thus how to
construct multiclass support vector machine (MSVM) continues to be an unsolved
problem. There are two widely used methods to extend binary SVM to multiclass
problems. One is the one-against-one (OAO) method and the other is one-against-
all (OAA) method. Studies indicate that OAO method has better performance than
OAA method [7]. Hence, OAO policy is adopted for multiclass classification.
Same feature vector adopted for all classes restricts conventional OAO-MSVM
application in practice. An improved classification algorithm for solving this
drawback is designed in this paper. The improved SVM consists of many two-class
classifiers, as shown in Fig. 21.3. As mentioned above, effective features and
division order of each class can be gained by using the feature selection method. A
two-class classifier is constructed for each class, which consists of feature combi-
nation and OAO-MSVM. The process of feature combination is to reconstruct
feature vector according to the efficient features of corresponding class. Then new
21 Classification of Lymphoma Cell Image Based on Improved SVM 203
Majority voting
Yes No
Class2
Class1
~3
classifier2
Feature combination
SVM3
(Class2 vs Class3)
Yes No
Class2 Class3
feature vector is the input data of OAO-MSVM, which uses voting model to
classify all samples into two classes which are corresponding class and the other
classes, respectively. The other classes are regarded as the input data of the next
classifier.
21.3 Experiment
Fig. 21.4 Selection diagrams of various types of lymphoma cells. a BL, b DLBCL, c FL, d HL,
e MCL, f MCHL, g RLH
are extracted, such as mean, standard deviation, smoothness, third moment, con-
sistent, entropy based on gray-level histogram and the maximum probability,
contrast, correlation, energy, homogeneity, relative entropy based on gray-level co-
occurrence matrix [9], respectively labeled F1–F12.
The scatterplot of twelve features is presented in Fig. 21.5. As is seen, the patterns
related to the different classes are located close to each other and are relatively well
separated from the other classes within the feature space.
Efficient features and division order of classes are obtained by the proposed
feature selection approach for those lymphoma images, and results are presented in
Table 21.1. It is obvious that F5, F6, F7, F10 are useless features because these
features do not belong to effective features of any class.
Six classifiers are constructed for seven different lymphoma cells. From
Table 21.1, it can be known that the input feature vector of the first classifier
consists of four features F8, F9, F11, F12 with aim of distinguishing the DLBCL
from other lymphoma cells. The input feature vector of second classifier consists of
two features F2, F3 with the aim of distinguishing FL from other rest lymphoma
cells, and so on.
In this paper, the improved SVM adopts the popular Gaussian kernel. When the
penalty factor C is 200 and the kernel parameter σ is 3, the classification accuracies
in training and testing groups are approximately equal to 99.48 and 97.08 %,
respectively.
21 Classification of Lymphoma Cell Image Based on Improved SVM 205
0 0 0.8
0 1 2 3 4 5 6 7 8 0 1 2 3 4 5 6 7 8 0 1 2 3 4 5 6 7 8
0.6 3
0.95
0.4 2
0.9
0.2 1
0 0.85 0
0 1 2 3 4 5 6 7 8 0 1 2 3 4 5 6 7 8 0 1 2 3 4 5 6 7 8
150 40 0.03
30
100 0.02
20
50 10 0.01
0 0 0
0 1 2 3 4 5 6 7 8 0 1 2 3 4 5 6 7 8 0 1 2 3 4 5 6 7 8
0.5 10 th feature for class 1-7 0.02 11 th feature for class 1-7 8 12 th feature for class 1-7
0.015 7
0
0.01 6
-0.5
0.005 5
-1 0 4
0 1 2 3 4 5 6 7 8 0 1 2 3 4 5 6 7 8 0 1 2 3 4 5 6 7 8
Fig. 21.5 The scatterplot of 12 features for different classes (1 = BL, 2 = DLBCL, 3 = FL,
4 = HD, 5 = MCL, 6 = MCHL, 7 = RLH)
Fig. 21.6 Recognition accuracy of training set and testing set. a Penalty factor tuned in the range
[0.1, 600]. b Kernal parameter tuned in the range [0.1, 8]
21 Classification of Lymphoma Cell Image Based on Improved SVM 207
shows the best performance of three classifiers. The results imply that the proposed
algorithm is more effective than OAO-MSVM and PSO-MSVM for lymphoma
classification.
21.5 Conclusions
Acknowledgments The research work was supported by National Natural Science Foundation of
China under Grant Nos. 50935005 and 51175389, and the Key Grant Project of Chinese Ministry
of Education No. 313042.
References
1. Varol E, Gaonkar B, Erus G et al (2012) Feature ranking based nested support vector machine
ensemble for medical image classification. In: 9th IEEE international symposium on
biomedical imaging (ISBI), 2012. IEEE, pp 146–149
2. Schwamborn K, Krieg RC, Jirak P et al (2010) Application of MALDI imaging for the
diagnosis of classical Hodgkin lymphoma. J Cancer Res Clin Oncol 136(11):1651–1655
3. Lu J, Du L, Ding H, et al (2014) Application of support vector machine in base liquor
classification. In: Proceedings of the 2012 international conference on applied biotechnology
(ICAB 2012). Springer, Berlin, pp 1051–1056
208 T. Yan et al.
4. George G, Raj VC (2011) Review on feature selection techniques and the impact of svm for
cancer classification using gene expression profile. arXiv:1109.1062, doi:10.5121/ijcses.2011.
2302
5. Huang CL, Liao HC, Chen MC (2008) Prediction model building and feature selection with
support vector machines in breast cancer diagnosis. Expert Syst Appl 34(1):578–587
6. Sitaram R, Lee S, Ruiz S et al (2011) Real-time support vector classification and feedback of
multiple emotional brain states. Neuroimage 56(2):753–765
7. Hsu CW, Lin CJ (2002) A comparison of methods for multiclass support vector machines.
IEEE Trans Neural Netw 13(2):415–425
8. Vardiman JW (2010) The World Health Organization (WHO) classification of tumors of the
hematopoietic and lymphoid tissues: an overview with emphasis on the myeloid neoplasms.
Chem Biol Interact 184(1):16–20
9. Gonzalez RC, Woods RE, Eddins SL (2009) Digital image processing using MATLAB.
Gatesmark Publishing, Knoxville
10. Meng Q, Ma X, Zhou Y (2012) Application of the PSO-SVM model for coal mine safety
assessment. In: Eighth international conference on natural computation (ICNC), 2012. IEEE
pp 393–397