0% found this document useful (0 votes)
12 views10 pages

023-Chapter 21 Classification of Lymphoma Cell Image Based On Improved SVM

This document presents an improved support vector machine (SVM) method for the classification of lymphoma cell images, addressing the limitation of conventional SVM that requires the same feature vector for all classes. The proposed method utilizes different feature vectors for each class through a feature selection approach and employs a 'one-against-one' strategy for classification. Experimental results demonstrate that the improved SVM achieves higher accuracy in classifying various lymphoma types compared to traditional SVM models.

Uploaded by

Niisama
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views10 pages

023-Chapter 21 Classification of Lymphoma Cell Image Based On Improved SVM

This document presents an improved support vector machine (SVM) method for the classification of lymphoma cell images, addressing the limitation of conventional SVM that requires the same feature vector for all classes. The proposed method utilizes different feature vectors for each class through a feature selection approach and employs a 'one-against-one' strategy for classification. Experimental results demonstrate that the improved SVM achieves higher accuracy in classifying various lymphoma types compared to traditional SVM models.

Uploaded by

Niisama
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 10

Chapter 21

Classification of Lymphoma Cell Image


Based on Improved SVM

Ting Yan, Quan Liu, Qin Wei, Fen Chen and Ting Deng

Abstract Due to the diversity of lymphoma, its classification must rely on expe-
rienced pathologist in clinic pathologic analysis. In order to improve the accuracy of
lymphoma classification, lots of image processing technologies and recognition
methods were presented. The support vector machine (SVM) has been widely
applied in medical image classification as an effective classification method.
However, the application of SVM is blocked by the limitation that each classifier
must adopt the same feature vector. In this paper, an improved SVM is proposed to
overcome this limitation. Through the analysis of features of different classes,
different feature vectors are obtained for each class of objects respectively. And
then the improved SVM based on “one-against-one” strategy is applied to classify
each class one by one. According to the results of classifying seven different
lymphoma images, our classification method is effective to acquire the higher
precision than conventional SVM and PSO-SVM model in lymphoma
classification.

Keywords Support vector machine  Lymphoma cell  Feature selection

21.1 Introduction

Lymphoma is a kind of cancer affecting lymph nodes and breaks the immunity of
body quickly. Automatic classification of lymphoma images is able to help for more
accurate and less labor-consuming in diagnosis of this disease. In particular, the

T. Yan (&)  Q. Liu  Q. Wei  F. Chen  T. Deng


Key Laboratory of Fiber Optic Sensing Technology and Information Processing,
Ministry of Education, Wuhan University of Technology, 430070 Wuhan, China
e-mail: [email protected]
Q. Wei
School of Mechanical and Electronic Engineering, Wuhan University of Technology,
430070 Wuhan, China

© Springer-Verlag Berlin Heidelberg 2015 199


T.-C. Zhang and M. Nakajima (eds.), Advances in Applied Biotechnology,
Lecture Notes in Electrical Engineering 332, DOI 10.1007/978-3-662-45657-6_21
200 T. Yan et al.

support vector machine (SVM) has become an effective tool in automatic classi-
fication of medical image. Varol et al. [1] presented that an ensemble of linear
support machine classifiers was employed for classification of structural magnetic
resonance images of the brain after ranking individual image feature through
Welch’s t-test. Schwamborn et al. [2] applied a five-dimensional genetic algorithm
and an SVM algorithm to distinguish classical Hodgkin lymphoma (HL) from
lymphadenitis. Lu et al. [3] proposed a novel method for liquor classification based
on SVM after considering the deficiency of the conventional liquor classification
method. Moreover, SVM was referred to as a predominant role for classification of
breast cancer [4, 5] and brain states [6].
SVM often adopts the same feature vector for each class which restricts its
application and affects the final results of classification in practice. With regard to
this limitation, an improved SVM algorithm is proposed in this paper. It not only
includes a feature selection method to construct one effective and adaptive feature
vector for each class, but also presents a two-class classifier consisting of two parts:
feature combination and a “one-against-one” SVM. The rest of the paper is orga-
nized as follows. Section 21.2 describes the improved SVM algorithm. Section 21.3
introduces the application of the proposed algorithm to the case of lymphoma
images and experimental results. Section 21.4 demonstrates other classical classi-
fication algorithm for comparison purposes. Section 21.5 gives conclusions and
points out our future work in lymphoma image processing.

21.2 Improved Support Vector Machine

21.2.1 Feature Selection

In classification, better discrimination ability of classifier is accompanied with the


greater differences in feature vectors of various classes. There are many efficient and
robust feature selection methods which are desired to extract meaningful features and
eliminate noisy ones. However, these methods extract the same features for all classes,
but neglect the correlation of different features. The remaining classes may be clas-
sified better with some features after eliminating some classes. For example, there are
three classes with three features. In Fig. 21.1, classes 3 and 1 are capable of being
divided from mixed classes according to the different distribution of features 1 and 2
(Fig. 21.1a) and features 1 and 3 (Fig. 21.1b), respectively.
A feature selection approach is proposed, in order to extract different features for
different classes. The feature vector for each class is searched from original features
using Euclidean distance as the criterion of identification. At first, according to the
known classes and kinds of features, one class is selected randomly as analyzing
object (named as C1). The residual classes are integrated into one as another class
(named as C2). In order to get effective features for the analyzing object, the distance
between two classes (C1 and C2) is calculated for each feature. Let {V1max, V2max}
21 Classification of Lymphoma Cell Image Based on Improved SVM 201

Fig. 21.1 The distribution of different feature vectors. a Distribution of feature 1 and feature 2.
b Distribution of feature 1 and feature 3

and {V1min, V2min} represent the maximum and minimum values of the two classes in
one specified feature (such as F1), respectively. The distance is calculated as follows:

V1min  V2max ; V1min  V2max or V2max  V1min
D¼ ð21:1Þ
V2min  V1max ; V2min  V1max or V1max  V2min

The specified feature is regarded as an effective feature of the analyzing object,


when the distance is larger than zero (as shown in Fig. 21.2a). It means that this
feature has better performance of identification. The amount of effective features of
each class is capable of being counted by calculating the distance for all features
one by one. The selected class is determined by the counted number (N) of effective
features and sum (S) of the calculated distance for the effective features. Then the
corresponding class is selected when the maximum N is unique. Otherwise, the
class with the maximum S is the one. Furthermore, any classes have no effective
features if the maximum N is zero. Under this condition, the class with the maxi-
mum distance for one feature is the selected class, and this feature is the effective
feature for this class. After eliminating selected class from original classes, the rest
classes reiterate this process until effective features and order of all classes can be
confirmed.

(a) (b)

V1max V1max
V1min
V2max
V2max V1min
V2min V2min

Fig. 21.2 The distribution of a specified feature. a The distance between two classes is larger than
zero. b The distance between two classes is less than zero
202 T. Yan et al.

21.2.2 Improved SVM Classifier

21.2.2.1 Support Vector Machine

The main idea of SVM is to find the optical classification plane of two classes that
maximized the margin between the two. Image training sample is set as (xi, yi),
i = 1, 2, … , l, x ∈ Rn, y ∈ {+1, −1}, l is the number of samples and n is the input
dimension, also expresses the number of characteristic value in an image. The
following is the equation of finding best optical classification plane:

minkwk2 Xl
þC ei
2 i¼1
ð21:2Þ
s:t: yi ðw xi þ bÞ  1  ei
ei  0; i ¼ 1; 2; . . .; l

where εi ≥ 0, i = 1, 2, …, l is non-negative slack variable, C is the penalty


parameter, bigger C said bigger punishment for fault classification. By introducing
Lagrange multipliers for the problem of optimization, the optimal decision function
is shown as follows:
" #
Xl
f ð xÞ ¼ sgn yi ai ðx  xi Þ þ b ð21:3Þ
i¼1

in which α is the Lagrange coefficient.

21.2.3 Design of SVM Classifier

Standard SVM only divides two types of objects in classification, thus how to
construct multiclass support vector machine (MSVM) continues to be an unsolved
problem. There are two widely used methods to extend binary SVM to multiclass
problems. One is the one-against-one (OAO) method and the other is one-against-
all (OAA) method. Studies indicate that OAO method has better performance than
OAA method [7]. Hence, OAO policy is adopted for multiclass classification.
Same feature vector adopted for all classes restricts conventional OAO-MSVM
application in practice. An improved classification algorithm for solving this
drawback is designed in this paper. The improved SVM consists of many two-class
classifiers, as shown in Fig. 21.3. As mentioned above, effective features and
division order of each class can be gained by using the feature selection method. A
two-class classifier is constructed for each class, which consists of feature combi-
nation and OAO-MSVM. The process of feature combination is to reconstruct
feature vector according to the efficient features of corresponding class. Then new
21 Classification of Lymphoma Cell Image Based on Improved SVM 203

Fig. 21.3 The flow chart of


proposed classification Class1~3
algorithm
classifier1
Feature combination

SVM1 SVM2 SVM3


(Class1 vs Class2) (Class1 vs Class3) (Class1 vs Class3)

Majority voting

Yes No
Class2
Class1
~3

classifier2
Feature combination

SVM3
(Class2 vs Class3)

Yes No

Class2 Class3

feature vector is the input data of OAO-MSVM, which uses voting model to
classify all samples into two classes which are corresponding class and the other
classes, respectively. The other classes are regarded as the input data of the next
classifier.

21.3 Experiment

21.3.1 Experimental Object

Malignancies originated from the lymphocyte are known as malignant lymphoma,


which are divided into Hodgkin’s diseases (HD) and Non-Hodgkin’s lymphoma
(NHL). Hodgkin lymphoma and mixed cellularity Hodgkin lymphoma (MCHL)
belong to HD. But diffuse large B-cell lymphoma (DLBCL), Follicular lymphoma
(FL), Burkitt lymphoma (BL), mantle cell lymphoma (MCL), and reactive lym-
phoid hyperplasia (RLH) belong to NHL. Each kind of lymphoma has its own
unique feature, for example, as for as distribution, those with diffuse distribution of
large cells are DLBCL, those with diffuse distribution of medium sized cells are BL
[8]. Seven kinds of lymphoma images are displayed in Fig. 21.4.
Different kinds of cells have difference in shape, arrangement, and distribution.
In order to classify different kinds of lymphoma cells, texture analysis is used to
extract the feature of cell image. Twelve common statistic values of texture analysis
204 T. Yan et al.

Fig. 21.4 Selection diagrams of various types of lymphoma cells. a BL, b DLBCL, c FL, d HL,
e MCL, f MCHL, g RLH

are extracted, such as mean, standard deviation, smoothness, third moment, con-
sistent, entropy based on gray-level histogram and the maximum probability,
contrast, correlation, energy, homogeneity, relative entropy based on gray-level co-
occurrence matrix [9], respectively labeled F1–F12.

21.3.2 Experimental Results

The scatterplot of twelve features is presented in Fig. 21.5. As is seen, the patterns
related to the different classes are located close to each other and are relatively well
separated from the other classes within the feature space.
Efficient features and division order of classes are obtained by the proposed
feature selection approach for those lymphoma images, and results are presented in
Table 21.1. It is obvious that F5, F6, F7, F10 are useless features because these
features do not belong to effective features of any class.
Six classifiers are constructed for seven different lymphoma cells. From
Table 21.1, it can be known that the input feature vector of the first classifier
consists of four features F8, F9, F11, F12 with aim of distinguishing the DLBCL
from other lymphoma cells. The input feature vector of second classifier consists of
two features F2, F3 with the aim of distinguishing FL from other rest lymphoma
cells, and so on.
In this paper, the improved SVM adopts the popular Gaussian kernel. When the
penalty factor C is 200 and the kernel parameter σ is 3, the classification accuracies
in training and testing groups are approximately equal to 99.48 and 97.08 %,
respectively.
21 Classification of Lymphoma Cell Image Based on Improved SVM 205

2 nd feature for class 1-7 rd


0.8 1 st feature for class 1-7 0.4 1 3 feature for class 1-7

0.6 0.3 0.95

0.4 0.2 0.9

0.2 0.1 0.85

0 0 0.8
0 1 2 3 4 5 6 7 8 0 1 2 3 4 5 6 7 8 0 1 2 3 4 5 6 7 8

4 th feature for class 1-7 th


0.8 1 5 feature for class 1-7 4 6 th feature for class 1-7

0.6 3
0.95
0.4 2
0.9
0.2 1

0 0.85 0
0 1 2 3 4 5 6 7 8 0 1 2 3 4 5 6 7 8 0 1 2 3 4 5 6 7 8

200 7 th feature for class 1-7 50 8 th feature for class 1-7 th


0.04 9 feature for class 1-7

150 40 0.03
30
100 0.02
20
50 10 0.01

0 0 0
0 1 2 3 4 5 6 7 8 0 1 2 3 4 5 6 7 8 0 1 2 3 4 5 6 7 8

0.5 10 th feature for class 1-7 0.02 11 th feature for class 1-7 8 12 th feature for class 1-7

0.015 7
0
0.01 6
-0.5
0.005 5

-1 0 4
0 1 2 3 4 5 6 7 8 0 1 2 3 4 5 6 7 8 0 1 2 3 4 5 6 7 8

Fig. 21.5 The scatterplot of 12 features for different classes (1 = BL, 2 = DLBCL, 3 = FL,
4 = HD, 5 = MCL, 6 = MCHL, 7 = RLH)

Table 21.1 The result of


feature selection All lymphomas Feature combination Division order
DLBCL F8, F9, F11, F12 1
FL F2, F3 2
RLH F1, F4 3
MCL F9 4
MCHL F2, F3, F4, F9, 5
BL F9 6
HL F9 7
206 T. Yan et al.

21.4 Performance Analysis and Comparison

The performances of different SVM classifiers are evaluated by using a threefold


cross-validation. In total, there are 84 samples (12 samples per class), which are
divided into three equal groups. Classifiers are trained by using two-thirds of the
samples, and then are tested by the remaining one-third. This procedure is repeated
three times, the one-third of all samples is randomly selected as the test group. To
obtain a more reliable estimation, the threefold cross-validation process is repeated
five times using different partitions of the samples. The average of train/test
accuracies is used as the estimator for the corresponding algorithm.
The largest problems encountered in establishing the SVM model are how to
select the kernel function and its parameters. All MSVM classifiers described in this
paper used Gaussian kernel due to its popularity in practical use. So the variation of
parameters is taken into consideration to compared with the different algorithms.
And the OAO-MSVM classifier utilized directly all features.
The performances of the OAO and the proposed (Pro) methods using the SVM
classifier with different penalty factor are compared, shown in Fig. 21.6a. In OAO-
Train and Pro-Train two curves, it denotes recognition accuracy of training set of
two classifier, when the value of C is 0.5, the classification accuracy of two clas-
sifiers are approximately equal to 100 %, but while the value of penalty factor
C ranges from 0.1 to 0.5, these results achieved by our classifier are better than
those achieved by OAO method. According to OAO-test and Pro-test results, it
indicates recognition accuracy of testing set of two classifiers, the percentage rec-
ognition accuracy of the proposed method is about 92 %, which is obviously higher
than OAO method (about 75 %). The experimental results prove that the effec-
tiveness of the proposed algorithm in comparison with OAO classifier. According
to Fig. 21.6b, the same conclusion can be obtained in comparison with similar
analytical method.
In [10], authors applied particle swarm optimization algorithm to choose the
suitable parameter combination for an SVM model, called PSO-MSVM model.
This paper also applies PSO-MSVM model in classification of lymphoma cells, and
makes a comparison with the proposed algorithm. The input feature and cross-
validation process of PSO-MSVM are the same with OAO-MSVM. Table 21.2

Fig. 21.6 Recognition accuracy of training set and testing set. a Penalty factor tuned in the range
[0.1, 600]. b Kernal parameter tuned in the range [0.1, 8]
21 Classification of Lymphoma Cell Image Based on Improved SVM 207

Table 21.2 Performance


All classifiers Train accuracy (%) Test accuracy (%)
comparison of OAO-MSVM
and PSO-MSVM OAO-MSVM 99 95
PSO-MSVM 98 90
Pro-MSVM 99 97

shows the best performance of three classifiers. The results imply that the proposed
algorithm is more effective than OAO-MSVM and PSO-MSVM for lymphoma
classification.

21.5 Conclusions

The application of conventional SVM is restricted in the same feature vector


adopted for each class. In order to overcome the limitation in SVM, an improved
classification algorithm is presented to classify the different kinds of lymphoma cell
images. The improvement is concentrated on a novel feature selection method and
multiclass SVM adopted OAO strategy. A comparison among our method, PSO-
MSVM and common OAO-MSVM indicate that our improved method is able to
correctly select the input features and also achieve high classification accuracy.
Furthermore, it also has the high efficiency in classification, which is achieved by
the variable feature vectors for each classifier. The experiments on classification of
seven kinds of lymphoma images show that the effectiveness and efficiency of our
method.
Like most issues in multiclass classification algorithm, there is an error accu-
mulation in the proposed method, especially the identification error of the root
node. Therefore, how to solve this problem needs further study.

Acknowledgments The research work was supported by National Natural Science Foundation of
China under Grant Nos. 50935005 and 51175389, and the Key Grant Project of Chinese Ministry
of Education No. 313042.

References

1. Varol E, Gaonkar B, Erus G et al (2012) Feature ranking based nested support vector machine
ensemble for medical image classification. In: 9th IEEE international symposium on
biomedical imaging (ISBI), 2012. IEEE, pp 146–149
2. Schwamborn K, Krieg RC, Jirak P et al (2010) Application of MALDI imaging for the
diagnosis of classical Hodgkin lymphoma. J Cancer Res Clin Oncol 136(11):1651–1655
3. Lu J, Du L, Ding H, et al (2014) Application of support vector machine in base liquor
classification. In: Proceedings of the 2012 international conference on applied biotechnology
(ICAB 2012). Springer, Berlin, pp 1051–1056
208 T. Yan et al.

4. George G, Raj VC (2011) Review on feature selection techniques and the impact of svm for
cancer classification using gene expression profile. arXiv:1109.1062, doi:10.5121/ijcses.2011.
2302
5. Huang CL, Liao HC, Chen MC (2008) Prediction model building and feature selection with
support vector machines in breast cancer diagnosis. Expert Syst Appl 34(1):578–587
6. Sitaram R, Lee S, Ruiz S et al (2011) Real-time support vector classification and feedback of
multiple emotional brain states. Neuroimage 56(2):753–765
7. Hsu CW, Lin CJ (2002) A comparison of methods for multiclass support vector machines.
IEEE Trans Neural Netw 13(2):415–425
8. Vardiman JW (2010) The World Health Organization (WHO) classification of tumors of the
hematopoietic and lymphoid tissues: an overview with emphasis on the myeloid neoplasms.
Chem Biol Interact 184(1):16–20
9. Gonzalez RC, Woods RE, Eddins SL (2009) Digital image processing using MATLAB.
Gatesmark Publishing, Knoxville
10. Meng Q, Ma X, Zhou Y (2012) Application of the PSO-SVM model for coal mine safety
assessment. In: Eighth international conference on natural computation (ICNC), 2012. IEEE
pp 393–397

You might also like