0% found this document useful (0 votes)
70 views15 pages

Automatic Detection and Classification of Leukocytes Using Convolutional Neural Networks

Uploaded by

alecoleto
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
70 views15 pages

Automatic Detection and Classification of Leukocytes Using Convolutional Neural Networks

Uploaded by

alecoleto
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 15

Med Biol Eng Comput

DOI 10.1007/s11517-016-1590-x

ORIGINAL ARTICLE

Automatic detection and classification of leukocytes using


convolutional neural networks
Jianwei Zhao1 · Minshu Zhang1 · Zhenghua Zhou1 · Jianjun Chu2 · Feilong Cao1 

Received: 15 March 2016 / Accepted: 26 October 2016


© International Federation for Medical and Biological Engineering 2016

Abstract The detection and classification of white blood proposed classification method has better accuracy almost
cells (WBCs, also known as Leukocytes) is a hot issue than some other methods.
because of its important applications in disease diagno-
sis. Nowadays the morphological analysis of blood cells Keywords  White blood cell · Detection · Classification ·
is operated manually by skilled operators, which results in Convolutional neural networks · Random forest
some drawbacks such as slowness of the analysis, a non-
standard accuracy, and the dependence on the operator’s
skills. Although there have been many papers studying 1 Introduction
the detection of WBCs or classification of WBCs inde-
pendently, few papers consider them together. This paper Nowadays, more and more computer technology and arti-
proposes an automatic detection and classification system ficial intelligence take part in disease diagnosis [1–5].
for WBCs from peripheral blood images. It firstly proposes Among them, the counting of white blood cells (WBCs,
an algorithm to detect WBCs from the microscope images also known as Leukocytes) in peripheral blood is one
based on the simple relation of colors R, B and morphologi- of important issues because it can assist pathologists to
cal operation. Then a granularity feature (pairwise rotation diagnose diseases such as leukemia and other blood dis-
invariant co-occurrence local binary pattern, PRICoLBP eases. Generally, there are mainly five types of WBCs:
feature) and SVM are applied to classify eosinophil and eosinophils, basophils, neutrophils, monocytes and lym-
basophil from other WBCs firstly. Lastly, convolution neu- phocytes, as shown in Fig. 1. They can be counted with
ral networks are used to extract features in high level from manual or automatic methods. Although the manual
WBCs automatically, and a random forest is applied to method can attain a recognition rate 100% when it is
these features to recognize the other three kinds of WBCs: performed by very skilled operators, it results in some
neutrophil, monocyte and lymphocyte. Some detection drawbacks such as slowness of the analysis, a non-stand-
experiments on Cellavison database and ALL-IDB data- ard accuracy, and the high dependence on the operator’s
base show that our proposed detection method has better skills. Hence, the automatic method for counting WBCs
effect almost than iterative threshold method with less cost is more and more preferred in the computer-aided diag-
time, and some classification experiments show that our nosis system.
Generally speaking, an automatic WBC recognition sys-
tem mainly consists of three key steps: (1) detecting WBCs
* Feilong Cao from a peripheral blood image, (2) extracting effective fea-
[email protected] tures, (3) designing a classifier [4]. That is, it firstly detects
1 WBCs from a peripheral blood image and then extracts
Department of Applied Mathematics, College of Science,
China Jiliang University, Hangzhou 310018, Zhejiang effective features of WBCs for the classification.
Province, People’s Republic of China To a certain extent, a good detection method for iden-
2
Jiashan Jasdaq Medical Device Co., Ltd., Jiashan 314100, tifying WBCs from their background correctly is the first
Zhejiang Province, People’s Republic of China step to success. There have been a lot of methods for

13
Med Biol Eng Comput

Fig. 1  Five types of white blood cells: eosinophils, basophils, neutrophils, monocytes and lymphocytes

detecting WBCs from the background, such like clustering and classify the obtained features with three kinds of neural
[6], thresholding [7–14], morphological operator [15–17], networks: multilayer perceptron, SVM, and the hyper rec-
Gram–Schmidt orthogonalization method [18], edge detec- tangular composite neural networks. Cuevas et al. [24, 25]
tion [19], region growing [20, 21], colors [14, 22, 23], opti- propose some automatic detection methods based on the
mization-based method [24, 25], fuzzy-based method [9, optimization theory, i.e., considering the automatic detec-
26], and support vector machine (SVM) [27]. On the one tion of WBCs as an ellipse or circle detection problem to
hand, each method has its advantages and disadvantages. improve the detection accuracy, robustness and stability.
For example, the threshold method is simple but is not able Now an effective feature extraction method a good clas-
to accurately segment WBCs from the background. Some sifier is the second step to success for a WBC recognition
methods (e.g., the SVM method and the region growing system. Many papers extract some features of nucleus
method) can produce reasonably accurate detection results, and cytoplasm, and then distinguish them with some neu-
but they cost time and need high computational resources. ral networks classifiers. For example, Mohapatra et al.
While some color-based segmentation methods (e.g., [14]) [28] use gray level co-occurrence matrix (GLCM) as well
were directly conducted on the RGB color space, some as some shape features of the leukocytes to classify five
approaches (e.g., [22, 23]) adopted the hue-saturation- types. Sinha and Ramakrishnan [29] use shape, color and
intensity (HSI) color space (especially on the S compo- texture features to identify leukocyte. Kuse et al. [30] use
nent). In general, the S-component-based methods outper- GLCM texture to obtain 18 signatures to identify lym-
formed the RGB-based methods. On the other hand, there phocytes. Rezatofighi and Soltanian-Zadeh [18] extract
are few automatic detection methods among them that the color and morphological features of cells, leaf area
detect WBCs from the microscope images. That is, many and texture feature (local binary pattern) to recognize leu-
methods start the classification from the cropped WBCs kocytes. Tai et al. [31] extract features such as roundness,
images by some experts, which results in the inconvenience color of cytoplasm, and nucleus-cytoplasm ratio. Su et al.
in real applications. For example, Tai et al. [31] begin the [4] extract some geometrical, color, texture features. How-
feature extraction such as roundness feature, color of cyto- ever, those features are designed by some experts with their
plasm, and nucleus-cytoplasm ratio. Therefore, some peo- experiences according to the characteristics of cells. These
ple begin to study the automatic WBCs detection methods low-level features can be hand-crafted with great success
for the applications. For examples, Guimaraes et al. [19] for some specific data, but designing effective features for
present a new automatic circular decomposition algorithm new data usually requires new domain knowledge because
that proceeds the separation of connected circular particles most hand-crafted features cannot be simply applied into
to locate their center coordinates and estimate their radii. new conditions. Therefore, people begin to find another
Sinha and Ramakrishnan [29] propose an automatic detec- way of learning features from the data of interest to rem-
tion method using k-means clustering and EM-algorithm edy the limitation of hand-crafted features. A representative
on the HSV-equivalent of the image. Nilufar et al. [12] pro- example of such methods is learning through deep neural
pose an automatic detection method based on contour of networks, which attracts significant attention recently [32].
blood cells. Rezatofighi and Soltanian-Zadeh [18] find the The idea of deep learning is to find higher-level features to
discriminating region of WBC on the hue-saturation-inten- provide more invariance to intra-class variability. One suc-
sity (HSI) color space, segment WBC with a morphological cessful application of deep learning in image classification
process, extract some geometrical, color, texture features, is the use of convolution neural network (CNN) [33, 34]

13
Med Biol Eng Comput

architecture. Human brain visual system can extract tem-


poral information effortlessly from a cluttered scene in the
external environment and analyze the target of interest or
region formed on scene quickly and accurately from the
aspect of understanding and awareness, while CNN just
imitates this operation principle with its special network
structure and learning rule. Therefore, CNN can extract
high-level features from images.
Up to now, there have been many classifiers designed for
pattern recognition, such as Bayes classifier [14], predictor
of natural disordered regions (PONDR predictor) [40] for
MobiDB database [41], feed-forward back propagation [35,
36], SVM [18], local linear map [37], fuzzy cellular neural
Fig. 2  A block diagram of our proposed automatic recognition sys-
network [38], extreme learning machine [39], random for- tem for WBCs: WBCs are detected from a microscope image first
est [46], and so on. Some of these classifiers are success- based on the simple relationship of colors and the morphological
fully applied in the process of classification for WBCs. As operations, then a granularity feature (PRICoLBP) is extracted for
a random forest consists of a collection of tree-structured each WBC, and SVM is applied to discern the eosinophil and baso-
phil from other types of WBCs. Last, CNN is applied to the left three
classifiers, it usually has a better classification effect than a types of WBCs to extract their features automatically, and a random
single classifier. forest is used to recognize them
In this paper, we address the issue of an automatic rec-
ognition system that can effectively detect and classify
WBCs from peripheral blood images. We want to auto- relationship of colors and the morphological operations.
matically detect WBCs from background with a simple Then for each WBC, a granularity feature (pairwise rota-
and effective algorithm. Then we want to extract features tion invariant co-occurrence local binary pattern (PRI-
in high level with CNN automatically. Finally, we con- CoLBP) [42]) is extracted and SVM is applied on these
sider applying an ensemble classifier: random forest to granularity features to discern the eosinophil and basophil
improve the classification. The main contributions of our from other types of WBCs. For the remaining three types of
study are highlighted as follows. (1) In the step of detect- WBCs, we apply CNN to extract their features in high level
ing WBCs from the microscope image, unlike some tra- automatically. Then an effective classifier, random forest,
ditional methods that convert color image into other color is applied to recognize which kind of WBCs it belongs to:
spaces, we just apply the simple relationship of R and B neutrophil, monocyte or lymphocyte? The flowchart of our
components based on the special characteristics of WBC proposed method is shown in Fig. 2.
to separate WBCs from the background. Then we propose
an algorithm that combines the lobes of nuclei using their 2.1 Detection of  WBCs
own characteristics to detect WBCs automatically, which
is timeless and accurate in experiments. (2) In the step of A peripheral blood image I0 contains not only different
extracting features, unlike some traditional methods that kinds of WBCs, but also a lot of red blood cells. Therefore,
design features by some experts with their experiences, the first step of our proposed method is to detect WBCs
we apply CNN to extract effective features of WBCs auto- from the image I0. Generally, after blood cells are dyed
matically. (3) In the step of choosing a classifier, unlike with some methods, e.g., the method of Wright’s staining,
the traditional strategy that constructs a single classifier, the nuclei of WBCs will appear deep colored, which makes
we apply a random forest that consists of a collection of WBCs be easily distinguished from other cells. For exam-
tree-structured classifiers to improve the accuracy for the ple, see the sampled images (a) and (b) in Fig. 7. The clas-
features extracted by CNN. sical methods for detecting the nucleus often convert the
microscope image from the RGB space to some other color
spaces such like HSI and HSV spaces, and then choose
2 Methods the significant components with an appropriate threshold
value. Here, in our proposed method, we will adopt the
In this section, we will propose a novel automatic WBC special property of blood cell image to detect WBCs. That
recognition system. The proposed system firstly detects is, we first use the difference value R − B of color R and
WBCs from the microscope images based on the simple B to get an R − B image I1, then give a binarization on the

13
Med Biol Eng Comput

Fig. 3  Color transformation
comparison of our method
with some other methods for
detecting WBCs initially. a
Examples of converting color
image into some other images.
The first row shows the process
of converting color microscope
image into gray image. The
second row shows the process
of converting color image into
“S” component image after HSI
transformation. The last row
shows the process of converting
color image into R − B image
with our method. b The process
of getting I2 image with our pro-
posed method. First, convert the
color image into R − B image,
then get a binary image using a
binarization on the R − B image
with a threshold value

13
Med Biol Eng Comput

As opposed to the erosion operation, dilation operation


can fill in the holes of the objects by means of “grow-
ing” or “thickening” objects. An dilation of B on X can be
described as
 
X ⊕ B = z ∈ Z2 | (B̂)z ∩ X � = ∅ ,

where B̂ is the reflection of B. In our proposed method,


we first use erosion operation once to get rid of small and
insignificant objects, and dilation operation twice to grow
the nucleus of WBCs. Its formula for getting a clear image
I3 from image I2 can be described as follows:
I3 = (I2 ⊖ B ⊕ B ⊕ B) ∩ I2 ,
where B is some chosen structuring element.
With the information of nuclei in I3 image, we begin
to locate and crop the WBCs. Let (xi , yi ) be the center
coordinates of the minimum bounding rectangle Ai that
contains the i-th WBC, i = 1, 2, . . . , N . Sometimes, the
selected rectangle may be not cover a whole cell, but just
Fig. 4  One cropped WBC image illustrates the algorithm for mer- a lobe of a nucleus. For example, as shown in Fig. 4, for
gence of lobe areas one WBC, there are three rectangles that contain one lobe,
respectively. So we need to find a rectangle that contains
the whole WBC. Here, we calculate the Euclidean dis-
R − B image I1 with a threshold value to get an image I2 tance of the centers of two rectangles. If the distance is
that makes the nuclei more obvious, as shown in Fig. 3. less than the diameter of the nucleus and the number of
We can see that the obtained image I2 in Fig. 3 contains pixels in two areas is less than the maximum value, we
some small objects that are not nuclei of WBCs, and some consider these two lobes are in the same nucleus and
nuclei may be not completed. In order to remove those update the rectangle and its center coordinate with the fol-
small objects from image I2 and complete the nuclei, we lowing formula
will apply some intersections of morphological operations  
xi + xj yi + yj
[43], such like erosion and dilation operation on the image Ai ← Ai ∪ Aj and (xi , yi ) ← ,
2 2
I2. Erosion operation can eliminate small and insignificant
objects. An erosion of B on X can be described as until the condition above fails. Generally, the diameter
  of the nucleus and the maximum value is related to the
X ⊖ B = z ∈ Z2 | (B)z ⊂ X , magnification of the image. For example, in our sampled
image, one nucleus contains around 3000 pixels. The
where X is a binary image, B is a structuring element, detail procedure of merging lobes of one nucleus is shown
and (B)z is the translation of B with respect to point z. in the following Algorithm 1.

13
Med Biol Eng Comput

Fig. 5  The flowchart of our


proposed method for detecting
WBCs from a peripheral image.
Convert color image into R − B
image, get a binary image using
a binarization on the R − B
image with a threshold value,
get rid of small objects and
complete the nuclei with mor-
phological operations, locate
WBCs with our algorithm of
merging lobes of one nucleus,
and crop each WBC from the
peripheral image

13
Med Biol Eng Comput

Fig. 6  The concrete architecture of CNN applied in our proposed method to extract features in high level. It consists of 5 convolutional layers
and 2 pooling layers

Algorithm 1 The mergence of lobes of one nucleus.


Input: The number N of minimum bounding rectangles, and the center coordinate
(xi , yi ) of each bounding rectangle Ai , i = 1, 2, · · · , N .
Output: The new number M and the coordinates of bounding rectangles {(xi , yi )}M
i=1 .

For i = 1 to N ;
k = 0;
For j = i + 1 to N ;
Calculate the distance of centers (xi , yi ) and (xj , yj ) of two bounding rectangles

d= (xi − xj )2 + (yi − yj )2 .

If d is less than the diameter of the nucleus and the number of pixels in Ai ∪ Aj
is less than the maximum value, update the rectangle and its center coordinate with
the following formula

xi + xj yi + yj
Ai ← Ai ∪ Aj and (xi , yi ) ← , ;
2 2

k = k + 1.
end if.
N = N − k.
end
end
Denote the final N as M , and collect all the new coordinates {(xi , yi )}M
i=1 .

13
Med Biol Eng Comput

Table 1  Databases speaking, traditional methods for classifying WBCs usu-


Num Resolution Format Usage ally extract some features of nucleus or cytoplasm for the
subsequent classifier, such like geometrical features, tex-
Cellavision database 14 2864 × 2909 JPG Detection ture features or color features. Therefore, the classification
1080 300 × 300 JPG Classification results depend greatly on the dissimilarity of the designed
ALL-IDB database 59 2592 × 1944 JPG Detection features. That is to say, how to design an effective feature
130 1712 × 1368 JPG Classification is very important for the WBC classification. In our pro-
Jiashan database 215 300 × 300 JPG Classification posed classification method, we want to seek a feature
extraction of WBCs based on CNN [33, 34]. As CNN will
exhibit better performance when it deals with larger set of
With the new coordinates {(xi , yi )}M i=1 of rectangles
images, while proportions of eosinophils and basophils in
Ai M
i=1 ,
we can locate WBCs and crop them from image I3 . peripheral blood image are 1 ∼ 5% and less 1%, respec-
Up to now, we have detected WBCs from the peripheral tively, we want to distinguish the eosinophil and basophil
image automatically. Its flowchart can be seen in Fig. 5 . from other types of WBCs firstly before applying CNN on
WBCs.
2.2 Classification of  WBCs Since eosinophil and basophil are full of some concen-
trate and big granules (as shown in Fig. 1a, b), we will
From Sect. 2.1, we have detected all kinds of WBCs from apply PRICoLBP feature [42] to embody the granularity
the peripheral image automatically. The left work is to of eosinophil and basophil, which enhances the discrimi-
determine which type the WBC is: neutrophils, eosino- native power of eosinophil and basophil from other types
phils, basophils, monocytes or lymphocytes? Generally of WBCs. PRICoLBP is an variant of local binary pattern

Fig. 7  Some images are sampled from ALL-IDB database, Cella- sampled from Cellavision database. c One cropped image from ALL-
vision database and Jiashan database. a One peripheral blood image IDB database. d One cropped image from Cellavision database. e
sampled from ALL-IDB database. b One peripheral blood image One image from Jiashan database

13
Med Biol Eng Comput

(LBP) that is a gray-scale invariant texture descriptor. For a the applied CNN in our method consists of 5 convolutional
point A in an image, its LBP code is computed by layers and 2 pooling layers. Its concrete architecture is
shown in Fig. 6.
n−1
 The applied learning algorithm for training CNN is the
LBP (A) = s(gi − gc )2i ,
stochastic gradient descent method. The update rule for
i=0
weight w is
where gc is the pixel value of point A, gi is the pixel value  
of point A’s i-th neighbor, and s(·) is the signal function ∂L
vi+1 := 0.9vi − 0.0005ǫwi − ǫ |w ,
whose value is 0 or 1 [44]. Let U (·) be the uniformity meas- ∂w i Di
ure on LBPs defined as wi+1 := wi + vi+1 ,
n
 where i is the iteration index, v is the momentum variable, ǫ
U ( LBP (A)) = |s(gi − gc ) − s(gi−1 − gc )|, ∂L
is the learning rate, and � ∂w |wi �Di is the average over the ith
i=1
batch Di of the derivative of the objective with respect to w,
then LBP u (A), the uniform LBP of point A, is defined evaluated at wi [34]. With the trained CNN, we can extract
as those LBPs of point A with U ( LBP (A)) ≤ 2, and a feature vector of 4096-dimension from each WBC image.
LBP ru (A), rotation invariant uniform LBP of point A, is Now a random forest [46] is used to classify those
defined as [44] extracted features with CNN to determine the left three
 n−1 types of WBCs: neutrophil, monocyte and lymphocyte. A
LBPru (A) = i=0 s(gi − gc ) U (LBP(A)) ≤ 2; random forest is a classifier consisting of a collection of
n+1 otherwise
tree-structured classifiers, and each tree casts a unit vote for
Now the PRICoLBP of two points A and B can be described the most popular class at input [46]. The mechanism and
as structure of a random forest can be summarized as follow-
ing two phases.
PRICoLBP (A, B) = [ LBP ru (A), LBP u (B, i(A))]co ,
where LBP u (B, i(A)) is the uniform LBP of point B by 1. Training phase: Given a set of training set
using i(A)th index as the start point of the binary sequence X = {xi }N
i=1 ⊆ R and the corresponding class labels
m

[42]. PRICoLBP feature can not only capture the spatial Y = {yi }N
i=1 ⊆ {1, 2, . . . , c}, where c is the number of
context co-occurrence information effectively, but also pos- classes, the number of trees L in the forests, and the i-th
sess rotation invariance. decision tree Ti in the random forests, i = 1, 2, . . . , L .
With these obtained PRICoLBP features for five types of
WBC, we use SVM [45] to classify them into three classes:
eosinophil, basophil, and others. Step 1. For each decision tree Ti, generate a training set
Now the left work is to classify the remaining three with N bagging sampling, that is, sampling N
kinds of WBCs: neutrophil, monocyte and lymphocyte. times from training set X with replacement.
Here we want to apply CNN [34] to extract features of Step 2. At each non-leaf node t, the best split attribute is
WBCs in high level, and take random forest [46] as a clas- calculated by an approach of Gini impurity index
sifier to distinguish these features. The idea of CNN is to defined by
discover multiple levels of representation, with the assump-
tion that higher-level features can represent more abstract c

semantics of the data. Those extracted features learned Gini(t) = 1 − [p(j|t)]2 , j = 1, . . . , c
from a deep network are expected to provide more invari- j=1

ance to intra-class variability [32]. CNN is a special feed- over the randomly chosen features, where p(j|t) is the prob-
forward neural network that consists of several convolu- ability of class j in the node t.
tional layers and pooling layers. The convolution layer
filters the input image with some small matrix of weights Step 3. Go to Step 2 until Ti is fully grown without being
and applies some nonlinear function as an active function. pruned.2. Classification phase: For a given
For example, for an input image I, let W ∈ Rk1 ×k2 be a fil- test sample x, it is pushed down each classifier in
ter of weights, then the operation in convolution layer is to forests and every decision tree will give only one
take the 2D convolution I × W , and the active function vote on the label of this sample. Then the pre-
can be taken f (x) = max(0, x). The pooling layer does not dicted label of x is determined as the one which
contain weights and simply reduces the size of the preced- has the most votes in the forests.
ing output with the max-pooling operation. In this paper,

13
Med Biol Eng Comput

Table 2  Sensitivity and No. of images Our method Iterative threshold method [14]
precision comparison of our
proposed method and the TP FP FN TPR (%) PPV (%) TP FP FN TPR (%) PPV(%)
iterative threshold method in the
paper [14] on the Cellavision #1 8 2 0 100 80 10 0 9 52.6 100
database #2 7 0 0 100 100 – – – – –
#3 7 0 0 100 100 6 1 5 54.5 85.7
#4 3 0 1 75 100 3 0 0 100 100
#5 0 0 0 100 100 0 0 3 0 100
#6 4 1 0 100 80 – – – – –
#7 6 0 0 100 100 6 0 0 100 100
#8 5 0 0 100 100 – – – – –
#9 5 0 0 100 100 5 0 8 38.5 100
#10 9 0 1 90 100 – – – – –
#11 12 1 2 85.7 92.3 13 0 4 76.5 100
#12 7 0 0 100 100 5 2 6 45.5 71.4
#13 1 0 0 100 100 – – – – –
#14 2 0 0 100 100 – – – – –

Table 3  Cost time (unit: s) comparison of our method and the itera- 2.3 Databases and evaluation criteria
tive threshold method [14]
Database Cost time of our method (s) Cost time of iterative It is a phenomenon that there is no big and public database
threshold method (s) for WBC detection and classification. So many papers test
their recognition system with only a few WBC images, or
Cellavision 29.06 65.18
with their own databases that are not for public. In order to
ALL-IDB 71.18 125.09
illustrate the high effect of our proposed method, we collect
all the currently known datasets and some sample images
from local hospital as possible as we can. Here Cellavision
database [47] is an innovative and global medical technol-
ogy company that develops and sells best-in-class systems
for the routine analysis of blood and other body fluids. We
download 14 microscope images of size 2864 × 2909 in
JPG format with 24-bit color depth to test our detection
method and 1080 cropped images of size 300 × 300 in
JPG format with 24-bit color depth to test our classification
method from the database in CellaVision Proficiency Soft-
ware. ALL-IDB database [48] is a public image database
consisting of the peripheral blood samples collected by
some experts from some normal individuals and leukemia
patients of childhood leukemia and hematological diseases.
It contains two distinct versions. One includes 108 images
in JPG format with 24-bit color depth, most of which were
captured with an optical microscope in different magnifica-
tions ranging from 300 to 500 and a Canon PowerShot G5
camera with a resolution 2592 × 1944. The other images
Fig. 8  Proportions of images with different PPV via our detection were captured with a microscope in a constant magnifica-
method for 59 images in ALL-IDB database. 92% of 59 periph- tion and an Olympus C2500L camera with a resolution
eral blood images have PPV 100%; 2, 3, 3% of 59 peripheral blood 1712 × 1368. In our experiments, the normal samples are
images have PPV 75, 67, 50%, respectively
chosen as our training and testing data. Jiashan database

13
Med Biol Eng Comput

Table 4  Accuracy comparison of our proposed classification method with Seyed method [18] and HSVM method [31] on the mixed database of
Cellavision, ALL- IDB and Jiashan databases
Methods Basophil (%) Eosinophil (%) Lymphocyte (%) Monocyte (%) Neutrophil (%) Classification accuracy (%)

Ours 100 70 74.8 85.3 97.1 92.8


Seyed [18] 53.0 63 85.0 39.0 50.8 76.8
HSVM [31] 43.8 0 66.8 0 7.5 76.3

is an image database consisting of the peripheral blood microscope images of Cellavision database and ALL-IDB
samples collected by some experts in Jiashan First Peo- database as the testing data. According to our detection
ple’s Hospital from some patients. It includes 215 cropped method described in Sect. 2.1, the threshold for separating
images in JPG format with 24-bit color depth and a resolu- nucleus from the background can be set from −5 to 0.
tion 300 × 300 for testing the classification. Table 2 shows sensitivity (TPR) and precision (PPV) of
In all of these databases, each image for detection has our proposed method with the iterative threshold method
an associated text file containing the coordinates of the [14] on the Cellavison database. Furthermore, Table 3
centroid of each leukocyte. They are manually labelled by exhibits their cost time on Cellavison database and ALL-
a skilled operator and can be used as a ground truth. The IDB database.
information of these databases is shown in Table 1, and As shown in Table 2, our proposed detection method
some sampled images from these databases are shown in has an outstanding performance than the iterative thresh-
Fig. 7. old method [14] for most sampled images. For exam-
To evaluate the good performance of a method, we apply ple, method can not detect 6 microscope images that
sensitivity (or true positive rate, TPR) and precision (or occupy 43% in the sampled images. Even for the images
positive predictive value, PPV) as follows: that method can detect, our proposed method almost has
a higher sensitivity, and method easily detects other non-
TPR = TP/(TP + FN), PPV = TP/(TP + FP),
WBCs as WBCs, which will also bring the bad effect in
where TP is the number of WBC in a microscope image diagnosis.
that has been detected to be a WBC correctly, FP is the Table 3 exhibits the comparison of cost time of our pro-
number of WBC in a microscope image that has not been posed method with the iterative threshold method [14],
detected to be a WBC, and FN is the number of non-WBC where the cost time is taken from inputting the peripheral
in a microscope image that has been detected to be a WBC. blood image to get the subimages of all kinds of WBCs.
Therefore, the higher the sensitivity and precision, the bet- As observed from Table 3, the iterative threshold method
ter the detected method. takes twice times as our method does for detecting WBCs,
All the programmes are carried out in MATLAB which implies that our method is quick and the cost time is
8.4.0.150421 (R2014b) environment running on 8 proces- acceptable in medical diagnosis.
sor with the speed of 2.40 GHz. Table  2 illustrates the good performance of our detec-
tion method on Cellavision database. Also, we take the
same experiments on the ALL-IDB database. Here we fur-
3 Results ther show the entire effect of our detection method on 59
peripheral blood images in ALL-IDB database. Then the
3.1 Experiments on WBC detection proportions of images with different PPV via our detection
method are shown in Fig. 8.
In this subsection, we compare the validity of our pro- Figure  8 shows that the images that have PPV 100%
posed detection method for WBCs with the iterative thresh- occupy 92% in 59 peripheral blood images; the left images
old method in the paper [14]. For the iterative threshold that have 75, 67, 50% occupy 2, 3, 3 in 59 peripheral blood
method [14], Wu et al. consider the WBC recognition from images, respectively, which implies that our detection
the global process. They start the image cropped by experts method can detect almost WBCs from peripheral blood
from microscope, find the discriminating region of WBC in images.
the HSI color space, segment WBC with a morphological
process, extract some geometrical, color, texture features, 3.2 Experiments on WBC classification
and classify the obtained features with three kinds of neu-
ral networks: multilayer perceptron, SVM, and the hyper In this subsection, 1080 cropped images of Cellavi-
rectangular composite neural networks. We take some sion database, 20 cropped images of ALL-IDB database,

13
Med Biol Eng Comput

215 images of Jiashan database with normalized size of method [14] can detect, our proposed method has a higher
227 × 227 are put together to test the effect of our clas- sensitivity (TPR) and precision (PPV) mostly. Because our
sification method. Firstly, we extract the PRICoLBP feature method makes the best of the simple relation of colors R, B
for each image and apply SVM to distinguish the eosino- and uses the morphological operation to delete the noises
phils and basophils from other three types of WBCs. Then and complete the nucleus, it can possibly avoid recogniz-
for the left three kinds of WBCs: neutrophil, monocyte and ing non-WBCs as WBCs while iterative threshold method
lymphocyte, CNN is used to extract feature in high level often regards non-WBCs as WBCs. At last, our detection
with 4096 dimension. After normalizing these features, we method does not need the iteration, so it costs less times
use a random forest to classify them. Classification process than the iterative threshold method does.
is run 50 times, and the average result is taken as the final Of course, our detection method is not perfect. There
classification accuracy. The final classification comparison is also some limitations. It cannot detect all the WBCs for
results of our proposed classification method with Seyed some peripheral images, for example, as #1, #6, #11 images
method [18] and hierarchical SVM (HSVM) method [31] shown in Table 2. And it sometimes regards a few non-
for WBCs are shown in Table 4. WBCs as WBCs, for example, as #4, #10, #11 images show.
Although Seyed [18] and HSVM [31] methods are suc- Hence, how to find a more effective detection method based
cessful on their respective databases, Table 4 shows that on our method is the direction of our study in the future.
they are not successful on our mixed database. The rea- For the classification of our method, as shown in
son for this is that the database has a strong challenge and Table  4, the classification accuracy of our proposed
is larger. However, our method employed CNN that can method is almost superior to Seyed method [18] and
extract multiple-levels features from WBCs effectively HSVM method [31] do on the mixed database of Cel-
on the larger database. Therefore, it is almost superior to lavision, ALL- IDB and Jiashan databases. Generally
Seyed and HSVM methods from each kind of WBC or the speaking, feature extraction and design of a classifier
total recognition rate on the challenge database. are two key steps in classification. Our proposed classi-
fication method considers these two factors in the process
of design. For example, considering that eosinophil and
4 Discussion basophil are full of some granules, we extract the PRI-
CoLBP features to enhance the discriminative power of
Since the counting of WBCs in peripheral blood image can eosinophil and basophil from other types of WBCs. As
assist pathologists to diagnose diseases, it is meaningful for expected, the accuracy of basophil and eosinophil with
the paper to discuss the automatic detection and classifi- our method is 10 and 70%, respectively, which is higher
cation for WBCs from microscope images. The proposed far from the results for Seyed method and HSVM method.
method firstly detects WBCs based on a simple relation of The reason is that we design a better features for eosino-
colors R and B and the morphological operation, then PRI- phil and basophil than Seyed method and HSVM method
CoLBP feature is extracted, and SVM is applied to distin- do. Furthermore, for the left three types of WBCs: lym-
guish the eosinophils and basophils from other three types phocyte, monocyte, and neutrophil, we apply CNN to dis-
of WBCs. While for the left three kinds of WBCs: neutro- cover features in high level to enforce the discrimination
phil, monocyte and lymphocyte, CNN is used to extract of these three WBCs. And an effective classifier: random
feature in high level and a random forest is applied to dis- forest is applied to classify the extracted features. Several
tinguish them. In a word, the design of our method is based studies have shown that combining multiple weak classi-
on the essential information of the peripheral blood image fiers into one aggregated classifier can lead to better clas-
and the structure of WBCs. And furthermore, some effec- sification performance than any of the weak individuals
tive methods are introduced to improve the classification [49], random forest shows an effective classification. As
accuracy. expected, the accuracy of monocyte and neutrophil with
For the detection of our method, as Tables 2 and 3 our method is 85.3 and 97.1%, respectively, which is
show, our proposed method has an outstanding perfor- higher far from the results for Seyed method and HSVM
mance than the iterative threshold method [14] for most method do.
images. Because the iterative threshold method [14] uses Of course, our classification method is not also perfect.
the iterative Otsu’s approach on HSI space based on circu- The classification for lymphocyte is 74.7%, which is less
lar histogram to detect WBCs, it sometimes cannot detect than 85% of Seyed method. The reason is that the number
WBCs for some peripheral blood images. For example, as of lymphocyte images for training CNN is less, while CNN
#2, #6, #8, #10, #13, #14 images shown in Table 2, no matter does good jobs for large data. Therefore, how to improve
how the threshold is taken with iteration, WBCs cannot be the classification of lymphocyte images based on our
segmented from the background. Even for the images that method is another direction of our study in the future.

13
Med Biol Eng Comput

5 Conclusion Acknowledgements  This study was funded by the National Natural


Science Foundation of China (61571410, 61672477, and 91330118)
and the Zhejiang Provincial Nature Science Foundation of China
In this paper, we have proposed an automatic detection (LY14A010027).
and classification system for WBCs from peripheral blood
images. Our proposed method firstly uses the simple rela- Compliance with ethical standards 
tion of color R, B to get R − B image, applies morphologi-
cal operation to delete the noises and complete nucleus, Conflict of interest  The authors declare that they have no conflict of
interest.
and then gives an algorithm of merging lobes of nucleus to
help detecting WBCs from peripheral blood images. With
Human and animal rights  This study did not involve human partici-
the detected WBCs, we use PRICoLBP feature and SVM to
pants and animals.
distinguish the eosinophils and basophils from other three
types of WBCs first. Then for the left three kinds of WBCs:
Informed consent  The all authors of this paper have consented the
neutrophil, monocyte and lymphocyte, CNN is used to submission.
extract feature in high level and a random forest is applied
to distinguish them.
The main contributions of this paper are highlighted as
follows: References

1. Ding Y, John NW, Smith L, Sun JA, Smith M (2015) Combina-


1. In the step of detecting WBC from the peripheral blood tion of 3D skin surface texture features and 2D ABCD features
image, unlike the traditional strategy that converts for improved melanoma diagnosis. Med Biol Eng Comput
color image into other color space, we have applied the 53(10):961–974
2. Ross NE, Pritchard CJ, Rubin DM, Duse AGY, Ding NW, John
simple relation of R, B to replace the HSI space.
L, Smith JA, Sun MS (2006) Automated image processing
2. We have proposed an algorithm that combines the method for the diagnosis and classification of malaria on thin
lobes of nuclei using their own characteristics, and blood smears. Med Biol Eng Comput 44(5):427–436
then detect WBC using the location of nucleus of 3. Acharya UR, Mookiah MRK, Sree SV, Afonso D, Sanches J,
Shafique S, Nicolaides A, Pedro LM, Fernandes JFE, Suri JS
leukocyte, which is timeless and accurate in experi-
(2013) Atherosclerotic plaque tissue characterization in 2D ultra-
ments. sound longitudinal carotid scans for automated classification:
3. Considering that eosinophil and basophil are full of a paradigm for stroke risk assessment. Med Biol Eng Comput
some granules, we designed a plan to extract the granu- 51(5):513–523
4. Su MC, Cheng CY, Wang PC (2014) A neural-network-based
larity via PRICoLBP feature to classify the eosinophil
approach to white blood cell classification. Sci World J 1:1–9
and basophil from other types of WBCs. 5. Gu G, Cui D, Li X (2012) Segmentation of overlapping Leuco-
4. For the left three types of WBCs: lymphocyte, mono- cyte images with phase detection and spiral interpolation. Com-
cyte, and neutrophil, we have applied CNN to extract put Methods Biomech Biomed Eng 15(4):425–433
6. Sheikh H, Zhu B, Tzanakou EM (1996) Blood cell identification
the most effective features and a random forest to clas-
using neural networks. In: Proceedings of the IEEE 22nd annual
sify them accurately. northeast bioengineering xonference, pp 119–120
5. We have proposed an automatic recognition system 7. Yampri P, Pintavirooj C, Daochai S, Teartulakarn S (2006) White
for WBC, that is to say, a method for detecting WBCs blood cell classification based on the combination of eigen cell
and parametric feature detection. In: Proceedings of the 1st IEEE
from peripheral blood image directly and classifying
conference on industrial electronics and applications (ICIEA
them without manual operation. 06), pp 1–4
8. Lin QM, Deng YY (2002) An accurate segmentation method
Experiments on Cellavison database, ALL-IDB database for white blood cell images. IEEE Int Symp Biomed Imaging
2002:245–248
and Jiashan database show that our proposed method has
9. Shirazi SH, Umar AI, Naz S, Razzak MI (2016) Efficient Leu-
better detection and classification than some other methods kocyte segmentation and recognition in peripheral blood image.
with less cost time. Technol Health Care 24(3):335–347
Of course, our proposed method is not 100% per- 10. Li Y, Zhu R, Mi L, Cao YH, Yao D (2016) Segmentation of
white blood cell from acute Lymphoblastic Leukemia images
fect. There is also some limitations. For example, it can-
using dual-threshold method. Comput Math Methods Med.
not detect all the WBCs for some peripheral images, and doi:10.1155/2016/9514707
it sometimes regards a few non-WBCs as WBCs. What 11. Bikhet SF, Darwish AM, Tolba HA, Shaheen SI (2000) Segmen-
is more, the classification for lymphocyte needs to be tation and classification of white blood cells. Proc IEEE Int Conf
Acoust Speech Signal Process 4:2259–2261
improved. Hence, how to find a more effective detection
12. Nilufar S, Ray N, Zhang H (2008) Automatic blood cell classi-
method and how to improve the classification of lympho- fication based on joint histogrambased feature and Bhattacharya
cyte images based on our method are the directions of our Kernel. In: Proceedings of the 42nd Asilomar conference on sig-
study in the future. nals, systems and computers (ASILOMAR 08), pp 1915–1918

13
Med Biol Eng Comput

13. Nazlibilek S, Karacor D, Ercan T, Sazli MH, Kalender O, Ege 31. Tai WL, Hu RM, Hsiao HCW, Chen RM, Tsai JJP (2011) Blood
Y (2014) Automatic segmentation, counting, size determination cell image classification based on Hierarchical SVM. IEEE Int
and classification of white blood cells. Measurement 55:58–65 Symp Multimed ISM 2011:129–136
14. Wu J, Zeng P, Zhou Y, Olivier C (2007) A novel color image seg- 32. Hinton GE, Osindero S, Teh YW (2006) A fast learning algo-
mentation method and its application to white blood cell image rithm for deep belief nets. Neural Comput 18(7):1527–1554
analysis. In: International conference on signal processing pro- 33. LeCun Y, Bengio Y (1995) Convolutional networks for images,
ceedings, ICSP, vol 2 speech, and time series. In: Arbib MA (ed) The handbook of
15. Dorini LB, Minetto R, Leite NJ (2013) Semiautomatic white brain theory and neural networks. MIT Press, Cambridge, MA,
blood cell segmentation based on multiscale analysis. IEEE J USA, pp 255–258
Biomed Health Inform 17(1):250–256 34. Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classifi-
16. Osowski S, Siroic R, Markiewicz T, Siwek K (2009) Appli- cation with deep convolutional neural networks. Adv Neural Inf
cation of support vector machine and genetic algorithm for Process Syst 25(2):1097–1105
improved blood cell recognition. IEEE Trans Instrum Meas 35. Umpon NT, Gader PD (2002) System-level training of neural
58(7):2159–2168 networks for counting white blood cells. IEEE Trans Syst Man
17. Rubeto CD, Dempster A, Khan S, Jarra B (2000) Segmentation Cybern Part C 32(1):48–53
of blood images using morphological operators. In: Proceedings 36. Long X, Cleveland WL, Yao YL (2005) A new preprocessing
of the 15th international conference on pattern recognition, vol 3, approach for cell recognition. IEEE Trans Inf Technol Biomed
p 3401 9:407–412
18. Rezatofighi SH, Soltanian-Zadeh H (2011) Automatic recogni- 37. Nattkemper TW, Ritter HJ, Schubert W (2001) A neural classifier
tion of five types of white blood cells in peripheral blood. Com- enabling highthroughput topological analysis of lymphocytes in
put Med Imaging Graph 35(4):333–343 tissue sections. IEEE Trans Inf Technol Biomed 5:138–149
19. Guimaraes LV, Suzim AA, Maeda J (2000) A new automatic cir- 38. Shitong W, Min W (2006) A new detection algorithm (NDA)
cular decomposition algorithm applied to blood cells image. In: based on fuzzy cellular neural networks for white blood cell
IEEE international symposium on bio-informatics and biomedi- detection. IEEE Trans Inf Technol Biomed 10:5–10
cal engineering, pp 277–280 39. Ravikumar S (2016) Image segmentation and classification of white
20. Chassery JM, Garbay C (1984) An iterative segmentation
blood cells with the extreme learning machine and the fast relevance
method based on contextual color and shape criterion. IEEE vector machine. Artif Cells Nanomed Biotechnol 44(3):985–989
Trans Pattern Anal Mach Intell 6(6):794–800 40. Bomma R, Venkatesh P, Dlvnsssr AK, Babu AY, Rao SK (2012)
21. Ghosh P, Bhattacharjee D, Nasipuri M (2016) Blood smear ana- PONDR (predicators of natural disorder regions). Int J Comput
lyzer for white blood cell counting: a hybrid microscopic image Technol Electron Eng IJCTEE 2(4):1–10
analyzing technique. Appl Soft Comput 46:629–638 41. Domenico TD, Walsh L, Martin AJM, Tosatto SCE (2012)

22. Hazlyna N, Mashor MY (2011) Segmentation technique for
MobiDB: a comprehensive database of intrinsic protein disorder
acute leukemia blood cells images using saturation component annotations. Bioinformatics 28(15):2080–2081
and moving l-mean clustering procedures. Int J Electr Electron 42. Qi XB, Xiao R, Li CG, Qiao Y, Guo J, Tang XO (2014) Pairwise
Eng Technol 1:23–35 rotation invariant co-occurrence local binary pattern. IEEE Trans
23. Salihah ANA, Mashor MY, Harun NH, Abdullah AA, Rosline H Pattern Anal Mach Intell 36(11):2199–2213
(2010) Improving colour image segmentation on acute myelog- 43. Gonzalez RC (2009) Digital image processing. Pearson Educa-
enous leukaemia images using contrast enhancement techniques. tion India, New York City, pp 649–657
In: Proceedings of the IEEE EMBS conference on biomedical 44. Ojala T, Pietikäinen M, Mäenpää T (2002) Multiresolution gray-
engineering and sciences (IECBES 10), pp 246–251 scale and rotation invariant texture classification with local binary
24. Cuevas E, Díaz M, Manzanares M, Zaldivar D, Pérez-Cisneros patterns. IEEE Trans Pattern Anal Mach Intell 24(7):971–987
M (2013) An improved computer vision method for white blood 45. Vapnik VN (1999) An overview of statistical learning theory.
cells detection. Comput Math Methods Med 2013:137392. IEEE Trans Neural Netw 10(5):988–999
doi:10.1155/2013/137392 46. Breiman L (2001) Random forests. Mach Learn 45(1):5–32
25. Cuevas E, Oliva D, Díaz M, Zaldivar D, Pérez-Cisneros M,
47. Cellavision Inc (2011). http://www.cellavision.com/
Pajares G (2013) White blood cell segmentation by circle detec- 48. Labati RD, Piuri V, Scotti F (2011) All-IDB: the acute lymphoblas-
tion using electromagnetism-like optimization. Comput Math tic leukemia image database for image processing. In: 18th IEEE
Methods Med 2013:395071 international conference on image processing (ICIP), pp 2045–2048
26. Chaira T (2014) Accurate segmentation of Leukocyte in blood 49. Bauer E, Kohavi R (1999) An empirical comparison of voting
cell images using Atanassov’s intuitionistic fuzzy and interval classification algorithms: bagging, boosting, and variants. Mach
Type II fuzzy set theory. Micron 61:1–8 Learn 36(1–2):105–139
27. Guo N, Zeng L, Wu Q (2007) A method based on multispectral
imaging technique for white blood cell segmentation. Comput Dr. Jianwei Zhao  is currently
Biol Med 37(1):70–76 a Professor and the Head of
28. Mohapatra S, Patra D, Satpathy S (2011) Automated leukemia Department of Applied Mathe-
detection in blood microscopic images using statistical texture matics, China Jiliang Univer-
analysis. In: Proceedings of the 2011 international conference on sity. Her research interests
communication, computing and security, pp 184–187 include image processing and
29. Sinha N, Ramakrishnan AG (2003) Automation of differential neural networks.
blood count. Proc TENCON Conf Converg Technol Asia Pac
Reg 2:547–551
30. Kuse M, Sharma T, Gupta S (2010) A classification scheme for
lymphocyte segmentation in H&E stained histology images. In:
Ünay D, Çataltepe Z, Aksoy S (eds) Recognizing patterns in sig-
nals, speech, images and videos. Springer, Berlin, pp 235–243

13
Med Biol Eng Comput

Minshu Zhang  is currently Dr. Jianjun Chu  is currently


working toward an M.Sc. degree a General Manager of Jiashan
in Applied Mathematics at Jasdaq Medical Device Co.,
China Jiliang University, China. Ltd., China. His research inter-
Her research interests include ests medical image processing.
medical image processing and
machine learning.

Dr. Zhenghua Zhou  is cur- Dr. Feilong Cao  is currently a


rently an Associate Professor Professor and the Dean of Col-
with Department of Information lege of Sciences, China Jiliang
and Computational Sciences, University. His research inter-
China Jiliang University. His ests include image processing
research interests include medi- and big data analysis and
cal image processing and computation.
CAGD.

13

You might also like