IWSSIP2020
IWSSIP2020
net/publication/342901979
CITATIONS READS
33 617
7 authors, including:
All content following this page was uploaded by Joao Manuel R. S. Tavares on 13 July 2020.
Abstract—Acute leukemia is a cancer-related to a bone marrow where the two main types of leukocytes are lymphoid and
abnormality. It is more common in children and young adults. This myeloid. Thus, the four main types of leukemia are Acute Lym-
type of leukemia generates unusual cell growth in a short period, phoblastic Leukemia (ALL), Acute Myeloid Leukemia (AML),
requiring a quick start of treatment. Acute Lymphoid Leukemia
(ALL) and Acute Myeloid Leukemia (AML) are the main re- Chronic Lymphocytic Leukemia (CLL), and Chronic Myeloid
sponsible for deaths caused by this cancer. The classification of Leukemia (CML) [1]. While ALs generate an abnormal cells
these two leukemia types on blood slide images is a vital process growth in a short time and reach mostly children, CLs tend
of and automatic system that can assist doctors in the selection to reach adults and the elderly, due to the progression in the
of appropriate treatment. This work presents a convolutional mature cells production being slower, taking months or even
neural networks (CNNs) architecture capable of differentiating
blood slides with ALL, AML and Healthy Blood Slides (HBS). years [1]. Thus, ALL and AML require a diagnosis in the
The experiments were performed using 16 datasets with 2,415 early stages of the disease to provide appropriate treatment.
images, and the accuracy of 97.18% and a precision of 97.23% Figure 1 shows examples of blood slide images used in our
were achieved. The proposed model results were compared with tests with ALL, AML, and Healthy Blood Slides (HBS).
the results obtained by the state of the art methods, including also
based on CNNs.
Index Terms—leukemia diagnosis, convolutional neural net-
work, computer aided diagnosis.
I. I NTRODUCTION
The bone marrow produces a large proportion of blood cells, (a) (b) (c)
among them 100 million of leucocytes (white blood cells) per
Figure 1. Examples of images used in this work: (a) ALL example [2], (b)
day on average. Leukocytes act combating and eliminating AML example [3], and (c) HBS example [4].
microorganisms and foreign chemical structures in the body
employing a catch (phagocytosis) or antibody production. One The use of computer systems can assist in fast leukemia
of the diseases affecting the bone marrow function is leukemia diagnosis. Currently, Convolutional Neural Networks (CNNs)
[1]. are one of the most effective techniques in diagnosing medical
Leukemia is a type of cancer that mostly affects the popu- images. However, CNNs demand a high computational cost,
lation. The American Cancer Society (ACS)1 estimated 61,780 and in systems with low processing and storage power, this
new cases for the year 2019, with approximately 22,840 deaths. technique becomes difficult to employ [5]. Therefore, in this
This disease has no defined etiology and affects the production work, we propose CNN architectures models with residual
of cells by the bone marrow. Over time, diseased cells replace characteristics to classify blood slides into three classes: ALL,
healthy blood cells (white, red blood cells, and platelets), and AML, and HBS. When building the model, we evaluated the
the individual suffers from problems in transporting oxygen tradeoff between accuracy and the number of parameters to
and fighting infections [1]. Among the forms of diagnosis of attain a model that takes up less memory and has performance
leukemia, the complete blood count (CBC) and the myelogram comparable to the state of the art. We evaluated the pro-
are the most used. posed model in 16 heterogeneous datasets with 2,415 images,
There are numerous types of leukemia. However, the most combined with data augmentation techniques. Additionally, we
usual classification considers two main characteristics: (1) the compared the achieved results with the results obtained by the
cell maturation time, resulting in Acute Leukemia (AL) and state of the art methods, including based on CCNs.
Chronic Leukemia (CL), and (2) the leukocyte affected type, This paper is organized as follows. Section II presents related
works; In Section III, we present the proposed CNN models,
1 https://www.cancer.org/cancer/leukemia.html
the dataset used, the data augmentation technique employed,
978-1-7281-7539-3/20/$31.00 ©2467422020 IEEE and the applied evaluation metrics. Sections IV and V present
the achieved results and a discussion; and finally, we present Leukemias Recognition Network (Alert Net), a CNN for the
the conclusion and possibilities of future work in Section VI. acute leukemia classification in blood slides.
Alert Net has five convolutional layers, followed by Batch
II. R ELATED W ORKS Normalization and Max Pooling layers. The shallower layers
We carried out a systematic survey of state of the art re- are formed by two fully connected layers, followed by a dropout
lated to leukemia computer-aided diagnosis. The survey aimed operation and a softmax layer with three neurons. This model
to identify and classify the available works in the literature has characteristics existing in sequential architectures presented
based on the techniques used, the year of publication, and the in state of the art. However, we were searching for a trade-off
relevance. between the number of parameters and accuracy. Therefore,
The survey was realized using three public datasets: Scopus, we proposed a model with 8 million parameters, which is, for
Web of Science and IEEE Xplore. We used the following search example, seven times less than Alex Net.
strings: “leukemia classification”, “white blood cell classifi- From the initial model, we carried out an ablation study,
cation”, and “blood smear leukemia classification”. Following to remove or replace layers in Alert Net. Thus, we built two
this, we selected works published after 2012 in engineering and models using technologies implemented in some of the CNNs
computer science fields. As a result, we obtained 423 articles. with the best results in the ImageNet competition. They are
We then analyzed the title and abstract of these, aiming to ResNet [22] and Xception [23]. We developed the Alert Net
eliminate repeated documents and those with non-automatic with a Residual Layer (Alert Net-R) and the Alert Net with
classification methods. Table I presents the works found in the Depthwise Separable Convolutions Layer (Alert Net-X). In
literature, organized according to their purpose. Figure 2, it is shown the three models used and illustrated the
We organized the selected papers into four approaches using layer types in each one.
the diagnosis type suggested by the authors. We found studies
that performed the diagnosis between images with leukemia and
healthy, regardless of the type of leukemia [6]. Some authors
Input
images, where 150 were of ALL type, 150 of AML type, and
200 of HBS type. According to the authors, the tests showed Alert Net-X
promising results.
Convolutional Layer Depth-wise Separable Convolution
III. M ATERIALS AND M ETHODS
BatchNormalization Fully Connected Layer
The purpose of this paper is to present models of CNN MaxPooling Dropout
architectures to diagnose acute leukemia types in blood slide
images. To develop the architectural model proposed in this Figure 2. Topologies of the proposed models.
work, we rely on architectures that recently obtained the best
results in leukemia detection, according to the studies found in In the Alert Net-R development, we inserted a residual
the literature. structure similar to ResNet. Initially, we use max pooling after
The dataset used in this research hold 2,415 images, which the input layer to resize the original image. The result of this
does not represent a large number of images for training a CNN. operation is concatenated with the second convolutional layer
A solution found to increase the generality of the model, and max pooling result. Therefore, the image to be concatenated
attack the few cases problem for training the network is the does not undergo modifications in the initial convolutional lay-
Data Augmentation technique that generates new samples for ers. We observed that the residue generated tends to propagate
training. the image’s essential characteristics during the training. Studies
presented in the literature prove the efficiency of this approach
A. Proposed CNN Models [22].
Based on state of the art architectures, such as AlexNet [19], To develop a model that enables less computationally ex-
CaffeNet [20], and VggNet [21], we initially studied the Acute pensive training, we studied Alert Net-X. The Depth-Wise
Table I
S UMMARY OF WORKS IDENTIFIED IN THE STATE OF THE ART AS TO : YEAR , DESCRIPTOR ( S ), CLASSIFIER , NUMBER OF IMAGES USED AND ACCURACY.
Separable Convolution layers were introduced in the Xception different datasets, contributing to the creation of a complex set
architecture and provide greater computational efficiency since with different resolutions, dyes, approximations, and contrast.
the number of operations performed during convolution is Such approach is similar to the one used to obtain microscopic
reduced. That is, they have less complexity and require less images in daily medical practices [14].
training time than regular convolutional layers. For the proper use of the images under study, we carried out
two pre-processing operations. The first was the central clipping
B. Image Dataset
considering the smaller side image since CNN architectures
The development of a robust methodology to aid in the require square inputs. The second operation was to resize the
diagnosis depends on the data used in its validation. The main input images to 224 × 224 pixels because these are the standard
challenge found in state of the art is related to the acquisition CNN input dimensions.
of the datasets since most of them are private. However, we In the Bloodline dataset, we observed the existence of 15
obtained 16 public datasets with 2,415 images for the evaluation rectangular images containing at least two leukocytes. In these
of the proposed model. In Table II, the used image datasets are images, the clipping of the leukocytes was done manually, since
presented according to the addressed classes. the pre-processing operations would eliminate the region of
interest for the classification. The clipped images were added
Table II to the dataset, resulting in a total of 217 samples.
S UMMARY OF THE USED IMAGE DATASETS .
Dataset HBS ALL AML Other types Total Ref. C. Data Augmentation
ALL-IDB 1 59 49 - - 108 [2]
ALL-IDB 1 (Crop) - 510 - - 510 [2]
Deep neural networks have been successfully applied to
ALL-IDB 2 130 130 - - 260 [2] Computer Vision tasks such as image classification, object
Leukocytes 149 - - - 149 [24]
CellaVision 109 - - - 109 [25]
detection, and image segmentation, thanks to the evolution of
Atlas - 25 40 23 88 - CNNs. However, these networks rely on a large amount of data
Omid et al. 2014 154 - - - 154 [4] to avoid overfitting [31].
Omid et al. 2015 - - 27 - 27 [26]
ASH-OK - - 96 - 96 [3] Improving generalization of these models is one of the main
Bloodline - - 217 - 217 [27] challenges in the area, but Data Augmentation is a powerful
ONKODIN - - 78 - 78 [28]
CellaVision 2 100 - - - 100 [29] way to overcome this difficulty, Augmented data is expected to
JTSC 300 - - - 300 [29] represent a more extensive dataset, minimizing the differences
UFG 57 10 27 27 121 -
PN-ALL Dataset - 30 - - 30 [30] between the training and validation sets as well as any future
leukemia-images - 40 78 22 140 - test sets [31].
Total of images 1058 794 563 72 2487 -
Usual augmentation operations are rotation in the range of
0º to 40º, vertical, horizontal, shear, and zoom in the field of 0
Among the images listed in Table II, we disregarded those to 0.2, as well as horizontal and vertical flip. One should notice
“Other Types" class, since the amount of images in this class that the nuclei images do not have asymmetry allowing flipping
does not form a sufficiently representative set. Thus, we used in both directions. The reflection fill operation was applied
the HBS, ALL, and AML classes in the building of the pro- to replace black pixels resulting from rotation and translation
posed model. One can note that these classes were built using techniques. Finally, we normalized the input image pixels to
values between 0 (zero) and 1 (one). The augmentation resulted regularization technique, and its use reduces the generalization
in a dataset 20 times bigger than the original. capacity of the model. To deal with this, it would be necessary
to increase the model size because typically, the error in the
D. Evaluation Metrics validation dataset is much smaller when using dropout, but with
To analyze the classification results, we computed the con- accounting larger models cost more training iterations. When
fusion matrix. Then, from the elements of this matrix, we the training dataset is small, the use of the dropout becomes
calculated the precision (P), recall (R), and accuracy (A). less effective.
We also computed the kappa index (k), which is recom-
mended as an appropriate exactitude measure as it can ade- V. D ISCUSSION
quately represent the confusion matrix; it takes all elements of We carry out a comparison among the Alert Net-RWD results
the matrix into account, rather than just those on the main di- and the ones obtained by works of literature that address the
agonal, which occurs when calculating the global classification same problem. From Table IV, one can realize that the works in
accuracy [57]. This metric can be calculated as: the literature presented higher accuracy values than Alert Net-
RWD. However, the number of images used in those studies is
observed − expected
k= . (1) at least four times less than the number of images used in this
1 − expected work. Another critical point is that only the current research
According to Landis and Koch [32] k assumes values be- used more than one image dataset. This characteristic leads
tween 0 (zero) and 1 (one). The result is qualified according to a greater diversity in the training data, which leads to the
to the k value as follows: k ≤ 0.2: Bad; 0.2 < k ≤ 0.4: Fair; achievement of a robust method for different input image types.
0.4 < k ≤ 0.6: Good; 0.6 < k ≤ 0.8: Very Good and k > 0.8:
Excellent. Table IV
The cost function metric (loss) was also used in this work. C OMPARISON AMONG THE RESULTS OBTAINED BY THE PROPOSED
AGAINST THE ONES OBTAINED BY RELATED METHODS .
This function is responsible for saying how far one is from the
ideal prediction and, therefore, quantifies the “cost" or “loss" Method Descritors
N. of
Accuracy
images
by accepting the prediction generated by the current parameters Rawat et al. [17] Geometrical, color and texture 420 99.50%
of the model [33]. Laosai and Shape, color, Texture
500 99.85%
Chamnongthai [18] and number of nucleoli
IV. R ESULTS Proposed method Deep Features 2415 97.18%
belong to a different domain when compared to blood smear different activation map patterns for each of these situations.
images. In these situations, it is better to apply Deeply Fine- Also, as it is trained in different databases, the proposed model
Tuning (DFT). The DFT approach allows training the entire can adapt to different characteristics, it is possible to observe
network. However, it requires a higher computational cost and that the maps differ from one base to another. However, CNN
a more considerable amount of data. Table VI, presents the activates different regions for each class. For example, in figure
results obtained by applying DFT with the CNNs found in the 3.a, of the ALL class, the leukocyte turned predominantly blue.
literature. This pattern changes in the other classes, becoming mostly red
Comparing Tables V and VI, one can realize that the use in the HBS class.
of DFT resulted in a substantial performance gain of the pre-
trained CNNs. With DFT, ResNet50 overcome Alert Net-RWD VI. C ONCLUSION
with the best results in terms of accuracy (97.80%), precision The conducted systematic survey showed that many re-
(97.81%), recall (97.80%), and kappa (0.9660). searchers had focused their efforts on the Computer-Aided
We performed a statistical evaluation using the student T- Diagnostic systems field, where the automatic diagnosis of
test with a significance level of 5% and found that the results leukemia can be found.
achieved by Alert Net-RWD and ResNet50 are statistically In this work, we developed Alert Net-RWD, a CNN model
equivalent. Therefore, one can conclude that Alert Net-RWD, for the automated diagnosis of acute lymphoid and acute
although less complex (it has about one-third of the parameters myeloid leukemia. The achieved results are promising, and the
relative to other CNNs), achieves results comparable to pre- number of parameters used in Alert Net-RWD is inferior to
trained architectures. Analyzing the file size generated from the the ones of other architectures found in the literature. Also, the
training, we observed that Alert Net-RWD is more attractive for proposed model has a smaller file size, which makes it more
use on mobile devices. These devices can have a crucial role attractive to be used in applications on mobile devices.
in the disease diagnosis in isolated regions, for example. The bibliographical survey also showed that authors usually
Figure 3 shows examples of the Alert Net-RWD activation use only a single private dataset to evaluate their proposals.
maps for the three classes. It is possible to identify which However, this situation does not represent a real environment
regions are used to differentiate healthy images from those with for the application of this type of system. Therefore, we
acute leukemia (lymphoid or myeloid). evaluated our approach in an high heterogeneous image dataset.
For future approaches, the proposed model needs to be
applied to a more significant number of images. Moreover,
in addition to differentiation of the three classes proposed in
this work, a distinction will also be made between the images
that have Chronic Lymphocytic Leukemia and Chronic Myeloid
Leukemia.
(a)
VII. ACKNOWLEDGEMENTS
This study was partially founded by the Coordenação
de Aperfeiçoamento de Pessoal de Nível Superior - Brasil
(CAPES) - Finance Code 001 and Fundação de Amparo à
Pesquisa do Piaui (FAPEPI). We gratefully acknowledge the
(b) support of NVIDIA Corporation with the donation of the Titan
Xp GPU used in this research.
Figure 3. Examples of activation maps for blood slides, (a) images with one
leukocyte, (b) images with various leukocytes. The first column are images of
the ALL class; the second column are images of the AML class and the third
of the HBS class.
R EFERENCES
The number of leukocytes may vary depending on the input [1] G. S. Travlos, “Normal structure, function, and histology of the bone
image. We see in Figure 3 that Alert Net-RWD generates marrow,” Toxicologic Pathology, vol. 34, no. 5, pp. 548–565, 2006.
Table VI
R ESULTS OBTAINED WITH D EEPLY F INE T UNING .