Computer Vision
Computer Vision
Computer Vision
Abstract—Traditional approaches to automatic diagnosis improve it by providing valuable information about the clini-
of skin lesions consisted of classifiers working on sets of cal case, and serving as filtering tools that automatically detect
hand-crafted features, some of which modeled lesion as- those cases with a high confidence of benignity, which can have
pects of special importance for dermatologists. Recently,
the broad adoption of convolutional neural networks (CNNs) a great impact in the final amount of moles that must be analyzed
in most computer vision tasks has brought about a great by the clinicians.
leap forward in terms of performance. Nevertheless, with However, despite the research efforts devoted to the topic,
this performance leap, the CNN-based computer-aided di- these systems have yet to become part of everyday clinical prac-
agnosis (CAD) systems have also brought a notable reduc- tice. From our point of view, there are two factors currently ham-
tion of the useful insights provided by hand-crafted fea-
tures. This paper presents DermaKNet, a CAD system based pering the adoption of CAD systems by dermatologists. Firstly,
on CNNs that incorporates specific subsystems modeling the lack of large, open, annotated datasets, containing images
properties of skin lesions that are of special interest to der- of lesions gathered by different medical institutions and a great
matologists aiming to improve the interpretability of its di- variety of dermatoscopes, has undermined the generalization
agnosis. Our results prove that the incorporation of these capability of developed CAD systems, leading to poor results
subsystems not only improves the performance, but also
enhances the diagnosis by providing more interpretable when applied to different datasets. Additionally, it has prevented
outputs. standard and fair comparisons between proposed methods, thus
hindering the scientific advances in the field. Secondly, most
Index Terms—Skin lesion analysis, melanoma, convolu-
tional neural networks, dermoscopy, CAD.
of CAD systems simply provide a tentative diagnosis to the
clinicians, which does not actually help them much in practice.
I. INTRODUCTION Hence, it would be more desirable for these systems to be able
to provide some insight about the elements and properties of the
ARLY Melanoma Diagnosis is one of the traditional fields
E of application of Computer Aided Diagnosis (CAD) sys-
tems. In addition to the high incidence and aggressiveness of
lesion that support the diagnosis.
In regard to the first factor, the International Skin Imag-
ing Collaboration: Melanoma Project (ISIC1 ) is an academia-
melanoma (it is the skin cancer that causes the most deaths in
industry partnership created to facilitate digital skin imaging
Europe [1]), there are other aspects that make it an specially
technologies to help reduce melanoma mortality. In addition to
suitable field for automatic diagnosis methods. For example,
developing standards to address the technologies, techniques,
the early removal of the lesion completely cures the disease, ef-
and terminology used in skin imaging, ISIC is continuously
fectively preventing metastasis [2]. Melanocytes are one of the
building an open source public access archive (ISIC Archive2 )
very few cells that are naturally colored and visible to the eye,
of skin images that allows researchers to assess and validate
which make them possible to diagnose using clinical images.
their CAD systems. The archive is large and includes images
Also, the use of portable and affordable acquisition instruments,
acquired using different devices from multiple medical insti-
such as dermatoscopes, improves the accuracy in the diagnosis
tutions. Furthermore, since 2016, the association is also pro-
can be improved by 5–30% [3]. As a result, there is a growing
moting the research in the field by organizing an International
interest in incorporating automatic systems in the daily practice
Challenge in which automatic methods for lesion segmentation,
of dermatologists, aiming not to replace their diagnosis, but to
dermoscopic feature detection and skin disease diagnosis are
evaluated using images of the archive [4], [5].
Manuscript received October 13, 2017; revised December 5, 2017
and January 15, 2018; accepted February 12, 2018. Date of publication With respect to the second, in the last few years there have
February 16, 2018; date of current version March 6, 2019.This work was been changes in machine learning technology that have in-
supported in part by the National Grant TEC2014-53390-P and National creased the difficulty of interpreting the results of the CAD
Grant TEC2014-61729-EXP of the Spanish Ministry of Economy and
Competitiveness, and in part by NVIDIA Corporation with the donation systems. Whereas traditional approaches relied on low-level
of the TITAN X GPU. handcrafted features computed over the lesion [6], [7], some of
The author is with the Department of Signal Theory and Communi-
cations, Universidad Carlos III de Madrid, Madrid 28045, Spain (e-mail:
1 http://isdis.net/isic-project/
[email protected]).
Digital Object Identifier 10.1109/JBHI.2018.2806962 2 https://isic-archive.com/
2168-2194 © 2018 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See http://www.ieee.org/publications standards/publications/rights/index.html for more information.
548 IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS, VOL. 23, NO. 2, MARCH 2019
them modeling aspects of special importance for dermatologists These features vary from general-purpose descriptors, e.g.,
in their diagnosis [8], [9], modern approaches, such as [10] and color and texture filter-banks [17]–[19], to problem-dependent
[11], have adopted the use of Convolutional Neural Networks knowledge-based features. The later deserve more interest from
(CNNs), due to their impressive performance in many computer our point of view since they aim to model particular lesion as-
vision tasks such as classification [12], [13], detection [14] and pects of special importance for dermatologists. Consequently,
segmentation [15], [16]. The drawback of CNN-based systems besides improving the system performance, they also enhance
is the lack of clear understanding of the underlying factors and the interpretability of the automatic diagnosis [20]. In [8], the
properties that support the final decision. authors proposed a reduced set of interpretable features model-
This paper presents DermaKNet (Dermatologist Knowledge ing some properties of the ABCD rule [21], such as symmetry
Network), a CAD system for automatic diagnosis of skin lesions and border sharpness. Along the same lines, some other methods
that aims to keep the best of both alternatives and, consequently, start by detecting a set of dermoscopic structures that are later
incorporates the intuitions of dermatologists into a CNN-based used to generate the diagnosis. Examples of these dermoscopic
framework. By developing novel computational blocks in the features include reticular patterns, dots and globules, streaks,
net, we model properties of the lesions that are known to etc. The complete set of structures that are commonly consid-
be discriminative for clinicians. The benefit of our approach ered was defined in the pattern analysis method for melanoma
is twofold: firstly, as we will demonstrate in the experimen- diagnosis [3], which has been widely adopted by specialists due
tal section, the performance of the diagnosis is improved and, to its accurate results.
secondly, the interpretability of the system can be enhanced To tackle the problem of detection of dermoscopic features,
by analyzing the outputs of these expert-inspired blocks. In classical segmentation techniques, such as Gaussian Mixture
particular, our system includes several elements, not found in Models [22], Markov Random Fields [23] and Topic Models
general-purpose classification CNNs, that become the main con- [9], or even discriminative approaches working over textons
tributions of this work: [24], have been adopted in the literature. Once the areas corre-
r A Dermoscopic Structure Segmentation Network which sponding to some of these structures have been identified, diag-
segments the lesion area into a set of high-level dermo- nosis can be inferred: in [25] the ABCD rule is combined with
scopic features corresponding to global and local struc- structure recognition in an attempt to detect suspicious lesions,
tures that have turned out to be of special interest for der- in [26] the 7-point checklist method is applied to the outputs of
matologists in their diagnosis. In the absence of strongly these structure detectors, and in [9] probabilistic segmentation
annotated data, we have trained this network from weakly- maps are used to build a set of specific classifiers, each one fo-
annotated clinical cases. cusing on a particular structure, which are then fused to provide
r A novel Modulation Block that incorporates these segmen- the final diagnosis.
tations into the diagnosis process as probabilistic modu- During the last few years, with the advent and broad adoption
lators of neuron activations. of CNNs in many recognition problems in computer vision,
r Two additional novel blocks, Polar Pooling and Asymme- several works have been proposed that apply this paradigm
try, that mimic the way in which dermatologists analyze to melanoma classification. In [10] CNNs are combined with
skin lesions. sparse coding and SVMs to provide a diagnosis. In [27] a Fully
r 3-branch top-layers in the diagnosis CNN, that provide Convolutional Neural Network (FCNN) is first used to seg-
the final diagnosis using both the traditional information ment the input image into lesion area and surrounding skin;
channels as well as these novel pathways modeling expert then a square and tight cropping is performed, and finally a
intuitions. diagnosis is provided using a CNN that is fine-tuned from the
r Some other elements with a great impact over the final well-known resnet model [13]. In [28], the authors have trained
system performance such as a specifically tailored data a CNN using a very large dataset with 129,450 clinical im-
augmentation process, or an external classifier based on ages and 2,032 different diseases, and tested its performance
non-visual meta-data. against 21 board-certified dermatologists on biopsy-proven clin-
The remainder of this paper is organized as follows: Section II ical images with two critical binary classification use cases:
performs a review of the related literature. In Section III we pro- malignant carcinomas versus benign seborrheic keratoses and
vide a general description of our method for automatic diagno- malignant melanomas versus benign nevi. Their results show
sis of skin lesions. Sections IV and V present our Dermoscopic that the automatic system achieves similar performance than all
Structure Segmentation and Diagnosis networks, respectively. tested experts across both tasks, demonstrating a level of com-
Section VI explains the experiments and discusses the results petence comparable to dermatologists. However, despite their
that support our method and, finally, Section VII summarizes impressive performance when enough training data is available,
our conclusions and outlines future lines of research. CNN-based methods still lack a clear understanding of the un-
derlying factors and properties that support their final decision,
limiting their usability and preventing their broad adoption by
II. RELATED WORK dermatologists.
Traditional approaches address the problem of automatic In this paper, we propose to incorporate knowledge-based
melanoma diagnosis using discriminative methods working over interpretable properties of skin lesions into the framework of
sets of hand-crafted visual features from dermoscopic images. CNNs. Although Majtner et al. [29] have previously tried to
GONZÁLEZ-DÍAZ: DERMAKNET: INCORPORATING THE KNOWLEDGE OF DERMATOLOGISTS TO CNNs 549
Fig. 1. Main processing pipeline of DermaKNet. Each clinical case is defined by an image X c . The Lesion Segmentation Net firstly segments the
image into areas corresponding to lesion and surrounding skin, giving rise to the binary masks M c . Then, the Data Augmentation Module extends
the initial visual support of the lesion and generates additional views X̃ vc of the lesion by applying rotations and crops. Next, the Dermoscopic
Structure Segmentation Network segments each lesion view into a set of high level dermoscopic structures s. Finally, the whole set of the lesion
images X̃ vc and their corresponding segmentation maps S vc s are passed to the Diagnosis Network, which generates the diagnosis.
As shown in Fig. 1, a probabilistic output of this SVM is then mented network disorder (globules or streaks) and highly
factorized with the output of the Diagnosis Network to provide indicative of melanoma. This structure is always considered
the final diagnosis Yc of the system. local in our annotations.
6. Streaks [32, pp. 17–18]: are black or light to dark brown
IV. DERMOSCOPIC STRUCTURE SEGMENTATION NETWORK longish structures of variable thickness, not clearly com-
(DSSN) bined with pigmented networks, and easily observed when
located at the periphery of the lesion. In general, they tend
The goal of the Dermoscopic Structure Segmentation
to converge to the center of the lesion. An even, radial dis-
Network is the following: given an input view of the lesion
tribution of the streaks around the border of the lesion is
X̃vc it aims to provide a segmentation considering a pre-defined
characteristic of Reed nevus. However, an asymmetric or
set of dermoscopic features that correspond with global and
localized distribution of streaks suggests malignancy. This
local structures of special interest for dermatologists in their
structure is always local, and spatially localized on the le-
diagnosis.
sion borders.
7. Vascular structures [32, p. 23]: they are homogeneous ar-
A. Considered Dermoscopic Structures eas with vessels. Depending on their shape, they may be
In this work we have considered a set of eight structures: a clear sign of malignancy. While abundant and prominent
1. Dots, globules and Cobblestone pattern [32, pp. 15–17]: comma vessels often exist in dermal nevi, some other vas-
although different, they have been fused into one for the cular patterns, such as arborizing, hairpin or linear irregular
purpose of the system development due to their visual sim- ones, are more frequent in melanomas. These structures are
ilarities. These patterns consist of a certain number of round always considered local in our annotations.
or oval elements, variously sized, with shades that can be 8.- Unspecific pattern: we group in this category those parts
brown and gray-black. In the case of cobblestone structures, of the lesion that cannot be assigned to any of the previous
they are usually larger, more densely grouped and some- structures. No direct diagnosis implication can be inferred
what angulate. In general, they are often located in lesion from it. Nevertheless, it is more often related to melanoma,
areas that are growing. While an even spatial distribution or at least it suggests that the lesion must be carefully ex-
with regular size and shape is associated with benignity, var- plored. Depending on its relative extent in the lesion area,
ious sizes and shapes, or irregular or localized distribution these feature can be either identified as local or global in
usually occur in melanoma. Depending on their relative ex- our annotations.
tent in the lesion area, these features can be either identified
as local structures or global patterns.
2. Reticular pattern and pigmented networks [32, pp. 10–13]:
they cover most parts of certain lesions. They look as grids B. A Weak Learning Approach for Segmentation
of thin brown lines over a light brown background and The main challenge to develop the DSSN is the annotation
are quite common in melanocytic lesions. If globally dis- of the training dataset. A traditional supervised approach would
tributed, this structure is related to benign lesions. However, require to provide a ground truth pixel-wise segmentation for
variations in size and form are indicative of malignancy. each training image. This kind of strong annotation is often hard
Depending on their relative extent in the lesion area, these to obtain as it demands a huge effort from the dermatologists to
features can be either identified as local structures or global manually outline the segmentations of the structures. Alterna-
patterns. tively, providing weak image-level labels indicating only which
3. Homogeneous areas [32, pp. 14–15]: these areas are dif- dermoscopic structures are present in a lesion is much easier for
fuse, with brown, grey-black, grey-blue or reddish-black dermatologists and becomes more affordable. Henceforth, fol-
shade, where there is no other local feature that can be rec- lowing this alternative approach, we asked dermatologists of a
ognized. A globally distributed pattern of bluish hue is the collaborating medical institution, the Hospital Doce de Octubre
hallmark of the blue nevus. With other shades, it may be in Madrid, to annotate the ISIC 2016 training dataset [4] with
present in several types of lesions, such as Clark-nevi, der- the presence or absence of the 8 aforementioned dermoscopic
mal nevi or nodular and metastatic melanomas. Depending structures. In particular, we asked them to provide one label
on their relative extent in the lesion area, these features can L(s) per structure s and clinical case: L(s) = 0 if the structure
be either identified as local structures or global patterns. is not present, L(s) = 1 if it takes up just a local area of the
4. Regression [32, pp. 20–21]: these structures are generally lesion (local structure), L(s) = 2 if it is present and dominant
well-defined white and/or blue areas that appear when the enough to be considered a global pattern in the lesion.
immune system has attacked the lesion. White areas resem- Given this weakly-annotated dataset, we have developed a
ble a superficial scar, and blue areas may appear as diffuse segmentation network based on the method described in [33],
blue-gray areas or peppering, which is an aggregation of where the authors introduced a Constrained Convolutional Neu-
blue-grey dots. Regression areas are always considered lo- ral Network for weakly supervised segmentation. For the sake
cal structures. of completeness we will include here some equations of the
5. Blue-white veil [32, pp. 22–23]: a region of grey-blue original model that accommodate the extensions and modifica-
to whitish-blue blurred pigmentation, correlated with pig- tions for our particular scenario. For an in-depth discussion and
552 IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS, VOL. 23, NO. 2, MARCH 2019
derivation of these equations, the interested reader is referred to where p− (s) = 1 − p+ (s), and 1[·] is an indicator function
the original paper [33]. which is evaluated only when the inner condition is satisfied.
To keep the notation simple, we omit the image index in the Given the probability distribution of an image stated in (2),
following paragraphs. Let us consider the dermoscopic structure the constrained CNN optimization for weakly-supervised seg-
segmentation as a pixel-wise labeling problem in which each mentation proposed in [33] is:
pixel i in the lesion area is labeled as belonging to a particular
structure si , s = 1...P = 8 or to a background class (s = 0). find θ
Passing the input image through the segmentation CNN will → −
− →
subject to A Q ≥ b (4)
produce a spatially reduced score map fi (si ; θ) (64 × 64 in our
case) at its top layer, where θ represents the set of parameters →
−
where Q is the vectorized form of the network output Q(S|θ),
of the CNN. Applying a parametric softmax over the network →
−
and A ∈ RK ×P N and b ∈ RK define K linear constraints over
scores, we can model the label of each pixel location i as a
the output distribution Q. Since this problem is not convex with
probabilistic random variable with value qi (si |θ):
respect to the network parameters θ, the authors defined a vari-
1 ational latent probability distribution P (S) over the semantic
qi (si |θ) = exp (γfi (si |θ)) (1) labels, which is independent of the CNN parameters θ, applied
Zi
the constraints to this new distribution rather than to the orig-
Here si is the random variable that represents the la- inal network output Q(S|θ), and enforced P (S) and Q(S|θ)
(dermoscopic structure) at the location i and Zi =
bel to model the same probability distribution by minimizing the
s=0...P =8 exp (γfi (s|θ)) is the partition function at the lo- Kullback-Leibler divergence between them. The resulting for-
cation i. The utility and appropriateness of the parameter γ, mulation becomes a Lagrangian optimization problem and gives
which was not included in the original model, will be discussed rise to the following update equation:
later on.
1
At this point, we have introduced another modification to the pi (s) = exp (γfi (si ; θ) + ATi,s i λ) (5)
original model. Our problem is highly unbalanced and, whereas Zi
structures as dots/globules or reticular patterns are very com- where λ ≥ 0 are the dual variables introduced in the optimiza-
mon, others like blue-white veil or regression patterns are less tion, and Zi = s exp (γfi (s; θ) + ATi,s λ) is the local partition
frequent. Moreover, the frequency of a structure does not cor- in location i. Additionally the final loss and its gradient needed
respond with its impact on the diagnosis, and less frequent pat- by the optimization become:
terns are in general more indicative of malignant lesions. We
have observed that learning the model directly from the data L(θ) = − wi pi (si ) log qi (si |θ) (6)
leads to solutions that focus more on the correct segmentation i si
1
P
N
w= 1[L(s) > 0]p− (s) + 1[L(s) = 0]p+ (s) (3) ls ≤ pi (s) (9)
P s=1 i=1
GONZÁLEZ-DÍAZ: DERMAKNET: INCORPORATING THE KNOWLEDGE OF DERMATOLOGISTS TO CNNs 553
Fig. 6. Processing pipeline of the Diagnosis Network. The outputs of layer res5c of the original resnet-50 are modulated by the segmentation
maps coming from DSSN, providing an extended set of channels. These channels, after batch normalization and ReLU activation function, are
passed through a 3-branch processing pipeline that analyzes the presence of visual patterns, their spatial location, and the asymmetry of the lesion,
respectively, to generate the diagnosis.
In our approach, we have modified the structure of the top B. Modulation Block
layers of the network, giving rise to the pipeline illustrated in The goal of the Modulation Block is to incorporate the seg-
Fig. 6. In the following sections, we will first introduce the mentations provided by DSSN to the diagnosis process. To do
structure of the top layers in the DN and, then, we will provide a
so, this block fuses the structure segmentation maps described
detailed description of those blocks that have been specifically in Section IV-B with the outputs of the previous layer in the
designed to work with dermoscopic images of skin lesions. CNN.
In particular, if the output of the previous layer is a tensor
A. Overview of the DN x ∈ RM ×N ×O , where M × N are the dimensions of the output
As shown in the Fig. 6, we have introduced several blocks and O is the number of output channels, and s ∈ RM×N×P is
to generate a final diagnosis Yvc for each considered view of a a segmentation map that has been previously re-sized to match
clinical case. Compared to the original resnet-50, we first apply the feature map, the output of this module is an extended and
a Modulation Block over the outputs of the convolutional res5c modulated feature map y ∈ RM×N×O P . To compute this output,
layer. This block, described in Section V-B, aims to modulate we modulate the o-th channel xo with the s-th segmentation
the previous outputs using the probabilistic segmentation maps map ss , producing a modulated channel yk :
provided by DSSN. As explained below, this block multiplies yk = xi ss , i = 1...O, s = 1...P, k = 1...OP (14)
the total number of channels or latent visual patterns by 9,
which goes from 2048 to 18432. Next, the Modulation Block Since the segmentations computed by DSSN are fixed, this mod-
is followed by a Batch Normalization and non-linear ReLU ule has no parameters to be optimized during the training phase.
activation (Rectified Linear Unit) layers. Finally, rather than Hence, the backpropagation process only requires the derivative
just applying the Average Pooling + Fully Connected approach with respect the data:
as in the original resnet-50, we have subdivided the pipeline into
∂z ∂z
three parallel processing branches: = sk (15)
1) Branch 1: the original pipeline with an average pooling (8 ∂xo ∂yk
k ∈K o
× 8 in our case), followed by a fully connected layer (FC1).
2) Branch 2: it performs an average normalized polar pooling where Ko corresponds to all the modulated channels k generated
(see Section V-C for further details) (R × Θ; R = 3, Θ = from the channel o, and sk is the corresponding modulating map
8) followed by a fully connected layer (FC2). This branch for that k.
provides a spatially discriminant analysis of the lesion. The application of this module to our diagnosis network has
3) Branch 3: it follows the previous polar pooling, estimates been adapted as follows: it has been added to the network just
the asymmetry of the lesion (see Section V-D for a complete after the res5c layer of the original resnet-50 [13]. Hence, we
description), and applies a fully connected layer (FC3) over modulate O = 2048 channels using the probabilities of the P = 8
the asymmetry measures. segmentation maps of local and global structures described in
The outputs of these three branches are then linearly com- Section IV-A. In addition, we also concatenate the original input
bined using a Sum Block, and the class-probabilities are com- channels to the modulated ones, resulting into an extended set
puted using a softmax. Finally, in order to generate a unified of O(P + 1) channels (18432 in our case).
final output for each clinical case Yc , we consider independence
between views leading to a factorization: C. Polar Pooling
This block aims to perform pooling operations (average or
V
max pooling), but instead of doing them over rectangular spatial
Yc = Yvc (13) regions, these operations are done over sectors defined in polar
v =1
coordinates. Hence, for a given number of rings R (with r ∈
[0, 1]) and angular sectors Θ (angles θ ∈ [0, 2π)), this block
GONZÁLEZ-DÍAZ: DERMAKNET: INCORPORATING THE KNOWLEDGE OF DERMATOLOGISTS TO CNNs 555
transforms an input x ∈ RM ×N ×O into an output y ∈ RR ×Θ×O , During back-propagation, the gradients needed by the
where O is the number of channels. stochastic gradient descent algorithm are:
Furthermore, in order to deal with lesions of irregular Θ/2
shape, we use the normalized polar coordinates described in ∂z 2 ∂z
= ϕ(ri , θj , θk ) (18)
Section III-C. Since, depending on the particular shape of a ∂xr i ,θ j ,o RΘ ∂yθ k ,o
k =1
lesion and the size of the tensor being pooled, some combina-
tions (r, θ) may not contain pixels within the lesion, we can where:
also define overlaps between adjacent sectors to improve the xr i ,θ j − xr i ,θ k −j , θj ∈ [θk , θk + π)
smoothness of the outputs. Moreover, we use a non-uniform ϕ(ri , θj , θk ) =
xr i ,θ k −j − xr i ,θ j , otherwise
radius quantization in order to generate fixed-area rings that (19)
contain the same number of pixels in the hypothetical case
of an ideally circular lesion. To that end, the k − th ring is E. Details About Learning and Evaluation Processes
defined as:
In this section we provide some useful details about the learn-
k−1 k ing process of DN. As mentioned in Section V, we have taken
≤r< (16) the original resnet-50 as initialization and fine-tuned the network
R R
using our own training data. When nothing else is specified, all
for k = 1...R. Given the proposed normalized polar coordinate the new layers in the network are initialized using weights com-
system, the equations needed to perform the forward and back- puted using Xavier’s method [34].
ward steps in the inference process do not differ from those ones Furthermore, due to the high degree of expressiveness of
of a regular max or average pooling block in Cartesian coordi- branches 2 and 3 with respect to the first branch, we have ob-
nates. Furthermore, it is worth noting that, once this block is served that training the whole system at a time was prone to
applied and data is converted into polar coordinates, no more overfitting. Hence, instead, we have first trained a model us-
convolutional layers can be applied as the spatial relationships ing only the first branch with a learning rate of Lr = 10−4 and
between contiguous values in the output matrix have been re- a weight Decay of W d = 10−4 . Once a coarse convergence is
defined (e.g., considering that columns in the data matrix refer reached, we have added the other two branches, frozen all layers
to angles, the first and last columns are adjacent in the angular up to (and including) the Modulation Block, initialized weights
space). For that reason, in our approach, this module is followed for branch 2 and 3 to zero, and learned the weights of the upper
by some blocks that are, either fully connected, or specifically layers using the following learning rates:
designed to work with polar coordinates (e.g., the Asymmetry r For branch 1 the original learning rate Lr1 = 10−4 and
block). weight Decay W d = 10−4 .
r For branches 2 and 3 the original learning rate and weight
D. Asymmetry Block decays are divided or multiplied by the total number of
input spatial neurons in the fully connected block, respec-
Melanomas tend to grow differently along each direction, tively. This stronger regularization and slower learning rate
becoming more asymmetric than benign lesions. This is why prevents these branches from getting more relevance than
symmetry is present in a variety of diagnosis algorithms, such the original one due to their expressiveness, and therefore
as the ABCD rule of dermoscopy [21]. The symmetry rule re- minimizes the likelihood of overfitting.
quires finding the axis of maximum symmetry according to The code that implements DermaKNet is available online.3
some criteria (e.g., shape, color), and its perpendicular. In doing
so, the lesion is labeled by dermatologists either as symmetric VI. EXPERIMENTAL SECTION
in one or two axes, or as asymmetric.
Our asymmetry block computes metrics that evaluate the A. Datasets and Experimental Setup
asymmetry of a lesion with respect to various axes. In par- DermaKNet has been assessed using the official dataset of
ticular, given a polar division of the lesion into R × Θ sectors, the 2017 ISBI Challenge on Skin Lesion Analysis Towards
we compute the asymmetry for axes aligned with the Θ/2 an- Melanoma Detection4 [5]. This challenge consists of three dif-
gles in the range [0, π). To do so, our approach folds the lesion ferent parts: 1) Lesion Segmentation, 2) Detection and Local-
over each angle θ and computes the accumulated square differ- ization of Visual Dermoscopic Features/Patterns, and 3) Dis-
ence between corresponding sectors. Hence, for a given input ease Classification. We focus on part 3, being our goal the
x ∈ RR ×Θ×O , this module generates an output y ∈ R1×Θ×O as automatic diagnosis of dermoscopic images into three differ-
follows: ent categories: 1) Nevus: benign skin tumor, derived from
R Θ/2
melanocytes (melanocytic), 2) Melanoma: malignant skin tu-
1 2 mor, derived from melanocytes (melanocytic), and 3) Sebor-
yθ k ,o = xr i ,θ k + j −1 ,o − xr i ,θ k − j ,o (17)
RΘ i=1 j =1 rheic Keratosis: benign skin tumor, derived from keratinocytes
tuted by Θ − j. 4 https://challenge.kitware.com/#challenge/583f126bcad3a51cc66c8d9a
556 IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS, VOL. 23, NO. 2, MARCH 2019
TABLE II
A COMPARISON BETWEEN DERMAKNET AND THE TOP FIVE PERFORMING OFFICIAL SUBMISSIONS TO 2017 ISBI
CHALLENGE ON SKIN LESION ANALYSIS
Method Mel AUC SK AUC Avg AUC Mel SP95 SK SP95 Avg SP95
AUC and SP95 are given for Melanomas (Mel) vs rest, Seborrheic Keratosis (SK) vs rest, and average (avg).
Fig. 8. Two examples of an interpretable system output generated with our approach: (a) Melanoma and (b) Seborrheic Keratosis. Each case
contains 6 figures which represent (from top to bottom and left to right): original image and diagnosis, binary mask with lesion/skin, segmentation
into dermoscopic features, automatic diagnosis, contribution of each dermoscopic feature to the final diagnosis, symmetry measures by angle.
In addition to gaining more insight on the automatic diag- our system. Looking at the outputs of those intermediate blocks
nosis, this information might also become the basis for other modeling intuitions from dermatologists, we can get more in-
end-user applications, such as e-learning tools to help in the sight about which dermoscopic features are influencing the di-
training of new specialists. agnosis, the lesion symmetry, and even the spatial locations that
support certain diagnosis.
VII. CONCLUSION AND FURTHER WORK The main lines of further research comprise the design of
In this paper we have introduced DermaKNet, a CAD system new blocks implementing other aspects in the lesions that are
for the diagnosis of skin lesions that is composed of several of interest for dermatologists, the development of segmentation
CNNs, each one devoted to a specific task: lesion-skin segmen- methods that account for other useful dermoscopic features, and
tation, detection of dermoscopic features, and global lesion di- the exploration of novel ways of incorporating the dermoscopic
agnosis. Our goal through the whole system is to incorporate the structures segmentation into the diagnosis process. With respect
expert knowledge provided by dermatologists into the decision to the latter, we will consider multi-task losses [42], which
process, overcoming the traditional limitation of deep learning allow for sharing processing layers in both tasks and fusing
regarding the lack of interpretability of the results. In order to segmentation and diagnosis networks into end-to-end trainable
achieve a seamless integration between CNNs and this expert in- architectures.
formation, we have developed several novel processing blocks. ACKNOWLEDGMENT
We have assessed our system in the challenging dataset
used in the 2017 ISBI Challenge on Skin Lesion Analysis To- The authors would like to kindly thank dermatologists of Hos-
wards Melanoma Detection, in the task of automatic diagnosis pital 12 de Octubre of Madrid because of their inestimable help
of melanoma and seborrheic keratosis. Our results prove that annotating the data contents with the weak labels of structural
modeling expert-based information enhances the system per- patterns.
formance and achieves very competitive results. In particular, REFERENCES
the last version of our model ranks first in the Seborrheic Ker-
[1] J. Ferlay et al., “Cancer incidence and mortality patterns in europe: Esti-
atosis category and average AUCs, and is very competitive in mates for 40 countries in 2012,” Eur. J. Cancer, vol. 49, no. 6, pp. 1374–
melanoma. Furthermore, our results in Specificity at a 95% Sen- 1403, 2013. [Online]. Available: http://www.sciencedirect.com/science/
sitivity are clearly better than those of the rest of the approaches, article/pii/S0959804913000075
[2] M. A. Weinstock, “Cutaneous melanoma: Public health approach to
which makes our system very suitable as an automatic filtering early detection,” Dermatologic Therapy, vol. 19, no. 1, pp. 26–31, 2006.
module reducing the workload of dermatologists. [Online]. Available: http://dx.doi.org/10.1111/j.1529-8019.2005.00053.x
In addition to this gain in performance, we have also shown
that we can produce a more interpretable diagnosis on top of
GONZÁLEZ-DÍAZ: DERMAKNET: INCORPORATING THE KNOWLEDGE OF DERMATOLOGISTS TO CNNs 559
[3] H. Pehamberger, A. Steiner, and K. Wolff, “In vivo epiluminescence [22] A. Sez, C. Serrano, and B. Acha, “Model-based classification methods of
microscopy of pigmented skin lesions. I. Pattern analysis of pigmented global patterns in dermoscopic images,” IEEE Trans. Med. Imag., vol. 33,
skin lesions,” J. Amer. Acad. Dermatology, vol. 17, no. 4, pp. 571– no. 5, pp. 1137–1147, May 2014.
583, 1987. [Online]. Available: http://www.sciencedirect.com/science/ [23] C. Serrano and B. Acha, “Pattern analysis of dermoscopic im-
article/pii/S0190962287702394 ages based on Markov random fields.” Pattern Recognit., vol. 42,
[4] D. Gutman et al., “Skin lesion analysis toward melanoma detection: A no. 6, pp. 1052–1057, 2009. [Online]. Available: http://dblp.uni-
challenge at the international symposium on biomedical imaging (ISBI) trier.de/db/journals/pr/pr42.html#SerranoA09
2016, hosted by the international skin imaging collaboration (ISIC),” [24] M. Sadeghi, T. K. Lee, D. McLean, H. Lui, and M. S. Atkins, “Global
arXiv:1605.01397, 2016. pattern analysis and classification of dermoscopic images using tex-
[5] N. C. F. Codella et al., “Skin lesion analysis toward melanoma de- tons,” Proc. SPIE, vol. 8314, 2012, Art. no. 83144X. [Online]. Available:
tection: A challenge at the 2017 international symposium on biomedi- http://dx.doi.org/10.1117/12.911818
cal imaging (ISBI), hosted by the international skin imaging collabora- [25] A. G. Isasi, B. G. Zapirain, and A. M. Zorrilla, “Melanomas non-
tion (ISIC),” arXiv:1710.05006, 2017. [Online]. Available: http://arxiv.org/ invasive diagnosis application based on the ABCD rule and pat-
abs/1710.05006 tern recognition image processing algorithms,” Comp. Bio. Med.,
[6] A. Madooei, M. S. Drew, M. Sadeghi, and M. S. Atkins, Intrinsic Melanin vol. 41, no. 9, pp. 742–755, 2011. [Online]. Available: http://dblp.uni-
and Hemoglobin Colour Components for Skin Lesion Malignancy Detec- trier.de/db/journals/cbm/cbm41.html#IsasiZZ11
tion. Berlin, Germany: Springer, 2012, pp. 315–322. [Online]. Available: [26] G. D. Leo, A. Paolillo, P. Sommella, G. Fabbrocini, and O. Rescigno,
http://dx.doi.org/10.1007/978-3-642-33415-3_39 “A software tool for the diagnosis of melanomas,” in Proc. IEEE Instrum.
[7] P. Rubegni et al., “Objective follow-up of atypical melanocytic skin Meas. Technol. Conf., May 2010, pp. 886–891.
lesions: a retrospective study,” Arch. Dermatological Res., vol. 302, [27] L. Yu, H. Chen, Q. Dou, J. Qin, and P. A. Heng, “Automated
no. 7, pp. 551–560, 2010. [Online]. Available: http://dx.doi.org/ melanoma recognition in dermoscopy images via very deep residual
10.1007/s00403-010-1051-6 networks,” IEEE Trans. Med. Imag., vol. 36, no. 4, pp. 994–1004,
[8] M. Zortea et al., “Performance of a dermoscopy-based computer Apr. 2017.
vision system for the diagnosis of pigmented skin lesions com- [28] A. Esteva et al., “Dermatologist-level classification of skin cancer with
pared with visual evaluation by experienced dermatologists,” Artif. deep neural networks,” vol. 542, pp. 115–118, 2017.
Intell. Med., vol. 60, no. 1, pp. 13–26, 2014. [Online]. Available: [29] T. Majtner, S. Yildirim-Yayilgan, and J. Y. Hardeberg, “Combining
http://www.sciencedirect.com/science/article/pii/S0933365713001589 deep learning and hand-crafted features for skin lesion classification,”
[9] J. López-Labraca, M. Á. Fernández-Torres, I. González-Dı́az, F. Dı́az-de in Proc. 6th Int. Conf. Image Process. Theory, Tools Appl., Dec. 2016,
Marı́a, and Á. Pizarro, “Enriched dermoscopic-structure-based cad system pp. 1–6.
for melanoma diagnosis,” Multimedia Tools Appl., Jun. 2017. [Online]. [30] M. Everingham, S. M. A. Eslami, L. Van Gool, C. K. I. Williams, J.
Available: https://doi.org/10.1007/s11042-017-4879-3 Winn, and A. Zisserman, “The pascal visual object classes challenge:
[10] N. Codella, J. Cai, M. Abedini, R. Garnavi, A. Halpern, and J. R. Smith, A retrospective,” Int. J. Comput. Vis., vol. 111, no. 1, pp. 98–136,
Deep Learning, Sparse Coding, and SVM for Melanoma Recognition in Jan. 2015.
Dermoscopy Images. Cham, Switzerland: Springer, 2015, pp. 118–126. [31] C. Cortes and V. Vapnik, “Support-vector networks,” Mach. Learn.,
[Online]. Available: https://doi.org/10.1007/978-3-319-24888-2_15 vol. 20, no. 3, pp. 273–297, Sep. 1995. [Online]. Available: https://
[11] L. Yu, H. Chen, Q. Dou, J. Qin, and P. A. Heng, “Automated melanoma doi.org/10.1007/BF00994018
recognition in dermoscopy images via very deep residual networks,” IEEE [32] A. Marghoob, J. Malvehy, R. Braun, and A. Kopf, An Atlas of Dermoscopy
Trans. Med. Imag., vol. 36, no. 4, pp. 994–1004, Apr. 2017. (Encyclopedia of Visual Medicine). Boca Raton, FL, USA: CRC Press,
[12] A. Krizhevsky, I. Sutskever, and G. E. Hinton, “Imagenet classification 2004.
with deep convolutional neural networks,” in Advances in Neural Informa- [33] D. Pathak, P. Krähenbühl, and T. Darrell, “Constrained convolutional
tion Processing Systems 25, F. Pereira, C. J. C. Burges, L. Bottou, and K. neural networks for weakly supervised segmentation,” in Proc. IEEE Int.
Q. Weinberger, Eds. Red Hook, NY, USA: Curran Associates, Inc., 2012, Conf. Comput. Vis., 2015, pp. 1796–1804.
pp. 1097–1105. [Online]. Available: http://papers.nips.cc/paper/4824- [34] X. Glorot and Y. Bengio, “Understanding the difficulty of train-
imagenet-classification-with-deep-conv olutional-neural-networks.pdf ing deep feedforward neural networks,” in Proceedings of the Thir-
[13] K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for teenth International Conference on Artificial Intelligence and Statis-
image recognition,” in Proc. IEEE Conf. Comput. Vis. Pattern Recog- tics (Proceedings of Machine Learning Research), Y. W. Teh and
nit., Las Vegas, NV, USA, Jun., 2016, pp. 770–778. [Online]. Available: M. Titterington, Eds., vol. 9. Sardinia, Italy: PMLR, May 2010,
https://doi.org/10.1109/CVPR.2016.90 pp. 249–256.
[14] R. Girshick, “Fast R-CNN,” in Proc. Int. Conf. Comput. Vis., 2015, [35] G. Argenziano, H. P. Soyer, and V. D. Giorgi, Interactive Atlas of Der-
pp. 1440–1448. moscopy. Milan, Italy: Edra Medical Publishing and New Media, 2002.
[15] E. Shelhamer, J. Long, and T. Darrell, “Fully convolutional networks for [36] “International skin imaging collaboration: Melanoma project,” ISIC Arch.,
semantic segmentation,” IEEE Trans. Pattern Anal. Machine Intell., vol. 2017. [Online]. Available: https://isic-archive.com/
39, no. 4, pp. 640–651, Apr. 1, 2017. [37] K. Matsunaga, A. Hamada, A. Minagawa, and H. Koga, “Image clas-
[16] O. Ronneberger, P. Fischer, and T. Brox, U-Net: Convolutional Net- sification of melanoma, nevus and seborrheic keratosis by deep neu-
works for Biomedical Image Segmentation. Cham, Switzerland: Springer, ral network ensemble,” arXiv:1703.03108, 2017. [Online]. Available:
2015, pp. 234–241. [Online]. Available: https://doi.org/10.1007/978-3-319- http://arxiv.org/abs/1703.03108
24574-4_28 [38] I. González-Dı́az, “Incorporating the knowledge of dermatologists
[17] M. Varma and A. Zisserman, “A statistical approach to texture classifica- to convolutional neural networks for the diagnosis of skin le-
tion from single images,” Int. J. Comput. Vis., vol. 62, no. 1–2, pp. 61–81, sions,” arXiv:1703.01976, 2017. [Online]. Available: http://arxiv.org/
2005. abs/1703.01976
[18] M. Abedini, Q. Chen, N. Codella, R. Garnavi, and X. Sun, “Ac- [39] A. Menegola, J. Tavares, M. Fornaciali, L. T. Li, S. E. F. de Avila, and E.
curate and scalable system for automatic detection of malignant Valle, “RECOD titans at ISIC challenge 2017,” arXiv:1703.04819, 2017.
melanoma,” in Digital Imaging and Computer Vision. Boca Raton, FL, [Online]. Available: http://arxiv.org/abs/1703.04819
USA: CRC Press, Sep. 2015, pp. 293–343. [Online]. Available: http:// [40] L. Bi, J. Kim, E. Ahn, and D. Feng, “Automatic skin lesion anal-
dx.doi.org/10.1201/b19107-11 ysis using large-scale dermoscopy images and deep residual net-
[19] H. Zare and M. Taghi Bahreyni Toossi, “Early detection of melanoma in works,” arXiv:1703.04197, 2017. [Online]. Available: http://arxiv.org/
dermoscopy of skin lesion images by computer vision-based system,” in abs/1703.04197
Digital Imaging and Computer Vision. Boca Raton, FL, USA: CRC Press, [41] X. Yang, Z. Zeng, S. Y. Yeo, C. Tan, H. L. Tey, and Y. Su, “A novel
Sep. 2015, pp. 345–384. multi-task deep learning model for skin lesion segmentation and clas-
[20] G. Fabbrocini et al., Automatic Diagnosis of Melanoma Based on the sification,” arXiv:1703.01025, 2017. [Online]. Available: http://arxiv.org/
7-Point Checklist. Berlin, Germany: Springer, 2014, pp. 71–107. [Online]. abs/1703.01025
Available: http://dx.doi.org/10.1007/978-3-642-39608-3_4 [42] P. Kisilev, E. Sason, E. Barkan, and S. Hashoul, Medical Image De-
[21] F. Nachbar et al., “The ABCD rule of dermatoscopy,” J. Amer. Acad. scription Using Multi-Task-Loss CNN. Cham, Switzerland: Springer,
Dermatology, vol. 30, no. 4, pp. 551–559, 2016. [Online]. Available: 2016, pp. 121–129. [Online]. Available: https://doi.org/10.1007/978-3-319-
http://dx.doi.org/10.1016/S0190-9622(94)70061-3 46976-8_13