Abstract—Dermatological segmentation has always visually. Meanwhile, the type of skin disease is also difficult
been a hot topic in medical imaging. At present, many to judge itself, due to its visual similarity. This delay in time
algorithms have achieved good results in the may cause the patient’s condition to deteriorate. Nevus and
segmentation of skin diseases, such as super-pixel melanomas (As shown in Fig. 1) are so similar in physical
segmentation and U-Net network. The method we used characteristics that even experienced doctors cannot
in this paper is improved based on the instance distinguish them in the first place.
segmentation model, Mask R-CNN. Firstly, we have
trained the classification branch in Mask R-CNN in
advanced. Secondly, we made some adjustments to the
parameters of Mask R-CNN. These two changes ensure
that our method has higher segmentation accuracy and
detection accuracy than traditional Mask R-CNN. The
data set used in this paper comes from ISIC
(International Skin Imaging Collaboration). Experiment
results demonstrate that the segmentation effect of our
method on skin lesion images is better than the
Fig. 1. The Left image is nevus, and the other is melanoma. They are not
traditional Mask R-CNN. very different in color, shape, and distribution area. It takes a certain
amount of time and effort to make judgments by manual means.
Keywords—Deep learning; Skin Segmentation; Mask R-
CNN; CNN If skin disease develops into skin cancer, 5% of which
are malignant, will result in around 75% of death [4], which
will seriously endanger human health. At the same time, the
Today, deep learning models are widely used in various development of methods and tools for automated diagnostics
fields. They have made enormous contributions to the field of skin lesion could provide low-cost medical help around
of medical imaging, especially in the field of dermatology the world and eventually benefits the humanity, especially
segmentation. Many computer vision laboratories are for those who live in less-developed areas where lack
involving in the research on dermatological segmentation professional dermatologists and the cost professional help is
algorithms. As a result, many classic algorithms have high [5]. Medical experts in South Korea have obtained
appeared so far, such as Fully Convolutional Network (FCN) good results by using deep learning algorithms to detect skin
[1], super-pixel segmentation algorithm [2], and U-net [3], diseases [6]. Its accuracy even exceeds the corresponding
which are milestones in dermatological segmentation medical experts. We also verified the correctness of their
algorithm. Many of the later dermatological segmentation results by reproducing their paper codes and used it to detect
algorithms are based on them. Nevertheless, they still have skin diseases. Moreover, on this basis, we studied the deep
some problems to be optimized. Due to the down-sampling learning model, Faster R-CNN [7], they adopted, replacing
steps caused by pooling and large receptive fields in the the original non-maximum suppression algorithm (NMS) [8]
convolutional layers, the predicted lesion segmentation is with Soft-NMS [9] to optimize the decision of the candidate
sometimes vague, and there is a lack of lesion boundary box. The successful application of the deep learning model
details, which is the problem of FCN [2]. The problem of the they used in the medical field prompted us to dive into a
super-pixel segmentation algorithm is that it does not take deeper direction in this field. We also found that there is a
into account on differences from the training data set ground more powerful model called Mask R-CNN [10] based on
truth segmentation masks. Also, for U-Net it's hard to avoid Faster R-CNN. The origin of Mask R-CNN is from the R-
the overfitting problem when the number of iterations CNN [11] series. R-CNN is improved on convolutional
increases. These problems have always been more difficult neural networks(CNN) basis [12] and is also the first deep
when trying to improve the dermatological segmentation learning algorithm applied to target detection. Based on R-
algorithm. For the skin disease itself, it is difficult to CNN, Fast R-CNN [13], Faster R-CNN, and Mask R-CNN
accurately determine the boundary of the affected areas
Mask R-CNN is a model that is easy to expand and In order to verify the effectiveness of our method, we
improve and has good robustness and stability. It extends adopted the ISIC skin disease data set, and the experimental
Faster R-CNN in which the mask branch only adds a small results and related conclusions will be described later.
computational overhead, enabling a fast system and rapid
experimentation [10]. Also, Mask R-CNN is quite successful III. METHODOLOGY
in the field of instance segmentation. It is the basis for many A. Mask R-CNN
later instance segmentation models. These models will be
compared with Mask R-CNN. The method used in this paper We constructed our framework based on Mask R-CNN
is to train its classification branch based on Mask R-CNN as shown in Fig. 2. Mask R-CNN, a general framework for
and to modify some settings of Mask R-CNN to adapt to object instance segmentation, can realize the accurate
skin lesion segmentation tasks. detection of objects in an image and generate a segmentation
mask for each instance simultaneously. For the images in
The dermatological data set we used is from ISIC skin lesion datasets, most of the segmented objects have
(International Skin Imaging Collaboration) [14], which has only one instance, but we have a higher standard for the
23906 images of skin lesion. We selected and downloaded accuracy of the segmented area of this instance. It contains
the data set used in the dermatological segmentation two stages. The first stage is proposing candidate object
competition for the training and testing of our method. bounding boxes with the RPN(Region Proposal Network).
The structure of the paper is organized as follows: The second stage is made up of a Fast R-CNN classifier and
Section 1. is the introduction. We will introduce related a binary mask prediction branch. The detailed steps for each
work about this paper in Section 2. In Section 3, we describe stage are as follows:
Methodology. Experimental results on a skin lesion data set Stage. 1: The original picture enters the Feature Pyramid
are introduced in Section 4. Section 5 concludes the paper. Networks (FPN) [18], a vital part to extract features and
obtain a feature map in the feature extractor of Mask R-
II. RELATED WORK CNN. The feature map passes through the RPN network to
Following Faster R-CNN, Mask R-CNN gave birth to a generate candidate boxes, which are then combined with the
masterpiece in the field of instance segmentation. Many feature map. Therefore, a feature map with candidate boxes
researchers have made certain improvements based on Mask is obtained. Among them, the number of candidate boxes is
R-CNN to complete their scientific research projects. These quite large. Therefore a certain selection is needed in the
papers [15][16][17] use Mask R-CNN to make certain later stage.
improvements for the recognition and segmentation of
Stage. 2: The feature map with candidate boxes is
Lungs, ships, and remote sensing pictures respectively.
screened by the NMS algorithm to obtain the candidate box
We borrow their improvement methods and have made of the optimal solution. NMS is widely used in object
certain improvements and adjustments to Mask R-CNN by detection algorithms whose purpose is to eliminate
studying the characteristics of skin diseases in the skin redundant candidate boxes and find the best object detection
disease data set. We also combined the capabilities of Mask position. Then the feature map is classified through three
R-CNN itself. For the problem of the difficulty to judge the branches, pixel-level segmentation, and candidate frame
type of skin disease, we pre-trained the classification optimization to obtain the final result. Among them, the
network of Mask R-CNN; for the problem of segmentation pixel-level segmentation operation occurs in the branch of
regions and their candidate frames, we have adjusted some Mask prediction. It will classify and judge the target object
settings of Mask R-CNN through certain research on the at the pixel level. If it is, it will mark the segmentation;
structure of Mask R-CNN. otherwise, it will not. Meanwhile, by feeding the features
into the mask prediction branch which consists of four
At the same time, given the characteristics of skin convolution layers and one de-convolution layer, it can
diseases, we also read some related literature and the predict the skin lesion area target mask.
corresponding segmentation algorithm literature to have a
better understanding of skin diseases. Combined with the B. Our Method
characteristics of skin diseases, we can improve Mask R- Due to the characteristics of skin diseases, moles and
CNN to deal with skin diseases more specifically. melanoma are not well distinguished in physical features.
We pre-trained the Mask R-CNN classification network and assigned corresponding weights before the formal training.
Fig. 3. Comparison of segmentation results of skin diseases. The first line is the original image, the second line is the Mask R-CNN segmentation map, and
the third line is the result of our method.
segmentation map, and the third line is the result of our
