Skin Lesion Classification Using Convolutional Neural Network For Melanoma Recognition
Skin Lesion Classification Using Convolutional Neural Network For Melanoma Recognition
Skin Lesion Classification Using Convolutional Neural Network For Melanoma Recognition
1 Introduction
Skin cancer is a common type of cancer that originates in the skin’s epidermis
layer by the irregular cells due to ultraviolet radiation exposure [1]. Every fifth
person in the United States (US) has a risk of skin cancer in a region under
strong sunshine [2]. Among all skin cancer types, melanoma is the nineteenth
most frequently cancer, where approximately 3.0 million new cases were iden-
tified in 2018. On average, 2, 490 females and 4, 740 males lost their lives due
to melanoma in 2019 [3] in the US alone. It is estimated that approximately,
in 2020, 1.0 million of newly affected melanoma patients will be diagnosed. An
approximated 6, 850 new case of deaths due to melanoma are anticipated in 2020
in the US alone, which will comprise 4, 610 males and 2, 240 females [4]. Skin
NOTE: This preprint reports new research that has not been certified by peer review and should not be used to guide clinical practice.
medRxiv preprint doi: https://doi.org/10.1101/2020.11.24.20238246; this version posted November 26, 2020. The copyright holder for this
preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
2 A. Dutta et al.
Marker Sign
Non-uniform Gel
Water Bubble Ink Marks
Vignetting
Fig. 1: An example of the challenging images, in the ISIC dataset [11], for the
accurate SLC.
Nowadays, several methods are being used for the SLC [12, 13]. The sen-
sitivity and specificity of the practitioners are 62 %, and 63 %, while for the
dermatologists are 80 % and 60 % respectively for melanoma recognition [14].
In [15], the authors presented a CAD system for the SLC, where they used
boarder and wavelet-based texture features with the Support Vector Machine
(SVM), Hidden Naı̈ve Bayes (HNB), Logistic Model Tree (LMT), and Random
Forest (RF) classifiers. In [16], the authors proposed a model for the SLC, which
comprises a Self-generating Neural Network (SGNN), a feature (texture, color,
and border) extraction, and an ensembling classifier. A deep residual network
(DRN) was presented in [17] for the SLC, where they demonstrated that DRN’s
have the capability to learn distinctive features than low-level hand-crafted fea-
tures or shallower CNN architecture. CNN architecture, along with the transfer
learning paradigm, was employed for the SLC in [18]. A 3-D skin lesion re-
construction technique was presented in [19], where depth and 3-D shape fea-
medRxiv preprint doi: https://doi.org/10.1101/2020.11.24.20238246; this version posted November 26, 2020. The copyright holder for this
preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
tures were extracted. Besides, they also extracted regular color, texture, and
2-D shape features. Different machine learning classifiers ( SVM and AdaBoost)
were employed to classify those features. In [20], the authors proposed an ef-
fective iterative learning framework for the SLC, where they designed a sample
re-weighting strategy to conserve the effectiveness of accurately annotated hard
samples. The stacking ensemble pipeline based on the meta-learning algorithm
was proposed in [21], where two-hybrid methods were introduced to combine the
mixture classifiers. The effect of dermoscopic image size based on pre-trained
CNNs, along with transfer learning, was analyzed in [22], where they resized
from 224 × 224 to 450 × 450. They proposed a multi-scale multi-CNN fusion
approach using EfficientNetB0, EfficientNetB1, and SeReNeXt-50, where three
networks architectures trained on cropped images of different sizes. An architec-
ture search framework was presented in [23] to recognize the melanoma, where
the hill-climbing search strategy, along with network morphism operations, were
employed to explore the search space.
This study proposes a framework for the SLC (multi-class task), where pre-
processing, geometric augmentation, CNN-based classifiers, and transfer learning
are the integrated steps. We have performed various types of geometric image
augmentations for increasing the training samples as in most of the medical
imaging domains, a massive number of manually annotated training images are
not yet available [24]. A CNN-based classifier has been used to avoid the tedious
feature engineering, which can learn features automatically during the forward-
backward pass of training images. Transfer learning of CNN is used to initialize
all the kernels in convolutional layers by leveraging the previously trained knowl-
edge rather than random initialization. Extensive experiments are conducted to
select different hyper-parameters like types of image augmentation, optimizer &
loss function, and metric to be maximized for training, the number of layers of
CNN to be frozen. We validate our proposed framework by comparing it with
several state-of-the-art methods on the ISBI-2017, where our proposed pipeline
achieved better results while being an end-to-end system for the SLC.
The remainder of this paper is set out as follows. Section 2 describes the
materials and the proposed framework. Section 3 presents detail results with
a proper illustration. Finally, the paper is concluded in section 4 with future
works.
The details description of the materials and the methods used in this literature
are represented as follows in several subsections:
The utilized dataset, for training, validation, and testing, is presented in Table 1,
where we use the ISIC-2017 dataset from the ISIC archive [11]. Table 1 presents
the class-wise distributions and a short description of the ISIC-2017 dataset. The
medRxiv preprint doi: https://doi.org/10.1101/2020.11.24.20238246; this version posted November 26, 2020. The copyright holder for this
preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
4 A. Dutta et al.
In the proposed framework, as shown in Fig. 2, the feature extraction and classi-
fication of the skin lesion for melanoma recognition have been automated using
an end-to-end CNN architecture, where the image augmentation and normal-
ization are the crucial and integral parts of the proposed framework. The deep
Fig. 2: Block diagram of the proposed framework for an automatic SLC towards
melanoma recognition using CNN-based classifiers, preprocessing, and transfer
learning.
CNNs are widely used in both medical and natural image classification, which has
achieved tremendous success since 2012 [25]. It often rivals human expertise [26].
In CheXNet [27], a CNN was trained on more than 1.0 million frontal-view of
the chest X-rays, where it was able to achieve better recognition results than the
average recognition by the four experts.
However, in this article, the CNN model is shown in Fig. 3, which has 13
convolutional layers in the 5 convolutional blocks. Each block ends up with a
medRxiv preprint doi: https://doi.org/10.1101/2020.11.24.20238246; this version posted November 26, 2020. The copyright holder for this
preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
Mel
SK
Nev
HOut ϵ𝑅3
H3 ϵ𝑅64
H2 ϵ𝑅128
H1 ϵ𝑅256
Fig. 3: The CNN network for the SLC, where Hm ∈ Rn is the mth hidden layer
in n-dimensional space. The output layer, HOut lies in 3-dimensional (Mel, SK,
and Nev) space.
6 A. Dutta et al.
Table 2: Classification report for the SLC, where the weights of the classes were
calculated from the supported samples for averaging the metrics.
Class Precision Recall F1-score Support
Mel 0.53 0.62 0.57 117
Nev 0.87 0.76 0.81 393
SK 0.55 0.73 0.63 90
Weighted Average 0.76 0.73 0.74 600
intuition for the quantitative evaluation of the classifier, which can also show
medRxiv preprint doi: https://doi.org/10.1101/2020.11.24.20238246; this version posted November 26, 2020. The copyright holder for this
preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
Table 3: Confusion matrix of the test results, where each column and row rep-
resent the instances in a predicted and actual class respectively.
Predicted
Mel Nev SK
Mel 72 28 17
Actual Nev 57 300 36
SK 8 16 66
8 A. Dutta et al.
Table 4: Several examples of wrongly classified images, from the proposed frame-
work, with confidence probability.
Example Images Predicted Class with Confidence
Actual Class: SK
Predicted Class: Nev
Confidence Probability: 0.994
Actual Class: SK
Predicted Class: Mel
Confidence Probability: 0.876
given random sample, the probability of accurate recognition as Mel, Nev, and
SK is as high as 87.3 %. The precision-recall curve, as shown in Fig. 4 (b), shows
the tradeoff between precision and recall for different thresholds, where the high
area under the curve represents both high recall and precision. High scores for
both show that the proposed pipeline is a blessing with the accurate results
as well as returning a majority of all positive results (high recall). The macro-
average precision from the proposed pipeline is 80.6 %, which indicates that the
proposed pipeline is well-suited for the SLC for melanoma recognition.
Table 5 shows the state-of-the-art comparison of our proposed pipeline with
recent works, where AlexNet [33] and ResNet-101 [34] were implemented in [35]
for the SLC. The proposed framework produces the best classification of the skin
lesion, as shown in Table 5. Our pipeline produces the best results concerning
medRxiv preprint doi: https://doi.org/10.1101/2020.11.24.20238246; this version posted November 26, 2020. The copyright holder for this
preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
1.0
0.9
0.8
1.0
0.9
0.8
0.7
Precision
0.6
0.5
0.4
0.3
0.2
0.1
0.0
0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
Recall
(b)
Fig. 4: (a) ROC curve to summarize the trade-off between the true-positive rate
and false-positive rate, and (b) Precision-recall curve to summarize the trade-off
between the true-positive rate and the positive predictive value.
the recall and precision beating the state-of-the-art works in [36] by a 12.0 %
margin, and in [34] by a 5.0 % respectively. Our method beats method in [34] by
the margin of 39.0 %, and 5.0 % in terms of the recall and precision respectively,
although the AUC is same in both the methods.
4 Conclusion
In this article, an automatic and robust framework for melanoma recognition has
been proposed and implemented. The potentiality of the proposed framework
has been validated via several comprehensive experiments. The scarcity of using
medRxiv preprint doi: https://doi.org/10.1101/2020.11.24.20238246; this version posted November 26, 2020. The copyright holder for this
preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
10 A. Dutta et al.
References
1. Narayanamurthy, V., Padmapriya, P., Noorasafrin, A., Pooja, B., Hema, K.,
Nithyakalyani, K., Samsuri, F., et al.: Skin cancer detection using non-invasive tech-
niques. RSC advances 8(49), 28095-28130 (2018).
2. Ries, L. A., Harkins, D., Krapcho, M., Mariotto, A., Miller, B., Feuer, E. J., Clegg,
L. X., Eisner, M., Horner, M. J., Howlader, N., et al.: SEER cancer statistics review,
1975-2003 (2006).
3. Zhang, N., Cai, Y. X., Wang, Y. Y., Tian, Y. T., Wang, X. L., Badami, B.: Skin
cancer diagnosis based on optimized convolutional neural network. Artificial Intel-
ligence in Medicine 102, 101756 (2020).
4. Siegel, R. L., Miller, K. D., Jemal, A.: Cancer statistics, 2020. CA: A Cancer Journal
for Clinicians 70(1), 7–30 (2020).
5. World Health Ranking, https://www.worldlifeexpectancy.com/bangladesh-skin-
cancers, last accessed 1 May 2020.
6. Ge, Z., Demyanov, S., Chakravorty, R., Bowling, A., Garnavi, R.: Skin disease
recognition using deep saliency features and multimodal learning of dermoscopy
and clinical images. In: International Conference on Medical Image Computing
and Computer-Assisted Intervention, pp. 250–258. Springer, Quebec City, Canada
(2017).
7. Smith, L., MacNeil, S.: State of the art in non-invasive imaging of cutaneous
melanoma. Skin Res. Technol 17(3), 257–269 (2011).
8. Hasan, M. K., Dahal, L., Samarakoon, P. N., Tushar, F. I., Martı́, R.: DSNet: Au-
tomatic dermoscopic skin lesion segmentation. Computers in Biology and Medicine
120, 103738 (2020).
medRxiv preprint doi: https://doi.org/10.1101/2020.11.24.20238246; this version posted November 26, 2020. The copyright holder for this
preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
9. Jalalian, A., Mashohor, S., Mahmud, R., Karasfi, B., Saripan, M. I. B., Ramli, A. R.
B.: Foundation and methodologies in computer-aided diagnosis systems for breast
cancer detection. EXCLI Journal 16, 113–137 (2017).
10. Mishraa, N. K., Celebi, M. E.: An overview of melanoma detection in dermoscopy
images using image processing and machine learning. arXiv:1601.07843 (2016).
11. Codella, N. F., Gutman, D., Celebi, M. E., Helba, B., Marchetti, M. A., Dusza,
S. W., Kalloo, A., Liopyris, K., Mishra, N., Kittler, H.: Skin lesion analysis toward
melanoma detection: A challenge at the 2017 International Symposium on Biomedi-
cal Imaging (ISIB), hosted by the International Skin Imaging Collaboration (ISIC).
In: 2018 IEEE 15th International Symposium on Biomedical Imaging (ISBI 2018),
pp. 168–172. IEEE, Washington, DC (2018).
12. Brinker, T. J., Hekler, A., Utikal, J. S., Grabe, N., Schadendorf, D., Klode, J.,
Berking, C., Steeb, T., Enk, A. H., von Kalle, C.: Skin cancer classification us-
ing convolutional neural networks: systematic review. Journal of medical Internet
research 20(10), e11936 (2018).
13. Ma, Z., Tavares, J. M. R., et al.: A review of the quantification and classification
of pigmented skin lesions: from dedicated to hand-held devices. Journal of medical
systems 39(11), 177 (2015).
14. Menzies, S. W., Bischof, L., Talbot, H., Gutenev, A., Avramidis, M., Wong, L.,
Lo, S. K., Mackellar, G., Skladnev, V., McCarthy, W., et al.: The performance of
solar scan: an automated dermoscopy image analysis instrument for the diagnosis
of primary melanoma. Archives of dermatology 141(11), 1388–1396 (2005).
15. Garnavi, R., Aldeen, M., Bailey, J.: Computer-aided diagnosis of melanoma us-
ing border and wavelet-based texture analysis. IEEE Transactions on Information
Technology in Biomedicine 16(6), 1239–1252 (2012).
16. Xie, F., Fan, H., Li, Y., Jiang, Z., Meng, R., Bovik, A.: Melanoma classification on
dermoscopy images using a neural network ensemble model. IEEE transactions on
medical imaging 36(3), 849–858 (2016).
17. Yu, L., Chen, H., Dou, Q., Qin, J., Heng, P.A.: Automated melanoma recognition in
dermoscopy images via very deep residual networks. IEEE transactions on medical
imaging 36(4), 994–1004 (2016).
18. Lopez, A. R., Giro-i Nieto, X., Burdick, J., Marques, O.: Skin lesion classification
from dermoscopic images using deep learning techniques. In: 2017 13th IASTED in-
ternational conference on biomedical engineering (BioMed), pp. 49–54. IEEE, Inns-
bruck, Austria (2017).
19. Satheesha, T., Satyanarayana, D., Prasad, M. G., Dhruve, K. D.: Melanoma is
skin deep: A 3D reconstruction technique for computerized dermoscopic skin lesion
classification. IEEE journal of translational engineering in health and medicine 5,
1–17 (2017).
20. Xue, C., Dou, Q., Shi, X., Chen, H., Heng, P. A.: Robust learning at noisy labeled
medical images: applied to skin lesion classification. In: 2019 IEEE 16th Interna-
tional Symposium on Biomedical Imaging (ISBI 2019), pp. 1280–1283. IEEE, Venice,
Italy (2019).
21. Ghalejoogh, G. S., Kordy, H. M., Ebrahimi, F.: A hierarchical structure based on
stacking approach for skin lesion classification. Expert Systems with Applications
145, 113127 (2020).
22. Mahbod, A., Schaefer, G., Wang, C., Dorffner, G., Ecker, R., Ellinger, I.: Transfer
learning using a multi-scale and multi-network ensemble for skin lesion classification.
Computer Methods and Programs in Biomedicine 193, 105475 (2020).
23. Kwasigroch, A., Grochowski, M., Mikolajczyk, A.: Neural architecture search for
skin lesion classification. IEEE Access 8, 9061–9071 (2020).
medRxiv preprint doi: https://doi.org/10.1101/2020.11.24.20238246; this version posted November 26, 2020. The copyright holder for this
preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
12 A. Dutta et al.
24. Harangi, B.: Skin lesion classification with ensembles of deep convolutional neural
networks. Journal of biomedical informatics 86, 25–32 (2018).
25. Rawat, W., Wang, Z.: Deep convolutional neural networks for image classification:
A comprehensive review. Neural computation 29(9), 2352–2449 (2017).
26. Yadav, S. S., Jadhav, S. M.: Deep convolutional neural network based medical
image classification for disease diagnosis. Journal of Big Data 6(1), 113 (2019).
27. Rajpurkar, P., Irvin, J., Zhu, K., Yang, B., Mehta, H., Duan, T., Ding, D., Bagul,
A., Langlotz, C., Shpanskaya, K., et al.: ChexNet: Radiologist-level pneumonia de-
tection on chest x-rays with deep learning. arXiv:1711.05225 (2017).
28. Lin, M., Chen, Q., Yan, S.: Network in network. arXiv:1312.4400 (2013).
29. Huh, M., Agrawal, P., Efros, A. A.: What makes ImageNet good for transfer learn-
ing? arXiv:1608.08614 (2016).
30. Deng, J., Dong, W., Socher, R., Li, L., Li, K., Fei-Fei, L.: ImageNet: A large-scale
hierarchical image database. In: IEEE Conference on Computer Vision and Pattern
Recognition, pp. 248–255. IEEE, Florida, USA (2009).
31. Glorot, X., Bengio, Y.: Understanding the difficulty of training deep feed forward
neural networks. In: Proceedings of the thirteenth international conference on arti-
ficial intelligence and statistics, pp. 249–256. Sardinia, Italy (2010).
32. Kingma, D. P., Ba, J.: Adam: A method for stochastic optimization.
arXiv:1412.6980 (2014).
33. Krizhevsky, A., Sutskever, I., Hinton, G. E.: ImageNet classification with deep con-
volutional neural networks. In: Advances in neural information processing systems,
pp. 1097–1105. Curran Associates, Inc., Nevada, USA (2012).
34. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition.
In: Proceedings of the IEEE conference on computer vision and pattern recognition,
pp. 770–778. IEEE, Las Vegas, NV, USA (2016).
35. Li, Y., Shen, L.: Skin lesion analysis towards melanoma detection using deep learn-
ing network. Sensors 18(2), 556 (2018).
36. Yang, J., Xie, F., Fan, H., Jiang, Z., Liu, J.: Classification for dermoscopy images
using convolutional neural networks based on region average pooling. IEEE Access
6, 65130–65138 (2018).
37. Sultana, N. N., Mandal, B., Puhan, N. B.: Deep residual network with regularised
fisher framework for detection of melanoma. IET Computer Vision 12(8), 1096–1104
(2018).
38. Serte, S., Demirel, H.: Gabor wavelet-based deep learning for skin lesion classifica-
tion. Computers in biology and medicine 113, 103423 (2019).