A Deep Transfer Learning Approach For Identification of Diabetic Retinopathy Using Data Augmentation
A Deep Transfer Learning Approach For Identification of Diabetic Retinopathy Using Data Augmentation
A Deep Transfer Learning Approach For Identification of Diabetic Retinopathy Using Data Augmentation
Corresponding Author:
Yerrarapu Sravani Devi
Department of Computer Science & Engineering, GITAM Deemed to be University
Hyderabad, Telangana, India
Email: [email protected]
1. INTRODUCTION
Throughout the world, Eye examinations are extremely significant for early detection so that the
chances of effective treatment can be improved where fundus cameras are used to capture retinal images.
diabetic eye disease (DED) comprises a bunch of several ocular conditions including glaucoma, diabetic
macular edema, and diabetic retinopathy (DR). All sorts of DED lead to major loss of sight which ultimately
results in vision impairment in patients aged from 20-74. By 2045, this problem is expected to rise to 690
million. They key consequences of diabetes can be seen in various parts in the body which includes retina as
well. The onset of Severe DED takes place with an unusual progression of blood vessels, deterioration of the
optic nerve and the generation of hard exudates near the macula (central part of retina) region. It is
considered that four types of DED are a threat to eye vision and they are explained in a nutshell in the given
subsection. The identification of DR, a diabetic complication affecting eyes is performed by observing the
damage to the blood vessels of retina at the fundus of the eye. Retina in the eye performs a function to sense
light & send the information with a sign to the brain. Then, the brain is responsible for decoding those signals
so that one can see the objects.
Stages of DR based on severity features. DR has been classified into various stages based on its
complications in [1]–[3]. The levels showed in Figure 1 are as :
Level 0 : This is not a severe case and doesn't need apparent retinopathy.
Level 1 : MILD NDPR. The patients have at least one microaneurysms with or without the presence of
other lesions.
Level 2 : Moderate NPDR. The patients have haemorrhages and consists of many microaneurysms.
Level 3 : Severe NPDR. i) The patients have haemorrhages & multiple microaneurysms in four sections
or quadrants of retina; ii) cottonwood spots in 2 or more quadrants; and iii) Intraretinal
microvascular abnormalities in 1 quadrant at least.
Level 4 :Proliferative DR. patients suffer from advanced stage of NDPR. At this stage, due to the
neovascularisation higher risk of leakage can cause severe loss of sight. This might lead to
blindness.
The classification and detection of DR process requires more time, and its time is important when
the patient case is severe. Instead of Machine Learning techniques like [4], need an automation system is
needed to identify the stages of DR efficiently. For building such system a few papers have been researched
to get a better understanding about convolutional neural network (CNN). A survey of research done on nearly
huge papers on the diagnosis of DR, it consists of various methods used to detect retinopathy.
Pratt et al. [5] and Abramoff et al. [6] prepared a CNN based network with data augmentation
technique. This can detect the complex features engaged in the task of classification like micro-aneurysms,
several haemorrhages in the retina, exudates. This accordingly diagnose on its own without the user input.
They obtained 95% sensitivity, 75% precision on five thousand validation images. Moreover, there are
several other research works done on CNNs by many other renowned researchers as in [7], [8]. Many other
research works have been done to make transfer learning CNN architecture based on as methods in [9]
attempted to prepare Inception Net V3 for a five-class categorization with pretrain on the ImageNet dataset
and obtained a 90.0% precision, Zhang et al. [10] attempted to train ResNet50, Xception Nets, DenseNets
and visual geometry group (VGG) with ImageNet pretrain and obtained great precision of around 81.3%.
Both research teams utilized the datasets which were offered APTOS & Kaggle as in suggested a method
with architecture of CNN and data augmentation.
Lam et al. [11] The duckworth–lewis (DL) method was qualified with the utilization of the
Inception architectural model & a big data greater than 1.6 million retinal pictures then modified to a set of
2,000 images which included the labels approved by three ophthalmologists as a reference standard.
Altogether, this model successfully exhibited a 5-class accuracy of 88.4%. with a precision of 96.9% for
images with no DR and 57.9% for images with mild or severe non-proliferative DR. Xu et al. [12] the
suggested method based upon CNN network using VGG network architecture have been trained with back
propagation neural network (NN), deep neural network (DNN) and CNN. Dutta et al. [13] a multi-cell multi-
task CNN (MCNN) is proposed. Gulshan et al. [14] the method is used pretrained transfer learning on
AlexNet and GoogleNet models from Imagenet. In [15], [16] is another model for classifying the DR with 5
classes of DR. EyePACS dataset is used for training set of 35,126 images and test set of 53,576 images. The
proposed DR classifier got sensitivity and specificity of 90% to detect the severity levels of DR disease.
The traditional diagnosis of DR is done by taking the retinal images and studying for the signs of the
disease on those images which are collected. Moreover, there is a heavy expenditure of the fundus imaging
devices and its installation in the healthcare centers. Due to the lack of ophthalmologists, healthcare
professionals and establishments globally, research has also been conducted to implement the mobile based
diagnosis services of DR. With the advancement in technology, Numerous researchers have been able to
develop several image restorations, image enhancement and the layouts or building designs of Image deep
learning particularly CNNs along with the classification layers at the last.
The transfer learning techniques [17] are widely accepted and are in demand due to the shortage of
labelled training data in designing and training of deep CNN models [18]. In the healthcare sector associated
with technology, the major hassle or issue in the application of deep learning models is the inadequacy of
annotated training data. Transfer learning is the use of pre-existing trained neural network models in
categorising the previously dataset that is unseen. This can be extremely significant in the classification of
medical image as we all know that the healthcare of computed image classification goes through the
obstacles from annotated training data inadequacy. Some of the existing transfer learning model accuracies
shown in Table.1.
Various international studies depict that the algorithms developed used the limited clinical datasets
and are not annotated by expert ophthalmologists. Moreover, standardized balanced dataset is not available in
the non-clinical environment for different specific diseases and hence not exactly identifying the prevalence
of the eye disease with available algorithms. In this paper, we investigate the problem of lack of medical
images such as eye diseases for image classification, generated images using traditional data augmentation
techniques and compare the classification model performance metrics with and without using synthetic
images. The model based on CNN for image classification.
2. RESEARCH METHOD
2.1. Datasets
The increasingly used datasets used for the detecting DR are Kaggle [19] and Messidor as in [20].
Authors in have used Kaggle data however have used Messidor data. The Kaggle dataset comprises of 88,702
images, of which 35,126 are utilized for training purpose and 53,576 have a usage for testing.
Messidor is the most used dataset that contain 1,200 fundus images. The Kaggle and Messidor dataset,
is labelled for the stages of DR. In the proposed model, APTOS 2019 Blindness detection dataset [21]
(APTOS, 2019) taken. The full dataset consists of 18590 fundus photographs, which are divided into 3662
training, 1928 validation, and 13000 testing images by organizers of Kaggle competition. All datasets have
similar distributions of classes; train and test data distribution for APTOS2019 is shown in Figure 2 and 3.
bring out the cases of the chosen images for a class wherein the number of images is lesser than another large
section of healthy images of retina as opposed to the DED retina images. The technique of augmentation as
in [22]–[27] is a common approach for improving the outcomes and to prevent overfitting. Moreover, after
observing, Kaggle dataset distribution is not even. Now, we also used the Kaggle APTOS-Blindness dataset
which comprises of 13,000 colour fundus images approximately, each one of dimension of 3216×2136
pixels, displayed in Figure 4(a). When a deep network is trained with dataset, it results in biasness of
classification. In the first step of data augmentation, resizing of each image to 224×224 is done, the resizing
helps to maintain the initial or the original aspect ratio as presented in Figure 4(b). Now, we can also look at
the augmented images as presented in Figure 4(c). These methodsincrease the dataset size, balance the
samples in each class and prevent overfilling. During the procedure of training, the validation set is used to
check and decrease the errors such as overfitting. Presenting results of the sample images of several data
augmentation techniques:
(a) (b)
(c)
Figure 4. Sample images after applying data augmentation techniques for (a) original image; (b) resized to
224×224, 256×256, 299×299 and 512×512; and (c) Augmented images using rotations
2.3. Pre-processing
For the improvement of the images, Image pre-processing steps can be applied. For adjusting the
images and making them clearer so that the model to learn features with efficacy can be enabled, we will use
some techniques of image processing using the OpenCV library in python (cv2). Moreover, gaussian blur can
also be used to produce the different features in the images. The convolution of image with the gaussian filter
takes place in the gaussian blur operation [28]. The gaussian filter is an added filter which is responsible for
removing the components that are high in frequency.
This is a noise removal filter such as gaussian filtering is then used which smoothens up the image
or reduces its details. The use of these techniques results in an image which has a low resolution in
accordance with the system. Next apply circle crop operation on the resultant images. This function is
identifying the circle part of the image. Figure 5 shows the sample images after performing pre-processing
operations on APTOS-Blindness dataset.
analyzed based on receiver operating characteristics (ROC) curve. This ROC curve maps the relationship
between the false positive rate on x-axis and true positive rate on y-axis across a full range of possible
thresholds. The classification accuracy of the original dataset without using synthetic data is 86% and with
synthetic dataset is 91.1%. Figures 7. Classification report of the model without data augmentation
Figure 7(a) represents the classification metrics of the model, Figure 7(b) represents the Confusion matrix of
ROC curve analysis of the model, and Figure 7(c) represents the ROC curve in the baseline case (i.e., without
synthetic data).
(a) (b)
(c)
Figure 7. Classification report of the model without dataaugmentation for (a) classification metrics of the
model; (b) Confusion matrix of ROC curve analysis of the model; and (c) ROC curve in the baseline case
(i.e., without synthetic data)
The ROC curve graphs demonstrate that the performance of the model constantly improved after
adding the synthetic images to the original dataset. The results showed that the overall accuracy of the
classification model was improved by 11% approximately. Here, the aim was to correctly categorise and
identify all DR stages most importantly the initial DR stages. Figures 8. Classification report of the model
with traditional dataaugmentation Figure 8(a) represents the classification metrics of the model, Figure 8(b)
represents the Confusion matrix of ROC curve analysis of the model and Figure 8(c) represents the ROC
curve in case of applying the traditional-based data augmentation methods.
(a) (b)
(c)
Figure 8. Classification report of the model with traditional dataaugmentation (a) classification metrics of the
model; (b) Confusion matrix of ROC curve analysis of the model; and (c) ROC curve in the baseline case
(i.e., with synthetic data)
4. CONCLUSION
In our modern-day technology, Deep learning tools open a wide range of possibilities to develop
effective models which can render better results. These deep learning tools have a potential utility in
ophthalmology. Since we all know that diabetes is a rapidly growing disease and affect our body severely.
The diagnosis of the disease by any manual means seems to be tiresome and usually results in errors.
Therefore, the computational tools are developed in a way that automatically diagnose that is discussed in the
literature. In the present study, we have managed to showcase an automatic deep learning model to identify
the different stages of DR using APTOS-Blindness dataset. The proposed model is ResNet50 with synthetic
data can provide strength to the classifying model and improves its capability. The model showed more
accuracy with 91.1%. The result depicts- the comparison of classification metrics using synthetic and non
synthetic images. The model compares and identifies all the stages of DR unlike the present methods and
achieve accuracy of 91% using the synthetic data and 86% accuracy without using synthetic data. On the top
of that, we also intend to have training specified models for stages so that the accuracy of the initial stages
can be increased or improved.
REFERENCES
[1] Q. Abbas, I. Fondon, A. Sarmiento, S. Jiménez, and P. Alemany, “Automatic recognition of severity level for diagnosis of
diabetic retinopathy using deep visual features,” Medical & Biological Engineering & Computing, vol. 55, no. 11, pp. 1959–1974,
Mar. 2017, doi: 10.1007/s11517-017-1638-6.
[2] C. P. Wilkinson et al.., “Proposed international clinical diabetic retinopathy and diabetic macular edema disease severity scales,”
Ophthalmology, vol. 110, no. 9, pp. 1677–1682, Sep. 2003, doi: 10.1016/s0161-6420(03)00475-5.
[3] M. Sandhya, M. K. Morampudi, R. Grandhe, R. Kumari, C. Banda, and N. Gonthina, “Detection of diabetic retinopathy (DR)
severity from fundus photographs: an ensemble approach using weighted average,” Arabian Journal for Science and Engineering,
pp. 1–8, Jan. 2022, doi: 10.1007/s13369-021-06381-1.
[4] M. Panda, D. P. Mishra, S. M. Patro, and S. R. Salkuti, “Prediction of diabetes disease using machine learning algorithms,” IAES
International Journal of Artificial Intelligence (IJ-AI), vol. 11, no. 1, pp. 284–290, Mar. 2022, doi: 10.11591/ijai.v11.i1.pp284-
290.
[5] H. Pratt, F. Coenen, D. M. Broadbent, S. P. Harding, and Y. Zheng, “Convolutional neural networks for diabetic retinopathy,”
International Conference On Medical Imaging Understanding and Analysis 2016, MIUA 2016, vol. 90, pp. 200–205, Jul. 2016,
doi: 10.1016/j.procs.2016.07.014.
[6] M. D. Abràmoff et al.., “Improved automated detection of diabetic retinopathy on a publicly available dataset through integration
of deep learning,” Investigative Opthalmology & Visual Science, vol. 57, no. 13, pp. 5200–5206, Oct. 2016, doi: 10.1167/iovs.16-
19964.
[7] Y.-H. Li, N.-N. Yeh, S.-J. Chen, and Y.-C. Chung, “Computer-assisted diagnosis for diabetic retinopathy based on fundus images
using deep convolutional neural network,” Mobile Information Systems, vol. 2019, pp. 1–15, Jan. 2019,
doi: 10.1155/2019/6142839.
[8] M. Shaban et al.., “A convolutional neural network for the screening and staging of diabetic retinopathy,” PLOS ONE, vol. 15,
no. 6, p. e0233514, Jun. 2020, doi: 10.1371/journal.pone.0233514.
[9] Y. S. Devi and S. P. Kumar, “A scoping review of diabetic retinopathy detection techniques using deep learning: taxonomy,
methods, and recent developments,” High Technology Letters, vol. 26, no. 11, pp. 392–406, 2020.
[10] W. Zhang et al.., “Automated identification and grading system of diabetic retinopathy using deep neural networks,” Knowledge-
Based Systems, vol. 175, pp. 12–25, Jul. 2019, doi: 10.1016/j.knosys.2019.03.016.
[11] C. Lam, D. Yi, M. Guo, and T. Lindsey, “Automated detection of diabetic retinopathy using deep learning,” in AMIA summits on
translational science proceedings, 2018, pp. 147–155.
[12] K. Xu, D. Feng, and H. Mi, “Deep convolutional neural network-based early automated detection of diabetic retinopathy using
fundus image,” Molecules, vol. 22, no. 12, p. 2054, Nov. 2017, doi: 10.3390/molecules22122054.
[13] S. Dutta, B. C. S. Manideep, S. M. Basha, R. D. Caytiles, and N. C. S. N. Iyengar, “Classification of diabetic retinopathy images
by using deep learning models,” International Journal of Grid and Distributed Computing, vol. 11, no. 1, pp. 89–106, Jan. 2018,
doi: 10.14257/ijgdc.2018.11.1.09.
[14] V. Gulshan et al.., “Development and validation of a deep learning algorithm for detection of diabetic retinopathy in retinal
fundus photographs,” JAMA, vol. 316, no. 22, pp. 2402–2410, Dec. 2016, doi: 10.1001/jama.2016.17216.
[15] R. Pires, S. Avila, J. Wainer, E. Valle, M. D. Abramoff, and A. Rocha, “A data-driven approach to referable diabetic retinopathy
detection,” Artificial Intelligence in Medicine, vol. 96, pp. 93–106, May 2019, doi: 10.1016/j.artmed.2019.03.009.
[16] Mobeen-ur-Rehman, S. H. Khan, Z. Abbas, and S. M. D. Rizvi, “Classification of diabetic retinopathy images based on
customised CNN architecture,” in 2019 Amity International Conference on Artificial Intelligence (AICAI), Feb. 2019,
pp. 244–248, doi: 10.1109/aicai.2019.8701231.
[17] Y. Wang, S. Nazir, and M. Shafiq, “An overview on analyzing deep learning and transfer learning approaches for health
monitoring,” Computational and Mathematical Methods in Medicine, vol. 2021, pp. 1–10, Mar. 2021,
doi: 10.1155/2021/5552743.
[18] R. Rajalakshmi, R. Subashini, R. M. Anjana, and V. Mohan, “Automated diabetic retinopathy detection in smartphone-based
fundus photography using artificial intelligence,” Eye, vol. 32, no. 6, pp. 1138–1144, Mar. 2018, doi: 10.1038/s41433-018-0064-
9.
[19] Kaggle, “Diabetic retinopathy detection.” 2015, Accessed: Sep. 10, 2021. [Online]. Available:
https://www.kaggle.com/c/diabetic-retinopathy-detection.
[20] “Messidor-Adcis.” 2010, Accessed: Oct. 18, 2021. [Online]. Available: https://www.adcis.net/en/third-party/messidor/.
[21] “Messidor-Adcis,” Asia Pacific Tele-Ophthalmology Society (APTOS). 2019, Accessed: Sep. 11, 2021. [Online]. Available:
https://www.kaggle.com/c/aptos2019-blindness-detection.
[22] R. Sarki, K. Ahmed, H. Wang, and Y. Zhang, “Automatic detection of diabetic eye disease through deep learning using fundus
images: a survey,” IEEE Access, vol. 8, pp. 151133–151149, 2020, doi: 10.1109/access.2020.3015258.
[23] H. Chen and P. Cao, “Deep learning based data augmentation and classification for limited medical data learning,” in 2019 IEEE
International Conference on Power, Intelligent Computing and Systems (ICPICS), Jul. 2019, pp. 300–303, doi:
10.1109/icpics47731.2019.8942411.
[24] T. Araújo et al.., “Data augmentation for improving proliferative diabetic retinopathy detection in eye fundus images,” IEEE
Access, vol. 8, pp. 182462–182474, 2020, doi: 10.1109/ACCESS.2020.3028960.
[25] S. S. Rahim, V. Palade, I. Almakky, and A. Holzinger, “Detection of diabetic retinopathy and maculopathy in eye fundus images
using deep learning and image augmentation,” in International Cross-Domain Conference for Machine Learning and Knowledge
Extraction, 2019, pp. 114–127, doi: 10.1007/978-3-030-29726-8_8.
[26] C. Xu, L. Xu, P. Ohorodnyk, M. Roth, B. Chen, and S. Li, “Contrast agent-free synthesis and segmentation of ischemic heart
disease images using progressive sequential causal GANs,” Medical Image Analysis, vol. 62, p. 101668, May 2020,
doi: 10.1016/j.media.2020.101668.
[27] F. J. Moreno-Barea, J. M. Jerez, and L. Franco, “Improving classification accuracy using data augmentation on small data sets,”
Expert Systems with Applications, vol. 161, p. 113696, Dec. 2020, doi: 10.1016/j.eswa.2020.113696.
[28] M. Seetha, N. Kalyani, and Y. Sravani Devi, “An ensemble CNN model for identification of diabetic retinopathy eye disease,” in
Smart Intelligent Computing and Applications, 2022, vol. 2, pp. 191–198, doi: 10.1007/978-981-16-9705-0_19.
[29] A. Bin Tufail et al.., “Diagnosis of diabetic retinopathy through retinal fundus images and 3D convolutional neural networks with
limited number of samples,” Wireless Communications and Mobile Computing, vol. 2021, pp. 1–15, Nov. 2021, doi:
10.1155/2021/6013448.
[30] T. Shanthi and R. S. Sabeenian, “Modified Alexnet architecture for classification of diabetic retinopathy images,” Computers &
Electrical Engineering, vol. 76, pp. 56–64, Jun. 2019, doi: 10.1016/j.compeleceng.2019.03.004.
BIOGRAPHIES OF AUTHORS
Singam Phani Kumar completed his B.E. (CSE) from VTU, Belgaum
M.Tech.(SE) and Ph.D. from Bharath University, Chennai. Presently he is working as
Professor & Head, Department of CSE, School of Technology, GITAM Deemed to be
University, Hyderabad. He has 30 research papers in reputed peer reviewed international
journals in addition to 14 papers in International Conferences and 2 Indian patents published
to his credit. He has co-authored 05 book chapters in Springer series. He is Life member of
ISTE, member of CSI, member of Indian Science Congress Association. His research
interests are safety critical systems, software safety, Wireless Sensor Networks, Machine
Intelligence, and IoT Security. He can be contact at email: [email protected].