Banana Ripeness Classification Based On Deep Learning Using Convolutional Neural Network
Banana Ripeness Classification Based On Deep Learning Using Convolutional Neural Network
net/publication/351644885
CITATIONS READS
31 2,717
2 authors:
All content following this page was uploaded by Andi Wahju Rahardjo Emanuel on 01 June 2021.
Abstract—Fruit ripeness is an important thing in agriculture technology has developed, and one of the deep learning
because it determines the fruit's quality. Determining the methods that play a significant role is the Convolutional
ripeness of the fruit that was done manually poses several Neural Network (CNN).
weaknesses, such as takes a relatively long time, requires a lot of
labor, and can cause inconsistencies. The agricultural sector is This study aims to apply the Convolutional Neural
one of the essential sectors of the economy in Indonesia. Network to classify the ripeness of the banana. The levels of
However, sometimes the process of determining fruit ripeness is ripeness measured are unripe/green, yellowish-green, mid-
still done by using the manual method. The development of ripen, and overripe. Image preprocessing such as removing
computer vision and machine learning technologies can be used noise in the image and resizing were used, and data
to classify fruit ripeness automatically. This study applies the augmentation is used to produce variations of the images in
Convolutional Neural Network to classify the ripeness of the the dataset. The dataset used in this study is a dataset of banana
banana. The banana's ripeness is divided into four classes: ripeness levels provided by [11]; by using the pre-trained
unripe/green, yellowish-green, mid-ripen, and overripe. Two CNN models used in this study are the MobileNet V2 [12] and
pre-trained models are used, which are MobileNet V2 and NASNetMobile [13]. Transfer learning by fine-tuning is used
NASNetMobile. The experiment was conducted using Google to the pre-trained MobileNet V2 and the NASNetMobile for
Colab and several libraries such as OpenCV, Tensorflow, and classifying the banana ripeness. The transfer learning method
scikit-learn. The result shows that MobileNet V2 achieves higher
is chosen because it requires an immense computational and
accuracy and faster execution time than the NASNetMobile. The
memory resource and many datasets to train a CNN model
highest accuracy achieved is 96.18%.
from scratch [14]. After training, several measurements are
Keywords— Fruit ripeness, computer vision, CNN, pre-trained conducted to determine the model's performance, such as the
model accuracy, precision, recall, and F1 score.
TABLE I. PERFORMANCE RESULT OF MOBILENET V2 The following experiment is by increasing the epochs in
training to 100 epochs. Using the 600th layer as the starting
Unfreezing layer for unfreezing, the model achieves slightly higher
No. of F1
Starting Accuracy Precision Recall
Epochs
Layer
score performance than training using 50 epochs using the 600th and
100th 94.66% 95.25% 94.53% 94.41% 700th layers. The accuracy, precision, recall, and F1 score are
50 90.84%, 90.81%, 91.02%, and 90.84%. The next experiment
125th 91.60% 91.80% 91.59% 91.37%
100th 96.18% 96.53% 96.09% 96.02% uses 100 epochs and increases the starting layer for unfreezing
100 to the 700th layer. The model achieves higher performance
125th 93.13% 93.21% 93.08% 93.22%
than by using 50 epochs. The accuracy reaches by the model
is 90.08%, with 90.22% precision, 90.35% recall, and 90.11%
As shown in Table 1, with a total of 50 epochs and F1 score. This result is nearly like the experiment using 50
unfreezing from the 100th layer, the model achieves 94.66% epochs and the 600th layer, with only a difference in precision
accuracy, 95.25% precision, 94.53% recall, and 94.41% F1 and F1 score.
score. However, by increasing the starting layer for unfreezing
to the 125th layer, the accuracy, precision, recall, and F1 score Overall, in the experiment of using NASNetMobile, the
are lower than the previous, 91.60%, 91.80%, and 91.59%, highest performance can be achieved by using 100 epochs and
and 91.37%, respectively. The result shows that increasing the choosing the 600th layer as the starting layer for unfreezing the
starting layer for unfreezing the model lowers the model's model. Like the previous experiment using the MobileNet V2,
performance in classifying the banana's ripeness. increasing the starting layer for unfreezing the
NASNetMobile, tends to lower the performance in classifying
The next experiment is using a total of 100 epochs to train the banana's ripeness.
the model. The result shows that the model's performance is
increasing than the previous experiment, with 96.18% Comparing between the MobileNet V2 and
accuracy, 96.53%% precision, 96.09% recall, and 96.02% F1 NASNetMobile, the MobileNet V2 achieves higher
score. Like the previous experiment, by increasing the starting performance than the NASNetMobile. However, both models
layer for unfreezing to the 125th layer, the model's achieve the highest performance when training the model
performance lower than when choosing the 100th layer for the using 100 epochs; thus, increasing the number of epochs in
starting layer. Using 100 epochs with the 125th layer as the training will result in better performance. Several trials are
starting layer, however, is higher than using 50 epochs with conducted using images never seen by the model to know
the same starting layer: 93.13% accuracy, 93.21% precision, whether the model can predict a banana's ripeness in an image
and 93.08% recall, and 93.22% F1 score. and the execution time of both models when using CPU or
GPU. Fig. 6 shows the test images, while the results are shown
In the experiment, choosing the 100th layer as the starting in Table 3 for MobileNet V2 and Table 4 for NASNetMobile.
layer for unfreezing the model gives better performance on
both 50 epochs and 100 epochs, with the highest achieved
using 100 epochs. However, increasing the starting layer from
the 100th to the 125th tends to reduce the model's performance.
The next result shows the performance of NASNetMobile,
which can be seen in Table 2 below. The experiment was done
using similar epochs to the previous experiment on MobileNet
V2, which are 50 epochs and 100 epochs. For the Fig. 6. Test images of banana with different ripeness, (a) unripe/green, (b)
NASNetMobile, the starting layer for unfreezing the model is yellowish-green, (c) mid-ripen, and (d) overripe
the 600th and 700th layer because the NASNetMobile has more
layers than the MobileNet V2. TABLE III. PREDICTION RESULT AND EXECUTION TIME OF
MOBILENET V2
TABLE II. PERFORMANCE RESULT OF NASNETMOBILE Execution Time
Test Images Prediction
Unfreezing CPU GPU
No. of F1
Starting Accuracy Precision Recall Unripe/Green Unripe/Green 0.093 s 0.046 s
Epochs score
Layer Yellowish Green Yellowish Green 0.098 s 0.043 s
600th 90.08% 90.20% 90.35% 90.15% Mid-Ripen Mid-Ripen 0.091 s 0.042 s
50
700th 88.55% 88.75% 88.76% 88.57%
Overripe Overripe 0.125 s 0.081 s
600th 90.84% 90.81% 91.02% 90.84%
100
700th 90.08% 90.22% 90.35% 90.11% The results in Table 3 show that the MobileNet V2 can
predict all the banana ripeness correctly. When the prediction
Training the NASNetMobile using 50 epochs and is executed using CPU, the execution time is in the range of
choosing the 600th layer as the starting layer for unfreezing 0.091 seconds to 0.125 seconds. The execution time is faster
achieves 90.08%, 90.20% precision, 90.35% recall, and when using GPU, with a range of 0.042 seconds to 0.081
90.15% F1 score. Increasing the starting layer to the 700th seconds. The difference in execution time on both CPU and
layer, on the contrary, achieves lower performance than the GPU is because each image has a different size; therefore, it
previous, which are 88.55%, 88.75%, 88.76%, 88.57% on the affects the execution time. The bigger the image size, hence
accuracy, precision, recall, and F1 score, respectively. The the execution time will be longer.
resulting pattern is like the previous result of using the
TABLE IV. PREDICTION RESULT AND EXECUTION TIME OF [3] A. Koirala, K. B. Walsh, Z. Wang, and C. McCarthy, "Deep learning
NASNETMOBILE for real-time fruit detection and orchard fruit load estimation:
benchmarking of 'MangoYOLO,'" Precision Agriculture, vol. 20, no. 6.
Execution Time pp. 1107–1135, 2019.
Test Images Prediction
CPU GPU [4] Y. D. Zhang et al., "Image based fruit category classification by 13-
Unripe/Green Yellowish Green 0.157 s 0.055 s layer deep convolutional neural network and data augmentation,"
Yellowish Green Yellowish Green 0.150 s 0.053 s Multimed. Tools Appl., vol. 78, no. 3, pp. 3613–3632, 2019.
Mid-Ripen Mid-Ripen 0.175 s 0.062 s [5] W. Castro, J. Oblitas, M. De-La-Torre, C. Cotrina, K. Bazan, and H.
Overripe Avila-George, "Classification of Cape Gooseberry Fruit According to
Overripe 0.189 s 0.083 s its Level of Ripeness Using Machine Learning Techniques and
Different Color Spaces," IEEE Access, vol. 7, pp. 27389–27400, 2019.
The results in Table 4 show that the NASNetMobile [6] M. P. Arakeri and Lakshmana, "Computer Vision Based Fruit Grading
cannot correctly predict the unripe/green banana, but the rest System for Quality Evaluation of Tomato in Agriculture industry,"
are correctly predicted due to the lower accuracy of Procedia Comput. Sci., vol. 79, pp. 426–433, 2016.
NASNetMobile. When the prediction is executed using CPU, [7] A. Wajid, N. K. Singh, P. Junjun, and M. A. Mughal, "Recognition of
ripe, unripe and scaled condition of orange citrus based on decision tree
the execution time is in the range of 0.150 seconds to 0.189 classification," 2018 Int. Conf. Comput. Math. Eng. Technol. Inven.
seconds, while the execution time when using GPU is in the Innov. Integr. Socioecon. Dev. iCoMET 2018 - Proc., vol. 2018-Janua,
range of 0.053 seconds to 0.083 seconds. Overall, the pp. 1–4, 2018.
execution time is faster when using the GPU. However, the [8] A. Bashir, S. Suhel, A. Azwardi, D. P. Atiyatna, I. Hamidi, and N.
execution time using MobileNet V2 is faster than Adnan, "The Causality Between Agriculture, Industry, and Economic
NASNetMobile because the MobileNet V2 model is smaller Growth: Evidence from Indonesia," Etikonomi, vol. 18, no. 2, pp. 155–
168, 2019.
than the NASNetMobile.
[9] I. B. Suban, A. Paramartha, M. Fortwonatus, and A. J. Santoso,
V. CONCLUSION "Identification the Maturity Level of Carica Papaya Using the K-
Nearest Neighbor," J. Phys. Conf. Ser., vol. 1577, no. 1, 2020.
In this study, two pre-trained models, MobileNet V2 and [10] J. Pardede, M. G. Husada, A. N. Hermana, and S. A. Rumapea, "Fruit
NASNetMobile, were used to classify banana ripeness. The Ripeness Based on RGB, HSV, HSL, L∗a∗b∗ Color Feature Using
banana ripeness is divided into four classes: unripe/green, SVM," 2019 Int. Conf. Comput. Sci. Inf. Technol. ICoSNIKOM 2019,
yellowish-green, mid-ripen, and overripe. The transfer 2019.
learning by fine-tuning approach was applied to train both [11] F. M. A. Mazen and A. A. Nashat, "Ripeness Classification of Bananas
Using an Artificial Neural Network," Arab. J. Sci. Eng., vol. 44, no. 8,
models, using different epochs and starting layers for pp. 6901–6910, Aug. 2019.
unfreezing the model. Image preprocessing, such as using the
[12] M. Sandler, A. Howard, M. Zhu, A. Zhmoginov, and L. C. Chen,
bilateral filter, was used to remove the image's noise before "MobileNetV2: Inverted Residuals and Linear Bottlenecks," Proc.
training. Data augmentation such as horizontal flip, vertical IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit., pp. 4510–
flip, brightness, zoom, shear, rotation, and shifting were 4520, 2018.
applied to add variations for the training data. The experiment [13] B. Zoph, V. Vasudevan, J. Shlens, and Q. V. Le, "Learning
results show that the MobileNet V2 achieves higher Transferable Architectures for Scalable Image Recognition," Proc.
IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit., pp. 8697–
performance than the NASNetMobile, with the highest 8710, 2018.
accuracy achieved by the MobileNet V2 is 96.18%. Choosing
[14] R. Patel and A. Chaware, "Transfer Learning with Fine-Tuned
the starting layer for unfreezing the model will affect the MobileNetV2 for Diabetic Retinopathy," in 2020 International
performance of the model. The experiments show that Conference for Emerging Technology (INCET), 2020, pp. 1–4.
choosing a higher layer as the starting layer tends to lower [15] K. Kangune, V. Kulkarni, and P. Kosamkar, "Grapes Ripeness
each model's performance. The result in each experiment also Estimation using Convolutional Neural network and Support Vector
shows that increasing the epochs in training will improve the Machine," in 2019 Global Conference for Advancement in Technology
model's performance. In terms of execution time, the (GCAT), 2019, pp. 1–5.
MobileNet V2 is faster than the NASNetMobile. The image [16] Z. Ibrahim, N. Sabri, and D. Isa, "Palm oil fresh fruit bunch ripeness
grading recognition using convolutional neural network," J.
size will affect the execution time; the more significant the Telecommun. Electron. Comput. Eng., vol. 10, no. 3–2, pp. 109–113,
image will make the execution time longer. Future research is 2018.
to train the models with more data with variations of bananas [17] M. Momeny, A. Jahanbakhshi, K. Jafarnezhad, and Y. D. Zhang,
in an image, applying object detection that can automatically "Accurate classification of cherry fruit using deep CNN based on
detect a banana in an image or video stream and combines it hybrid pooling approach," Postharvest Biol. Technol., vol. 166, no.
with a classifier to create a real-time banana ripeness December 2019, p. 111204, 2020.
classifier. [18] R. Thakur, G. Suryawanshi, H. Patel, and J. Sangoi, "An Innovative
Approach for Fruit Ripeness Classification," Proc. Int. Conf. Intell.
ACKNOWLEDGMENT Comput. Control Syst. ICICCS 2020, no. Iciccs, pp. 550–554, 2020.
[19] DSCEteamD, “BananaCo_images.” [Online]. Available:
The authors would like to convey gratitude to Universitas https://github.com/DSCEteamD. [Accessed: 02-Nov-2020].
Atma Jaya Yogyakarta, Yogyakarta, Indonesia, for funding [20] D. Zhang, J. Kang, L. Xun, and Y. Huang, "Hyperspectral Image
this research. Classification Using Spatial and Edge Features Based on Deep
Learning," Int. J. Pattern Recognit. Artif. Intell., vol. 33, no. 9, 2019.
REFERENCES [21] N. K. Manaswi, Deep Learning with Applications Using Python.
[1] J. Naranjo-Torres, M. Mora, R. Hernández-García, R. J. Barrientos, C. Karnataka: Apress, 2018.
Fredes, and A. Valenzuela, “A review of convolutional neural network [22] P. Szymak, P. Piskur, and K. Naus, "The effectiveness of using a
applied to fruit image processing,” Appl. Sci., vol. 10, no. 10, 2020. pretrained deep learning neural networks for object classification in
[2] M. Khojastehnazhand, V. Mohammadi, and S. Minaei, "Maturity underwater video," Remote Sens., vol. 12, no. 18, pp. 1–19, 2020.
detection and volume estimation of apricot using image processing [23] A. Rosebrock, Deep Learning for Computer Vision with Python
technique," Sci. Hortic. (Amsterdam)., vol. 251, no. January, pp. 247– Practitioner Bundle. Maryland: PyImageSearch, 2017.
251, 2019.