0% found this document useful (0 votes)
28 views6 pages

Banana Ripeness Classification Based On Deep Learning Using Convolutional Neural Network

Uploaded by

Lê Thanh Hải
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
28 views6 pages

Banana Ripeness Classification Based On Deep Learning Using Convolutional Neural Network

Uploaded by

Lê Thanh Hải
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/351644885

Banana Ripeness Classification Based on Deep Learning using Convolutional


Neural Network

Conference Paper · April 2021


DOI: 10.1109/EIConCIT50028.2021.9431928

CITATIONS READS

31 2,717

2 authors:

Raymond Erz Saragih Andi Wahju Rahardjo Emanuel


Universal University Universitas Atma Jaya Yogyakarta
10 PUBLICATIONS 128 CITATIONS 130 PUBLICATIONS 606 CITATIONS

SEE PROFILE SEE PROFILE

All content following this page was uploaded by Andi Wahju Rahardjo Emanuel on 01 June 2021.

The user has requested enhancement of the downloaded file.


Banana Ripeness Classification Based on Deep
Learning using Convolutional Neural Network

Raymond Erz Saragih Andi W. R. Emanuel


Magister Informatika Magister Informatika
Universitas Atma Jaya Yogyakarta Universitas Atma Jaya Yogyakarta
Yogyakarta, Indonesia Yogyakarta, Indonesia
[email protected] [email protected]

Abstract—Fruit ripeness is an important thing in agriculture technology has developed, and one of the deep learning
because it determines the fruit's quality. Determining the methods that play a significant role is the Convolutional
ripeness of the fruit that was done manually poses several Neural Network (CNN).
weaknesses, such as takes a relatively long time, requires a lot of
labor, and can cause inconsistencies. The agricultural sector is This study aims to apply the Convolutional Neural
one of the essential sectors of the economy in Indonesia. Network to classify the ripeness of the banana. The levels of
However, sometimes the process of determining fruit ripeness is ripeness measured are unripe/green, yellowish-green, mid-
still done by using the manual method. The development of ripen, and overripe. Image preprocessing such as removing
computer vision and machine learning technologies can be used noise in the image and resizing were used, and data
to classify fruit ripeness automatically. This study applies the augmentation is used to produce variations of the images in
Convolutional Neural Network to classify the ripeness of the the dataset. The dataset used in this study is a dataset of banana
banana. The banana's ripeness is divided into four classes: ripeness levels provided by [11]; by using the pre-trained
unripe/green, yellowish-green, mid-ripen, and overripe. Two CNN models used in this study are the MobileNet V2 [12] and
pre-trained models are used, which are MobileNet V2 and NASNetMobile [13]. Transfer learning by fine-tuning is used
NASNetMobile. The experiment was conducted using Google to the pre-trained MobileNet V2 and the NASNetMobile for
Colab and several libraries such as OpenCV, Tensorflow, and classifying the banana ripeness. The transfer learning method
scikit-learn. The result shows that MobileNet V2 achieves higher
is chosen because it requires an immense computational and
accuracy and faster execution time than the NASNetMobile. The
memory resource and many datasets to train a CNN model
highest accuracy achieved is 96.18%.
from scratch [14]. After training, several measurements are
Keywords— Fruit ripeness, computer vision, CNN, pre-trained conducted to determine the model's performance, such as the
model accuracy, precision, recall, and F1 score.

I. INTRODUCTION II. LITERATURE REVIEW


Advances in technology bring a significant impact on Various studies have been conducted to determine the
human life. Technology is used in various fields to assist level of fruit ripeness using computer vision and machine
humans in carrying out different processes. One of the learning. Kangune et al. [15] compare the accuracy of using
implications of technology is in agriculture. In agriculture, the CNN and Support Vector Machine (SVM) to classify grapes'
use of innovative technology is one of the crucial aspects [1]. maturity. Color features such as RGB and HSV and
Some of the latest technologies used are computer vision and morphological features were used, and Gaussian blur was
machine learning [1], [2]. Computer vision and machine used in the preprocessing stage. The accuracy using CNN was
learning in agriculture are used for several tasks, such as fruit 79.49%, higher than the SVM, which was 69%. Ibrahim et al.
detection [3], classifying fruits [4], and determining the level [16] used AlexNet and SVM to classify the oil palm fruit
of ripeness of the fruit or determining fruit defect [5]. bunch's ripeness. SVM's input features are color moment, Fast
Retina Keypoint (FREAK) binary feature, and Histogram of
Fruit ripeness is an essential thing in agriculture because it Oriented Gradient (HOG) texture feature. Preprocessing was
determines the fruit's quality [5]. Previously, determining the performed, such as resizing the images and converting them
ripeness of the fruit was done manually. There are several to grayscale. The results obtained are that the accuracy using
weaknesses in the manual method. It takes a relatively long AlexNet is higher than using SVM. Momeny et al. [17]
time to do it, requires a lot of labor, and can cause classify cherries based on their shape. The accuracy of CNN
inconsistencies in determining fruit ripeness [5], [6], [7]. The was compared with various machine learning methods, such
emergence of computer vision technology can overcome these as K-Nearest Neighbors, Artificial Neural Network (ANN),
problems because the classification of fruit ripeness can be Fuzzy, and Ensemble Decision Trees. Feature extractors, such
done automatically; therefore, it is relatively quick, consistent, as HOG and LBP, were used. In the preprocessing stage,
and relatively inexpensive [2]. segmentation and resizing of the images were conducted, and
In Indonesia, agriculture plays a vital role in the country's data augmentation was used to increase the number of images
economy [8], [9]. In Indonesia, various kinds of fruit are used in the dataset. The accuracy of CNN reaches up to 99%,
produced, such as mango, Salak, orange, banana, watermelon, which is higher than other methods.
and many more [10]. However, sometimes the process of The study by Mazen and Nashat classified banana ripeness
determining fruit ripeness is still done by using the manual using several machine learning methods [11]. The methods
method. The development of computer vision and machine include ANN, SVM, Naïve Bayes, KNN, Decision Tree, and
learning technologies can be used to classify fruit ripeness Discriminant Analysis. Several preprocessing steps were
automatically. Within machine learning, deep learning carried out, such as image smoothing, color channel
conversion, morphological filters, and segmentation. Tamura because previous studies stated that it could achieve high
texture features and new features were used to define the accuracy.
banana ripeness factor and as the input. The accuracy achieved
is up to 100% for the green and overripe bananas and 97.75% The second stage is obtaining the dataset. The dataset used
for yellowish-green and mid-ripen bananas. Thakur et al. tried in this study is images combined from [11] and [19]. The
to classify strawberries from their ripeness [18]. The CNN dataset consists of 436 banana images with four ripeness
model was used to classify these strawberries' ripeness and classes: unripe/green, yellowish-green, mid-ripen, and
achieve an accuracy rate of 91.6%. Pardede et al. [10] overripe. The next stage is processing the images from the
researched to classify the ripeness of various fruits, such as dataset. Image processing was done to remove noise. Two pre-
apple, mango, orange, and tomato. For example, color trained models are used in the training stage: MobileNet V2
features, such as RGB, HSV, HSL, and L * a * b * were used, [12] and NASNetMobile [13]. In the Keras available models,
and SVM as the classifier. Different degree of the polynomial MobileNet V2 has the smallest size among other models with
kernel was used to obtain the best result. The highest accuracy the size of 14 MB, while the MobileNet has the size of 16 MB
was obtained using the HSV color feature and 6th-degree and the NASNetMobile has the size of 23 MB. Despite having
polynomial, which is 76%. a small model size, MobileNet V2 and NASNetMobile could
achieve high image recognition accuracy [12], [13].
Khojastehnazhand et al. used LDA and QDA to classify Therefore, MobileNet V2 and NASNetMobile were used in
apricots' ripeness and estimate the fruit's volume [2]. The color this study to create an efficient model with good performance
features, such as G channel, grayscale, L *, and b *, were used in classifying banana ripeness. Following the training, the next
as input for the LDA and QDA. In the preprocessing stage, step is to evaluate the models. The evaluation was done by
noise removal and segmentation were conducted. Results measuring the performance of each model. Accuracy,
show that the highest accuracy was achieved by using QDA. precision, recall, specificity, sensitivity, and F-measure were
Suban et al. classify the ripeness of Papaya Carica fruit using used to measure the performance, and the result could
KNN [9]. The RGB values were used as the input feature for determine which model performs better.
KNN. The level of accuracy that was successfully obtained
using the KNN was 100%. Arakeri and Lakshmana conducted A. Dataset Description
research using ANN to classify tomatoes based on their The dataset used in this study consisted of 436 images of
ripeness and degree of the defect. The accuracy rate obtained banana fruit and was obtained from [11] and [19]. The images
was 100% for the defect and 96.47% for ripeness [6]. Castro were divided into four ripeness classes: unripe/green,
et al. [5] tried to classify Cape Gooseberry fruit based on yellowish-green, mid-ripen, and overripe. The image size
maturity level, using classifiers, such as SVM, ANN, Decision varies from 225 x 225 pixels to 960 x 540 pixels. Fig. 2 shows
Tree, and KNN, and color features RGB, HSV, L*a*b*, were several images of the banana dataset.
used. Besides, PCA was used to combine the color features.
Results show that the L*a*b* color features and SVM
achieved the highest f-measure, and the use of PCA proved to
improve the model's performance.
This study aims to apply and evaluate two pre-trained
CNN models: MobileNet V2 and NASNetMobile, classifying
different ripeness of the banana fruit, such as unripe/green,
yellowish-green, mid-ripen, and overripe. Image processing
was conducted to reduce noise, and data augmentation was
applied to add variations to the dataset.
III. RESEARCH METHODOLOGY
Fig. 2. Images of banana with different ripeness, (a) unripe/green, (b)
Fig. 1 shows the stages in this study: literature study, yellowish-green, (c) mid-ripen, (d) overripe
obtaining the dataset, image preprocessing, train CNN
models, and evaluation. B. Image Preprocessing
In this stage, a filter is applied to the images to remove
noise. The filter used in this study is the bilateral filter. The
process of applying the bilateral filter is shown in Fig. 3.

Fig. 3. Process of applying the bilateral filter

The original images from the dataset as well as test images


Fig. 1. Research method are processed by applying the bilateral filter. The bilateral
filter in this work utilizes the function provided by OpenCV.
Based on Fig. 1, the first stage is the literature study. In The bilateral filter can remove noise from an image while still
this stage, research documents and reports related to the topic preserving the edges of the object in the image; therefore, the
are collected from various journals. The research documents object's shape is preserved [20]. The bilateral filter will
and information are then used as references for this study. produce a new output image that will then be used as the CNN
Through the literature study, different classifying fruit model's input. An example of applying the bilateral filter to an
ripeness methods can be compared and become the basis for image is shown in Fig. 4, using the images [11].
selecting the method used in this study. CNN was chosen
As shown in Fig. 5 above, the top layers of the original
pre-trained MobileNet V2 and NASNetMobile were removed.
A global average pooling layer was added on top of the base
model of MobileNet V2, and the NASNetMobile and the final
layer is the prediction layer, which uses the Softmax activation
function [14]. The final classification layer is customized
according to the desired number of classes, in this case, are
four classes, which are green (unripe banana), yellowish-
green, mid-ripen, and overripe bananas. A dropout layer with
a probability of 0.3 is added between the global average
pooling layer and the Softmax layer to reduce overfitting [23].
Fig. 4. Result of the bilateral filter (b) from the original image (a)
D. Evaluation of Model
Applying the bilateral filter to an image creates a new The trained models then must be evaluated to know the
image that has been smoothened, and the noise has been performance. The performance is measured in terms of
reduced; however, the shape of the banana is still preserved. accuracy, precision, recall, and F1 score. The accuracy,
As stated above, the dataset consists of images with different precision, recall, and F1 score are calculated using the
sizes; therefore, after applying the bilateral filter, the resulting following equation:
images are resized to the size of 224 x 224 pixels following
the default input image of MobileNet V2 and NASNetMobile. 𝑇𝑃+𝑇𝑁
𝐴𝑐𝑐𝑢𝑟𝑎𝑐𝑦 = ()
𝑇𝑃+𝐹𝑃+𝐹𝑁+𝑇𝑁
C. Training CNN
𝑇𝑃
After the images were processed, the next stage is training 𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 = ()
the CNN model. The CNN models used are the pre-trained 𝑇𝑃+𝐹𝑃
MobileNet V2 [12] and NASNetMobile [13]. In this work, the 𝑇𝑃
MobileNet V2 and NASNetMobile used are provided by the 𝑅𝑒𝑐𝑎𝑙𝑙 = ()
𝑇𝑃+𝐹𝑁
Keras library. MobileNet V2 currently is the smallest model
provided in the Keras application library, while the 𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 ×𝑅𝑒𝑐𝑎𝑙𝑙
𝐹1 𝑆𝑐𝑜𝑟𝑒 = 2 × ()
NASNetMobile is the third smallest model. Both models were 𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛+𝑅𝑒𝑐𝑎𝑙𝑙
trained on ImageNet to classify 1000 classes of objects. The
original MobileNet V2 consists of 156 layers with 3,538,984 The scikit-learn library provides a method to calculate
total parameters. On the other hand, the original accuracy, precision, recall, and F1 score automatically.
NASNetMobile has 771 layers with 5,326,716 total Therefore, in the experiment, the scikit-learn is used to give
parameters. As can be seen, the NASNetMobile has more in- the performance result of each model.
depth architecture and more parameters than the MobileNet
IV. EXPERIMENTAL RESULT
V2. However, in this work, the original classification layer
based on the number of classes in the ImageNet dataset will The experiments were conducted using the dataset from
not be used., thus reducing the layers of the MobileNet V2 into [11] and [19]. The combined dataset contains 436 images of
154 layers and the NASNetMobile into 769 layers. bananas, with four ripeness classes: green, yellowish-green,
mid-ripen, and overripe. The experiments were done using
Transfer learning was applied in this study to train both Google Colab with GPU hardware accelerator and several
models in classifying banana ripeness. Transfer learning libraries, such as OpenCV, Tensorflow, and scikit-learn. The
consists of two approaches. The first approach uses a pre- dataset is split into training data and testing data. The training
trained CNN as a feature extractor for the new classification data is 70% of the dataset, and the testing data is 30% of the
task by removing the last fully connected layer. A classifier is dataset. The splitting was done randomly. Each model was
then trained on the features extracted from the CNN [21]. The trained for 50 and 100 epochs with a batch size equal to 10.
second approach is by fine-tuning a pre-trained CNN, training The optimizer used for each model is Adam, and the loss
a new classification layer, and retraining several layers of the function used is categorical cross-entropy. Data augmentation
pre-trained CNN model [22]. In this study, the pre-trained such as horizontal and vertical flip, brightness, zoom, shear,
MobileNet V2 and NASNetMobile are fine-tuned to classify rotation, and shifting were applied for the training data to
the banana ripeness. The overview of the transfer learning create variations of the training images.
used in this study is shown in Fig. 5.
The first step is to freeze the base model and do a warm-
up train for the new classification layer [23]. The initial epochs
used for the warm-up stage are 20 epochs. The next step is to
unfreeze several top layers from the base model and retrain the
model with the new classification layers [14]. The unfreezing
of MobileNet V2 was done from the 100th and 125th layers of
the base model, while for the NASNetMobile, the unfreezing
was done from the 600th and 700th layers of the base model.
The learning rate used in the warm-up stage is 0.0001, and in
the fine-tuning stage, the learning rate is lowered to 0.00001.
Following the training process, an evaluation was done to
Fig. 5. Overview of the transfer learning
know the performance of each model. Table 1 shows the
performance result of the MobileNet V2 model after the
training process. The precision, recall, and F1score listed are MobileNet V2; increasing the starting layer for unfreezing
based on the average macro result. tends to lower its performance.

TABLE I. PERFORMANCE RESULT OF MOBILENET V2 The following experiment is by increasing the epochs in
training to 100 epochs. Using the 600th layer as the starting
Unfreezing layer for unfreezing, the model achieves slightly higher
No. of F1
Starting Accuracy Precision Recall
Epochs
Layer
score performance than training using 50 epochs using the 600th and
100th 94.66% 95.25% 94.53% 94.41% 700th layers. The accuracy, precision, recall, and F1 score are
50 90.84%, 90.81%, 91.02%, and 90.84%. The next experiment
125th 91.60% 91.80% 91.59% 91.37%
100th 96.18% 96.53% 96.09% 96.02% uses 100 epochs and increases the starting layer for unfreezing
100 to the 700th layer. The model achieves higher performance
125th 93.13% 93.21% 93.08% 93.22%
than by using 50 epochs. The accuracy reaches by the model
is 90.08%, with 90.22% precision, 90.35% recall, and 90.11%
As shown in Table 1, with a total of 50 epochs and F1 score. This result is nearly like the experiment using 50
unfreezing from the 100th layer, the model achieves 94.66% epochs and the 600th layer, with only a difference in precision
accuracy, 95.25% precision, 94.53% recall, and 94.41% F1 and F1 score.
score. However, by increasing the starting layer for unfreezing
to the 125th layer, the accuracy, precision, recall, and F1 score Overall, in the experiment of using NASNetMobile, the
are lower than the previous, 91.60%, 91.80%, and 91.59%, highest performance can be achieved by using 100 epochs and
and 91.37%, respectively. The result shows that increasing the choosing the 600th layer as the starting layer for unfreezing the
starting layer for unfreezing the model lowers the model's model. Like the previous experiment using the MobileNet V2,
performance in classifying the banana's ripeness. increasing the starting layer for unfreezing the
NASNetMobile, tends to lower the performance in classifying
The next experiment is using a total of 100 epochs to train the banana's ripeness.
the model. The result shows that the model's performance is
increasing than the previous experiment, with 96.18% Comparing between the MobileNet V2 and
accuracy, 96.53%% precision, 96.09% recall, and 96.02% F1 NASNetMobile, the MobileNet V2 achieves higher
score. Like the previous experiment, by increasing the starting performance than the NASNetMobile. However, both models
layer for unfreezing to the 125th layer, the model's achieve the highest performance when training the model
performance lower than when choosing the 100th layer for the using 100 epochs; thus, increasing the number of epochs in
starting layer. Using 100 epochs with the 125th layer as the training will result in better performance. Several trials are
starting layer, however, is higher than using 50 epochs with conducted using images never seen by the model to know
the same starting layer: 93.13% accuracy, 93.21% precision, whether the model can predict a banana's ripeness in an image
and 93.08% recall, and 93.22% F1 score. and the execution time of both models when using CPU or
GPU. Fig. 6 shows the test images, while the results are shown
In the experiment, choosing the 100th layer as the starting in Table 3 for MobileNet V2 and Table 4 for NASNetMobile.
layer for unfreezing the model gives better performance on
both 50 epochs and 100 epochs, with the highest achieved
using 100 epochs. However, increasing the starting layer from
the 100th to the 125th tends to reduce the model's performance.
The next result shows the performance of NASNetMobile,
which can be seen in Table 2 below. The experiment was done
using similar epochs to the previous experiment on MobileNet
V2, which are 50 epochs and 100 epochs. For the Fig. 6. Test images of banana with different ripeness, (a) unripe/green, (b)
NASNetMobile, the starting layer for unfreezing the model is yellowish-green, (c) mid-ripen, and (d) overripe
the 600th and 700th layer because the NASNetMobile has more
layers than the MobileNet V2. TABLE III. PREDICTION RESULT AND EXECUTION TIME OF
MOBILENET V2
TABLE II. PERFORMANCE RESULT OF NASNETMOBILE Execution Time
Test Images Prediction
Unfreezing CPU GPU
No. of F1
Starting Accuracy Precision Recall Unripe/Green Unripe/Green 0.093 s 0.046 s
Epochs score
Layer Yellowish Green Yellowish Green 0.098 s 0.043 s
600th 90.08% 90.20% 90.35% 90.15% Mid-Ripen Mid-Ripen 0.091 s 0.042 s
50
700th 88.55% 88.75% 88.76% 88.57%
Overripe Overripe 0.125 s 0.081 s
600th 90.84% 90.81% 91.02% 90.84%
100
700th 90.08% 90.22% 90.35% 90.11% The results in Table 3 show that the MobileNet V2 can
predict all the banana ripeness correctly. When the prediction
Training the NASNetMobile using 50 epochs and is executed using CPU, the execution time is in the range of
choosing the 600th layer as the starting layer for unfreezing 0.091 seconds to 0.125 seconds. The execution time is faster
achieves 90.08%, 90.20% precision, 90.35% recall, and when using GPU, with a range of 0.042 seconds to 0.081
90.15% F1 score. Increasing the starting layer to the 700th seconds. The difference in execution time on both CPU and
layer, on the contrary, achieves lower performance than the GPU is because each image has a different size; therefore, it
previous, which are 88.55%, 88.75%, 88.76%, 88.57% on the affects the execution time. The bigger the image size, hence
accuracy, precision, recall, and F1 score, respectively. The the execution time will be longer.
resulting pattern is like the previous result of using the
TABLE IV. PREDICTION RESULT AND EXECUTION TIME OF [3] A. Koirala, K. B. Walsh, Z. Wang, and C. McCarthy, "Deep learning
NASNETMOBILE for real-time fruit detection and orchard fruit load estimation:
benchmarking of 'MangoYOLO,'" Precision Agriculture, vol. 20, no. 6.
Execution Time pp. 1107–1135, 2019.
Test Images Prediction
CPU GPU [4] Y. D. Zhang et al., "Image based fruit category classification by 13-
Unripe/Green Yellowish Green 0.157 s 0.055 s layer deep convolutional neural network and data augmentation,"
Yellowish Green Yellowish Green 0.150 s 0.053 s Multimed. Tools Appl., vol. 78, no. 3, pp. 3613–3632, 2019.
Mid-Ripen Mid-Ripen 0.175 s 0.062 s [5] W. Castro, J. Oblitas, M. De-La-Torre, C. Cotrina, K. Bazan, and H.
Overripe Avila-George, "Classification of Cape Gooseberry Fruit According to
Overripe 0.189 s 0.083 s its Level of Ripeness Using Machine Learning Techniques and
Different Color Spaces," IEEE Access, vol. 7, pp. 27389–27400, 2019.
The results in Table 4 show that the NASNetMobile [6] M. P. Arakeri and Lakshmana, "Computer Vision Based Fruit Grading
cannot correctly predict the unripe/green banana, but the rest System for Quality Evaluation of Tomato in Agriculture industry,"
are correctly predicted due to the lower accuracy of Procedia Comput. Sci., vol. 79, pp. 426–433, 2016.
NASNetMobile. When the prediction is executed using CPU, [7] A. Wajid, N. K. Singh, P. Junjun, and M. A. Mughal, "Recognition of
ripe, unripe and scaled condition of orange citrus based on decision tree
the execution time is in the range of 0.150 seconds to 0.189 classification," 2018 Int. Conf. Comput. Math. Eng. Technol. Inven.
seconds, while the execution time when using GPU is in the Innov. Integr. Socioecon. Dev. iCoMET 2018 - Proc., vol. 2018-Janua,
range of 0.053 seconds to 0.083 seconds. Overall, the pp. 1–4, 2018.
execution time is faster when using the GPU. However, the [8] A. Bashir, S. Suhel, A. Azwardi, D. P. Atiyatna, I. Hamidi, and N.
execution time using MobileNet V2 is faster than Adnan, "The Causality Between Agriculture, Industry, and Economic
NASNetMobile because the MobileNet V2 model is smaller Growth: Evidence from Indonesia," Etikonomi, vol. 18, no. 2, pp. 155–
168, 2019.
than the NASNetMobile.
[9] I. B. Suban, A. Paramartha, M. Fortwonatus, and A. J. Santoso,
V. CONCLUSION "Identification the Maturity Level of Carica Papaya Using the K-
Nearest Neighbor," J. Phys. Conf. Ser., vol. 1577, no. 1, 2020.
In this study, two pre-trained models, MobileNet V2 and [10] J. Pardede, M. G. Husada, A. N. Hermana, and S. A. Rumapea, "Fruit
NASNetMobile, were used to classify banana ripeness. The Ripeness Based on RGB, HSV, HSL, L∗a∗b∗ Color Feature Using
banana ripeness is divided into four classes: unripe/green, SVM," 2019 Int. Conf. Comput. Sci. Inf. Technol. ICoSNIKOM 2019,
yellowish-green, mid-ripen, and overripe. The transfer 2019.
learning by fine-tuning approach was applied to train both [11] F. M. A. Mazen and A. A. Nashat, "Ripeness Classification of Bananas
Using an Artificial Neural Network," Arab. J. Sci. Eng., vol. 44, no. 8,
models, using different epochs and starting layers for pp. 6901–6910, Aug. 2019.
unfreezing the model. Image preprocessing, such as using the
[12] M. Sandler, A. Howard, M. Zhu, A. Zhmoginov, and L. C. Chen,
bilateral filter, was used to remove the image's noise before "MobileNetV2: Inverted Residuals and Linear Bottlenecks," Proc.
training. Data augmentation such as horizontal flip, vertical IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit., pp. 4510–
flip, brightness, zoom, shear, rotation, and shifting were 4520, 2018.
applied to add variations for the training data. The experiment [13] B. Zoph, V. Vasudevan, J. Shlens, and Q. V. Le, "Learning
results show that the MobileNet V2 achieves higher Transferable Architectures for Scalable Image Recognition," Proc.
IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit., pp. 8697–
performance than the NASNetMobile, with the highest 8710, 2018.
accuracy achieved by the MobileNet V2 is 96.18%. Choosing
[14] R. Patel and A. Chaware, "Transfer Learning with Fine-Tuned
the starting layer for unfreezing the model will affect the MobileNetV2 for Diabetic Retinopathy," in 2020 International
performance of the model. The experiments show that Conference for Emerging Technology (INCET), 2020, pp. 1–4.
choosing a higher layer as the starting layer tends to lower [15] K. Kangune, V. Kulkarni, and P. Kosamkar, "Grapes Ripeness
each model's performance. The result in each experiment also Estimation using Convolutional Neural network and Support Vector
shows that increasing the epochs in training will improve the Machine," in 2019 Global Conference for Advancement in Technology
model's performance. In terms of execution time, the (GCAT), 2019, pp. 1–5.
MobileNet V2 is faster than the NASNetMobile. The image [16] Z. Ibrahim, N. Sabri, and D. Isa, "Palm oil fresh fruit bunch ripeness
grading recognition using convolutional neural network," J.
size will affect the execution time; the more significant the Telecommun. Electron. Comput. Eng., vol. 10, no. 3–2, pp. 109–113,
image will make the execution time longer. Future research is 2018.
to train the models with more data with variations of bananas [17] M. Momeny, A. Jahanbakhshi, K. Jafarnezhad, and Y. D. Zhang,
in an image, applying object detection that can automatically "Accurate classification of cherry fruit using deep CNN based on
detect a banana in an image or video stream and combines it hybrid pooling approach," Postharvest Biol. Technol., vol. 166, no.
with a classifier to create a real-time banana ripeness December 2019, p. 111204, 2020.
classifier. [18] R. Thakur, G. Suryawanshi, H. Patel, and J. Sangoi, "An Innovative
Approach for Fruit Ripeness Classification," Proc. Int. Conf. Intell.
ACKNOWLEDGMENT Comput. Control Syst. ICICCS 2020, no. Iciccs, pp. 550–554, 2020.
[19] DSCEteamD, “BananaCo_images.” [Online]. Available:
The authors would like to convey gratitude to Universitas https://github.com/DSCEteamD. [Accessed: 02-Nov-2020].
Atma Jaya Yogyakarta, Yogyakarta, Indonesia, for funding [20] D. Zhang, J. Kang, L. Xun, and Y. Huang, "Hyperspectral Image
this research. Classification Using Spatial and Edge Features Based on Deep
Learning," Int. J. Pattern Recognit. Artif. Intell., vol. 33, no. 9, 2019.
REFERENCES [21] N. K. Manaswi, Deep Learning with Applications Using Python.
[1] J. Naranjo-Torres, M. Mora, R. Hernández-García, R. J. Barrientos, C. Karnataka: Apress, 2018.
Fredes, and A. Valenzuela, “A review of convolutional neural network [22] P. Szymak, P. Piskur, and K. Naus, "The effectiveness of using a
applied to fruit image processing,” Appl. Sci., vol. 10, no. 10, 2020. pretrained deep learning neural networks for object classification in
[2] M. Khojastehnazhand, V. Mohammadi, and S. Minaei, "Maturity underwater video," Remote Sens., vol. 12, no. 18, pp. 1–19, 2020.
detection and volume estimation of apricot using image processing [23] A. Rosebrock, Deep Learning for Computer Vision with Python
technique," Sci. Hortic. (Amsterdam)., vol. 251, no. January, pp. 247– Practitioner Bundle. Maryland: PyImageSearch, 2017.
251, 2019.

View publication stats

You might also like