Classification of Fruits Based On Shape and Color Using Combined Nearest Mean Classifiers
Classification of Fruits Based On Shape and Color Using Combined Nearest Mean Classifiers
Decree of the Director General of Higher Education, Research, and Technology, No. 158/E/KPT/2021
Validity period from Volume 5 Number 2 of 2021 to Volume 10 Number 1 of 2026
JURNAL RESTI
(Rekayasa Sistem dan Teknologi Informasi)
Vol. 7 No. 1 (2023) 51 - 57 ISSN Media Electronic: 2580-0760
Abstract
Fruit classification is an important task in many agriculture industry. The fruit classification system can be used to identify the
types and prices of fruit. Manual classification of fruit is not efficient for large amount of fruits. The advancement of information
technology has made possible fruit classification be done by a machine. This research aims to propose a fruit classification
methodology based on shape and color. To reduce the effect of lighting variability a color normalization is carried out prior
to feature extraction. The color features used in this research are mean and standard deviation. The shape features are area,
perimeter, and compactness. The classification of an unknown fruit is carried out using the nearest mean classifier. The method
developed in this research is tested using 12 classes of fruits where each class is represented by a number of samples. The
experimental results show that the method proposed in this research provides an accuracy of 95.83% for two samples per class
and 100% for three samples per class. Experiment on small training samples has been conducted to evaluate the performance
of the proposed combined nearest mean classifiers and results obtained showed that the technique was able to provide good
accuracy.
Keywords: fruit classification, nearest mean classifier, color features, shape features
The image of the fruit will be preprocessed to obtain the According to Sen [10],[11] that the classification used
feature of the fruit. Operations such as background as mentioned above is named supervised classification,
subtraction and the normalization of color will be since the class has been noticed and the data sample has
performed on the image of the fruit. Background been available. To develop the supervised
subtraction is performed to separate the image of the classification, earlier a computer system must have the
fruit from its background. Color normalization knowledge that can be developed by learning the
operation is then performed to eliminate the influence sample and recording them in a database [12].
of different lighting.
The fruit classification system follows the structure of
Features in the image of the fruit are extracted and an introduction design system proposed by Yan and
placed in feature vectors. The color features are Gao [13] that includes censor, processing feature
measured by mean and standard deviation on each red, extraction and classifier algorithm. The classification of
green, and blue (RGB) channel. The shape features are fruits is done undirectly by capturing the fruit object’s
measured by area, perimeter and compactness. The area image using the censor. The object’s image that is
of fruit reflects the actual fruit size or weight. The identical with its feature as well the reality is in the same
perimeter of fruit is defined as the area that covers the class [14].
boundary. The compactness of fruit is defined as the
The censor used as the image capture in this system is a
ratio of the area of a fruit to the area of a circle with the
digital camera (or webcam), as shown in Figure 2.
same perimeter.
DOI: https://doi.org/10.29207/resti.v7i1.4693
Creative Commons Attribution 4.0 International License (CC BY 4.0)
52
Abdullah, Agus Harjoko, Othman Mahmod
Jurnal RESTI (Rekayasa Sistem dan Teknologi Informasi) Vol. 7 No. 1 (2023)
r ( p)
p =1
ravg = (5)
P
P
g ( p)
g avg =
p =1 (6)
Figure 3. The classification of fruit architecture system P
P
2.1. Training phase
b( p )
p =1
In the training phase, the image of the fruit sample is bavg = (7)
captured through a censor. The fruit sample used P
consists of several fruit samples for each If the fruit image is x and the pixel number is P, the
category/sample afterward the processing is done. deviation standard of the fruit image color is:
The first process is background subtraction to separate T
the fruit from its background by implementing pixel x = (rstd , g std , bstd ) where
subtraction operation. The result is the absolute
(r ( p) − r )
P
2
DOI: https://doi.org/10.29207/resti.v7i1.4693
Creative Commons Attribution 4.0 International License (CC BY 4.0)
53
Abdullah, Agus Harjoko, Othman Mahmod
Jurnal RESTI (Rekayasa Sistem dan Teknologi Informasi) Vol. 7 No. 1 (2023)
the image feature database that is classified based on the After the above similarity measurement on each feature
category or class label. group, the similarity measurement is done
2.2. Recognition Phase simultaneously to those three feature groups. The way
is by adding up those three groups’ distances. However,
In the recognition phase, the fruits are classified and the the distance scale in each feature is different, it is
images will be captured through the censor, normalized by subtracting each distance of a certain
implemented preprocessing and feature extraction as in feature with the maximum distance. The normalized
the training phase. distance of each group is around 0-1, so the distance
The extraction result of fruit shape and color is used in similarity total is around 0-3. The equation used to
the image feature query. The classification is done by measure the similarity distance is the equation (18).
measuring the shape and color similarity of the image d (q, x ) d stdev (q, x ) d bentuk (q, x )
d sim (q, x ) =
avg
+ + ()
query with the mean class as equation (13). The max d avg max d stdev max d bentuk
unknown fruit is stated as feature vector q that will be
classified to class i if it is closer to vector mean class i where d sim (q, x ) is the similarity distance,
than others.
max d mean is the maximum distance of the mean color.
The similarity is measured through the vector distance.
Two closest vectors will possess similarities and a little max d stdev is the maximum distance of standard deviation
bit of difference [14]. Generally, the NMC classifier color.
used euclidean distance [22], even according to [23], max d bentuk is the maximum distance in shape’s category.
NMC classifier is also well-known as Nearest Centroid
Classifier [24]. Furthermore, the used distance metric in The similarity distance measurement is done for all
fruit classification is the L2 metric (euclidean metric). mean classes. The classification rule according to [14]
According to Malkov [25] that the euclidean of two is given to two classes w1 and w2. The object’s vectors
vectors x and w is shown by the equation (14). are written as {x1,...,xn}, if x1 is the w1 mean class, the
n
() new object Z is represented in the space as Zx..
x − y = ( xi − yi )
2
DOI: https://doi.org/10.29207/resti.v7i1.4693
Creative Commons Attribution 4.0 International License (CC BY 4.0)
54
Abdullah, Agus Harjoko, Othman Mahmod
Jurnal RESTI (Rekayasa Sistem dan Teknologi Informasi) Vol. 7 No. 1 (2023)
DOI: https://doi.org/10.29207/resti.v7i1.4693
Creative Commons Attribution 4.0 International License (CC BY 4.0)
55
Abdullah, Agus Harjoko, Othman Mahmod
Jurnal RESTI (Rekayasa Sistem dan Teknologi Informasi) Vol. 7 No. 1 (2023)
Table 2. The Same Fruits Images Test Result [3] M. Astani, M. Hasheminejad, and M. Vaghefi, “A diverse
The Classification ensemble classifier for tomato disease recognition,” Comput.
The image Electron. Agric., vol. 198, p. 107054, 2022, doi:
image
No training Rejected Result %
number
testing True False https://doi.org/10.1016/j.compag.2022.107054.
number [4] J. Kang and J. Gwak, “Ensemble of multi-task deep
1 One 12 11 1 0 91,67 convolutional neural networks using transfer learning for fruit
sample per 24 23 1 0 95,83 freshness classification,” Multimed. Tools Appl., vol. 81, Jul.
class 36 34 2 0 94,44
2022, doi: 10.1007/s11042-021-11282-4.
48 46 2 0 95,83
2 Two 12 11 1 0 91,67 [5] H. S. Gill, O. I. Khalaf, Y. Alotaibi, S. Alghamdi, and F.
samples 24 23 1 0 95,83 Alassery, “Fruit Image Classification Using Deep Learning,”
per class 36 35 1 0 97,22 Comput. Mater. Contin., vol. 71, no. 2, pp. 5135–5150, 2022,
48 46 1 1 95,83 doi: 10.32604/cmc.2022.022809.
3 Three 12 12 0 0 100 [6] C. C. Ukwuoma, Q. Zhiguang, M. B. Bin Heyat, L. Ali, Z.
samples 24 24 0 0 100 Almaspoor, and H. N. Monday, “Recent Advancements in
per class 36 36 0 0 100
Fruit Detection and Classification Using Deep Learning
48 48 0 0 100
Techniques,” Math. Probl. Eng., vol. 2022, p. 9210947, 2022,
doi: 10.1155/2022/9210947.
Table 3. The Other Fruit Image Test Result.
[7] P. Sumari et al., “A Precision Agricultural Application:
Manggis Fruit Classification Using Hybrid Deep Learning,”
The The Classification Rev. d’Intelligence Artif., vol. 35, no. 5, pp. 375–381, 2021,
image image
No
training testing Rejected
Result doi: 10.18280/ria.350503.
True False
number number
% [8] Z. Mai, R. Li, H. Kim, and S. Sanner, “Supervised Contrastive
12 12 0 0 100 Replay: Revisiting the Nearest Class Mean Classifier in Online
One Class-Incremental Continual Learning,” CoRR, vol.
24 24 0 0 100
1 sample
per class
36 36 0 0 100 abs/2103.13885, 2021, [Online]. Available:
48 46 1 1 95,83 https://arxiv.org/abs/2103.13885.
12 12 0 0 100 [9] E. Santucci, “Quantum Minimum Distance Classifier,”
Two
24 24 0 0 100
2 samples Entropy, vol. 19, no. 659, pp. 1–14, 2017, doi:
36 36 0 0 100
per class 10.3390/e19120659.
48 45 2 0 95,83
12 12 0 0 100 [10] P. C. Sen, M. Hajra, and M. Ghosh, “Supervised Classification
Three Algorithms in Machine Learning: A Survey and Review BT -
24 24 0 0 100
3 samples
per class
36 36 0 0 100 Emerging Technology in Modelling and Graphics,” 2020, pp.
48 48 0 0 100 99–111.
[11] K. D. Copsey, Statistical Pattern Recognition. Wiley, 2011.
[12] I. H. Sarker, “Machine Learning: Algorithms, Real-World
4. Conclusion Applications and Research Directions,” SN Comput. Sci., vol.
Classification of fruits using the proposed multiple 2, no. 3, pp. 1–21, 2021, doi: 10.1007/s42979-021-00592-x.
[13] X. Yan and L. Gao, “A feature extraction and classification
nearest mean classifier technique has shown that the algorithm based on improved sparse auto-encoder for round
technique is capable in producing high accuracy with a steel surface defects,” Math. Biosci. Eng., vol. 17, no. 5, pp.
small sample size. The sample number in each class 5369–5394, 2020, doi: 10.3934/MBE.2020290.
influences the system’s ability, the system becomes [14] I. Reppa, K. E. Williams, W. J. Greville, and J. Saunders, “The
relative contribution of shape and colour to object memory,”
better with the sample advancement’s number. Up to 3 Mem. Cognit., vol. 48, no. 8, pp. 1504–1521, 2020, doi:
samples in each class, the system had been able in doing 10.3758/s13421-020-01058-w.
the classification to 48 fruits with 100% in its [15] M. H. Guo et al., “Attention mechanisms in computer vision:
successfulness level or having a good reputation. The A survey,” Comput. Vis. Media, vol. 8, no. 3, pp. 331–368,
2022, doi: 10.1007/s41095-022-0271-y.
image capture process needs to be taken into account, [16] M. Han, H. Wu, Z. Chen, M. Li, and X. Zhang, “A survey of
so the color and shape of the fruit can be represented multi-label classification based on supervised and semi-
well. This way can be applied by using supplemented supervised learning,” Int. J. Mach. Learn. Cybern., 2022, doi:
light and solid-state image censor. In fact, the fruit 10.1007/s13042-022-01658-9.
[17] K. . Kendall and J. . Kendall, System Analysis and Design, 8th
surfaces are not always stainless, sometimes possessing ed. New Jersey: Prentice-Hall, 2020.
stains and dust in which their colors are identical to the [18] V. Meshram, K. Patil, V. Meshram, D. Hanchate, and S. D.
background thus the background subtraction result is Ramkteke, “Machine learning in agriculture domain: A state-
not perfect. It needs an algorithm arrangement and of-art survey,” Artif. Intell. Life Sci., vol. 1, no. September, pp.
1–11, 2021, doi: 10.1016/j.ailsci.2021.100010.
image processing technique to figure out the [19] D. N. Arulnathan, B. C. W. Koay, W. K. Lai, T. K. Ong, and
weaknesses. It needs the fruit classification system’s L. L. Lim, “Background Subtraction for Accurate Palm Oil
hardware so that the fruit sorting and grading process Fruitlet Ripeness Detection,” in 2022 IEEE International
can be done by a machine or robotic system. Conference on Automatic Control and Intelligent Systems
(I2CACIS), 2022, pp. 48–53, doi:
10.1109/I2CACIS54679.2022.9815275.
References [20] N. Phuangsaijai, J. Jakmunee, and S. Kittiwachana,
“Investigation into the predictive performance of colorimetric
[1] H. A. Hambali, S. L. S. Abdullah, N. Jamil, and H. Harun, sensor strips using RGB, CMYK, HSV, and CIELAB coupled
“Fruit classification using neural network model,” J. with various data preprocessing methods: a case study on an
Telecommun. Electron. Comput. Eng., vol. 9, no. 1–2, pp. 43– analysis of water quality parameters,” J. Anal. Sci. Technol.,
46, 2017. vol. 12, no. 1, 2021, doi: 10.1186/s40543-021-00271-9.
[2] O. Sagi and L. Rokach, “Ensemble learning: A survey,” Wiley [21] T. Gevers and A. Smeulders, “Foreword,” Lect. Notes Comput.
Interdisciplinary Reviews: Data Mining and Knowledge Sci. (including Subser. Lect. Notes Artif. Intell. Lect. Notes
Discovery, vol. 8, no. 4. 2018, doi: 10.1002/widm.1249.
DOI: https://doi.org/10.29207/resti.v7i1.4693
Creative Commons Attribution 4.0 International License (CC BY 4.0)
56
Abdullah, Agus Harjoko, Othman Mahmod
Jurnal RESTI (Rekayasa Sistem dan Teknologi Informasi) Vol. 7 No. 1 (2023)
Bioinformatics), vol. 9909 LNCS, p. V, 2016, doi: [25] Y. A. Malkov, “Efficient and robust approximate nearest
10.1007/978-3-319-46493-0. neighbor search using Hierarchical Navigable Small World
[22] N. Ali, D. Neagu, and P. Trundle, “Evaluation of k-nearest graphs,” IEEE Trans. Pattern Anal. Mach. Intell., pp. 31–33,
neighbour classifier performance for heterogeneous data sets,” 2018, [Online]. Available:
SN Appl. Sci., vol. 1, no. 12, pp. 1–15, 2019, doi: https://github.com/nmslib/nmslib%0Ahttp://ann-
10.1007/s42452-019-1356-9. benchmarks.com/hnsw(nmslib).html.
[23] K. Taunk, S. De, S. Verma, and A. Swetapadma, “A Brief [26] R. C. Chen, C. Dewi, S. W. Huang, and R. E. Caraka,
Review of Nearest Neighbor Algorithm for Learning and “Selecting critical features for data classification based on
Classification,” in 2019 International Conference on machine learning methods,” J. Big Data, vol. 7, no. 52, 2020,
Intelligent Computing and Control Systems (ICCS), 2019, pp. doi: 10.1186/s40537-020-00327-4.
1255–1260, doi: 10.1109/ICCS45141.2019.9065747. [27] J. Lv and J. Fang, “A Color Distance Model Based on Visual
[24] S. Johri et al., “Nearest centroid classification on a trapped ion Recognition,” Math. Probl. Eng., vol. 2018, 2018, doi:
quantum computer,” npj Quantum Inf., vol. 7, no. 1, 2021, doi: 10.1155/2018/4652526.
10.1038/s41534-021-00456-5. [28] R. Manisha, “Content Based Image Retrieval using Color and
Texture Feature with Distance Matrices,” Int. J. Sci. Res. Publ.,
vol. 7, no. 8, pp. 512–523, 2012, doi: 10.5121/sipij.2012.3104.
DOI: https://doi.org/10.29207/resti.v7i1.4693
Creative Commons Attribution 4.0 International License (CC BY 4.0)
57