0% found this document useful (0 votes)
179 views12 pages

A Computer Vision System For Automatic Cherry Beans - 2020 - Pattern Recognitio

A-computer-vision-system-for-automatic-cherry-beans-_2020_Pattern-Recognitios x

Uploaded by

Michael Wondemu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
179 views12 pages

A Computer Vision System For Automatic Cherry Beans - 2020 - Pattern Recognitio

A-computer-vision-system-for-automatic-cherry-beans-_2020_Pattern-Recognitios x

Uploaded by

Michael Wondemu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

Pattern Recognition Letters 136 (2020) 142–153

Contents lists available at ScienceDirect

Pattern Recognition Letters


journal homepage: www.elsevier.com/locate/patrec

A computer vision system for automatic cherry beans detection on


coffee trees
Jhonn Pablo Rodríguez a,∗, David Camilo Corrales a,b, Jean-Noël Aubertot b,
Juan Carlos Corrales a
a
Department of Telematics Engineering, Engineering Telematics Group, University of Cauca, Popayán 190002, Colombia
b
INRAE, University of Toulouse, UMR 1248 AGIR, Centre de recherche Occitanie-Toulouse, Auzeville, France

a r t i c l e i n f o a b s t r a c t

Article history: Coffee production estimation is an essential task for coffee farmers in terms of money investment and
Received 1 November 2019 planning time. In Colombia, the traditional methodology to estimate the total amount of cherry coffee
Revised 27 May 2020
beans is through direct measurements in the field; leave out the cherry beans collected of coffee produc-
Accepted 28 May 2020
tion (destructive sampling). The cherry coffee dropped in this process cannot be harvest by the producer.
Available online 31 May 2020
In this sense, we found several shortcomings in this methodology as counting errors in the sampling
Keywords: process, insufficient coffee bean samples, significant expenses of costs and time, and coffee beans losses.
Coffee production To handle these issues, we propose a classic Computer Vision (CV) approach to detect cherry beans in
Caturra coffee trees. This approach substitutes the destructive counting method as a first step to estimate coffee
Bourbon production. To evaluate the CV proposed, seven coffee farmers counted the number of cherry beans on
Castillo 600 images of coffee trees (castillo, bourbon, and caturra varieties) by human visual perception (ground
Noise reduction
truth). From evaluations of coffee farmers, we computed statistical measures like precision, recall and,
Segmentation
Morphological transformations
F1-score. The CV system achieved the best results for bourbon coffee trees with 0.594 of precision; 0.669
of total relevant cherry beans correctly classified.
© 2020 Elsevier B.V. All rights reserved.

1. Introduction tion, the coffee farmer can establish an estimated price of his cof-
fee production; in addition he can calculate the production losses,
Coffee is an essential product for rural communities in Colom- among other activities.
bia, according to statistics from the Federación Nacional de There have been remarkable advances in the development of
Cafeteros (FNC acronym in Spanish) of 2018 [18]. In Colombia, methodologies to predict grain production from a national per-
931,746 coffee hectares are sown by 555,692 coffee families as spective. The first approaches to the estimation of coffee produc-
the primary source of income. Ninety-six percent of coffee farm- tion were based on supply functions that depend on the harvested
ers population produce coffee in 1.3 Hectares (on average). Coffee area and historical production levels. At a later time, these func-
is the product with the most significant impact on the Colombian tions added variables such as the yield of plants by age, prices paid
rural sector [17], generating nearly 730 thousand jobs, represent- to the producer, and the effects of applied technology. Currently,
ing around 25% of agricultural employment in the country. Dur- the estimation of coffee production implies a sampling methodol-
ing 2018 coffee production amounted to 13.57 million bags (60 kg ogy in coffee plantations.
sacks). According to [35], currently, the FNC estimates the national cof-
The coffee production estimation is useful to schedule tasks of fee production with direct measurements in the field, leave out the
coffee crop management as a support tool for decision making. For cherry beans collected of coffee production (destructive sampling).
instance, determine the number of coffee workers for the crop har- Each sample consists of 60 coffee trees per hectare, in an area of
vesting and calculate the coffee value chain supplies until grain 20 0 0 hectares. Although FNC methodology can compute an esti-
commercialization. In this sense, from estimated coffee produc- mation of coffee production, this approach has four shortcomings:


Corresponding author.
E-mail addresses: [email protected] (J.P. Rodríguez),
1. Counting error in the sampling process: this task involves hu-
[email protected] (D.C. Corrales), [email protected] (J.-N. Auber- man interaction in the bean-counting process. This fact can re-
tot), [email protected] (J.C. Corrales). duce the reliability of the sample.

https://doi.org/10.1016/j.patrec.2020.05.034
0167-8655/© 2020 Elsevier B.V. All rights reserved.
J.P. Rodríguez, D.C. Corrales and J.-N. Aubertot et al. / Pattern Recognition Letters 136 (2020) 142–153 143

Fig. 1. Description of main stages of a classic computer vision system.

2. Insufficient coffee bean samples: the coffee plots contain 2.1.2. Image pre-processing
around 20 0 0-10,0 0 0 coffee trees. However, coffee farmers col- Frequently, images contain a different kind of noise, deteriorat-
lect a small sampling per plot due they follow the steps of FNC ing the image quality. In order to solve the noise problem, the im-
methodology. In this sense, the samples do not represent the age pre-processing enhances the collected image overcoming re-
entire population of the crop. luctant distortions and enlarge the features (degraded form). In
3. Significant expenses of costs and time: the activities proposed this step, several filters are used as a simple (reduce noise), me-
by FNC methodology requires investment to hire coffee farm- dian (reduce peak noise), and modified unshaped filter to identify
ers for bean-counting. Besides, the sampling collection task pro- borders. Also, image pre-processing detects regions of interest from
posed by FNC methodology takes time due is an operation the other areas to extract features [33].
manual process.
4. Coffee beans losses: collection process involves extracting the 2.1.3. Segmentation
cherry coffee beans from the trees affecting the coffee yield on The segmentation step separates an image into different areas
crops, hence the destructive name process. [7]. The primary function identifies the background, and subse-
quently, it is separated from the remaining image areas. The seg-
To handle the issues presented by FNC methodology, we pro-
mentation classifies each pixel of the image in two classes (i.e., in-
pose a first approximation of the CV system to detect cherry beans
terest area and background area). The pixels with a particular gray
in coffee trees to facilitate the coffee production estimation. The
level belonged to the class of interest area, while pixels with the
remainder of this paper is organized as follows: Section 1 presents
equivalent gray level belonged to the background class.
the basis and related works in agriculture CV; Section 2, the
CV approach to detect cherry beans; Section 3, the results and
2.1.4. Morphological transformation
Section 4 conclusions and future works.
The morphological transformation involves operations based on
2. Background the size and shape of the image [7]. Erosion, dilation, opening op-
eration, and closing operation are the basis of morphological trans-
In this section, we explain the computer vision basis and re- formation. In agriculture, the mentioned methods are used to crop
lated works that involve CV approaches in agriculture. disease detection [15].

2.1. Computer vision 2.1.5. Features extraction


In this step, the image features are extracted from methods
The principle of computer vision consists of understanding im- used in previous steps. The main idea of feature extraction is to
ages and videos; in other words, CV emulates the task of a human map the high dimensional space to low dimensional space. In crop
visual system [33]. Computer vision has been widely used in the disease recognition, basis features as color, texture, shape have
field of agricultural automation and plays a crucial role in its de- been widely extracted [15]. Besides, the color, texture, and mor-
velopment. CV applications in agriculture involve the monitoring phological features are frequently used to analyze the defects and
of the healthy growth of crops, detection of crop diseases, insect maturity of the fruits and vegetables [33].
pests, classification and quality inspection of agricultural products, Depending on CV tasks and interest regions, the previous CV
and automated management of modern farms [41]. A CV system is steps are repeated (Fig. 1).
composed of the following stages (Fig. 1).
2.2. Related works
2.1.1. Image acquisition
This step consists of the instruments to collect the images are Image processing has been used in several agriculture fields as
selected. In agriculture, different instruments are used: camera, ul- land identification [16], estimation of plant nitrogen content [40],
trasound, magnetic resonance imaging (MRI), and electrical tomog- recognition of infected areas by animal pests [27], detection of
raphy and computed tomography (CT). In the analysis of fruits and plant disease from shape, texture, and color [32] and notably in
vegetables, the light systems are structured as front and backlight- the classification of beans, grains, and fruits.
ing, or only capturing the image on the environment. However, it For instance, the work developed in [25] proposed a computer
involves more difficulty to individualize the objects with clarity of vision system for the quality inspection of beans (cereal, legumes,
the vision [7] fruits, and vegetables). From 14 images were extracted 511 bean
144 J.P. Rodríguez, D.C. Corrales and J.-N. Aubertot et al. / Pattern Recognition Letters 136 (2020) 142–153

Fig. 2. Plot distribution of the Naranjos farm [12]. The images were collected from plots 3, 4, 5 and 12.

Fig. 3. (a) Original image captured by smartphone. (b) Image 3(a) converted to YCrCb space.

samples. Sixty-nine bean samples were used for training an Arti- ence Vector (CCV), Color difference histogram (CDH). These de-
ficial Neural Network (ANN), 71 for validation, and 371 for test- scriptors generated 347 images by Gaussian filtering of the orig-
ing. ANN classified correctly 99.3% of white beans, 93.3% of yellow- inal images to test the blur insensitivity of different methods. For
green damaged beans, 69.1% of black damaged beans, 74.5% of low the classification, three models were analyzed: Support Vector Ma-
damaged beans, and 93.8% of highly damaged beans. The overall chine (SVM), K-Nearest Neighbor (KNN), and Probabilistic Neural
correct classification rate obtained was 90.6%. Network (PNN). Finally, among the classification methods utilized,
The authors of [38] categorized soybean plant foliar diseases KNN and SVM showed the best performance.
from 5 descriptors: histogram, Wavelet Decomposed Color His- An algorithm was proposed to detect injuries in stems and skin
togram (WDH), Border/Interior Classification (BIC), Color Coher- of citrus fruits [8]. They used an approach based on segmentation
J.P. Rodríguez, D.C. Corrales and J.-N. Aubertot et al. / Pattern Recognition Letters 136 (2020) 142–153 145

Fig. 4. YCrCb space converted to threshold space. Fig. 5. Threshold space converted to medianBlur space.

on 10,660 images, including 2132 oranges and mandarins with skin


injuries. The proposed algorithm detects injuries with 95.00% pre- the test set with real data from tomato crop images. Results show
cision. a 91% average test precision on real images and 93% on synthetic
A CV system to analyze the tomato injuries and ripeness are images.
proposed in [3]. The dataset contains 520 tomatoes images col- In [11], the authors propose a fruit (orange and green apples)
lected from the different gardens in the University of Agricultural counting approach based on the blob detector and the convolu-
Sciences, Bangalore. The dataset consists of ripe, unripe, injured tional network. First, candidate regions are extracted in the images,
and healthy tomatoes. The proposed CV system reached 96.47% of and second, the convolutional network counts the number of or-
precision. ange, and green apple fruits in each region.
In [4], researchers present an image processing framework for The state-of-the-art object detection framework is presented in
fruit detection on orchard images. The pixel-wise fruit segmenta- [5]. They built a Faster Region - Convolutional Neural Network (R-
tion output is processed using the Watershed Segmentation (WS) CNN) to detect orchard fruits (mangoes, almonds, and apples). The
and Circular Hough Transform (CHT) algorithms to detect and results of this work have shown an F1-score greater than 0.9 for
count individual fruits. Experiments were conducted in a commer- apples and mangoes.
cial apple orchard near Melbourne, Australia. The WS algorithm The approach exposes in [21] shows a comparative study of
produced the best apple orchard detection and counting results, multiple apple detection and counting methods. The images of
with a detection F1-score of 0.858. apple crops used for this paper were collected at the University
The authors [6] propose a weakly-supervised framework to of Minnesota Horticultural Research Center (HRC) in Eden Prairie,
count three varieties of fruits: olives, almonds, and apples. They Minnesota. The results indicated that the semi-supervised method
proposed a convolutional neural network architecture that aims to based on Gaussian Mixture Models outperforms the deep learning-
detect whether the image contains instances of the fruits or not based methods in the majority of the data sets.
and combines this information with image spatial consistency con- CV mechanism for detecting and counting immature green cit-
straints. rus fruit is proposed in [20]. Researchers used a fast-normalized
Authors in [34] present a simulated deep convolutional neural cross-correlation (FNCC) method. The dataset used in this work is
network for yield estimation, from the number of fruits, flowers, composed of 118 images from an experimental citrus grove in the
and trees. The authors train the network with synthetic data and University of Florida, Gainesville, USA. The authors used a valida-
146 J.P. Rodríguez, D.C. Corrales and J.-N. Aubertot et al. / Pattern Recognition Letters 136 (2020) 142–153

tial sensors. This approach was tested on an experimental coffee


plot with 16 videos (with good and low quality). They evaluate
and implement different algorithms (Laplacian, Soble, and Canny)
for quality detection before storage and synchronization of each
record.
From the methodology proposed by FNC, authors in [35] devel-
oped a strategy based on image analysis, considering only coffee
tree branches images to detect the maturity status of coffee beans.
To image segmentation, they used the Food-Color Inspector appli-
cation. The strategy obtained an average error of 5.5 percent.
Although previous works proposed CV mechanisms to detect
coffee beans, they are not focused on detecting the beans on coffee
trees. The work of [14] detected beans on images after to harvest,
while [31,35] used images of coffee tree branches. In this context,
we propose a first approximation of a CV system to detect cherry
beans on coffee trees.

3. Material and methods

This section explains the approach proposed for detecting and


quantifying of cherry coffee beans in crops, as shown in Fig. 1. The
computer vision approach was implemented in OpenCV 3.2.0 li-
brary [9]. Below are described the steps of our CV approach.

3.1. Image acquisition

The images used in this work were collected from 4 plots (In
Fig. 2, plots with ID: 3, 4, 5 and 12), at the Experimental Farm “Los
Naranjos” by Tecnicafé, in Cajibio, Cauca, Colombia (21 35 08"N,
76 32 53"W). Plot 3 (ds_castillo_3) is composed of Castillo variety,
while Plots 4 (ds_bourbon_4) and 5 (ds_bourbon_5) contained Bour-
bon coffee trees and plot 12 (ds_caturra_12) Caturra coffee trees.
Six hundred coffee trees images were captured (150 photos for
each plot). Photos were collected during the harvest period of April
2018 from 14:00 to 17:00 hours. To utilize similar phones to the
coffee farmers, we used a mid-range phone: Samsung J5 to cap-
ture the images (13 Megapixels resolution, focus of f/1.9, a24-bit
Fig. 6. The canny method applied to medianBlur space.
color JPG format 2322×4128 pixels). The photos were captured ap-
proximately 2 meters away from the coffee tree, at the height of
1 meter from the ground. The angle to capture the entire tree and
tion data set of 59 images; 84.4% of the fruits were successfully its most significant amount of cherry beans was 0 degrees. Besides,
detected. we only considered the cherry coffee beans displayed in the cap-
[39] propose a multi-sensor framework to identify, track, local- tured image (i.e., we do not take into account cherry beans behind
ize, and map every piece of fruit in a commercial mango orchard. the leaves or branches). Concerning the weather conditions, images
Five hundred twenty-two trees and 71,609 fruits were scanned on were captured on sunny days avoiding the wind, to obtain good
a Calypso mango orchard near Bundaberg, Queensland, Australia. quality images.
The researchers counted sixteen trees mangoes manually to vali- The capturing process of the 600 images was done by one per-
date the method proposed (before and after harvest). The multi- son in 3 hours.
view approach achieved an error rate of less than 1.36% for each
tree. 3.2. Pre-processing
The authors [36] adapted a model of last-generation neural net-
works called CNN (Faster R-CNN) for fruit detection. One hundred This phase aims to detect pixels with cherry coffee beans. Ini-
tree images were captured with multispectral cameras during day tially, the original image is converted to a space of image YCrCb to
and night. highlight the cherry coffee beans color from the remainder of the
From coffee crops, different works detect coffee beans into four objects contained in the image, as shown in Fig. 3(a) and Fig. 3(b).
groups: whitish, cane green, green, and bluish-green. The proposed Usually, images are in RGB space, which is a mathematical rep-
CV classification system [14] classifies color variations in green cof- resentation of a set of colors. Each image can be represented by
fee beans. The dataset contains green beans of commercial Coffeea different combinations of red, green, and blue, and this is called
Arabica also known as the Arabian Coffee harvested in 2013, pro- RGB color space [23].
vided by coffee growers from Minas Gerais State, Brazil. They se- The original image was converted to different spaces of images
lected 120 samples of 50g (30 per color). The neural network mod- such as XYZ, Lab, HSV, HLS, and Luv. However, YCrCb is the space
els reached a generalization error of 1.15%, and the Bayesian classi- of images that highlights more cherry coffee beans from the re-
fier achieved 100% precision. mainder of the image objects. YCrCb space encodes a color image
A Mobile application was developed in [31] to process and store similar to the human visual system. This fact allows for reducing
records of coffee tree branches. The App takes videos or photo bandwidth for chrominance components. Besides, YCrCb space en-
sequences of coffee tree branches (cherry beans) through iner- ables transmission errors or compression artifacts to be more effi-
J.P. Rodríguez, D.C. Corrales and J.-N. Aubertot et al. / Pattern Recognition Letters 136 (2020) 142–153 147

Fig. 7. Image transformation process. From original image to canny method application.

Fig. 8. Extraction of coffee tree branches through contour function. Tree branches are saved as independent images.

ciently masked by human perception than using RGB color space. Y 3.4. Segmentation
represents luminance, Cr, and Cb represent the red-difference and
blue-difference chrominance components [23]. Once the regions of interest are identified in the previous phase,
Subsequently, the threshold method [19] was applied to the im- we detected the edges of coffee tree branches using the canny
age in order to separate cherry coffee beans pixels from the back- method. The canny method [10] is to determine the first partial
ground. Results are depicted in Fig. 4. derivatives concerning X and Y. Therefore, based on these values,
the magnitude and direction of the best edge are found (De la
[13]). Fig. 6 shows the result when applying the canny method.
3.3. Noise reduction
The canny method is considered highly efficient due it can be
applied to various types of images without decrease the perfor-
In this phase, pixels without cherry beans information are
mance in the presence of noise on the original image [42].
deleted. To achieve this tasks, first, the image is converted to
In order to show the transformation of the image, Fig. 8 shows
grayscale and second a nonlinear filter (medianBlur function in
the processing carried out from the original image of the coffee
OpenCV library) is applied: median to blur objects contained in the
plant until the extraction of interest regions.
image (Z. [30]), in this case, regions with cherry coffee beans are
observed in detail. The results are shown in Fig. 5.
148 J.P. Rodríguez, D.C. Corrales and J.-N. Aubertot et al. / Pattern Recognition Letters 136 (2020) 142–153

Fig. 9 shows samples corresponding to branches with cherry


coffee beans from the original image.
In order to count the number of coffee cherry beans (depicted
in Fig. 1.), for each sample, we defined four steps: (i) FE: Image
Pre-processing, (ii) FE: Segmentation of Regions, (iii) FE: Morpho-
logical Transformations and (iv) FE: Features Extraction of the Re-
gions. Below these steps are explained.

3.5.1. FE: image pre-processing


Samples were converted to grayscale, and histograms were
equalized (equalizeHist function in OpenCV library) to improve the
contrast of cherry beans. The grayscale image conversion approach
is the line projection described as [43]:
I = ∝r R + ∝g G + ∝b B (1)
Where the coefficients are non-negative ∝r , ∝g , ∝b .
The histogram equalization normalizes the brightness and in-
creases the contrast of the image [43]. After, a nonlinear filter
(medianBlur function in OpenCV library) is applied (Z. [30]). In
Fig. 10(a) and Fig. 10(b), we presented three samples related to FE:
image pre-processing step.

3.5.2. FE: Segmentation of regions


In this step, we used a threshold filter (adaptiveThreshold func-
tion in OpenCV library) to convert a sample with different levels
of gray in a black and white scale, in order to highlight the cherry
beans center in white color as seen Fig. 11.

3.5.3. FE: morphological transformations (erosion and dilation)


The results obtained in the previous section, Erosion filter
[22] was applied to remove the noise generated in the segmen-
tation task. Afterward, the dilation filter [22] was used to expand
the cherry beans region. Fig. 12 shows the mentioned process.

3.5.4. FE: features extraction of the regions


Finally, we detected the contours through the OpenCV function
findContours. Each edge found is an approximation of cherry bean.
Once detected the contours, are counted to express the number
Fig. 9. Coffee tree branches without image background.
of cherry beans. Fig. 13 depicts contours found on each sample,
enclosed by rectangles using crop and boundingRect, functions of
OpenCV library.

3.5. Features extraction 4. Results

Once detected the edges by the canny method, the contours are The approach used for the detection and quantification of
found by findContours function. The contours found corresponds to cherry coffee beans was tested with 600 images collected in
tree branches with cherry coffee beans. Later, each contour is saved Naranjos farm (Section 2.1). We built the ground truth [26] from
as an image; this one is considered as a sample (see Fig. 8). coffee collectors and farmers described in Table 1. They counted

Fig. 10. (a). Coffee tree branches extracted from the original image. (b). Application of noise reduction filters to coffee tree branches.
J.P. Rodríguez, D.C. Corrales and J.-N. Aubertot et al. / Pattern Recognition Letters 136 (2020) 142–153 149

Fig. 10. Continued

Fig. 11. Application of threshold filter to images with noise reduction filters.

Fig. 12. Application of erosion and dilation methods to images with threshold filter.

the number of cherry coffee beans on each image by human visual We use three metrics for evaluation purposes: precision, recall,
perception. The collectors presented more experience than produc- and F1-score. These metrics are obtained using the confusion ma-
ers in the beans collection process. Collectors are directly focused trix (depicted in Table 2) from true positives (TP), false positives
on coffee growing, while producers are centered on coffee farm (FP), false negatives (FN) and true negatives (TN) rates.
management. The TP are cherry coffee beans that were detected by coffee
Coffee farmers counted the beans of 600 tree images for two farmers and by the computer vision approach. The FP corresponds
months approximately. We planned eight bean count sessions with to cherry coffee beans identified by the computer vision approach
coffee farmers. During each session, farmers counted coffee beans but not by coffee farmers. The FN are cherry coffee beans that
from 75 images chosen randomly. were not detected by the CV approach but are cherry coffee beans.
150 J.P. Rodríguez, D.C. Corrales and J.-N. Aubertot et al. / Pattern Recognition Letters 136 (2020) 142–153

Fig. 13. Extraction of interest regions through crop and boundingRect, functions of OpenCV library.

Recall refers to the percentage of total relevant coffee beans


correctly classified by the CV system [37], given by Eq. (3):
Recall = T P/(T P + F N ) (3)
The F1-score is the harmonic mean of the precision and recall,
where an F1-score reaches its best value at 1 (perfect precision and
recall) and worst at 0 [21]. Eq. (4) shows the F1-score measure:
F1 = 2 ∗ ( (P ∗ Recall )/(P + Recall ) ) (4)
In Table 3, the results of the evaluation metrics for each of the
datasets are presented.
Regarding precision, results obtained by all coffee varieties are
more significant than 0.42; the CV system achieved the highest
precision in bourbon coffee trees (in Table 3, dataset_bourbon_4:
0.594 and dataset_bourbon_5: 0.590). The foliage of this kind of
coffee tree is less than remain of coffee varieties analyzed [2]; In
this sense, the CV system detects more natural the coffee cherry
beans in bourbon variety.
The lowest precision was reached by caturra variety coffee trees
(Plot 12). This variety is represented by low-bearing, mainly axis
little branched, with abundant secondary branches and short in-
ternodes. Leaves are large, full, and of a slightly rough texture, with

Table 1
Coffee farmers profiles: job and years of experience.

Coffee Farmers Job Experience

1 Collector 7 years
2 Collector 8 years
3 Coffee Producer 5 years
4 Collector 1 year
5 Coffee Taster 3 years
6 Collector 5 years
7 Coffee Taster 4 years

Table 2
Representation of confusion matrix.

Positives Negatives

Fig. 14. Example of image labeled by LabelImg tool. Green boxes correspond to Positives TP FP
cherry beans labeled. Negatives FN TN

Table 3
Results of statistical criteria (precision, recall, F1-
score) to evaluate the CV system proposed on three
coffee varieties: castillo, bourbon and caturra.
Concerning statistics metrics, precision measures the percent-
Dataset Precision Recall F1-score
age of relevant coffee beans obtained by the CV system [37]. For-
mally expressed by Eq. (2): ds_castillo_3 0.523 0.706 0.600
ds_bourbon_4 0.594 0.699 0.642
ds_bourbon_5 0.590 0.696 0.638
ds_caturra_12 0.425 0.697 0.528
P = T P/(T P + F P ) (2)
J.P. Rodríguez, D.C. Corrales and J.-N. Aubertot et al. / Pattern Recognition Letters 136 (2020) 142–153 151

Fig. 15. Training time of Convolutional Neural Network. Training set of 400 images.

wavy edge, new leaves or buds are green and with vegetative vigor Table 4
Comparative study of CV approaches: the system of cherry coffee beans
[2]. Therefore, cherry coffee beans are barely visualized in the cap-
detection (CBD) proposed and Convolutional Neural Network (CNN). Three
tured images. statistical criteria were used (precision and recall, F1-score) on three coffee
CV system retrieved the highest quantity of relevant coffee varieties: castillo, bourbon and caturra.
beans for castillo variety (Recall 0.706). In bourbon and caturra
Precision Recall F1-score
trees, the number of relevant cherry beans retrieved by the CV sys- Dataset
tem is higher than 0.69. CBD CNN CBD CNN CBD CNN
As we explained above, F1 -score measure shows the bal- ds_castillo_3 0.523 0.695 0.706 0.621 0.600 0.656
ance between precision and recall measures. CV system achieved ds_bourbon_4 0.594 0.724 0.699 0.545 0.642 0.622
the highest F1 -score values for coffee trees of bourbon vari- ds_bourbon_5 0.590 0.703 0.696 0.757 0.638 0.729
ds_caturra_12 0.425 0.651 0.697 0.905 0.528 0.757
ety (0.642 and 0.638 for plots 4 and 5, respectively). In other
words, our CV approach classifies correctly and retrieves relevant
cherry beans in bourbon trees compared with castillo and caturra
varieties. nical specifications: 12 Cores Intel Xeon 32GB Ram. Two hundred
images were used as a test set.

5. Comparative study
5.1. Evaluation of computer vision approaches
In order to demonstrate the performance of the CV system
proposed against popular CV approaches, we tested a Convolu- Table 4 shows the results obtained by the system of cherry cof-
tional Neural Network (CNN) with the same dataset explained in fee beans detection (CBD) proposed and CNN from a test set of 200
Section 2.1. images. CBD falls in the precision compared with CNN in castillo
CNN comprises several convolutional and pooling layers simi- and caturra crops (0.172 and 0.226 respectively) due to the com-
lar to the human visual system. The input layer produces a vector plexity of cherry coffee beans detection in coffee varieties with
of different features correlated with objecting classes through the abundant foliage. In contrast to coffee varieties of low foliage as
output layer. Convolution series represent the hidden layers and bourbon, the precision difference between CNN and CBD decreases
pooling layers, followed by fully connected layers [6]. The training (plots 4 and 5, precision difference of 0.13 and 0.113, respectively).
process is performed in forward and backward stages based on the In castillo coffee crop, CBD classified correctly 70.6% of cherry
prediction output and labeled ground-truth [29]. In the backprop- beans on the total number of cases compared with 62.1% of beans
agation stage, the gradients of each parameter are computed based classified by CNN (in Table 4, the recall values are normalized as
on the loss cost. The parameters are updated based on the gradi- 0.706 and 0.621 respectively). In contrast, in caturra coffee crop,
ents. Subsequently, they are used in the next forward computation. CNN reached a recall value (0.905) higher than CBD (0.697). From
The training process can be stopped after a determinate number of the results of F1-score (interpreted as a weighted average of preci-
iterations of forwarding and backward stages. sion and recall) shown in Table 4, CNN achieved closer values to 1
To train the CNN, we labeled the cherry coffee beans of the than CBD, although neither of F1-score values above 0.76.
dataset (Section 2.1) through LabelImg [28]. This image annotation
tool allows for labelling the bounding boxes of the cherry coffee 5.2. Training time of CNN
beans as shown in Fig. 14.
We built a Faster Region-based Convolutional Neural Network In this section, we present the training time of Convolutional
(Faster-RCNN) [24] developed in TensorFlow [1]. Faster-RCNN was Neural Networks. Regarding CBD, our approach does not use a
trained with 400 images on a cloud server with the following tech- training phase due to cherry beans count is implanted through
152 J.P. Rodríguez, D.C. Corrales and J.-N. Aubertot et al. / Pattern Recognition Letters 136 (2020) 142–153

color contrast of cherry coffee beans respecting immature beans, [2] Anacafé. (20 0 0). Semillas de las variedades de café cultivadas por Anacafé,
leaves of coffee trees, and whole image background. Asociación Nacional Del CaFÉ. http://www.anacafe.org/glifos/index.php?title=
Variedades_del_cafe.
The use of CNN improved the results obtained by CBD (Sec- [3] Megha.P. Arakeri, Lakshmana, Computer vision based fruit grading system for
tion 4.1); however, CNN requires high computation resources in the quality evaluation of tomato in agriculture industry, Procedia Comput. Sci. 79
model training stage [24]. This fact is due to the large number of (2016) 426–433, doi:10.1016/j.procs.2016.03.055.
[4] S. Bargoti, and J. Underwood. (2016). Image segmentation for fruit detection
layers identifying the features correlated to cherry beans and up- and yield estimation in apple orchards. arXiv:1610.08120 [cs].
dating the parameters in forwarding and backward stages [34]. In [5] S. Bargoti, and J. Underwood. (2017). Deep fruit detection in orchards.
Fig. 15, the learning time of CNN is presented for the training set arXiv:1610.03677 [cs].
[6] E. Bellocchio, T.A. Ciarfuglia, G. Costante, P. Valigi, Weakly supervised fruit
of 400 images.
counting for yield estimation using spatial consistency, IEEE Robot. Autom.
The learning time of CNN experiments has exponential growth. Lett. 4 (3) (2019) 2348–2355, doi:10.1109/LRA.2019.2903260.
The time is directly proportional to the image number. The [7] A. Bhargava, A. Bansal, Fruits and vegetables quality evaluation using computer
vision: a review, J. King Saud Univ. - Comput. Inf. Sci. (2018), doi:10.1016/j.
time/training proportion is 100:100, i.e., the Convolutional Neural
jksuci.2018.06.002.
Network needs around 100 hours to train 100 images of coffee [8] J. Blasco, N. Aleixos, J. Gómez-Sanchís, E. Moltó, Recognition and classifica-
trees. tion of external skin damage in citrus fruits using multispectral data and
morphological features, Biosyst. Eng. 103 (2) (2009) 137–145, doi:10.1016/j.
biosystemseng.20 09.03.0 09.
6. Conclusions and future works [9] G. Bradski, A. Kaebler, Learn. Open CV, Comput. Vis. Open CV Libr. (2008).
[10] J.F. Canny, Find. Edges Lines Images, Artif. Intell. Lab. 545 Technol. Sq. Camb.
(1983).
CV system proposed reduces the time estimation of coffee pro- [11] S.W. Chen, S.S. Shivakumar, S. Dcunha, J. Das, E. Okon, C. Qu, C.J. Taylor, V. Ku-
duction. This reduction is reflected in the saving time for tasks mar, Counting apples and oranges with deep learning: a data-driven approach,
management of the coffee production chain. However, the detec- IEEE Robot. Autom. Lett. 2 (2) (2017) 781–788, doi:10.1109/LRA.2017.2651944.
[12] D.C. Corrales, A. Ledezma, A.J.P. Q, J. Hoyos, A. Figueroa, J.C. Corrales, A new
tion of objects by CV is affected by external factors like sunlight, dataset for coffee rust detection in Colombian crops base on classifiers, Sist. y
quality, angle and camera resolution, photo distance. In our case, Telemática 12 (29) (2014) 9–23, doi:10.18046/syt.v12i29.1802.
coffee tree variety influenced the results of our CV system. The [13] A. De la Escuela. (2001). Vision por computador. Fundamentos y metodos.
[14] E.M. de Oliveira, D.S. Leme, B.H.G. Barbosa, M.P. Rodarte, R.G.F.A. Pereira, A
CV system detects more FP for coffee trees with high foliage (e.g., computer vision system for coffee beans classification based on computational
castillo variety). intelligence techniques, J. Food Eng. 171 (2016) 22–27, doi:10.1016/j.jfoodeng.
From the technology viewpoint, our CV approach extracts coffee 2015.10.009.
[15] Z. Diao, C. Zhao, G. Wu, X. Qiao, Review of application of mathematical mor-
beans from tree branches to difference of research works presented phology in crop disease recognition, in: . En D. Li, C. Zhao (Eds.), Computer
in section 1.2. These works are focused on the quality of the grain, and Computing Technologies in Agriculture II, Volume, 2, Springer, US, 2009,
acquisition of coffee branch images [31], and grain color classifica- pp. 981–990, doi:10.1007/978- 1- 4419- 0211- 5_23.
[16] B. Erdenee, T. Ryutaro, G. TanaParticular agricultural land cover classification
tion [14]. Especially in [35], under a semi-controlled environment,
case study of Tsagaannuur, Mongolia., in: 2010 IEEE International Geoscience
authors classify coffee beans by maturity. and Remote Sensing Symposium, 2010, pp. 3194–3197, doi:10.1109/IGARSS.
We propose future work to increase the image database from 2010.5649664.
[17] Federación Nacional de Cafeteros. (2015). Ensayos Sobre Econ. Cafe. No.30.
remain plots of Naranjos farm; these plots contain other coffee
[18] Federación Nacional de Cafeteros de Colombia. (2019). Inf. Estad. Cafe.. www.
varieties (e.g., Colombia, Maragogype, Tabi, etc). The inclusion of federaciondecafeteros.org.
new coffee varieties within image database force to calibrate (due [19] T.Y. Goh, S.N. Basah, H. Yazid, M.J. Aziz Safar, F.S. Ahmad Saad, Performance
shapes of new coffee trees) our CV approach and use other CV analysis of image thresholding: Otsu technique, Measurement 114 (2018) 298–
307, doi:10.1016/j.measurement.2017.09.052.
approaches as deep learning. In this sense, the comparative study [20] L. Han, L. Suk, W. Ku, Immature green citrus fruit detection and counting based
(section 4) allowed to demonstrate the capabilities of deep learn- on fast normalized cross correlation (FNCC) using natural outdoor colour im-
ing methods like Convolutional Neuronal Networks. Deep learning ages, Precis. Agric. 17 (2016), doi:10.1007/s11119- 016- 9443- z.
[21] N. Häni, P. Roy, V. Isler, A comparative study of fruit detection and counting
solves classification and detection applications complex for classic methods for yield mapping in apple orchards, J. F. Robot. (2019) rob. 21902.,
computer vision algorithms. For instance, the defect variations for doi:10.1002/rob.21902.
applications that require an appreciation of acceptable deviations [22] L. Ji, J. Piper, J.-Y. Tang, Erosion and dilation of binary images by arbitrary struc-
turing elements using interval coding, Pattern Recognit. Lett. 9 (3) (1989) 201–
from the control [4–6,11,20,21,34,36,39]. 209, doi:10.1016/0167- 8655(89)90055- X.
Besides, we propose to calibrate our CV approach to identify the [23] H. Jiang, H. Li, T. Liu, P. Zhang, J. Lu, A fast method for RGB to YCrCb conversion
maturity of coffee beans. Detection of the maturity level of beans based on FPGA, Proc. 2013 3rd Int. Conf. Comput. Sci. Netw. Technol. (2013)
588–591, doi:10.1109/ICCSNT.2013.6967182.
allows to coffee farmers to plan the coffee yield. We propose to
[24] A. Kamilaris, F.X. Prenafeta-Boldú, A review of the use of convolutional neural
create a time series to predict several stages of coffee production networks in agriculture, J. Agric. Sci. (2019), doi:10.1017/S0021859618000436.
from information collected from bean maturity level. Fig. 7 [25] K. Kılıç, İ.H. Boyacı, H. Köksel, İ Küsmenoğlu, A classification system for beans
using computer vision system and artificial neural networks, J. Food Eng. 78
(3) (2007) 897–904, doi:10.1016/j.jfoodeng.2005.11.030.
Acknowledgements [26] S. Krig, Ground truth data, content, metrics, and analysis. en s. krig (ed.,
Comput. Vis. Metr.: Surv., Taxon. Anal. (2014) 283–311 Apress, doi:10.1007/
978- 1- 4302- 5930- 5_7.
We thank to the Telematics Engineering Group (GIT) of the Uni- [27] M.K. Krishnan, Pest Control Agric. Plant. Using Image Process. (2013).
versity of Cauca and Tecnicafé for the technical support. In addi- [28] Labelimg Tool. (2015). Repository: https://github.com/tzutalin/labelImg. https:
//theailearner.com/tag/labelimg/.
tion, we are grateful to COLCIENCIAS for PhD scholarship granted
[29] T. Liu, S. Fang, Y. Zhao, P. Wang, and J. Zhang. (2015). Implementation of train-
to PhD. David Camilo Corrales. This work has been also supported ing convolutional neural networks. arXiv:1506.01195.
by Innovacción-Cauca (SGR-Colombia) under project "Alternativas [30] Z. Liu, P. Zhao, Parameters identification for blur image combining motion and
Innovadoras de Agricultura Inteligente para sistemas productivos defocus blurs using BP neural network, in: 2011 4th International Congress
on Image and Signal Processing, 2, 2011, pp. 798–802, doi:10.1109/CISP.2011.
agrícolas del departamento del Cauca soportado en entornos de 6100307.
IoT ID 4633 - Convocatoria 04C-2018 Banco de Proyectos Conjuntos [31] Muñoz Pérez, C. M., Desarrollo De Una Aplicación Android Que Permita Procesar
UEES - Sostenibilidad". Y Almacenar Registros De Ramas De Café, Tesis De Maestría En Ingeniería Elec-
trónica, Universidad Nacional de Colombia, 2017.
[32] J.K. Patil, R. Kumar, Advances in image processing for detection of plant dis-
References eases, J. Adv. Bioinf. Appl. Res. (2011).
[33] D.I. Patrício, R. Rieder, Computer vision and artificial intelligence in precision
[1] M. Abadi, A. Agarwal, P. Barham, E. Brevdo, Z. Chen, C. Citro, G.S. Corrado, agriculture for grain crops: a systematic review, Comput. Electron. Agric. 153
A. Davis, J. Dean, M. Devin, Tensorflow: Large-Scale Mach. Learn. Heterog. Syst. (2018) 69–81, doi:10.1016/j.compag.2018.08.001.
(2015) www.tensorflow.org.
J.P. Rodríguez, D.C. Corrales and J.-N. Aubertot et al. / Pattern Recognition Letters 136 (2020) 142–153 153

[34] M. Rahnemoonfar, C. Sheppard, Deep count: fruit counting based on deep sim- [40] V.K. Tewari, A.K. Arudra, S.P. Kumar, V. Pandey, N.S. Chandel, Estimation of
ulated learning, Sensors 17 (4) (2017) 905, doi:10.3390/s17040905. plant nitrogen content using digital image processing, Agric.l Eng. Int.: CIGR
[35] P. Ramos, F. Prieto, C. Oliveros, N. Aleixos, F. Albert, J. Blasco, Medición Del J. 15 (2) (2013) 78–86.
Porcentaje De Madurez En Ramas De Café Mediante Dispositivos Móviles Y [41] H. Tian, T. Wang, Y. Liu, X. Qiao, Y. Li, Computer vision technology in agricul-
Visión Por Computador (2015). tural automation—a review, Inf. Process. Agric. (2019), doi:10.1016/j.inpa.2019.
[36] I. Sa, Z. Ge, F. Dayoub, B. Upcroft, T. Perez, C. Mccool, Deep fruits: a fruit 09.006.
detection system using deep neural networks, Sensors 16 (8) (2016) 1222, [42] J. Valverde-Rebaza, Detección De Bordes Mediante El Algoritmo De Canny, Uni-
doi:10.3390/s16081222. versidad Nacional de Trujillo, 2007 https://www.researchgate.net/publication/
[37] G. Salton, M.J. McGill, Introduction to Modern Information Retrieval (1983). 267240432_Deteccion_de_bordes_mediante_el_algoritmo_de_Canny.
[38] S. Shrivastava, S.K. Singh, D.S. Hooda, Soybean plant foliar disease detection [43] Y. Wan, Q. Xie, A novel framework for optimal rgb to grayscale image con-
using image retrieval approaches, Multimed. Tools Appl. 76 (24) (2017) 26647– version, in: 2016 8th International Conference on Intelligent Human-Machine
26674, doi:10.1007/s11042- 016- 4191- 7. Systems and Cybernetics (IHMSC), 2, 2016, pp. 345–348, doi:10.1109/IHMSC.
[39] M. Stein, S. Bargoti, J. Underwood, Image based mango fruit detection, local- 2016.201.
isation and yield estimation using multiple view geometry, Sensors 2016 16
(1915) (2016), doi:10.3390/s16111915.

You might also like