0% found this document useful (0 votes)
18 views

YOLO Deep Learning Algorithm

This review discusses the YOLO deep learning algorithm for object detection in agriculture, highlighting its one-stage detection approach, speed, and accuracy compared to two-stage methods. YOLO has demonstrated significant applications in classifying crops, detecting diseases and pests, and environmental monitoring, with various versions like YOLOv4 and YOLOv5 showing impressive performance metrics. The paper also outlines the evolution of YOLO, its implementation in agricultural contexts, and its advantages in real-time object detection.

Uploaded by

iamancipe
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
18 views

YOLO Deep Learning Algorithm

This review discusses the YOLO deep learning algorithm for object detection in agriculture, highlighting its one-stage detection approach, speed, and accuracy compared to two-stage methods. YOLO has demonstrated significant applications in classifying crops, detecting diseases and pests, and environmental monitoring, with various versions like YOLOv4 and YOLOv5 showing impressive performance metrics. The paper also outlines the evolution of YOLO, its implementation in agricultural contexts, and its advantages in real-time object detection.

Uploaded by

iamancipe
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 15

Journal of Agricultural Engineering 2024; volume LV:1641

YOLO deep learning algorithm for object detection in agriculture:


a review
Kamalesh Kanna S,1 Kumaraperumal R,1 Pazhanivelan P,2 Jagadeeswaran R,1 Prabu P.C1

1Department of Remote Sensing and Geographic Information System, Tamil Nadu Agricultural University, Coimbatore
2Centre for Water and Geospatial Studies, Tamil Nadu Agricultural University, Coimbatore, India

90%. In this review, we discuss the basic principles behind YOLO,


Abstract different versions of YOLO, limitations, and YOLO application in
YOLO represents the one-stage object detection also called agriculture and farming.
regression-based object detection. Object in the given input is
directly classified and located instead of using the candidate
region. The accuracy from two-stage detection is higher than one-
stage detection where one-stage object detection speed is higher Introduction
than two-stage object detection. YOLO has become popular An increasing population results in an increasing demand for
because of its detection accuracy, good generalization, open- food, which results in the need for increased production of agricul-
source, and speed. YOLO boasts exceptional speed due to its tural products. Even though there is an increase in production we

ly
approach of using regression problems for frame detection, elimi- cannot meet the need because of several factors like pest and dis-
nating the need for a complex pipeline. In agriculture, using ease attacks, improper harvesting, climate factors, biodiversity,

on
remote sensing and drone technologies YOLO classifies and etc. Using advanced technologies like drones, Artificial
detects crops, diseases, and pests, and is also used for land use Intelligence (AI), and robots we can manage those factors. Thus,
mapping, environmental monitoring, urban planning, and wildlife. by introducing AI techniques we can improve the agriculture sec-

e
Recent research highlights YOLO’s impressive performance in tor production and reduce crop loss by reducing pest and disease
various agricultural applications. For instance, YOLOv4 demon-
strated high accuracy in counting and locating small objects in
us
attacks, improving nutrient management, timely harvest, etc.
Using deep learning, big data, and Internet of Things (IoT) we can
UAV-captured images of bean plants, achieving an AP of 84.8% monitor crops, predict yield, manage irrigation, manage weeds,
al
and a recall of 89%. Similarly, YOLOv5 showed significant pre- detect plant stress, etc. In computer vision, the most challenging
cision in identifying rice leaf diseases, with a precision rate of and fundamental task is object detection. Object detection
ci

involves accurately finding the object in the input image and clas-
sifying it according to the labels. In object detection, object clas-
er

sification, instance segmentation, and semantic segmentation are


related (Xiao et al., 2020; Zhang and Cloutier, 2021). Due to the
m

Correspondence: Kumaraperumal R, Department of Remote


Sensing and Geographic Information System, Tamil Nadu development of deep learning methods, object detection has
om

Agricultural University, Coimbatore, India. improved from machine learning to deep learning methods which
E-mail: kumaraperumal.r@tnau.ac.in are based on analytics. In remote sensing object detection is more
challenging due to the smaller number of datasets available, and
Key words: agriculture; computer vision; deep learning; object
low-resolution images (Teng et al., 2019; Yin et al., 2018, 2019).
-c

detection; real-time farming, YOLO.Contributions: all the authors


Object detection is classified into two types based on their work-
contributed equally, read and approved the final version of the man-
on

uscript and agreed to be accountable for all aspects of the work. ing stage: Two-stage and single-stage object detection. Two-stage
object detection is represented by R-CNN (Region-based
Convolutional Neural Network) (Girshick, 2015). It involves
N

Conflict of interest: the authors declare that they have no competing


interests, and all authors confirm accuracy. object detection in two stages. First, the candidate region is gener-
ated in the images. Then, regression processing and object classi-
Received: 14 March 2024. fication were performed on the candidate region (Papageorgiou et
Accepted: 13 July 2024. al., 1998; Zhou et al., 2018). In our review, YOLO represents the
one-stage object detection also called regression-based object
©Copyright: the Author(s), 2024
detection. Here the object in the given input is directly classified
Licensee PAGEPress, Italy
and located instead of using the candidate region. The accuracy
Journal of Agricultural Engineering 2024; LV:1641
doi:10.4081/jae.2024.1641
from two-stage detection is higher than one-stage detection where
one-stage object detection speed is higher than two-stage object
This work is licensed under a Creative Commons Attribution- detection (Dollár et al., 2014; Song et al., 2011). In real-time
NonCommercial 4.0 International License (CC BY-NC 4.0). object detection, the YOLO (You Only Look Once) algorithm is
remarkable for its accuracy and speed when compared to other
Publisher's note: all claims expressed in this article are solely those algorithms such as DPM (Deformable Parts Model), OverFeat,
of the authors and do not necessarily represent those of their SSD (Single Shot MultiBox Detector), RCNN (Regions with
affiliated organizations, or those of the publisher, the editors and
Convolutional Neural Network features), SPPNet (Spatial
the reviewers. Any product that may be evaluated in this article or
claim that may be made by its manufacturer is not guaranteed or Pyramid Pooling Network), fast RCNN, Mask RCNN, etc., pro-
endorsed by the publisher. viding reliable results in a brief period. The main idea of
GoogleNet (Zhong et al., 2015) has been implemented into the

[page 96] [Journal of Agricultural Engineering 2024; LV:1641]


Review

YOLO algorithm of their networks. We can improve the Deep con- because it requires only a single network evaluation. Non-maxi-
volution network by implementing large-scale image datasets such mum suppression (NMS) has been used here to reduce the multiple
as ImageNet and COCO. In agriculture YOLO classifies and detection error. Source code: https://github.com/pjreddie/darknet
detects crops (Espinoza-Hernández et al., 2023; Tian et al., 2019;
Wu et al., 2020),weeds (Ajayi et al., 2023), diseases and pests YOLO9000
(Lippi et al., 2021), land use mapping (Cheng et al., 2021), envi- YOLO9000 can detect over nine thousand object categories. A
ronmental monitoring (Zakria et al., 2022), urban planning (Qing 2% improvement in mAP is achieved by adding batch normaliza-
et al., 2021), and wildlife (Roy et al., 2023). tion to all Convolutional layers in YOLO, by this we can remove
dropouts from the model without overfitting. YOLO9000 has been
adjusted to work better in higher resolution inputs, by this increase
of 4% mAP is achieved. For predicting bounding boxes instead of
YOLO (you only look once) using fully connected layers, YOLOv2 used anchor boxes. The
YOLO is a real-time object identification technique that was network operation has been reduced to 416 images instead of 448
introduced in 2015 by Redmon and colleagues in their research x 448 to achieve an odd number of locations in the feature map so
paper “You only look once: unified, real-time object detection” that we have a single centre cell. In YOLOv2 they used multiscale
(Redmon et al., 2016) (Figure 1). YOLO approached the object training. For every 10 batches of iterations, different resolutions
detection problem as a regression problem spatially. The YOLO have been chosen by the algorithm itself. By this, the algorithm can
method is a direct object detection technique that employs a soli- detect objects at different resolutions. It is seen that, for lower res-
tary neural network to forecast several bounding boxes and the cor- olution, the algorithm works fairly accurately by producing 69
responding probability of each box’s class. YOLO directly trains mAP, and for higher resolution still operates above real-time speed
and enhances detection performance on full-size photos, so implic- producing 78.6 mAP (Redmon and Farhadi, 2017). Using PAS-

ly
itly adding contextual knowledge about classes and their visual CAL VOC 2012 dataset YOLOv2 runs faster than other algorithm

on
properties. Notably, Fast R-CNN, a leading detection method, methods achieving 73.4 mAP (Table 1). YOLOv2, we have been
tends to misinterpret background patches as objects due to its lim- cooperatively using the detection dataset (detects bounding boxes,
ited contextual awareness. and objectness and classifies common objects) and classification

e
dataset (expands the number of categories the algorithm can
us
detect). Here we use the multi-label model to combine data that are
not mutually exclusive. If an image is labelled for detection, our
History of YOLO network can backpropagate based on the complete YOLOv2 loss
al
function. When YOLOv2 encounters an image that requires classi-
YOLOv1 - you only look once version 1 fication, it only backpropagates the loss from the parts of the archi-
ci

tecture that are specific to classification. Source code:


YOLO spatially detects objects present on the images based on
https://pjreddie.com/darknet/yolov2/
er

regression by creating associated class probabilities and separate


bounding boxes. YOLO’s speed is fast because it only requires the
YOLOv3
m

image to be input into the network to obtain the final detection


result. This makes it possible for YOLO to perform real-time Redmon and Farhadi published YOLOv3 in ArXiv in 2018.
om

object detection on videos as well (Jiang et al., 2022). YOLO vati- Despite bigger architecture, they maintained the real-time perfor-
cinate class probabilities and bounding boxes by a single neural mance of YOLOv3.YOLOv3 architecture is made up of 53 convo-
network in a single evaluation. By this, the optimization of the lutional networks. It predicts the object using the multiscale perdi-
-c

algorithm on object detection can increased directly (Redmon et tion method where it uses bounding boxes of different grid sizes
al., 2016). By this real-time object detection has been achieved and which improves the prediction of smaller objects. Using regression
on

we can detect objects even in videos having more fps. Fast RCNN YOLOv3 predicts the objectness score for each bounding box.
an object detection algorithm, makes errors by identifying back- Anchor boxes having the highest overlap with ground truth objects
N

ground patches as objects in an image but YOLO makes less than have given 1 as the objectness score whereas other boxes have
half an error when compared to the RCNN algorithm. given 0 as the objectness score. In YOLOv3, the author added
Because of YOLO’s generalizability, it is more stable when Spatial Pyramid Pooling as the backbone of the architecture which
applied to a new domain of interest or unexpected inputs, when improves AP50 by 2.7%. 36.2% average precision AP is achieved
YOLO was introduced with artworks trained with natural images, it by YOLOv3-spp in the COCO MS dataset, and 60.6% AP50 at 20
outperformed other algorithms such as RCNN and DPM but the FPS is achieved by YOLOv3 2 times faster (Redmon and Farhadi,
accuracy of YOLOv1 less. During test time YOLO is extremely fast 2018). Source code: https://pjreddie.com/darknet/yolo/

Figure 1. Timeline of different versions of YOLO algorithms.

[Journal of Agricultural Engineering 2024; LV:1641] [page 97]


Review

Table 1. YOLOv2 performance using PASCAL VOC 2007 dataset and PASCAL VOC 2012 dataset.
Detection frame Resolution FPS mAP (%)
VOC 2007 dataset
YOLOv2 (Redmon and Farhadi, 2017) 288 x 288 91 69
352 x 352 81 73.7
416 x 416 67 76.8
480 x 480 59 77.8
544 x 544 40 78.6
VOC 2012 dataset
YOLOv2 544 x 544 73.4
SSD (Liu et al., 2016) 512 x 512 74.9
SSD 300 x 300 72.4
YOLOv1 57.9
Fast R-CNN (Girshick, 2015) 68.4
Faster R-CNN (Zhang et al., 2016) 70.4.

ly
on
Table 2. Some studies related to object detection using YOLO in agriculture.

e
Author YOLO model Number of images Accuracy Resolution (px) Inference
used used for training (for training)
us
Buzzy et al., 2020 Tiny-YOLOv3 >1000 Inference time 0.01 s 410 x 410 Counting of plant leaves using Tiny
F1 score 0.94 YOLOv3 model
al
FPR 24%
ci

Hamidisepehr et al., 2020 YOLOv2 478 AP 97% to 55.99% 570 x 430 Compared different object detection
algorithms for corn damage assessment
er

Bazame et al., 2021 Tiny-YOLOv3 mAP 84%, 800 x 800 Mapping, classification, and detection of
F1 score 82% coffee fruits from videos using computer
m

Precision 83%, visions (YOLOv3 Tiny) in Patos de Minas, Brazil regions


Recall 82%
om

Ohnemüller and Briassouli, 2021 Scaled YOLOv4 3782 10% higher mAP score than 480 x 480 Improvement of YOLOv4 MS COCO dataset
the baseline model accuracy and efficiency for detection of plants using
Nugroho et al., 2022 YOLOv4 400 Average accuracy 94.6% 1024 x 720 Detection of tomato ripeness using different
-c

720 x 480 deep learning models. The prediction results improved as


the total loss was reduced
on

Wiggers et al., 2022 YOLOv3 and YOLOv4 68 AP 84.8% (YOLOv4) 416 x 416 Bean plants were captured using UAV and counted
Recall 89% (YOLOv4) using YOLOv3 and YOLOv4 models. From this
N

YOLOv4 performed because of Spatial Pyramid Pooling (SPP)


Zhang and Li, 2022 YOLO-VOLO LS 300 Recall 96.059% 384 x 384 Used YOLO for object detection and VOLO for variety
Precision 96.014% identification of early lettuce seedlings
F1 score 96.039%
Ajayi et al., 2023 YOLOv5 254 Recall 69.2% 416 x 416 Automatic detection of crops classified as banana,
Precision 82.3% sugarcane, pepper, spinach, and weed using
F1 score 75.2% the YOLOv5 model in data collected through UAV.
Too much of an epoch affects the model’s strength
Yeh et al., 2024 YOLOv4 94 Accuracy 0.97 224 x 224 896 x 896 Using Mish function the accuracy of the YOLOv4 model
F1 score 0.91 is improved in counting and locating small objects
Haque et al., 2022 YOLOv5 1500 Precision 90% 416 x 416 Rice leaf diseases were detected and classified using the
Recall 67% YOLOv5 model and trained in Google Colab
mAP 76%
F1 score 81%
Sulemane et al., 2022 Tiny YOLO versions 1696 (RGB only) mAP <70% 406 x 406 To reduce water wastage in orchards, the gaps between the
plantations were automatically identified using algorithms and
found that Tiny YOLO performed well.
Different spectralimages such as NDVI, and NDWI
were used for the identification of gaps

[page 98] [Journal of Agricultural Engineering 2024; LV:1641]


Review

YOLOv4 YOLOv7
In April 2020, YOLOv4 was introduced by Bochkovskiy and Wang and colleagues published YOLOv7 in ArXiv in July
colleagues in ArXiv. YOLOv4 aimed to discover the ideal equilib- 2022. YOLOv7 outperformed all existing object detectors in both
rium by exploring numerous modifications classified as “bag-of- accuracy as well as speed from 5 FPS to 160 FPS range. Like
freebies” and “bag-of-specials.” “Bag-of-freebies” encompasses YOLOv4, YOLOv7 underwent training solely on the MS COCO
techniques altering the training strategy, and escalating training dataset without leveraging pre-trained backbones. YOLOv7 intro-
expenses, yet without a rise in inference time, with data augmenta- duced several architectural modifications and a set of “bag-of-free-
tion being the predominant example. Conversely, “bag-of-spe- bies”, contributing to enhanced accuracy without compromising
cials” includes methods that slightly amplify inference costs but inference speed, with the only impact being on the training time.
markedly enhance accuracy. In YOLOv4 Self-Adversarial ELAN is a strategy developed to improve the learning and con-
Training (SAT) is used where it hides the ground truth object and vergence efficiency of a deep model by controlling the shortest
detects the correct object based on original labels. AP of 43.5% is longest gradient path. YOLOv7 introduced E-ELAN, a feature
achieved in MS COCO dataset test-dev 2017 and 65.7% AP50 at designed exclusively for models that include an endless number of
more than 50 FPS is achieved using NVIDIA V100 (Bochkovskiy stacked computational blocks. E-ELAN increases network learn-
et al., 2020). Source code: https://github.com/AlexeyAB/darknet ing by shuffling and merging cardinality among distinct groups,
hence boosting the learning process without affecting the integrity
YOLOv5 of the original gradient path. It attains the maximum accuracy,
A few months after the release of YOLOv4, YOLOv5 is exhibiting an astonishing 56.8% average precision (AP), outper-
released by Glen Jocher. YOLOv5 was developed in PyTroch. forming all other real-time object detectors specifically intended
They used the Auto Anchor method which adjusts and checks for GPUs, such as the V100, when working at 30 FPS or above (C.-
anchor boxes for unfitness for training settings and dataset. Y. Wang et al., 2023). Source code: https://github.com/

ly
Installation of YOLOv5 in IoT devices is easier because it is writ- WongKinYiu/yolov7

on
ten in Python programming language. Even though no articles
were published by the author for YOLOv5 it is said that YOLOv5 YOLOv8
outperforms the other previous versions. Different model versions YOLOv8 uses two loss functions to increase its performance.

e
of YOLOv5 have been released such as YOLOv5n (nano), The CIoU and DFL loss functions are utilized for bounding box
YOLOv5s (small), YOLOv5m (medium), YOLOv5l (large), and loss, whereas binary cross-entropy is employed for classification
us
YOLOv5x (extra-large) where their convolutional size is changed loss. These loss functions have been demonstrated to increase
based on different hardware requirements and applications. object detection performance, especially when dealing with tiny
al
YOLOv5x is developed for high-resource devices with high per- objects. The YOLOv8-Seg model has a prediction layer and five
formance whereas YOLOv5s and YOLOv5n are developed for detection modules, which are similar the detection heads of
ci

low-resource devices. AP of 50.7% is achieved by YOLOv5x hav- YOLOv8. YOLOv8 has a semantic segmentation component
ing an image size of pixels in MS COCO dataset test-dev 2017. known as YOLOv8-Seg. This model has exhibited leading perfor-
er

Source code: https://github.com/ultralytics/yolov5 mance on a range of object detection and semantic segmentation
examinations, all while sustaining speedy processing and effec-
m

YOLOv6 tiveness. The model uses a CSPDarknet53 feature extractor as its


backbone, followed by a C2f module instead of the usual YOLO
om

Meituan Vision AI Department published YOLOv6 in ArXiv in


2022. Using post-training quantization (PTQ) and quantization- neck architecture. For the prediction of semantic segmentation, the
aware training (QAT), YOLOv6 inference speed was boosted with- C2f module is followed by two segmentation heads. It also sup-
ports different integrations for labelling, training, and deployment.
-c

out much reduction in performance. YOLOv6 was mainly devel-


oped and focused on industry applications. YOLOv6 was suffused According to the MS COCO dataset test-dev 2017, YOLOv8x
on

with a self-distillation strategy. In network designing for the con- achieved an average accuracy (AP) of 53.9% with an image size of
struction of the backbone RepBlock (Ding et al., 2021) is used for 640 pixels. This is a huge improvement compared to YOLOv5’s
small models, and CSP (Wang et al., 2020) block is used for large AP of 50.7% on the identical input size 12. YOLOv8x gets a speed
N

models. For neck (Liu et al., 2018) constructs, PAN topology (used of 280 frames per second (FPS) when running on an NVIDIA
in YOLOv5 and YOLOv6) with RepBlocks or CSPStackRep A100 with TensorRT, as indicated in the paper by Terven and
Blocks is adopted to have Rep-PAN an enhanced version of PAN Cordova-Esparza (2023). Source code: https://github.com/ultralyt-
topology. Efficient Decoupled Head is used for head construction. ics/ultralytics
For labelling task alignment learning (TAL) (Feng et al., 2021) is
considered as more efficient. In YOLOv6, we employ a hybrid- YOLOv9
channel strategy to create a more streamlined decoupled head. To Yolov9 involves the use of PGI (Programmable Gradient
be precise, we decrease the count of intermediate 3x3 convolution- Information) and a lightweight network called GELAN
al layers to just one. The head’s width is simultaneously adjusted (Generalised Efficient Layer Aggregation Network). PGI is an
by the width multiplier for both the backbone and the neck. These auxiliary supervision framework developed to solve information
adjustments effectively diminish computational expenses, result- bottleneck problems such as the loss of information during the
ing in a decreased inference latency. YOLOv6 adopts an anchor- feedforward mechanism. PGI consists of three components: main
free detector (anchor point-based) (Ge et al., 2021; Tian et al., branch, auxiliary reverse branch and multi-level auxiliary branch.
2019) where the box regression branch accurately anticipates the An auxiliary reversible branch has been implied in PGI to retain
distance from the anchor point to all four sides of the bounding the information that has been lost due to an information bottleneck.
boxes (Li et al., 2022). Source code: https://github.com/meituan/ By introducing GELAN (formed by combining CSPNet and
YOLOv6 ELAN), they improved the model’s architecture and reduced the
information bottleneck (Tishby and Zaslavsky, 2015) which gener-

[Journal of Agricultural Engineering 2024; LV:1641] [page 99]


Review

ally occurs during the feedforward mechanism. C.-Y. Wang et al.


(2024) used the MS COCO dataset to validate the model with other Metrics for measuring the accuracy of YOLO
models. The training was done based on train-from-scratch object
detection and a total training of 500 epochs was done. From Figure Mean average precision
2, we can see that YOLOv9 performed well by utilizing fewer For analysis of the efficiency of object identification and seg-
parameters only. They also conducted ablation studies and found mentation, we often use a metric called mean average precision
CSP block with ELAN has given good results, accuracy shows a (mAP). Algorithms such as SSD, YOLO, and R-CNN use mAP to
linear relationship for 2 and more than 2 depth of ELAN and CSP measure their performance. This statistic is often employed in
block. Source code: https://github.com/WongKinYiu/yolov9 benchmark challenges, including Pascal, VOC, COCO, etc. The

ly
on
e
us
al
ci
er
m
om

Figure 2. Comparison chart of YOLOv9 with other start of art object detection.
-c
on
N

Figure 3. YOLO and other state-of-the-art object detectors speed comparison.

[page 100] [Journal of Agricultural Engineering 2024; LV:1641]


Review

procedure requires obtaining the mean of average accuracy (AP)


values, obtained across recall values that vary from 0 to 1. The
mAP formula incorporates sub-metrics such as

Rajamohanan and Latha, 2023


- Confusion matrix

Ajikaran et al., 2023


Sportelli et al., 2023
Chen Z. et al., 2022
- Intersection over Union (IoU)

Abhijit et al., 2023


Hobbs et al., 2021

Song et al., 2020

Zhu et al., 2023


Yu et al., 2022)
- Recall

Source
- Precision

Confusion matrix
Confusion matrix is a highly famous measure utilized while
solving classification difficulties. It can be applied to binary clas-
sification as well as to multi-class classification issues. Confusion

This model has few parameters with high calculation accuracy when compared with other models

This works helps farmers to do tedious work of categorization and recognize of bird eye chillies
mechanized harvesting of Asparagus to reduce labor costs and increase production efficiency
matrices represent counts from predicted and actual values. The

Detection of weeds in different turfgrass. Such as manila grass, ryegrass and bermudagrass

From this study they found that YOLOv4 is suitable and effective for counting grains
By apple flower detection we can find the apple thinning time and predict the yield
result “TN” stands for true negative which shows the number of
negative situations identified accurately. Similarly, “TP” stands for
true positive which shows the number of positive cases identified

They identified and classified the citrus leaf disease into


accurately. The term “FP” shows a false positive value, i.e., the

in localization and counting of on-ear corn Kernels

This helps farmer to manage the disease affected


citrus canker, citrus greening and healthy leaf
number of actual negative cases classed as positive; while “FN”

in images obtained from agricultural field


YOLOv5 outperformed other algorithms

and make the prevention measures


means a false negative value which is the number of actual positive
examples classified as negative. To obtain a confusion matrix,

Key findings
users need to pass real values and expected values to the function

ly
(Kulkarni et al., 2020) (Figure 3).

on
Intersection over union (IoU)
Bounding boxes: Bounding boxes are rectangular zones that are

e
drawn around the object of interest in images. We use x and y as coor-
dinates to represent the coordinates of the bounding boxes. Object us
detection methods such as YOLO, CNN, and SSD use bounding
boxes with probabilistic classes for identified objects (Breuers et al.,
2016). Tracking of objects, instance segmentation (Hsu et al., 2019),
al
and scene understanding were done in images using bounding boxes.
ci

Intersection over Union (IoU): For the assessment of bounding

YOLOv8L mAP0.5 0.9795, mAP0.5 0.95-0.8123


boxes, we use IoU metrics. It involves the quantification of overlap

YOLOv4 Accuracy 97.65%


er
YOLO version Performance metrics

AP 0.90 and mAP 0.94


between the predicted boxes and the ground truth boxes. IoU is the

Accuracy 85.45%
Accuracy 93%
mAP0.5 0.70

mAP 95.4%

mAP 77.5%
mAP 95.5%
ratio of the area of interest of two bounding boxes and their area of
m

union (Figure 4). The standard Pascal Visual Object Classes


(VOC) Challenge 2007 requires that IoU values surpass 0.5 to be
om

considered acceptable (Cowton et al., 2019).


EADD-YOLO

Mathematically,
-c

YOLOv5s

YOLOv5s

YOLOv3,
YOLOv4,
YOLOv5

YOLOv5
YOLOv5

YOLOv4

YOLOv5
on

Table 3. YOLO applications in agriculture and key findings.

Apple leaf disease dataset (ALDD)

bird-eye chillies dataset(private)


Public and self-build dataset,

Weed and self-build dataset


Hafiz Tayyab Rauf dataset,
N

corn Kernel counting

where Area of Intersection is the region where both the predicted


Apple flower dataset
PlantVillage dataset
Self-build dataset,

Self-build dataset

Self-build dataset

Self-build dataset
Dataset used

and ground truth bounding boxes overlap, and Area of Union is the
combined region covered by both the predicted and the ground
truth bounding boxes.
Intersection over Union values range from 0 to 1. IoU of zero
indicates no overlap between the ground truth bounding boxes and
the predicted bounding boxes. IoU of one indicates the perfect
Sem blight and brown spot detection

match i.e. ground truth boxes were precisely aligned with the pre-
Tomato leaf disease identification

Apple leaf disease detection

Recognize and categorize

dicted bounding boxes. Figure 5 represents the threshold values to


Apple flower detection
Specific task

Weed detection
Counting and e

Grain counting

check whether the predicted value is true positive or false positive.


Citrus disease

in Asparagus
stimation

detection

For example, if the acquired IoU value exceeds the predefined


threshold (e.g., 0.6) the predicted value will be true positive other
than this the predicted value is treated as false positive.

Precision and recall


Application area

Quality assessment
Quality assessment

Precision and recall serve as commonly employed and


Crop monitoring
Crop monitoring

Crop monitoring

Crop monitoring

Crop monitoring

Crop monitoring

Crop monitoring

favoured metrics in classification tasks. Precision assesses the


model’s accuracy in predicting positive values, thus quantifying

[Journal of Agricultural Engineering 2024; LV:1641] [page 101]


Review

the correctness of positive predictions. This measure is alternative- with lower confidence from our list is removed.
ly referred to as the positive predictive value. Recall, also termed vii) This step is repeated until we have gone through all the boxes
sensitivity, evaluates a model’s capability to predict positive out- in the list.
comes effectively (Chen, 2021; Pedregosa et al., 2011).
A good F1 score suggests good precision and recall values
were attained.
YOLO architecture and design principles
YOLO partitions an image into a grid with dimensions S x S.
Within each grid cell, predictions are made for B bounding boxes
and their corresponding confidence levels. The confidence of an
object indicates the reliability and accuracy of the bounding box
that both identifies and classifies the object (Štancel and Hulič,
2019). The core idea guiding the detection of an object within any
grid cell is that the centre of the object must be situated inside that
TP (true positive) = the objects were detected as that object. specific grid cell. The detection of a particular object is attributed
FP (false positive) = objects other than those that were detected as to the responsibility of the grid cell, aided by an appropriate
those objects. bounding box (Diwan et al., 2023). The grid cell forecasts param-
FN (false negative) = objects were not detected as those objects. eters for a singular bounding box, with the initial five parameters
being specific to that bounding box. However, the remaining
Non-maximum suppression algorithm parameters are common to all bounding boxes within the same
Non-maximum suppression (NMS) is employed as a post-pro- grid, regardless of the bounding boxes present.

ly
cessing methodology to enhance object detection by mitigating the
occurrence of overlapping bounding boxes and enhancing overall

on
accuracy. During the object detection process, the algorithm com-
monly produces numerous bounding boxes around the desired
object, each accompanied by distinct confidence scores (Figure 6).

e
To eliminate redundant and repetitive boxes and retain only the us
most accurate ones, we utilize NMS (Hosang et al., 2017).

General steps of NMS algorithm followed by


al

Subramanyam (2021)
ci

i) Confidence threshold and IoU threshold values are defined.


ii) Bounding boxes are sorted in descending order by confidence.
er

iii) If boxes have a confidence lower than the confidence thresh-


old, they are removed.
m

iv) Then, a loop is executed, keeping the highest confidence box


as the first.
om

v) Calculation of IoU of the current box is done with every


remaining box that belongs to the same class.
Figure 4. Visual representation of intersection over union (IoU).
vi) If the IoU of the two boxes exceeds the IoU threshold, the box
-c
on
N

Figure 5. Evaluation of IoU.

[page 102] [Journal of Agricultural Engineering 2024; LV:1641]


Review

Applications of YOLO in agricultural remote


sensing
The variable denotes the probability of an object being present
in the grid through the associated bounding box. The coordinates Detection of objects in satellite imagery
bxw, by specify the centre of the predicted bounding box, while bw,
Benayad et al. (2023) employed YOLOv3 to discover
bh indicate the anticipated dimensions of the bounding box. The
geomembrane basins using satellite imagery automatically. They
term p(ci) signifies the conditional probability of the object belong-
used 100 high-resolution satellite photos from Google Earth to
ing to the ith class, given pc, where n is the total number of classes
train the model for this endeavor. The algorithm focused on classi-
or categories. In total, a grid cell generates (B × 5 + n) values,
fying five main objects: geomembrane basins, crop areas, roads,
where B represents the number of bounding boxes per grid cell.
houses, and bare fields. To enhance the training process, over 300
The shape of the output tensor is S × S × (B × 5 + n) since we
basins were enclosed and taught using Darknet, chosen for its
divided the image into an S × S grid (Diwan et al., 2023).
exceptional precision and speed. During the evaluation of fresh
The confidence score (cs) for each bounding box in a grid is
images, an average precision of 80.6% is achieved, with a preci-
calculated by multiplying pc with IoU between the ground truth
sion of 83.3% and recall. However, YOLOv3 exhibited poor per-
and the predicted bounding box. If there is no object present in the
formance when dealing with small or closely located objects.
grid cell, the confidence score is set to zero. We calculate the class-
Li et al. (2020) detected agricultural greenhouse (AG) in areas
specific score (CSS) for each bounding box across all grid cells.
of Baoding, Hebei province, China by comparing different algo-
This score reflects both the probability of the class being present in
rithms like faster R-CNN, YOLOv3, and SSD (single shot multi-
that box and the degree to which the predicted box accurately
box detector). In their work they fused high resolution Gaofen-1 (2
aligns with the object. Typically, these bounding boxes vary in size

ly
m spatial resolution) and Gaofen- 2 (1 m spatial resolution) satel-
to accommodate different shapes and effectively capture various
lite images for detection of AG. All the architectures were imple-

on
objects, referred to as anchor boxes. The objective is to detect an
object in an image with a bounding box where the centre of the mented with the PyTorch framework (deep learning framework).
object lies. However, multiple object centres may fall within the The darknet model of YOLOv3 was converted to the PyTorch
framework. By adaptation of the Feature Pyramid Network (FPN)

e
same bounding box. The authors introduce the term “anchor
boxes” to denote the bounding boxes associated with a single grid and multilabel classification YOLOv3, the detection is enhanced.
us
cell. Anchor boxes constitute a set of standardized bounding boxes, Among the different architectures, YOLOv3 performed well with
by analysing the dataset and objects in it, the anchor boxes were mAP (GF-1 and GF-2) of 90.4% with an FPS of 73. They conclud-
ed that to increase the detection quality we need to increase the
al
chosen. These selected anchor boxes aim to encompass most class-
es/categories by considering diverse combinations of width and spatial resolution of the input images.
ci

height, such as vertical, square, or horizontal rectangles, etc. This Tundia et al. (2020) in their studies detected minor irrigation
ensures the representation of various aspect ratios and scales for all structures using Google Satellite images. They compared the speed
er

objects present in the dataset. and accuracy of Faster R-CNN, YOLOv3, Tiny YOLOv3, and
The CNN demonstrates remarkable performance in extracting RetinaNet. From this Tiny YOLOv3 has the least inference time
m

features from visual input by efficiently transmitting low-level fea- among the other architectures due to its reduced convolutional
layer but its accuracy is reduced (Tables 2 and 3).
om

tures from the beginning convolutional layers to subsequent ones


that are present in a deep CNN. The key challenge lies in precisely
identifying multiple objects and determining their precise positions Tree detection
within a single visual input. Effective handling of the YOLO object For the detection of date palm in regions of the Arabian
-c

detection problem is facilitated by two essential CNN features: Peninsula, North Africa, and the Middle East (Jintasuttisak et al.,
on

parameter sharing and the use of multiple filters. 2022) used state-of-the-art YOLOv5 (small, medium, large, and
N

Figure 6. Application of NMS to remove redundant bounding boxes.

[Journal of Agricultural Engineering 2024; LV:1641] [page 103]


Review

extra-large), YOLOv3, YOLOv4, and SSD300 are used. They ran- YOLOv7 and obtained mAP@0.5, precision and recall of 56.6%,
domly selected 125 images captured using an RGB drone camera 61.3%, and 62.1% respectively using CP datasets. Using the LB
from which they used 60% for training, 20% for validation, and dataset, they obtained mAP, mAP for weeds, and mAP for sugar
20% for testing then applied data augmentation which increased beets from 51% to 61%, 67.5% to 74.1%, and 34.6% to 48%. For
the range of the training dataset by five times. From their studies, spraying weedicide, Narayana and Ramana (2023) developed
they concluded that YOLOv5m (medium CNN depth) has per- object detection using YOLOv7 which trained using two datasets
formed better than other architecture with mAP of 92.34% and are early crop weed detection dataset (contains 308 images) and
YOLOv5s has less training time (11.33 ms) because of their small the 4weed dataset (contains 618 RGB images). They used 90% of
CNN network. Nurhabib and Seminar (2022) identified and count- the dataset for training and 10% for the testing set. The model was
ed oil palm trees using YOLO with Citra satellite series (1, 2, 3) trained and tested in Google Colab which is a cloud-based environ-
images. Özer et al. (2022) carried out an inter-comparative analy- ment. mAP of 99.6% was obtained for the Early Weed dataset and
sis of YOLOv5 where they compared the results of YOLOv5s, 78.53% mAP was obtained for the 4weed dataset.
YOLOv5m, and YOLOv5x for the detection of cherry trees in
Afyonkarahisar. A total of 889 images were obtained and 80% Fruit detection
were used for training rest for testing. YOLOv5s model performed Kumar and Kumar (2023) used a new approach to object
well and obtained precision, recall, and F1 scores of 0.983, 0.978, detection applying a multi-head attention mechanism and depth
and 0.980 respectively. Palm tree detection was carried out by values to YOLOv7 for the detection of apples in an orchard. The
Ariyadi et al. (2023) using 500 UAV images. The detection is car- input data was acquired through DJI Mavic mini 3 and images
ried out using YOLOv7 with and precision of 98.5%, recall of from the video were extracted and then annotated with depth label
98.17%, overall accuracy of 98.31%, and mean average precision creation and augmentations such as image mirroring, blurring of
of 99.7%. For training, they used 80% of the data, and 20% of the image, noisy image, etc we have done on the input. This modified

ly
data was used for testing. For each image, the detection time YOLOv7 consists of three detection heads which also help to
ranged from 17 ms to 18.4 ms. detect the depth of the apple in the orchard, which is further used

on
Monitoring forests enables us to tackle the loss of biodiversity to estimate distribution and density. In the end, YOLOv7 couldn’t
in forest ecosystems and tackle the effects of climate change. be able to identify all apples while detection but the modified
Straker and colleagues (2023) in their studies counted the number YOLOv7 (i.e., multi-head detection mechanism) detected almost

e
of trees and segmented the tree crowns using YOLOv5 and all apples which gave precision, recall, and F1 scores of 0.91, 0.96,
Tessellation approach. They used the “For Instance” dataset which
consists of 4192 annotated images. The YOLO model performed
us
and 0.92, respectively. For better marketing, ripeness is an impor-
tant factor for tomatoes. Thus, tomatoes need to be harvested in the
27% and 34% better than the Individual tree crown approach at correct stage. For this, (Appe et al., 2023) used a modified version
al
point densities of 50 and 10 points m-2 respectively. of YOLO called CAM – YOLO which used YOLOv5 for detecting
In countries such as India transmission lines passes through ripened tomatoes using convolutional block attention model
ci

cultivation lands. It is important to monitor these transmission (CBAM). By this, they achieved an accuracy of 88.1% and per-
lines to avoid damage by trees growing under them. Xu et al.
er

formed better than the base YOLO. A tomato health monitoring


(2023) used YOLOv7 and YOLOv4 to classify tree species in system was developed by Quach et al. (2024) by a combined
transmission line corridors. They classified trees into betel nut,
m

method of Mobilenetv2 and YOLOv8 for the classification, count-


jackfruit, neem, banyan, rubber, and coconut trees with 9531, ing, and detection of tomato. YOLOv8 performed well in the
om

4688, 1113, 2336, 2195 and 290 labels, respectively. The images detection of small objects because of the replacement of the C3
were collected through drones mounted with an MS600 pro multi- model in YOLOv8 from the C2f model used in YOLOv5 (Sohan et
spectral camera. They also applied image augmentations such as al., 2024). For annotation, they used RoboFlow and divided the
flipping, random cropping, colour dithering, rotation, scaling and
-c

dataset into 6:2:2 ratios for training, validation, and testing respec-
affine transformation. Using three different band combinations, tively. An image resolution of 640 x 640 is used for training for the
on

i.e., R-G-B, NIR-R-G, NIR-G-B the images were inputted. From development of the YOLOv8m and MobileNetv2 models. They
this, YOLOv7 achieved an average accuracy of 75.77%. from the achieved 95.76%, 95.74%, and 95.75% of precision, recall, and
different band combinations RGB composition acquired higher F1-Score respectively for YOLOv8m and MobileNetv2 models.
N

mean mAP. Fukada et al. (2023) used YOLOv5 (pre-trained using the
COCO 2017 dataset) to analyse tomato growth using industry
Weed detection camera devices. This implementation of YOLOv5-based object
Etienne et al. (2021) used YOLOv3 for the identification of detection reduced the effort required to analyse crop growth by
monocot and dicot weeds in the fields of corn and soybean 80%. Lawal (2021) detected tomatoes in complex environments
research plots. They created four different training image sets with using YOLO-Tomato (a modified version of YOLOv3). They
images acquired from 10 m above ground level (AGL), 30 m AGL, divide the models into three types. Such as YOLO-Tomato-A,
30 m and 10 m AGL, and 10 m GL with only dicot weeds. The YOLO-Tomato-B and YOLO-Tomato-C. YOLO-Tomato-C has a
obtained images were reduced to 416 x 416 pixels before training. mish activation function with a front detection layer (FDL) and
Weed instances of 25,560 were manually annotated. 91.48% and SPP outperformed the other two types by producing an AP of
86.13% of average precision (AP) scores were obtained at a thresh- 99.5%. The use of SPP results in improved AP of the model com-
old of 0.25. pared to the other two models. Fruits such as bananas, apricots,
Gallo et al. (2023) used UAV images due to their flexibility of apples, and strawberries ripen faster than other fruits. Detection of
data acquisition and high-resolution capability and created 12,113 ripened strawberries in fields by traditional methods is time-con-
bounding box annotations from 3000 collected RGB images suming and results in spoilage of fruits. An et al. (2022) developed
through UAV. In their studies they used two datasets; one is specif- a strawberry growth detection algorithm based on YOLOX.
ically developed for chicory plantations called the chicory plant Though the model size remains the same as YOLOX, it has 3.64%,
(CP) and another one is lincoln beet (LB). For detection, they used 2.04% and 4.08% higher accuracy, recall and precision respective-

[page 104] [Journal of Agricultural Engineering 2024; LV:1641]


Review

ly. This model also solves problems such as the low accuracy of an average inference time of 1.563 ms and a model size of 2 MB.
models at complex environments. Chen et al. (2023) has overcome Madhurya and Jubilson (2023) detected and classified plant
the dense and occluded grape detection and missing detection of leaf disease using the YOLOv7 framework called YR2S (YOLO-
grapes by developing a lightweight model called GA-YOLO. In Enhanced Rat Swarm Optimizer - Red Fox Optimization (RFO-
this model, SE-CSPGhostnet is designed and introduced in the ShuffleNetv2)). They used PCFAN for the generation of feature
backbone with 82.79% reduced parameters. It has a mAP of maps. The model was detected and classified with a high accuracy
96.87% and a detection speed of 55.867 FPS. Using artificial intel- of 99.69%. Bandi et al. (2023) used YOLOv5 for leaf disease and
ligence as a classifier and cameras as sensors (Chen M.-C. et al., used U2-Net to remove the background of the affected leaf. They
2022) identified the external quality of fruits such as apples, also used a vision transformer for classifying the disease into dif-
oranges and lemons based on size, height, width, etc. This reduces ferent stages such as high, medium, and low. They used open
the labour intensiveness and improves the work speed. They used datasets like PlantDoc and Plant Village. They achieved an F1
the YOLOv3 algorithm for fruit detection and acquired an accura- score of 0.57 and a confidence score of 0.2 for YOLOv5 in disease
cy of 88% by testing on 6000 images. Detecting cherry fruits in detection. Bachhal et al. (2023) in their studies used CCN+YOLO
open environments results in reduced accuracy due to shading. compared with other models for the detection of maize plant dis-
Thus, Gai et al. (2023) introduced an improved version of ease. They used the Plant Village dataset with 100 images of com-
YOLOv4 called YOLOv4-dense which has a modified backbone mon rust, 50 images of southern rust, 30 images of maize leaf
of CSPDarknet53 combined with DenseNet. Image augmentation blight, 30 images of turcicum leaf blight, 70 images of grey leaf
such as flipping, zooming, colour gamut changing, etc., were spot, and 90 health leaf images. To detect verticillium fungus in
applied on input images. Also, they changed the rectangular olive trees, Mamalis et al. (2023) different models of YOLOv5
bounding boxes into circular bounding boxes. By this the algo- such as nano, medium, and small. For annotation they used the
rithm’s speed is increased and feature extraction is also improved. LabelImg package and classified them as healthy and damaged has

ly
This model produced 0.15 higher mAP than YOLOv4. withered effect. These images were trained in two image sizes
With the help of computer vision, we can reduce input costs, 1216 x 1216 and 640 x 640. The YOLOv5m with model input of

on
and labour costs and increase production efficiency. Gremes et al. 640 x 640 size outperformed other models in their studies. They
(2023) counted green oranges directly from trees with green leaf concluded that as the input size decreases and increases in model
backgrounds using YOLOv4. The performance of used YOLOv4 capacity, the performance increases. Pine Wilt Disease (PWD) is

e
model was compared with an optimal object detector model, where one of the most dangerous diseases in forest regions because of its
in the captured video each orange were detected frame by frame.
Thus, by combining these two techniques double-counting errors
us
rapid spread and management challenges. Traditional methods
have more challenges such as excessive time consumption and
were reduced and the detected and actual oranges were almost poor accuracy. Detection of PWD in forest regions helps policy-
al
equal. The algorithm obtained an mAP50, mAP50:95, precision, makers to manage the situation based on the results. Zhu et al.
recall, F1-score, average IoU of 80.16, 53.83, 0.92, 0.93, 0.93 and (2024) used YOLOv7-SE for the detection of PWD from high-res-
ci

82.08%, respectively. olution helicopter images. The model achieved a precision rate of
0.9281, F1 score of 0.9117 and a recall of 0.8958. Similarly, Wu et
er

Disease detection al. (2024) used YOLOv3 for detecting PWD from UAV images.
They used the CIoU loss function for detecting forest pests and dis-
m

Amarasingam et al. (2022) used a one-stage object detector -


YOLOv5 for detecting white leaf disease in sugarcane using DJI eases. Yao et al. (2024) developed a model called Pine-YOLO
om

Phantom 4 equipped with RTK technology in regions of eastern Sri (modified version of YOLOv8) which identifies PWD. This model
Lanka. The obtained images were augmented using Python aug- mAP@0.5 at 90.69%, mAP@0.5:0.95 at 49.72%, recall at 85.72%,
mentor package 0.2.9. 1200, 240, and 240 images were used for precision at 91.31% and F1-score at 88.43%.
-c

training, testing, and validation process. They conclude, that


among the different algorithms used YOLOv5 outperformed other Crop detection
on

algorithms in precision, mAP@0.5. mAP@0.95 and has a very Espinoza-Hernández et al. (2023) determined agave plant den-
small model size of 14MB when compared to YOLOR, DETR, and sity using high-resolution RGB images captured through remote
Faster R-CNN. Amarasingam et al. (2022) conducted object detec- pilot drones. They used YOLOv4 and YOLOv4 tiny for accurate
N

tion using XGB, RF, DT, and KNN in the same fields and obtained detection at different phenological stages and produced a mean
very little accuracy than YOLOv5. Mathew and Mahesh (2022) average accuracy of 0.99 for both architectures with 0.95 and 0.96
used YOLOv3 for disease detection in apples. They identified dis- F1 Score for YOLOv4 and YOLOv4 tiny respectively. Qin et al.
eases visible in apple tree leaves such as black rot, cedar rust, and (2021) developed an algorithm from YOLO called Ag-YOLO
apple scab. they classified the image dataset into four classes, and which was operated in NCS2(Intel Neural Compute Stick 2). They
for each class for training and testing, they utilized 1500 and 500 also compared the developed model with YOLOv3 – Tiny. It is
images respectively. At the 700th iteration, they get an average loss seen that Ag- YOLO outperformed YOLOv3 – Tiny producing a
of 0.6010. (da Silva et al., 2023) their studies for the detection of higher accuracy of 0.9205 (F1 Score) and a higher FPS of 36.5
diseases in Citrus used YOLOv3 and Faster RCNN for detection which is two times faster than Tiny YOLOv3 using 12x fewer
tasks and concluded YOLO was faster than Faster R-CNN which parameters. Counting rice seedlings traditionally is time-consum-
utilizes less computation power when compared to Faster RCNN. ing and labour-intensive leading to errors. Yeh et al. (2024) devel-
They used LabelImg (Tzutalin, 2015). YOLOv3 and faster R-CNN oped a YOLO-based approach for counting and marking the loca-
were run on Keras back-end and evaluated using mAP. While tion of rice seedlings in the field using a UAV UAV-based
detection they used GPS of mobile to map how the infection spread approach. In their studies, they used YOLOv4 for counting the
through the orchard spatially. To detect crop leaf diseases, Dai and seedlings. Though YOLO models are weak in detecting small
Fan (2022) used YOLOv5- CAcT and Plant Village and AI objects they made changes in images by making data augmentation
Challenger datasets. The model achieved an accuracy of 94.24% (image cropping) and changes in the activation function. They
and achieved 59 crop disease categories and 10 crop species with implemented the Mish function to improve the accuracy of archi-

[Journal of Agricultural Engineering 2024; LV:1641] [page 105]


Review

tecture. They utilized the UAV dataset provided by AIdea (25 rice upcoming days. We can make the YOLO algorithm use RGB
images with a resolution of 3000 x 2000, 19 images with a resolu- images as well as multispectral bands to analyse chlorophyll con-
tion of 2304 x 1728). The experiment was conducted in six models, tent and monitor the stress condition of crops in real-time
and it was found that model 6 (modified YOLOv4 with mish acti- (Thomson and Sullivan, 2006). Increased fertilizer application
vation function) had given accuracy of 0.97, an average precision results in wastage of input and has adverse effects on the environ-
of 0.917, and an F1-score of 0.91. Wang Y. et al. (2023) in their ment. Using YOLO, we can apply fertilizers to specified crops
studies proposed a YOLOv5-AC model for detecting the efficiency through IoT technology. By this, we can reduce the input cost,
of uncrewed rice transplanters. The model achieved an accuracy of reduce the wastage of the raw materials, and protect the environ-
95.8% and F1 score of 93.39%. Lu et al. (2023) modified YOLOv8 mental impact. Weeds play a crucial role in the agricultural field
for UAV-based object detection and developed a model for precise since they result in reduced yield of crops due to nutrient uptake by
agriculture. When compared with YOLOv8-N, this model per- them (Nath et al., 2024). In some studies, weeds were detected and
formed well by obtaining 0.921,0.883,0.937and 0.565 precision, management practices using YOLO algorithms. By implementing
recall, AP50 and AP50:95 respectively. Pu et al. (2023) used a YOLO, we need to distinguish between crops and weeds to apply
modified version of YOLOv7 called Tassel-YOLO which used site-specific management practices such as applying weedicides.
GSConv and VoVGSCSP module in the neck part and SIoU loss By collecting high-resolution images of croplands through drones
function in the head part. Tassel-YOLO achieves 96.14% we can predict the yield and plan harvesting. The data obtained can
mAp@0.5, with a counting accuracy of 97.55%. They used the be integrated with weather data, satellite imagery, and crop models
global attention mechanism (GAM) (Liu et al., 2021) which to create a decision support system for farmers (Table 4).
improves the feature representation ability through channel atten-
tion and the accuracy of spatial data through spatial attention
(Wang et al., 2018). Images were acquired using a DJI Mavic

ly
drone and the image resolution was reduced to 640 x 640 during Conclusions
the detection phase. Due to a lack of knowledge and experience,

on
In agriculture, YOLO has been used for crop detection (Sneha
coffee farmers find it difficult to harvest coffee fruits at the time of
et al., 2024), fruit detection (Appe et al., 2023), and pest and dis-
harvest. Bazame et al. (2022) detected and classified coffee fruits
ease detection (Amara et al., 2023). However, the base YOLO
into unripe(green), overripe(dry) and ripe(cherry) using YOLO.

e
algorithm struggles with identifying small objects, which poses
They used YOLOv3 and YOLOv4 for detection and classification.
challenges for detecting crops like rice, sorghum, and maize. To
us
YOLOv4 and YOLOv4-tiny models performed well and obtained
address this, modified versions like Tassel-YOLO for maize tassel
mAP of 81% and 79%. Camacho and Morocho-Cayamcela (2023)
detection (Pu et al., 2023), Ag-YOLO for broader agricultural
used YOLOv8 for the segmentation and detection of tomatoes at
studies (Qin et al., 2021), and a modified YOLOv4 for cherry
al
different maturity stages. YOLOv8 produced an R2 of
detection (Gai et al., 2023) have been developed. Further enhance-
0.809,0.897,0.968 in ripe, half-ripe and green categories respec-
ci

ments include modifying the activation function with the mish


tively. Tea quality is based on the correct identification and har-
function and using a modified PANet in YOLOv4 to improve the
er

vesting of perfect tea buds which improves the industry’s profit.


counting and locating of small objects (Yeh et al., 2024). For real-
But the harvest is labour-intensive and time-consuming. By com-
time crop detection using UAVs, computational efficiency is cru-
m

bining the YOLOv3 algorithm, semantic segmentation, minimum


cial. A modified version of YOLOv7, called YR2S, has been used
bounding rectangle and skeleton extraction (C. Chen et al., 2022)
for disease detection, achieving a high accuracy of 99.69%
om

located the picking point of tea buds. YOLOv3 obtained an aver-


(Madhurya and Jubilson, 2023). While YOLO has been increasing-
age accuracy of 71.96% for tea bud identification.
ly used in agriculture, newer versions have not yet been widely
Wang C. et al. (2023) developed a modified version of YOLO
adopted. In other applications, such as livestock management, a
-c

called YOLOv5n for accurate and rapid target detection. This


modified YOLOv3 has been used for detecting and monitoring
model can be used in lightweight applications and real-time detec-
on

cow estrus behavior, though challenges remain due to size varia-


tion. It also compared with other versions of YOLO and obtained
tions in the animals. The introduction of the DenseBlock structure
an average accuracy of 95.2%.
improved detection in these cases (Z. Wang et al., 2024).
N

Developing software for UAVs compatible with low-computation


YOLO models is highly valued. The latest YOLOv9 model, which
requires less computational power, can enhance real-time perfor-
Discussion mance on low-processing devices. YOLO has shown significant
potential and versatility in agriculture and other fields, ongoing
YOLO in agriculture improvements and adaptations are necessary to fully leverage its
Monitoring of crops using drones is becoming popular in the capabilities in real-world applications.

Table 4. Future direction and research opportunities.


Research area Description Potential impact on agriculture Current state of research
Real-time Processing Need to enhance YOLO for quick inference time. Enables real time monitoring and helps in decision making Current improvements in architecture of model
and hardware acceleration
Multispectral Imaging Multispectral data integrating with YOLO. Better detection of physiographic character study and disease detection Combining multispectral camera with deep learning
Integration with Robotics Monitoring and harvesting by integrating YOLO. improves efficiency and automation of labour-intensive tasks Computer vision combining with robotics in agriculture
Transfer learning YOLO adopts to new datasets quickly Less annotated data is sufficient Pretrained models like Ag-YOLO are developing

[page 106] [Journal of Agricultural Engineering 2024; LV:1641]


Review

Camacho, J.C., Morocho-Cayamcela, M.E. 2023. Mask R-CNN


References and YOLOv8 comparison to perform tomato maturity recogni-
Abhijit, Akhil, S., Kumar, V.A., Jose, B.K., Abubeker, K. 2023. tion task. In: Maldonado-Mahauad, J., Herrera-Tapia, J.,
Computer vision assisted bird–eye chilli classification frame- Zambrano-Martínez, J.L., Berrezueta, S. (eds.), Information
work using YOLO V5 object detection model. In: Shrivastava, and Communication Technologies. TICEC 2023.
V., Bansal, J.C., Panigrahi, B.K. (eds.), Power Engineering and Communications in Computer and Information Science vol
Intelligent Systems. PEIS 2023. Lecture Notes in Electrical 1885. Cham, Springer.
Engineering vol 1097. Singapore, Springer. Chen, C., Lu, J., Zhou, M., Yi, J., Liao, M., Gao, Z. 2022. A
Ajayi, O G., Ashi, J., Guda, B. 2023. Performance evaluation of YOLOv3-based computer vision system for identification of
YOLO v5 model for automatic crop and weed classification on tea buds and the picking point. Comput. Electron. Agr.
UAV images. Smart Agr. Technol. 5:100231. 198:107116.
Ajikaran, R., Hewarathna, A.I., Palanisamy, V., Joseph, C., Chen, J., Ma, A., Huang, L., Su, Y., Li, W., Zhang, H., Wang, Z.
Thuseethan, S. 2023. An image analysis-based automated 2023. GA-YOLO: a lightweight YOLO model for dense and
method using deep learning for grain counting. IEEE 17th Int. occluded grape target detection. Horticulturae 9:443.
Conf. on Industrial and Information Systems (ICIIS), Chen, L.-P. 2021. Practical statistics for data scientists: 50+ essen-
Peradeniya. pp. 25-30 tial concepts using R and Python. Technometrics 63:272-273.
Amara, S.J., Yamini, S., Sumathi, D. 2023. Pest detection using Chen, M.-C., Cheng, Y.-T., Liu, C.-Y. 2022. Implementation of a
YOLO V7 model. In: Namasudra, S., Trivedi, M.C., Crespo, fruit quality classification application using an artificial intelli-
R.G., Lorenz, P. (eds.), Data Science and Network gence algorithm. Sensors Mater. 34:151-162.
Engineering. ICDSNE 2023. Lecture Notes in Networks and Chen, Z., Su, R., Wang, Y., Chen, G., Wang, Z., Yin, P., Wang, J.
Systems,vol 791. Singapore, Springer. 2022. Automatic estimation of apple orchard blooming levels

ly
Amarasingam, N., Gonzalez, F., Salgadoe, A.S.A., Sandino, J., using the improved YOLOv5. Agronomy (Basel) 12:2483.
Powell, K. 2022. Detection of white leaf disease in sugarcane Cheng, L., Li, J., Duan, P., Wang, M. 2021. A small attentional

on
crops using UAV-derived RGB imagery with existing deep YOLO model for landslide detection from satellite remote
learning models. Remote Sens. (Basel) 14:6137. sensing images. Landslides 18:2751-2765.
An, Q., Wang, K., Li, Z., Song, C., Tang, X., Song, J. 2022. Real- Cowton, J., Kyriazakis, I., and Bacardit, J. 2019. Automated indi-

e
time monitoring method of strawberry fruit growth state based vidual pig localisation, tracking and behaviour metric extrac-
on YOLO improved model. IEEE Access 10:124363-124372.
Appe, S.N., Arulselvi, G., Balaji, G. 2023. CAM-YOLO: tomato
us tion using deep learning. IEEE Access 7:108049-108060.
da Silva, J.C., Silva, M.C., Luz, E.J., Delabrida, S., Oliveira, R.A.
detection and classification based on improved YOLOv5 using 2023. Using mobile edge AI to detect and map diseases in cit-
al
combining attention mechanism. PeerJ Comp. Sci. 9: e1463. rus orchards. Sensors (Basel) 23:2165.
Ariyadi, M.R.N., Pribadi, M.R., Widiyanto, E.P. 2023. Unmanned Dai, G., Fan, J. 2022. An industrial-grade solution for crop disease
ci

aerial vehicle for remote sensing detection of oil palm trees image detection tasks. Front. Plant Sci. 13:921057.
er

using you only look once and convolutional neural network. Ding, X., Zhang, X., Ma, N., Han, J., Ding, G., Sun, J. 202).
10th Int. Conf. on Electrical Engineering, Computer Science Repvgg: Making vgg-style convnets great again. IEEE/CVF
m

and Informatics (EECSI), Palembang. pp. 226-230 Conf. on Computer Vision and Pattern Recognition, Nashville.
Bachhal, P., Kukreja, V., Ahuja, S. 2023. Real-time disease detec- pp. 13728-13737.
om

tion system for maize plants using deep convolutional neural Diwan, T., Anirudh, G., Tembhurne, J.V. 2023. Object detection
networks. Int. J. Comput. Dig. Syst. 14:10263-10275. using YOLO: Challenges, architectural successors, datasets
Bandi, R., Swamy, S., Arvind, C. 2023. Leaf disease severity clas- and applications. Multimed. Tools Appl. 82:9243-9275.
-c

sification with explainable artificial intelligence using trans- Dollár, P., Appel, R., Belongie, S., Perona, P. 2014. Fast feature
former networks. Int. J. Adv. Technol. Eng. Explor. 10:278. pyramids for object detection. IEEE T. Pattern Anal. 36:1532-
on

Bazame, H.C., Molin, J.P., Althoff, D., Martello, M. 2021. 1545.


Detection, classification, and mapping of coffee fruits during Espinoza-Hernández, J., de Jesús López-Canteñs, G., López-Cruz,
harvest with computer vision. Comput. Electron. Agr. I.L., Romantchik-Kriuchkova, E. 2023. Agave plant density
N

183:106066. using convolutional neural networks on aerial imagery.


Bazame, H.C., Molin, J.P., Althoff, D., Martello, M. 2022. Agrociencia 57. Online Ahead of Print.
Detection of coffee fruits on tree branches using computer Etienne, A., Ahmad, A., Aggarwal, V., Saraswat, D. 2021. Deep
vision. Sci. Agric. 80:e20220064. learning-based object detection system for identifying weeds
Benayad, M., Houran, N., Aamir, Z., Maanan, M., Rhinane, H. using uas imagery. Remote Sens. (Basel) 13:5182.
2023. Geomembrane basins detection based on satellite high- Feng, C., Zhong, Y., Gao, Y., Scott, M.R., Huang, W. 2021. Tood:
resolution imagery using DEEP learning algorithms. Int. Arch. Task-aligned one-stage object detection. IEEE/CVF Int. Conf.
Photogramm. Remote Sens. Spatial Inf. Sci. 48:75-79. on Computer Vision (ICCV), Montreal. pp. 3490-3499.
Bochkovskiy, A., Wang, C.-Y., Liao, H.-Y.M. 2020. Yolov4: Fukada, K., Hara, K., Cai, J., Teruya, D., Shimizu, I., Kuriyama, T.,
Optimal speed and accuracy of object detection. a et al. 2023. An automatic tomato growth analysis system using
arXiv:2004.10934. YOLO transfer learning. Appl. Sci. 13: 6880.
Breuers, S., Yang, S., Mathias, M., Leibe, B. 2016. Exploring Gai, R., Chen, N., Yuan, H. 2023. A detection algorithm for cherry
bounding box context for multi-object tracker fusion. IEEE fruits based on the improved YOLO-v4 model. Neural
Winter Conf. Applications of Computer Vision (WACV), Lake Comput. Appl. 35:13895-13906.
Placid. pp. 18-8. Gallo, I., Rehman, A.U., Dehkordi, R. H., Landro, N., La Grassa,
Buzzy, M., Thesma, V., Davoodi, M., Mohammadpour Velni, J. R., Boschetti, M. 2023. Deep object detection of crop weeds:
2020. Real-time plant leaf counting using deep object detec- Performance of YOLOv7 on a real case dataset from UAV
tion networks. Sensors (basel) 20:6896. images. Remote Sens. (Basel) 15:539.

[Journal of Agricultural Engineering 2024; LV:1641] [page 107]


Review

Ge, Z., Liu, S., Wang, F., Li, Z., Sun, J. 2021. Yolox: Exceeding IEEE Access.
yolo series in 2021. arXiv: 2107.08430. Madhurya, C., and Jubilson, E. A. (2023). YR2S: efficient deep
Girshick, R. 2015. Fast r-cnn. IEEE Int. Conf. on Computer Vision, learning technique for detecting and classifying plant leaf dis-
Santiago. pp. 1440-1448, eases. IEEE Access 11:116196-116205
Gremes, M.F., Fermo, I.R., Krummenauer, R., Flores, F.C., Mamalis, M., Kalampokis, E., Kalfas, I., Tarabanis, K. 2023. Deep
Gonçalves Andrade, C.M., da Motta Lima, O.C. 2023. System learning for detecting verticillium fungus in olive trees: Using
of counting green oranges directly from trees using artificial YOLO in UAV imagery. Algorithms (Basel) 16:343.
intelligence. AgriEngineering (Basel) 5:1813-1831. Mathew, M.P., Mahesh, T.Y. (2022). Leaf-based disease detection
Hamidisepehr, A., Mirnezami, S.V., Ward, J.K. 2020. Comparison in bell pepper plant using YOLO v5. Signal Image Video P.
of object detection methods for corn damage assessment using 16:841-847.
deep learning. T. ASABE 63:1969-1980. Narayana, C.L., Ramana, K.V. 2023. An efficient real-time weed
Haque, M.E., Rahman, A., Junaeid, I., Hoque, S.U., Paul, M. 2022. detection technique using YOLOv7. Int. J. Adv. Comput. Sci.
Rice leaf disease classification and detection using yolov5. Appl. 14:550-556.
arXiv: 2209.01579. Nath, C.P., Singh, R.G., Choudhary, V.K., Datta, D., Nandan, R.,
Hobbs, J., Khachatryan, V., Anandan, B.S., Hovhannisyan, H., Singh, S.S. (2024). Challenges and alternatives of herbicide-
Wilson, D. 2021. Broad dataset and methods for counting and based weed management. Agronomy (Basel) 14:126.
localization of on-ear corn kernels. Front. Robot. AI 8:627009. Nugroho, D.P., Widiyanto, S., Wardani, D.T. 2022. Comparison of
Hosang, J., Benenson, R., Schiele, B. 2017. Learning non-maxi- deep learning-based object classification methods for detecting
mum suppression. IEEE Conf. Computer Vision and Pattern tomato ripeness. Int. J. Fuzzy Logic Intell. Syst. 22:223-232.
Recognition, Honolulu. pp. 6469-6477. Nurhabib, I., Seminar, K. 2022. Recognition and counting of oil
Hsu, C.-C., Hsu, K.-J., Tsai, C.-C., Lin, Y.-Y., Chuang, Y.-Y. 2019. palm tree with deep learning using satellite image. IOP Conf.

ly
Weakly supervised instance segmentation using the bounding Ser. Earth Environ. Sci. 974:012058.
box tightness prior. 33rd Conf. Neural Information Processing Ohnemüller, L., Briassouli, A. 2021. Improving accuracy and effi-

on
Systems (NeurIPS 2019), Vancouver. ciency in plant detection on a novel, benchmarking real-world
Jiang, P., Ergu, D., Liu, F., Cai, Y., Ma, B. 2022. A review of Yolo dataset. IEEE Int.Workshop on Metrology for Agriculture and
algorithm developments. Procedia Comput. Sci. 199:1066- Forestry (MetroAgriFor), Trento-Bolzano. pp. 172-176.

e
1073. Özer, T., Akdoğan, C., Cengız, E., Kelek, M.M., Yildirim, K.,
Jintasuttisak, T., Edirisinghe, E., Elbattay, A. 2022. Deep neural
network based date palm tree detection in drone imagery.
us
Oğuz, Y., Akkoç, H. 2022. Cherry tree detection with deep
learning. IEEE Conf. on Innovations in Intelligent Systems and
Comput. Electron. Agr. 192:106560. Applications (ASYU), Antalya. pp. 1-4.
al
Kulkarni, A., Chong, D., Batarseh, F.A. 2020. Foundations of data Papageorgiou, C.P., Oren, M., Poggio, T. 1998. A general frame-
imbalance and solutions for a data democracy. In: Feras A. work for object detection. IEEE 6th Int. Conf. on Computer
ci

Batarseh, Yang R. (eds.), Data democracy. Cambridge, Vision, Bombay. pp. 555-562.
er

Academic Press. pp. 83-106. Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion,
Kumar, P., Kumar, N. 2023. Drone-based apple detection: Finding B., Grisel, O., et al. 2011. Scikit-learn: Machine learning in
m

the depth of apples using YOLOv7 architecture with multi- Python. J. Machin. Learning Res. 12:2825-2830.
head attention mechanism. Smart Agr. Technol. 5:100311. Pu, H., Chen, X., Yang, Y., Tang, R., Luo, J., Wang, Y., Mu, J.
om

Lawal, M.O. 2021. Tomato detection based on modified YOLOv3 2023. Tassel-YOLO: A new high-precision and real-time
framework. Sic. Rep. 11:1477. method for maize tassel detection and counting based on UAV
Li, C., Li, L., Jiang, H., Weng, K., Geng, Y., Li, L., et al. 2022. aerial images. Drones 7:492.
-c

YOLOv6: A single-stage object detection framework for Qin, Z., Wang, W., Dammer, K.-H., Guo, L., Cao, Z. 2021. Ag-
industrial applications. arXiv: 2209.02976. YOLO: A real-time low-cost detector for precise spraying with
on

Li, M., Zhang, Z., Lei, L., Wang, X., Guo, X. 2020. Agricultural case study of palms. Front. Plant Sci. 12:753603.
greenhouses detection in high-resolution satellite images based Qing, Y., Liu, W., Feng, L., Gao, W. 2021. Improved Yolo network
on convolutional neural networks: Comparison of faster R- for free-angle remote sensing target detection. Remote Sens.
N

CNN, YOLO v3 and SSD. Sensors (Basel) 20:4938. (Basel) 13:2171.


Lippi, M., Bonucci, N., Carpio, R. F., Contarini, M., Speranza, S., Quach, L.-D., Quoc, K.N., Quynh, A.N., Ngoc, H.T., Nghe, N.T.
Gasparri, A. 2021. A yolo-based pest detection system for pre- 2024. Tomato health monitoring system: tomato classification,
cision agriculture. 29th Mediterranean Conf. Control and detection, and counting system based on YOLOv8 model with
Automation (MED). pp. 342-347. explainable MobileNet models using Grad-CAM++. IEEE
Liu, S., Qi, L., Qin, H., Shi, J., Jia, J. 2018. Path aggregation net- Access 12:9719-9737.
work for instance segmentation. IEEE Conf. Computer Vision Rajamohanan, R., Latha, B.C. 2023. An optimized YOLO v5
and Pattern Recognition, Salt Lake City. pp. 8759-8768. model for tomato leaf disease classification with field dataset.
Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C.-Y., Eng. Technol. Appl. Sci. Res. 13:12033-12038.
Berg, A.C. 2016. Ssd: Single shot multibox detector. In: Leibe, Redmon, J., Divvala, S., Girshick, R., Farhadi, A. 2016. You only
B., Matas, J., Sebe, N., Welling, M. (eds) Computer Vision – look once: Unified, real-time object detection. IEEE Conf. on
ECCV 2016. Lecture Notes in Computer Science vol 9905. Computer Vision and Pattern Recognition, Las Vegas. pp. 779-
Cham, Springer. 788.
Liu, Y., Shao, Z., Hoffmann, N. 2021. Global attention mechanism: Redmon, J., Farhadi, A. 2017. YOLO9000: better, faster, stronger.
Retain information to enhance channel-spatial interactions. IEEE Conf. on Computer Vision and Pattern Recognition,
arXiv: 2112.05561. Honolulu. pp. 6517-6525.
Lu, D., Ye, J., Wang, Y., Yu, Z. 2023. Plant detection and counting: Redmon, J., Farhadi, A. 2018. Yolov3: An incremental improve-
Enhancing precision agriculture in UAV and general scenes. ment. arXiv: 1804.02767.

[page 108] [Journal of Agricultural Engineering 2024; LV:1641]


Review

Roy, A.M., Bhaduri, J., Kumar, T., Raj, K. 2023. WilDect-YOLO: Wang, C.-Y., Liao, H.-Y. M., Wu, Y.-H., Chen, P.-Y., Hsieh, J.-W.,
An efficient and robust computer vision-based accurate object Yeh, I.-H. 2020. CSPNet: A new backbone that can enhance
localization model for automated endangered wildlife detec- learning capability of CNN. IEEE/CVF Conf. on Computer
tion. Ecol. Inform. 75:101919. Vision and Pattern Recognition, Seattle. pp. 1571-1580
Sneha, N., Sundaram, M., Ranjan, R. 2024. Acre-scale grape Wang, C.-Y., Yeh, I.-H., Liao, H.-Y. M. 2024. YOLOv9: learning
bunch detection and predict grape harvest using YOLO deep what you want to learn using programmable gradient informa-
learning network. SN Comput. Sci. 5:250. tion. arXiv: 2402.13616.
Sohan, M., Sai Ram, T., Reddy, R., Venkata, C. 2024. A review on Wang, C., Wang, C., Wang, L., Wang, J., Liao, J., Li, Y., Lan, Y.
YOLOv8 and its advancements. Int. Conf. on Data Intelligence 2023. A lightweight cherry tomato maturity real-time detection
and Cognitive Informatics. pp 529-545 algorithm based on improved YOLOV5n. Agronomy (Basel)
Song, C., Wang, C., Yang, Y. 2020. Automatic detection and image 13:2106.
recognition of precision agriculture for citrus diseases. IEEE Wang, H., Fan, Y., Wang, Z., Jiao, L., Schiele, B. 2018. Parameter-
Eurasia Confe. on IOT, Communication and Engineering, free spatial attention network for person re-identification.
Yunlin, Taiwan. pp. 187-190. arXiv: 1811.12150.
Song, Z., Chen, Q., Huang, Z., Hua, Y., Yan, S. 2011. Wang, Y., Fu, Q., Ma, Z., Tian, X., Ji, Z., Yuan, W., et al. 2023.
Contextualizing object detection and classification. IEEE T. YOLOv5-AC: a method of uncrewed rice transplanter working
Pattern Anal. 37:13-27. quality detection. Agronomy (Basel) 13:2279.
Sportelli, M., Apolo-Apolo, O.E., Fontanelli, M., Frasconi, C., Wang, Z., Hua, Z., Wen, Y., Zhang, S., Xu, X., Song, H. 2024. E-
Raffaelli, M., Peruzzi, A., Perez-Ruiz, M. 2023. Evaluation of YOLO: Recognition of estrus cow based on improved
YOLO object detectors for weed detection in different turf- YOLOv8n model. Expert Syst. Appl. 238:122212.
grass scenarios. Appl. Sci. 13:8502. Wiggers, K.L., Pohlod, C.D., Orlovski, R., Ferreira, R., Santos,

ly
Štancel, M., Hulič, M. 2019. An introduction to image classifica- T.A. 2022. Detection and counting of plants via deep
tion and object detection using YOLO detector. Proc. CEUR learningusing images collected by RPA. Rev. Bras. Cien. Agr.

on
Workshop. 17:1.
Straker, A., Puliti, S., Breidenbach, J., Kleinn, C., Pearse, G., Wu, D., Lv, S., Jiang, M., Song, H. 2020. Using channel pruning-
Astrup, R., Magdon, P. 2023. Instance segmentation of individ- based YOLO v4 deep learning algorithm for the real-time and

e
ual tree crowns with YOLOv5: A comparison of approaches accurate detection of apple flowers in natural environments.
using the ForInstance benchmark LiDAR dataset. ISPRS Open
J. Photogramm. Remote Sens. 9:100045.
us Comput. Electron. Agr. 178:105742.
Wu, Y., Yang, H., Mao, Y. 2024. Detection of the pine wilt disease
Subramanyam, V.S. 2021. Non Max Suppression (NMS). using a joint deep object detection model based on drone
al
Available from: https://medium.com/analytics-vidhya/non- remote sensing data. Forests (Basel) 15:869.
max-suppression-nms-6623e6572536 Xiao, Y., Tian, Z., Yu, J., Zhang, Y., Liu, S., Du, S., Lan, X. 2020.
ci

Sulemane, S., Matos-Carvalho, J.P., Pedro, D., Moutinho, F., A review of object detection based on deep learning. Multim.
er

Correia, S.D. 2022. Vineyard gap detection by convolutional Tools Appl. 79:23729-23791.
neural networks fed by multi-spectral images. Algorithms Xu, S., Wang, R., Shi, W., Wang, X. 2023. Classification of tree
m

15:440. species in transmission line corridors based on YOLO v7.


Teng, L., Li, H., Karim, S. 2019. DMCNN: a deep multiscale con- Forests (Basel) 15:61.
om

volutional neural network model for medical image segmenta- Yao, J., Song, B., Chen, X., Zhang, M., Dong, X., Liu, H., et al.
tion. J. Healthc. Eng. 2019:8597606. 2024. Pine-YOLO: a method for detecting pine wilt disease in
Terven, J., Cordova-Esparza, D. 2023. A comprehensive review of unmanned aerial vehicle remote sensing images. Forests
-c

YOLO: From YOLOv1 to YOLOv8 and beyond. arXiv: (Basel) 15:737.


2304.00501. Yeh, J.-F., Lin, K.-M., Yuan, L.-C., Hsu, J.-M. (2024). Automatic
on

Thomson, S.J., Sullivan, D.G. 2006. Crop status monitoring using counting and location labeling of rice seedlings from
multispectral and thermal imaging systems for accessible aeri- unmanned aerial vehicle images. Electronics (Basel) 13:273.
al platforms. 2006 ASAE Annual Meeting 061179. Yin, S., Zhang, Y., Karim, S. 2018. Large scale remote sensing
N

Tian, Y., Yang, G., Wang, Z., Wang, H., Li, E., Liang, Z. 2019. image segmentation based on fuzzy region competition and
Apple detection during different growth stages in orchards Gaussian mixture model. IEEE Access 6:26069-26080.
using the improved YOLO-V3 model. Comput. Electron. Agr. Yin, S., Zhang, Y., Karim, S. 2019. Region search based on hybrid
157:417-426. convolutional neural network in optical remote sensing
Tishby, N., Zaslavsky, N. 2015. Deep learning and the information images. Int. J. Distrib. Sensor N. 15:1550147719852036.
bottleneck principle. IEEE Information Theory Workshop, Yu, J., Zhang, C., Wang, J., Zhang, M., Zhang, X., Li, X. 2023.
Jerusalem. pp. 1-5 Research on asparagus recognition based on deep learning.
Tundia, C., Tank, P., Damani, O.P. 2020. Aiding irrigation census IEEE Access 11:117362-117367.
in developing countries by detecting minor irrigation structures Zakria, Z., Deng, J., Kumar, R., Khokhar, M.S., Cai, J., Kumar, J.
from satellite imagery. Proc. 6th Int. Conf. on Geographical 2022. Multiscale and direction target detecting in remote sens-
Information Systems Theory, Applications and Management. ing images via modified YOLO-v4. IEEE J. Sel. Top. Appl.
pp. 208-215. 15:1039-1048.
Tzutalin, D. 2015. tzutalin/labelImg. Available from: Zhang, H., Cloutier, R.S. 2021. Review on one-stage object detec-
https://github.com/tzutalin/labelImg tion based on deep learning. EAI Endor. T. e-Learning 7:e5.
Wang, C.-Y., Bochkovskiy, A., Liao, H.-Y.M. 2023. YOLOv7: Zhang, L., Lin, L., Liang, X., He, K. 2016. Is faster R-CNN doing
Trainable bag-of-freebies sets new state-of-the-art for real- well for pedestrian detection? ECCV 2016. Lecture Notes in
time object detectors. IEEE/CVF Conf. on Computer Vision Computer Science, vol 9906. Springer, Cham. pp 443-457.
and Pattern Recognition, Vancouver. pp. 7464-7475. Zhang, P., Li, D. 2022. YOLO-VOLO-LS: a novel method for vari-

[Journal of Agricultural Engineering 2024; LV:1641] [page 109]


Review

ety identification of early lettuce seedlings. Front. Plant Sci. Pattern Recognition, Salt Lake City. pp. 528-537.
13:806878. Zhu, S., Ma, W., Wang, J., Yang, M., Wang, Y., Wang, C. 2023.
Zhong, Z., Jin, L., Xie, Z. 2015. High performance offline hand- EADD-YOLO: An efficient and accurate disease detector for
written chinese character recognition using googlenet and apple leaf using improved lightweight YOLOv5. Front. Plant
directional feature maps. 3rd Int. Conf. on Document Analysis Sci. 14:1120724.
and Recognition, Tunis. pp. 846-850. Zhu, X., Wang, R., Shi, W., Liu, X., Ren, Y., Xu, S., Wang, X.
Zhou, P., Ni, B., Geng, C., Hu, J., Xu, Y. 2018. Scale-transferrable 2024. Detection of pine-wilt-disease-affected trees based on
object detection. IEEE/CVF Conf. on Computer Vision and improved YOLO v7. Forests (Basel) 15:691.

ly
on
e
us
al
ci
er
m
om
-c
on
N

[page 110] [Journal of Agricultural Engineering 2024; LV:1641]

You might also like