Hybrid Object Detection and Distance Measurement For Precision Agriculture: Integrating YOLOv8 With Rice Field Sidewalk Detection Algorithm
Hybrid Object Detection and Distance Measurement For Precision Agriculture: Integrating YOLOv8 With Rice Field Sidewalk Detection Algorithm
Corresponding Author:
Nipat Jongsawat
Data and Information Science, Faculty of Science and Technology
Rajamangala University of Technology Thanyaburi
Pathum Thani, Thailand
Email: [email protected]
1. INTRODUCTION
Obstacle detection is an important element in autonomous robot navigation, with the complex
trade-off between safety and mobility at the heart of the problem. The former ensures that the robot does not
harm itself or the environment, including humans and animals. The latter is equally crucial, as it enables the
robot to fully plan and execute paths, thus determining its ability to successfully complete tasks [1].
Additionally, obstacle detection needs to be updated rapidly, allowing the robot to react promptly to
safety-critical information. This need is especially pronounced when transitioning from controlled indoor
environments to more challenging outdoor settings, where safe navigation becomes paramount [2]‒[4].
In this study, we focus on outdoor environments where robots must navigate various terrains, including
roads, paths, grasslands, meadows, and forest trails. This setup presents particular challenges, as the
definition of obstacles varies depending on terrain type. Such considerations are pertinent to a range of robot
applications, including forestry and agriculture [5].
Several researchers have conducted studies on object detection across various environments.
These include obstacle detection [6], crack detection in tiled sidewalks [7], identification of accessibility
problems on sidewalks [8], detection of furrows in corn fields [9], and identification of pathways in rice
fields for agricultural vehicles [10]. These studies underscore the importance of real-time detection and
adaptability to diverse environmental conditions. Terrain-adaptive obstacle detection integrates 3D-light
detection and ranging (LiDAR) data with geometric and semantic terrain features to ensure reliable
navigation of autonomous systems across different terrain types [11]. Real-time crack detection in tiled
sidewalks utilizes unmanned aerial vehicle (UAV) imagery and YOLO-based methods, demonstrating
excellent accuracy and adaptability to environmental factors such as shadows and rain. The PreSight system
accelerates object detection by leveraging prior data collection, significantly reducing latency for real-time
identification of accessibility issues on sidewalks. Neural network-based algorithms for corn field furrow
detection offer high accuracy and versatility, overcoming challenges posed by color and texture similarities.
While using LiDAR sensors presents an easy solution, various obstacles, particularly related to cost rather
than the sensor itself, may arise [12]‒[16]. Thus, this research focuses on developing a rice field sidewalk
detection (RIFIS-D) algorithm. The novelty of this research lies in employing a low-cost tool, such as a
camera, for detection, as opposed to sensors with environmental mapping capabilities like LiDAR as shown
in Figure 1. Given the usage of a camera as a sensor, an accurate detection algorithm becomes imperative.
The proposed algorithm incorporates a hybrid technique combining YOLOv8 and RIFIS-D.
Figure 1. Comparing the novelty of this research with previous studies helps to underscore its unique
contributions to the field
The main algorithm used in RIFIS-D Sidewalk detection is based on YOLOv8. YOLOv8, short for
YOLOv8, is a deep learning algorithm employed for object detection. It is notable for its speed and accuracy,
making it well-suited for real-time applications [17]‒[20]. Compared to other object detection algorithms,
YOLOv8 offers several advantages, including high accuracy at rapid inference speeds, simultaneous
detection of multiple objects, and seamless integration into various systems. These features render it a
preferred choice over other algorithms for object detection tasks. However, as this algorithm cannot function
in isolation, an additional algorithm in the form of RIFIS-D is necessary to detect the distance between the
Sidewalk and the Tractor. The RIFIS-D algorithm involves reading an image using OpenCV, dividing it into
chunks with a defined function, and preprocessing it with bilateral filtering and edge detection.
Edge array generation identifies edges and divides them for processing. Lines are drawn based on calculated
coordinates, connecting points in the edge array and extending from the bottom center of the image. The
processed image is then displayed. This process enables efficient RIFIS-D in agricultural images, supporting
tasks such as plowing fields with hand tractors. An overview of the concept of this research is provided in
Figure 2.
2. METHOD
2.1. The process of generating the YOLOv8 model
The generation of the YOLOv8 model begins with comprehensive data preparation. This involves
collecting relevant images and videos from various sources and meticulously annotating them with accurate
bounding boxes and class labels to facilitate supervised learning [21]. To increase dataset variability and
robustness, data augmentation techniques such as rotation, flipping, and scaling are applied [22]. The
annotated data is then converted into a suitable format, typically involving java script object notation (JSON)
or extensible markup language (XML) files, and divided into training, validation, and test subsets to ensure
the model's performance is accurately assessed. Preprocessing steps like normalizing and resizing images,
along with efficient data loading pipelines, prepare the data for the training phase. Quality control measures
are implemented to ensure the dataset is free from errors and inconsistencies [23]. During model training, the
process starts with initializing the YOLOv8 model architecture, and defining its layers and parameters. An
appropriate loss function, such as mean squared error for regression tasks or cross-entropy loss for
classification, is selected to measure the model's performance. The choice of optimizer, such as Adam or
stochastic gradient descent (SGD), is crucial for adjusting the model weights during training [24]. The
training loop involves iterating over the training data, performing forward and backward passes, and updating
the model parameters. Regular checkpointing saves the model's state, allowing training to resume from
specific points if necessary. Model testing involves running the trained YOLOv8 model on the test dataset to
generate predictions. Post-processing steps, such as non-maximum suppression, refine these predictions.
Evaluation metrics include constructing a plotting precision and recall curves to assess the trade-off between
precision and recall [25]. Once the YOLOv8 model is trained and tested, it is deployed for RIFIS-D.
Hybrid object detection and distance measurement for precision agriculture: … (Anucha Tungkasthan)
1510 ISSN: 2252-8938
(g)
Figure 4. Ground truth for dataset annotations (a) six points, (b) seven points, (c) five points, (d) four points,
(e) three points, (f) eight points, (g) legend
Hybrid object detection and distance measurement for precision agriculture: … (Anucha Tungkasthan)
1512 ISSN: 2252-8938
Utilizing the YOLOv8 model generated and saved in 'best.pt' format, a mask extraction process is
initiated from the detected sidewalks. Initially, the pre-trained YOLOv8 model is loaded, followed by the
execution of inference on the specified image files, yielding a list of segmentation results. Subsequently,
the process iterates through each segmentation result, extracting the mask tensor and converting it into a
NumPy array. For each mask, a corresponding Python imaging library (PIL) image object is instantiated,
converting the pixel values to 'uint8' format and scaling them within the range [0, 255]. Eventually, the
resulting mask image is saved as a '.jpg' file. This methodology ensures the accurate preservation of
segmentation masks extracted from the YOLOv8n-seg model as '.jpg' images, thereby facilitating the
subsequent distance measurement process using the RIFIS-D algorithm approach.
RIFIS-D is an algorithm proposed in this research based on input from previous sidewalk detections.
To provide further clarification of this algorithm, several mathematical formulas are presented. The first
formula describes the image processing process to detect edges and calculate a certain distance. Firstly, the
original image, denoted as I, undergoes application of a bilateral filter to smooth the image, resulting in 𝐼𝑏𝑙𝑢𝑟
as shown in (1). Subsequently, edge detection using the Canny method is performed on the blurred image,
generating an edge map 𝐼𝑒𝑑𝑔𝑒 as presented in (2). Following this, in the vertical edge detection step, a vertical
scan is conducted on the edge map with a step size 𝑆 to identify the lowest edge coordinate in each column,
which is stored in 𝐸𝑎𝑟𝑟𝑎𝑦 according to (3). This array is then partitioned into three parts, denoted as 𝑠𝑙𝑐 in (4).
The average coordinates 𝑎𝑣𝑔𝑥 and 𝑎𝑣𝑔𝑦 are calculated for each part, as outlined in (5), and the line distance
is measured from the bottom center point of the image to the average point using the Euclidean formula,
yielding 𝑙𝑙𝑖𝑛𝑒 in (6). Finally, lines are drawn between the detected pairs of edge points, as well as lines from
the bottom of the image to each edge point, and the results are displayed in the final image. Details of each
symbol can be found in Table 1.
𝑙𝑒𝑛𝑔𝑡ℎ(𝐸𝑎𝑟𝑟𝑎𝑦 )
𝑠𝑙𝑐 = {𝐸𝑎𝑟𝑟𝑎𝑦 [𝑘: 𝑘 + 𝑛]|𝑘 ∈ [0, 𝑙𝑒𝑛𝑔𝑡ℎ(𝐸𝑎𝑟𝑟𝑎𝑦 ), 𝑛 = ]} (4)
3
∑𝑥 ∑𝑦
𝑎𝑣𝑔𝑥 = 𝑙𝑒𝑛𝑔𝑡ℎ𝑣𝑎𝑙𝑠 , 𝑎𝑣𝑔𝑦 = 𝑙𝑒𝑛𝑔𝑡ℎ𝑣𝑎𝑙𝑠 (5)
(𝑥 𝑣𝑎𝑙𝑠 ) (𝑦 𝑣𝑎𝑙𝑠 )
𝑊 2 2
𝑙𝑙𝑖𝑛𝑒 = √(𝑎𝑣𝑔𝑥 − 2 ) + (𝑎𝑣𝑔𝑦 − 𝐻) (6)
Table 1. Nomenclature
Symbols Description Symbols Description
𝑙𝑒𝑛𝑔𝑡ℎ(𝐸 𝑎𝑟𝑟𝑎𝑦 )
𝐼 : Original image with dimensions H×W (height and width) 𝑠𝑙𝑐 : Slice of 𝐸𝑎𝑟𝑟𝑎𝑦 with size length 3
𝐼𝑏𝑙𝑢𝑟 : Image after application of bilateral filters 𝑥𝑣𝑎𝑙𝑠 : List of x coordinates of each slice
𝐼𝑒𝑑𝑔𝑒 : Edge map resulting from edge detection (Canny method) 𝑦𝑣𝑎𝑙𝑠 : List of y coordinates of each slice
𝑆 : Step size (Step Size), here S=5 𝑎𝑣𝑔𝑥 : Average x coordinate in one slice
𝐻 : Image height reduced by 1 (height(I)−1) 𝑎𝑣𝑔𝑦 : Average y coordinate in one slice
𝑊 : Image width reduced by 1 (width(I)−1) Length of the line measured from the bottom
𝑙𝑙𝑖𝑛𝑒 : center point of the image to the average point
𝐸𝑎𝑟𝑟𝑎𝑦 : Array storing the coordinates of the detected edge points
in the slice
𝑆𝑐 : Sigma color 𝑆𝑠 : Sigma space
Figure 7 demonstrates the creation and testing process of the proposed algorithm through three
testing stages. Figure 7(a) depicts the fundamental concept using an ideal image, which undergoes a sequence
of operations including smoothing (blurring), edge detection, and distance calculation. Figure 7(b) assesses
the algorithm's performance using indoor laboratory images, following the same stages of processing:
original image, smoothing, edge detection, and distance calculation. Figure 7(c) evaluates the algorithm using
real outdoor images, incorporating object detection using YOLOv8, followed by edge detection and distance
calculation. These three parts collectively showcase the algorithm's capability to accurately process images
from diverse conditions, enabling precise edge detection and distance calculation.
Figure 8 illustrates the final results of the RIFIS-D algorithm, displaying the distance measurement
between the center point of the bottom of the image and the detected sidewalk. The edge of the sidewalk,
marked in red, is detected in the image. The algorithm identifies three main points on the sidewalk, with each
annotated with the distance from the bottom center point of the image. The blue line represents the distance
measurement from the center point of the bottom of the image to the three edge points of the sidewalk.
The numbers in the image, such as 353.36, 253.02, and 372.03, denote the Euclidean distance (in pixels)
from the bottom center point of the image to each detected edge point. These results underscore the RIFIS-D
algorithm's ability to visually and accurately identify objects in images and measure their distances.
Hybrid object detection and distance measurement for precision agriculture: … (Anucha Tungkasthan)
1514 ISSN: 2252-8938
(a)
(b)
(c)
Figure 7. Creation and testing of proposed algorithm; (a) basic concept of using ideal images, (b) algorithm
testing using indoor laboratory images, and (c) algorithm testing using original images and detection results
by YOLOv8
Figure 8. The final result of the RIFIS-D algorithm displays the results of measuring the distance between the
center point of the bottom of the image and the detected sidewalk
Figure 9 shows the sidewalk detection process using YOLOv8 as well as the mask extraction and
distance measurement steps. In Figure 9(a), the results of sidewalk detection are shown in two different
images with YOLOv8, where the areas detected as sidewalks are colored red with confidence levels of
0.92 and 0.95 respectively. The next step is mask extraction, as seen in Figure 9(b), where the detected
sidewalk area is represented in binary form (black and white), with white indicating the sidewalk area.
The next process is the distance measurement shown in Figure 9(c), where the vertical green lines and blue
lines indicate the distance measurement points on the mask, with distance values listed at several points. This
process illustrates how YOLOv8 can be used to detect objects accurately and how the detection results can be
further analyzed for specific purposes such as measuring the distance between points in the detected area.
Figure 9. Experimental results; (a) sidewalk detection results using Yolov8, (b) mask extraction, and
(c) distance measurement algorithm
4. CONCLUSION
This research describes in detail a new approach for semantic segmentation of sidewalks in rice
fields using the YOLOv8 algorithm enriched with the RIFIS-D algorithm. Through a series of experimental
steps including development environment preparation, data extraction, model training, evaluation, and
analysis, this research succeeded in producing significant findings. The evaluation results show that the
proposed model is able to detect and measure sidewalks with high precision, even in a variety of different
environmental conditions. It was found that this hybrid approach has consistent robustness and accuracy, and
has great potential to improve the efficiency and effectiveness of agricultural monitoring. Thus, this research
effectively fills the knowledge gap in the domain of automated navigation and monitoring in agriculture,
making a significant contribution to scientific and technological progress in this field. For the future,
experimental suggestions include further trials to evaluate the reliability and adaptability of this model in a
wider range of environmental scenarios and agricultural applications, while continuing to develop the
integration of this technology into more integrated and automated agricultural systems. Thus, this conclusion
reflects the detailed findings of the report as well as providing insight into future research directions
regarding direct implementation of this algorithm on tractors.
ACKNOWLEDGEMENTS
The authors would like to thank RMUTT for the support and facilities they provided for this
research. The availability of datasets and computing infrastructure has greatly helped in the development and
implementation of the proposed algorithm. The authors also appreciate the collaboration and discussions of
the researchers at RMUTT, who provided valuable insights and technical support throughout the research
process. The contribution from RMUTT has been an important pillar in the success of this research, and the
authors hope that this good collaboration can continue in the future for further research.
REFERENCES
[1] J. P. A. Yaacoub, H. N. Noura, and B. Piranda, “The internet of modular robotic things: Issues, limitations, challenges, &
solutions,” Internet of Things, vol. 23, 2023, doi: 10.1016/j.iot.2023.100886.
[2] M. Sadaf et al., “Connected and automated vehicles: infrastructure, applications, security, critical challenges, and future aspects,”
Technologies, vol. 11, no. 5, 2023, doi: 10.3390/technologies11050117.
[3] D. Maneetham, P. N. Crisnapati, and Y. Thwe, “Autonomous open-source electric wheelchair platform with internet-of-things
and proportional-integral-derivative control,” International Journal of Electrical and Computer Engineering, vol. 13, no. 6,
pp. 6764–6777, 2023, doi: 10.11591/ijece.v13i6.pp6764-6777.
Hybrid object detection and distance measurement for precision agriculture: … (Anucha Tungkasthan)
1516 ISSN: 2252-8938
[4] S. Ketsayom, D. Maneetham, and P. N. Crisnapati, “AGV maneuverability simulation and design based on pure pursuit algorithm
with obstacle avoidance,” Indonesian Journal of Electrical Engineering and Computer Science, vol. 34, no. 2, pp. 835–847, 2024,
doi: 10.11591/ijeecs.v34.i2.pp835-847.
[5] L. Wijayathunga, A. Rassau, and D. Chai, “Challenges and solutions for autonomous ground robot scene understanding and
navigation in unstructured outdoor environments: a review,” Applied Sciences, vol. 13, no. 17, 2023, doi: 10.3390/app13179877.
[6] J. W. Hu et al., “A survey on multi-sensor fusion based obstacle detection for intelligent ground vehicles in off-road
environments,” Frontiers of Information Technology and Electronic Engineering, vol. 21, no. 5, pp. 675–692, 2020, doi:
10.1631/FITEE.1900518.
[7] Q. Qiu and D. Lau, “Real-time detection of cracks in tiled sidewalks using YOLO-based method applied to unmanned aerial
vehicle (UAV) images,” Automation in Construction, vol. 147, 2023, doi: 10.1016/j.autcon.2023.104745.
[8] C. Bennett, E. Ackerman, B. Fan, J. Bigham, P. Carrington, and S. Fox, “Accessibility and the Crowded Sidewalk:
Micromobility’s Impact on Public Space,” DIS 2021-Proceedings of the 2021 ACM Designing Interactive Systems Conference:
Nowhere and Everywhere, pp. 365–380, 2021, doi: 10.1145/3461778.3462065.
[9] N. A. Simon and C. H. Min, “Neural network based corn field furrow detection for autonomous navigation in agriculture
vehicles,” in IEMTRONICS 2020- International IOT, Electronics and Mechatronics Conference, Proceedings, 2020, doi:
10.1109/IEMTRONICS51293.2020.9216347.
[10] P. N. Crisnapati and D. Maneetham, “Two-dimensional path planning platform for autonomous walk behind hand tractor,”
Agriculture, vol. 12, no. 12, 2022, doi: 10.3390/agriculture12122051.
[11] S. Jiang, W. Jiang, and L. Wang, “Unmanned aerial vehicle-based photogrammetric 3D mapping: a survey of techniques,
applications, and challenges,” IEEE Geoscience and Remote Sensing Magazine, vol. 10, no. 2, pp. 135–171, 2022, doi:
10.1109/MGRS.2021.3122248.
[12] T. Raj, F. H. Hashim, A. B. Huddin, M. F. Ibrahim, and A. Hussain, “A survey on LiDAR scanning mechanisms,” Electronics,
vol. 9, no. 5, 2020, doi: 10.3390/electronics9050741.
[13] P. Borges et al., “A survey on terrain traversability analysis for autonomous ground vehicles: methods, sensors, and challenges,”
Field Robotics, vol. 2, no. 1, pp. 1567–1627, 2022, doi: 10.55417/fr.2022049.
[14] Z. Li, C. Jiang, X. Gu, Y. Xu, Feng zhou, and J. Cui, “Collaborative positioning for swarms: A brief survey of vision, LiDAR and
wireless sensors based methods,” Defence Technology, vol. 33, pp. 475–493, 2024, doi: 10.1016/j.dt.2023.05.013.
[15] R. Hasan and R. Hasan, “Pedestrian safety using the Internet of Things and sensors: Issues, challenges, and open problems,”
Future Generation Computer Systems, vol. 134, pp. 187–203, 2022, doi: 10.1016/j.future.2022.03.036.
[16] M. B. Alatise and G. P. Hancke, “A review on challenges of autonomous mobile robot and sensor fusion methods,” IEEE Access,
vol. 8, pp. 39830–39846, 2020, doi: 10.1109/ACCESS.2020.2975643.
[17] B. Xiao, M. Nguyen, and W. Q. Yan, “Fruit ripeness identification using YOLOv8 model,” Multimedia Tools and Applications,
vol. 83, no. 9, pp. 28039–28056, 2024, doi: 10.1007/s11042-023-16570-9.
[18] G. Wang, Y. Chen, P. An, H. Hong, J. Hu, and T. Huang, “UAV-YOLOv8: A small-object-detection model based on improved
YOLOv8 for UAV aerial photography scenarios,” Sensors, vol. 23, no. 16, 2023, doi: 10.3390/s23167190.
[19] J. Terven, D. M. C. Esparza, and J. A. R. González, “A comprehensive review of YOLO architectures in computer vision: from
YOLOv1 to YOLOv8 and YOLO-NAS,” Machine Learning and Knowledge Extraction, vol. 5, no. 4, pp. 1680–1716, 2023, doi:
10.3390/make5040083.
[20] M. Safaldin, N. Zaghden, and M. Mejdoub, “An improved YOLOv8 to detect moving objects,” IEEE Access, vol. 12,
pp. 59782–59806, 2024, doi: 10.1109/ACCESS.2024.3393835.
[21] P. Jiang, D. Ergu, F. Liu, Y. Cai, and B. Ma, “A review of yolo algorithm developments,” Procedia Computer Science, vol. 199,
pp. 1066–1073, 2021, doi: 10.1016/j.procs.2022.01.135.
[22] K. Maharana, S. Mondal, and B. Nemade, “A review: data pre-processing and data augmentation techniques,” Global Transitions
Proceedings, vol. 3, no. 1, pp. 91–99, 2022, doi: 10.1016/j.gltp.2022.04.020.
[23] W. Li, M. I. Solihin, and H. A. Nugroho, “RCA: YOLOv8-based surface defects detection on the inner wall of cylindrical high-
precision parts,” Arabian Journal for Science and Engineering, vol. 49, no. 9, pp. 12771–12789, 2024, doi: 10.1007/s13369-023-
08483-4.
[24] M. Reyad, A. M. Sarhan, and M. Arafa, “A modified Adam algorithm for deep neural network optimization,” Neural Computing
and Applications, vol. 35, no. 23, pp. 17095–17112, 2023, doi: 10.1007/s00521-023-08568-z.
[25] E. Casas, L. Ramos, C. Romero, and F. R. Echeverría, “A comparative study of YOLOv5 and YOLOv8 for corrosion
segmentation tasks in metal surfaces,” Array, vol. 22, 2024, doi: 10.1016/j.array.2024.100351.
[26] P. N. Crisnapati and D. Maneetham, “RIFIS: A novel rice field sidewalk detection dataset for walk-behind hand tractor,” Data,
vol. 7, no. 10, 2022, doi: 10.3390/data7100135.
[27] Q. Lin, G. Ye, J. Wang, and H. Liu, “RoboFlow: a data-centric workflow management system for developing AI-enhanced
robots,” in Proceedings of Machine Learning Research, 2021, pp. 1789–1794.
BIOGRAPHIES OF AUTHORS
Padma Nyoman Crisnapati after obtaining a Bachelor's degree in 2009 from the
Department of Informatics Engineering at Sepuluh Nopember Institute of Technology. He
pursued a master's degree in learning technology in 2011 and a master's degree in computer
science in 2018 from Ganesha Education University. He is a lecturer at STIKOM Bali Institute
of Technology and Business, teaching sensors transducer, assembly language, and animation.
He previously served as the Head of the Computer Systems Study Program from 2016 to
2020. He is pursuing a Ph.D. in the Department of Mechatronics Engineering at Rajamangala
University of Technology Thanyaburi (RMUTT). His research interests encompass 2D and 3D
animation, the internet of things, robotics, automation, augmented and virtual reality, and
robotics. He can be contacted at email: [email protected].
Hybrid object detection and distance measurement for precision agriculture: … (Anucha Tungkasthan)