0% found this document useful (0 votes)
12 views5 pages

Reducing Computation Requirements For Autonomous Mobile Robots On Low Powered Embedded Systems

The document presents two innovative algorithms, Pure Image Segmentation Approach (PISA) and UNet Based Approach to Semantic Segmentation (UBASS), designed to enhance the efficiency and autonomy of mobile robots while reducing computational requirements. PISA utilizes classical computer vision techniques for tasks like object detection and lane following, while UBASS employs deep learning for semantic segmentation, both demonstrating superior performance compared to traditional methods. The research aims to advance autonomous mobile robotics by providing practical solutions for navigation and perception challenges, particularly in low-powered embedded systems.

Uploaded by

f20210906
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views5 pages

Reducing Computation Requirements For Autonomous Mobile Robots On Low Powered Embedded Systems

The document presents two innovative algorithms, Pure Image Segmentation Approach (PISA) and UNet Based Approach to Semantic Segmentation (UBASS), designed to enhance the efficiency and autonomy of mobile robots while reducing computational requirements. PISA utilizes classical computer vision techniques for tasks like object detection and lane following, while UBASS employs deep learning for semantic segmentation, both demonstrating superior performance compared to traditional methods. The research aims to advance autonomous mobile robotics by providing practical solutions for navigation and perception challenges, particularly in low-powered embedded systems.

Uploaded by

f20210906
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

2023 International Conference on Modeling, Simulation & Intelligent Computing (MoSICom)

December, 7-9, 2023, BITS Pilani Dubai Campus, Dubai, UAE

Reducing Computation Requirements for


Autonomous Mobile Robots on Low Powered
2023 International Conference on Modeling, Simulation & Intelligent Computing (MoSICom) | 979-8-3503-9341-5/23/$31.00 ©2023 IEEE | DOI: 10.1109/MoSICom59118.2023.10458726

Embedded Systems
Abhinav Pathak Aditya Singhal Adit Tewari
Department of Computer Science, Department of Computer Science, Birla Department of Electrical and
Birla Institute of Technology and Institute of Technology and Science Electronics Engineering, Birla Institute
Science Pilani, Dubai Campus, Pilani, Dubai Campus, of Technology and Science
Dubai, UAE Dubai, UAE Pilani, Dubai Campus,
[email protected] [email protected] Dubai, UAE
[email protected]
Aadish Jain V. Kalaichelvi
Department of Computer Science, Birla Department of Electrical and
Institute of Technology and Science Electronics Engineering, Birla Institute
Pilani, Dubai Campus, of Technology and Science Pilani,
Dubai, UAE Dubai Campus,
[email protected] [email protected].

Abstract— Fully autonomous mobile robots have the potential balance has motivated researchers and engineers to seek
to revolutionize various industries, from warehouse management innovative solutions that can enhance the efficiency and
to hospital logistics and last-mile deliveries. However, a significant independence of these robots.
obstacle to achieving reliable autonomy lies in the high
computational and energy requirements. In response to this One of the key challenges in autonomous driving is the
challenge, our paper introduces two innovative algorithms: the development of algorithms that can accurately detect and track
Pure Image Segmentation Approach (PISA) and the UNet Based objects in the environment, including other vehicles,
Approach to Semantic Segmentation (UBASS). PISA leverages pedestrians, and obstacles. This requires a combination of
classical computer vision techniques, offering a fresh perspective computer vision techniques, such as object detection and
on solving crucial tasks such as object detection, object avoidance, tracking, as well as machine learning algorithms that can learn
and lane detection. In contrast, UBASS harnesses the power of to recognize and classify different objects in the environment.
deep learning algorithms for semantic segmentation, unlocking
new capabilities in robot perception. Our experiments showcase For lane detection specifically, there are several
the effectiveness of these algorithms, demonstrating their algorithms that are commonly used in autonomous driving
accuracy and computational efficiency. Notably, PISA and systems. One of the most popular methods for
UBASS outperform or match traditional techniques, including computationally inexpensive lane detection is [3] canny edge
End-to-End Deep Learning and Canny Edge Detection, in terms detection, which works by identifying the edges in an image
of both task performance and resource utilization. This research and using those edges to define the boundaries of the lane.
contributes to the advancement of autonomous mobile robotics by However, canny edge detection has several limitations, such
offering practical and efficient solutions for navigation and as struggling with noisy images or low contrast environments,
perception challenges. By combining classic and contemporary and it may produce false positives if there are other edges in
approaches, we aim to inspire further research in the field, the image that are not part of the lane. Additionally, canny
ultimately paving the way for more accessible and dependable edge detection is only able to identify the lane, and a separate
autonomous mobile robots. algorithm is required to manage object detection, which can
Keywords--Autonomous Mobile Vehicle, Computer Vision, add complexity to the system.
UBASS. PISSA. Low Powered Embedded Systems, UNET, To improve the accuracy of canny edge detection, some
Convolution Neural Networks researchers have combined it with other techniques like the
Hough transform and deep learning. These methods can be
I. INTRODUCTION
more computationally expensive, but they can help to reduce
In today's fast-paced technological landscape, fully false positives and improve lane detection in challenging
autonomous mobile robots have emerged as catalysts for environments. However, these methods still require a separate
profound changes across various industries. They offer algorithm for object detection, and they may struggle with
potential improvements in warehouse management, hospital real-time performance and scalability.
logistics, and last-mile deliveries by automating tasks.
However, a significant obstacle hampers their widespread Another approach that has gained popularity in recent
adoption: the substantial demands they place on years is end-to-end deep learning. This method uses a neural
computational power and energy. network to directly map the input image to the steering
commands needed to control the car. This approach doesn't
Achieving dependable autonomy in mobile robotics has require a separate algorithm for object detection, which can
long been a complex balancing act between harnessing simplify the system, and it has been shown to be effective in
advanced technology and conserving resources. This delicate somecases. However, end-to-end deep learning requires a
lot of

281
979-8-3503-9341-5/23/$31.00
Authorized ©2023
licensed use limited to: Birla IEEE
Inst of Technology and Science Pilani Dubai. Downloaded on March 20,2024 at 10:06:53 UTC from IEEE Xplore. Restrictions apply.
training data, and it can be very computationally expensive. of important road shape data, while a dynamic ROI would
Additionally, this method is hard to debug, and it is not require significant computing resources. Therefore, a simpler
predictive in nature, meaning that it may struggle to handle approach would be to split the image into two halves, since
unexpected situations that were not seen during training. most of the irrelevant noise tends to occur in the upper half of
the image.
In response to these critical challenges, our research paper
introduces two novel algorithms: the Pure Image
Segmentation Approach (PISA) and the UNet Based
Approach to Semantic Segmentation (UBASS). These
algorithms are meticulously designed to address specific
aspects of a mobile robot's ability to understand its Fig. 2. Reducing noise via simple cropping
surroundings and navigate effectively. PISA draws inspiration
from traditional computer vision techniques and reimagines The subsequent step in the pipeline involves the
how robots detect objects, avoid obstacles, and follow lanes. computation of the steering angle. Given the high frame rate
In contrast, UBASS takes advantage of deep learning, of input images, it is not pragmatic to calculate the exact
leveraging its transformative capabilities to enhance a robot's steering angle for each frame. Instead, it can be viewed as a
perception. rate of angle at which the vehicle must steer in order to remain
Our experiments with PISA, UBASS and End-to-End on the road while [2,3] evading obstacles. To achieve this, one
Deep Learning on a mobile robot equipped with Nvidia Jetson can capitalize on the fact that in the binary map, all road pixels
Nano, 4gb RAM, MPCore dual core CPU and a Logitech are represented in black. Thus, the objective is to compute a
Webcam yielded valuable insights, demonstrating not only steering angle that maximizes the presence of black pixels in
their precision but also their efficient use of computational the image, which corresponds to staying on the road. A
resources. Perhaps most notably, these algorithms often straightforward method of accomplishing this is to divide the
outperform established methods like End-to-End Deep image into left and right halves and compute the total number
Learning and the conventional Canny Edge Detection, both in of black pixels in each half. By comparing these [3] pixel
task performance and responsible resource usage. counts, the direction of steering can be ascertained. Finally,
the magnitude of the steering angle (S) can be calculated using
This paper aims to advance the field of autonomous the following equation:
mobile robotics. By providing practical and efficient solutions
to the intricate challenges of navigation and perception, we 𝑆 = (𝜌𝐿 − 𝜌𝑅) ∗ 𝜃𝑚𝑎𝑥 (1)
aspire to empower today's robots and inspire future research Here 𝜌L represents the black pixel density in the left half
endeavors. With a blend of timeless principles and cutting- of the input image, while 𝜌R represents the black pixel density
edge techniques, our journey beckons further exploration and in the right half of the input image. 𝜃 represents the maximum
discovery, ultimately paving the way for more accessible and steering angle that can be mobile robot is expected to steer in
dependable autonomous mobile robots. each instant, usually this value is set at 90 degrees. However,
depending on the use case, this value can be changed to some
II. METHODOLOGY
other value.
This paper proposes two algorithms, each tailored to suit
different use cases depending on the anticipated driving This algorithm has the added advantage of obstacle
conditions and the available computing resources. avoidance, as any obstacles in the environment would be
represented as white pixels. The steering angle would then be
A. Pure Image Segmentation Approach (PISA) calculated based on the location of these pixels, allowing the
This algorithm is specifically designed for deployment in robot to avoid obstacles while maintaining its position within
extremely low-powered embedded systems, thereby the lane. This can be visualized using the accompanying
necessitating its inherent simplicity.[5] The algorithm utilizes figure.
a two-stage image processing technique involving the
application of Gaussian blur to the image, succeeded by
adaptive thresholding to binarize the blurred image.
This strategy effectively mitigates noise and enhances
image smoothness, thereby producing a binary map output.
Fig. 3. Image processing pipeline, showing howobstacle is
ignored in the map.

It should be noted that the proposed [2,4] algorithm is


intended for deployment on low-powered embedded systems
that may lack hardware acceleration. As such, the algorithm
Fig. 1. Converting raw input image to binarized map is constructed to be simple in nature, and may exhibit
temporary inconsistencies under challenging lighting
conditions. Additionally, it may not perform optimally in
Despite the effectiveness of the aforementioned technique, cases where there is significant noise in the lower half of the
the resulting [3] binary map output may still contain some input image. It is important to mention that the algorithm's
level of noise that could affect the accuracy of the subsequent approach to calculating the steering angle via a simple
computations. One way to mitigate this issue is by defining a comparison of the right and left halves of the input image
Region of Interest (ROI) that covers only the relevant parts of renders it incapable of detecting sharp curves exceeding 90
the image. However, a simple static ROI may lead to the loss

282

Authorized licensed use limited to: Birla Inst of Technology and Science Pilani Dubai. Downloaded on March 20,2024 at 10:06:53 UTC from IEEE Xplore. Restrictions apply.
degrees. Despite these limitations, the proposed algorithm
succeeds in achieving the fundamental objective of realizing
autonomous navigation even in resource-constrained
environments.

Fig. 5. Overview of the Upgraded Image Pipeline

For training this neural network we have made our very


own data set compromising 5,554 images of roads and
obstacles of different complexity and under different lighting
conditions. Out of these, 4166 images have been used for
training the model, whereas 1388 images have been reserved
for testing purposes. This model has been trained on a high-
end computer, compromising of Intel i7-11700k, RTX 3070TI
and 16 gigabytes of RAM. Along with this we have also used
Kaiming He weight initialization which uses random weights
derived from a Gaussian distribution for setting up the initial
weights. This also uses ReLU activation which enables for
Fig. 4. Overview of the Complete PISA Algorithm much faster convergence of the network when training from
scratch.
B. UNet Based Approach for Semantic Segmentation This method is effective in real-time mobile navigation, as
(UBASS) it accurately segments the road even in challenging lighting
conditions. Furthermore, the use of UNet architecture for
In order to overcome several of the downsides of the [2] semantic image segmentation can be applied to other tasks,
PISSA algorithm we can use more context-based intelligent such as object detection and classification, making it a
semantic segmentation algorithm. Semantic image versatile and powerful tool in the field of computer vision.
segmentation is a popular research topic in the field of The steering angle calculation method utilized in this
computer vision, and UNet architecture is often used as a algorithm is identical to the one implemented in the PISA
framework to achieve this task. UNet architecture is an approach. Nonetheless, a minor modification is introduced by
encoder-decoder network that utilizes convolutional layers introducing a new variable, referred to as the 'factor', which is
and transpose convolutional layers for feature encoding and influenced by the density of white pixels in the opposite half
decoding. The encoder part of UNet comprises multiple layers of the steering angle direction. This modification has the effect
of 3x3 convolution, followed by a batch normalization layer of enhancing the sensitivity of the algorithm to white pixels,
and a rectified linear unit (ReLU) activation function. The leading to a more precise response in the steering angle
encoder also includes a 2x2 max pooling operation with a calculation. The formula for this is given below:
stride of 2 that down-samples the images and doubles featured
channels at each step. The decoder part of UNet involves up- 𝑆 = 𝑓 ∗ (𝜌𝐿 − 𝜌𝑅) ∗ 𝜃𝑚𝑎𝑥 (2)
sampling feature maps and reducing the number of feature
Specifically, this adaptation enables the robot to adhere to
channels using a 2x2 convolution. Cropped feature maps from
the road path even more accurately when navigating through
the contracting path are concatenated, and two 3x3
sharper turns. Consequently, this algorithmic adjustment can
convolutions with ReLU follow before high-resolution
be considered an improvement in the overall performance of
features are provided by skip connections.
the steering angle calculation method. The following flow
To segment a road image, the proposed UNet architecture chart visualizes the complete UBASSalgorithm.
replaces hand-crafted features with a deep neural network to
generate robust road mask proposals. The deep neural network
analyses global appearance and contextual information at each
pixel, and it helps to suppress false positive detections. The
auxiliary encoder creates feature embeddings from the road
proposal mask. The module consists of four auxiliary blocks,
each followed by a 2x2 max pooling operation to reduce the
size of the feature map. The final output of the UNet
architecture produces a binarized map that accurately
segments only the road, while everything else is turned white.
The road is masked as black in the output, indicating the
directions in which the robot can steer in. The following
figure should illustrate the image outputted by this Fig. 6. Overview of the Complete UBASS Algorithm
architecture.

283

Authorized licensed use limited to: Birla Inst of Technology and Science Pilani Dubai. Downloaded on March 20,2024 at 10:06:53 UTC from IEEE Xplore. Restrictions apply.
The current study proposes two novel algorithms for output frame rate, CPU usage, and RAM usage. These
autonomous mobile navigation, which aim to optimize the measurements are summarized in the table below.
navigation process through the incorporation of distinct
decision-making strategies. To validate the effectiveness of
these algorithms, a comparative analysis is conducted,
whereby an end-to-end deep learning approach is
implemented and evaluated alongside the proposed
algorithms. In this regard, the experimental setup for this study
involves the implementation of a rubber mat road track that
simulates real-world environments by integrating diverse Fig. 7. Object Avoidance Using PISA
obstacles, including cones and green boxes. This setup enables
the evaluation of the proposed algorithms and the end-to-end
deep learning approach against challenging navigation
conditions and varying levels of complexity. The end-to-end
deep learning approach, a promising technique in the field of
mobile robot navigation, involves training a navigation system
solely on raw sensory data, this methodology bypasses the
traditional pipeline of intermediate processing steps, leading
to a more streamlined and efficient navigation system. Thus, Fig. 8. Object Avoidance Using UBASS
comparing the proposed algorithms with the end-to-end deep
learning approach presents an opportunity to evaluate the The figure presented provides an illustration of the
efficacy of the different techniques in addressing the successful implementation of lane detection and object
challenges associated with mobile robot navigation. Overall, avoidance systems in our custom-built mobile robot using the
the results obtained from this study will provide insights into PISA and UBASS algorithms. Notably, the UBASS algorithm
the strengths and weaknesses of the proposed algorithms and surpasses the PISA algorithm in its ability to detect and steer
the end-to-end deep learning approach, enabling researchers away from obstacles. This superiority can be attributed to the
to make informed decisions regarding the optimal navigation use of the UNet architecture in the UBASS algorithm, which
strategy for autonomous mobile robots. facilitates enhanced road extraction. Additionally, the UBASS
The proposed experiment aims to capture and analyses algorithm's superior white pixel avoidance system enables it
multiple performance metrics to evaluate the efficiency and to detect and avoid obstacles at an earlier stage than the PISA
effectiveness of the experimental approach. Specifically, the algorithm.
experiment seeks to record and compare various matrix
TABLE I. EXPERIMENTAL DATA FOR COMPUTATIONAL EXPENSE
parameters, including the central processing unit (CPU) IN TABULAR FORM
consumption, random access memory (RAM) consumption,
and frame rate of the processed video stream. Measuring the S. No Algorithm CPU% RAM% FPS
CPU and RAM consumption will provide a comprehensive 1. Pure Image Segmentation
65.24% 43.55%
30.00
understanding of the system's resource utilization, Approach (PISA)
highlighting any bottlenecks or inefficiencies that may be 2. UNet Based Approach to 13.87
Semantic Segmentation 72.52% 66.34%
present. Additionally, the analysis of the frame rate of the (UBASS)
processed video stream will provide insights into the system's 3. End to End Deep Learning 7.35
responsiveness and real-time processing capabilities. Along 97.9% 70.32%
Approach
with this, we have implemented a system for monitoring
instances where the robot deviates from its intended path, 100.00%
makes minor contact with obstacles, or experiences a full
collision with an obstacle. This system allows us to maintain
80.00%
a comprehensive record of the robot's performance and its
ability to navigate its environment with precision and
accuracy. To ensure complete testing of the proposed 60.00%
algorithms, we have run the test 60 times while varying the
road complexity and obstacle for the targeted problem domain. 40.00%

III. PREPARE YOUR PAPER BEFORE STYLING 20.00%


Our mobile robot uses the Nvidia Jetson Nano, equipped
with a quad-core ARM Cortex-A57 MPcore processor and 0.00%
128 Cuda cores for hardware accelerated deep learning tasks, PISA UBASS End-to-End DL
was chosen as the testing platform for the algorithms. The
CPU% RAM%
system was also outfitted with a Logitech C70 web camera to
capture input video streams, with the onboard camera able to
capture video at 30 frames per second. As a result, the
Fig. 9. Graph Illustrating the comparison of CPU% andRAM%
maximum possible output stream frame rate is 30 frames per
second. To determine performance metrics for each algorithm,
the testing method outlined in the previous section was
repeated 50 times, with measurements taken for the system’s

284

Authorized licensed use limited to: Birla Inst of Technology and Science Pilani Dubai. Downloaded on March 20,2024 at 10:06:53 UTC from IEEE Xplore. Restrictions apply.
TABLE II. EXPERIMENTAL DATA FOR REAL WORLD PERFORMACE

No. of runs with No. of runs with No. of runs with slight No. of perfect
S. No Algorithm Lighting Condition
derailment complete collision obstacle grazing runs
Low 23 16 37 8
Pure Image Segmentation
1. Medium 19 13 19 15
Approach (PISA)
Ideal 17 9 10 19
Low 15 7 5 29
UNet Based Approach to
2. Semantic Segmentation Medium 9 5 1 37
(UBASS)
Ideal 8 2 1 50
Low 19 14 3 31
End to End Deep
3. Medium 10 5 2 35
Learning Approach
Ideal 5 1 1 51

IV. CONCLUSION In conclusion, our paper has proposed two novel


approaches to reduce computational requirements for
This paper introduces two novel algorithms: PISA (Pure autonomous mobile navigation on low end embedded
Image Segmentation Approach) and UBASS (UNet Based systems. This paves the way for further innovations and
Approach to Semantic Segmentation). These algorithms offer advancements in this field that will make such mobile robots
significant advantages in terms of computational requirements even more accessible, energy efficient and cost effective.
compared to existing methods like the End-to-End Deep
Learning approach. Notably, they have been successfully ACKNOWLEDGMENT
implemented on low-end embedded systems such as the
Nvidia Jetson Nano. We express our sincere gratitude to the esteemed
institution, BITS Pilani, Dubai Campus, for affording us the
Our experiments demonstrate that PISA is 4.7 times faster opportunity to conduct this research project. We acknowledge
than the End-to-End deep learning approach while consuming the institution's unwavering support, which proved
only 76% of the CPU resources and using 60.2% less memory. instrumental in enabling us to accomplish this endeavor.
Similarly, UBASS outperforms the End-to-End Deep
Furthermore, we extend our heartfelt appreciation to Dr.
Learning model by being 2.18 times faster, 94% lighter on
V. Kalaichelvi, Associate Professor, EEE, Birla Institute of
memory, and 89% less CPU intensive.
Technology and Science Pilani, Dubai Campus, for providing
Our results show that PISA is the least computationally us with invaluable research counsel and guidance at each stage
inexpensive, however it is also the least accurate. We believe of our academic journey. Her insights and suggestions were
that PISA has a lot of potential to be used in lower end pivotal in shaping our research approach, and we are deeply
applications where accuracy in low light environment isn’t indebted to her mentorship.
crucial. This might prove useful for demonstration in schools
where students can have a hands-on approach to autonomous REFERENCES
mobile robot algorithms. It can also be potentially deployed in [1] (James Provost, n.d.) S. Cass, "Nvidia makes it easy to embed AI: The
environments with fixed lighting and low complexities, such Jetson nano packs a lot of machinelearning power into DIY projects -
as small warehouses. On the other hand, UBASS shows [Hands on]," in IEEE Spectrum, vol. 57, no. 7, pp. 14-16, July 2020,
doi: 10.1109/MSPEC.2020.9126102.
performance that’s on par with end-to-end deep learning
[2] (Anak Undit et al., 2021)H. J. Anak Undit, M. F. Abu Hassan and Z.
approach while simultaneously being a lot less M. Zin, "Vision-Based Unmarked Road Detection with Semantic
computationally expensive than the end-to end deep learning Segmentation using Mask R-CNN for Lane Departure Warning
approach. We believe that UBASS can be used for warehouses System," 2021 4th International Symposium on Agents, Multi-Agent
for inventory management, medicine transportation in Systems and Robotics (ISAMSR), Batu Pahat, Malaysia, 2021, pp. 1-
hospitals, last mile deliveries for food and package delivery, 6, doi: 10.1109/ISAMSR53229.2021.9567892.
automated surveillance and much more. [3] (Sadik et al., 2019)F. Sadik, M. R. Subah, A. G. Dastider, S. A. Moon,
S. S. Ahbab and S. A. Fattah, "Bangla Sign Language Recognition with
Even though these approaches have a high accuracy, there Skin Segmentation and Binary Masking," 2019 IEEE International
WIE Conference on Electrical and Computer Engineering (WIECON-
are still improvements that could be made to these algorithms, ECE), Bangalore, India, 2019, pp. 1-5, doi: 10.1109/WIECON-
such as replacing the encoders on UBASS with MobileNet ECE48653.2019.9019931.
encoders to increase performance and further reduce [4] (Nguyen et al., 2017)T. D. Nguyen, A. Shinya, T. Harada and R.
computational requirements. Along with that more accuracy Thawonmas, "Segmentation Mask Refinement Using Image
can be achieved using a larger database with a more diverse Transformations," in IEEE Access, vol. 5, pp. 26409-26418, 2017, doi:
range of complexity and lighting conditions. Furthermore, 10.1109/ACCESS.2017.2772269.
these approaches still lack the ability to map their [5] (Kumar et al., 2022) S. Kumar, M. Jailia and S. Varshney, "A
surroundings areas, which is quite useful for indoor mobile Comparative Study of Deep Learning based Lane Detection Methods,"
2022 9th International Conference on Computing for Sustainable
robots. Global Development (INDIACom), New Delhi, India, 2022, pp. 579-
584, doi: 10.23919/INDIACom54597.2022.9763110.

Authorized licensed use limited to: Birla Inst of Technology and Science Pilani Dubai.285
Downloaded on March 20,2024 at 10:06:53 UTC from IEEE Xplore. Restrictions apply.

You might also like