Deep_Reinforcement_Learning_Tf-agent-based_Object_
Deep_Reinforcement_Learning_Tf-agent-based_Object_
This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2023.3325062
ABSTRACT The recent development of object-tracking framework inventions has affected the performance
of many manufacturing and service industries, such as product delivery, autonomous driving systems,
security systems, military and transportation, retailing industries, smart cities, healthcare systems, agriculture,
etc. Object tracking in physical environments and conditions is much more challenging to achieve accurate
results. However, the process can be experimented using simulation techniques or platforms to evaluate and
check the model’s performance under different simulation conditions and weather changes. This paper
represents one of the target tracking approaches based on the reinforcement learning technique integrated
with tf-agent (TensorFlow-Agent) to accomplish the tracking process in the Unreal Game Engine simulation
platform, Blocks. The productivity of these platforms can be seen while experimenting in virtual-reality
conditions with virtual drone agents and performing fine-tuning to achieve the best or desired performance.
In this proposal, the tf-agent drone learns how to track an object integration with a deep reinforcement
learning process to control the actions, states, and tracking by receiving sequential frames from a simple
Blocks environment. The TF-agent is trained in a Blocks environment for adaptation to the environment and
existing objects in a simulation environment for further testing and evaluation regarding the accuracy of
tracking and speed. We have tested and compared two approaches to the algorithm methods based on the
DQN and PPO trackers integrated with the simulation process regarding stability, rewards, and numerical
performance.
INDEX TERMS Object Tracking, Object Detection, Reinforcement Learning, AirSim, Virtual
Environment, Virtual Simulation, tf-agent, Unreal Game Engine
development of innovative technologies is taking novel ideas algorithms. Kalidas A. P. et al. [19] presented vision-based
from modeling systems such as virtual reality environment navigation of UAVs based simply on image data by
simulators. Creating a virtual version of the physical objects employing deep reinforcement learning to avoid stationary
and process simulations can provide proper service and movable obstacles autonomously in discrete and
optimization and allow for free experimentation with continuous action space. W. Zhao et al. [20] also proposed a
conditions for any activity. perception-based hierarchical active tracking control for
In recent years, most of the UAV-based research UAVs deploying a high-level controller and action orders in a
community has paid attention to improving the performance V-REP-based environment. A trained PPO algorithm [21]
of visual tracking techniques with several neural networks with reward shaping for aircraft direction to a moving
related to network-based architectures, such as CNN [10], destination in a three-dimensional continuous space model
DNN [11], LSTM [12], RL [13], and others. However, these was suggested, with the agent-specific target guidance in
research works present training and testing results in physical virtual state space using a novel reward calculation. Using a
environment datasets, such as image collections taken from PPO-based DRL algorithm [22] was suggested for UAV
different cameras in public areas, image and video sets taken tracking with the assistance of another UAV, introducing the
by drone cameras. These object-tracking applications work generalized distributed deep reinforcement learning platform,
with certain object classes where the drone tracks dynamic which provides solutions to overcome various problems such
objects in an exact pathway and localizes objects with specific as tracking, controlling, and mission coordination of UAVs.
methods as an additional task for the target-tracking Moreover, M. A. B. Abdelkader et al. [23] propose RL-based
framework. Still, there are challenges while tracking moving drone elevation control on a Python-Unity integrated
and static objects with apparently identical aspects to simulation framework to achieve a stable user diagram
recognize which one is an actual tracking target. Recent protocol (UDP) with the suggested algorithm. E. Ç etin [24]
object-tracking research development has begun with proposes counting drones in a 3D space with several DRL
intensive learning with virtual reality integration platforms methods present to count drones with another drone in the
[14], [15], so scientists can integrate their technique or environment provided by an AirSim simulator.
algorithm with a virtual reality platform to test their proposals In this study, we developed an algorithm based on tf-agent
with various fine-tuning parameters and conditions. It gives drone tracking in a Blocks environment where the tf-agent is
scholars more opportunities to explore their methods better actively makes decisions to track the target object in the
and more deeply in a hardware-free environment at zero cost, runtime environment. This proposal includes different reward
and optimize them as much as possible. Additionally, there are techniques to boost the learning, tracking, and decision-
some techniques motivated by robust extension of integral making processes via a TF-agent-based drone in a simulation
schemes for mismatched uncertain nonlinear systems platform. There is some computational consideration for
proposed to support asymptotic tracking [16]. Asymptotic correctly applying parameter values to achieve a higher
tracking means ensuring that the systems’ output tracks a accuracy rate, and state representation was formulated to clear
desired reference trajectory over time with negligible tracking out unnecessary losses and constraints for the training and
error. The main goal is to design a tracking control system that testing processes.
guarantees the output converges to the reference trajectory as The following illustrations show our work’s primary
time approaches infinity in uncertain environments. Another contributions:
model is the output feedback adaptive rise control technique • We introduce a virtual environmental simulation-
[17] used for uncertain nonlinear systems to achieve accurate based object-tracking algorithm model that receives
tracking of desired trajectories. The term “adaptive” indicates input images directly from a realistic virtual platform.
that the controller parameters are updated online based on the • Direct access to the network feedable source images
system’s behavior and the tracking error. A deep Q-learning- from the simulation environment makes the
based [18] approach has been suggested for firefighting framework more advantageous to learn and test when
situations, which is obtainable in some agent robots or drones it comes to unknown environmental conditions.
for finding or planning paths and navigating through fire • The experiment is implemented in an AirSim-based
environments. In such complex and hazardous cases, it is basic Blocks environment with a random, particular
required to be more careful to control the situation with walking person, to track by a virtual drone agent.
concrete plans and actions for rescuing injured or victims of • Two different methods were adopted and integrated
the incident by coordinating situational awareness with other with the virtual simulation platform to demonstrate
rescuers, which is an urgent task. When the framework is the performance of the models.
installed and applied to real drones or robots, it can ensure
firefighters or rescuers make the right decision in extreme,
panicky, and disorienting conditions.
Several relevant research studies have been published that
integrate virtual simulation platforms with proposal
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2023.3325062
FIGURE 1. Proposed DRL-based TF-agent object tracking baseline integration with the Game Engine.
II. RELATED WORKS value, showing the difference between the targeted action
The recent development of object tracking via bounding box and ground truth values. In general, it is called
reinforcement learning has improved by integrating it with intersection-over-union (IoU). Reward values will change
many target tracking techniques, which produce better according to the action value difference with ground truth
performance with decision-making in tracking procedures. output, which shows tracking accuracy.
Although most visual tracking concepts based on DRL could
perform better in the case of the representation model with B. VIRTUAL SIMULATION-BASED OBJECT TRACKING
adopted manners for locating the target object within a VIA DEEP REINFORCEMENT LEARNING
search region, the final estimated target coordinates are In the last few years, most research topics have interacted
ideally centered. with innovative trends in virtual simulation world
environments that allow the simulation of any action, object,
A. OBJECT TRACKING VIA DEEP REINFORCEMENT or process, enabling experimentation with complex
LEARNING conditions to manage and optimize results. Algorithm
The advancement of object tracking via reinforcement integration with simulation platforms makes it challenging to
learning is a comparatively novel idea, where object conduct testing and experimentation while taking advantage
localization and tracking integration with a decision-making of simulation behavior closely related to real-world models
model [13], [25]-[30] are applied to the learning and tracking with dynamic and inactive action modes. Several simulation
process as well. Several studies have discovered that platforms have flexible functionality to connect with
combining deep and RL [31] in various settings confers software algorithms for experimentation. The most widely
many advantages. Visual object tracking [32], localizing used open-source platforms currently are AirSim [14] and
temporal activity [33], identifying object classes [26], object Unity [15], which intend to bridge the gap between the
recognition through video sequence [34], and segmentation virtual and real worlds to support the development of
[35] are just a few of the computer vision problems that have autonomous control and a realistic replica of the actual
used DRL. Notably, visual object tracking via DRL world. Both platforms are advancing their technical abilities
framework studies has increased in recent years, where the with high intensity to positively influence the development
DRL was associated with several techniques to robust the and testing of data-driven machine intelligence techniques
training and decision-making ability while targeting object such as reinforcement learning and deep learning. W. Luo et
location. The agent must estimate the target position al. suggested an active object tracking technique [29] via
(bounding box) in every sequence frame in the most typical deep reinforcement learning, in which a drone agent adopted
use of DRL on visual object tracking by repeatedly selecting the ConvNet-LSTM function approximator for predicting
ultimate fitting actions to get accurate tracking results. the target movement using a frame-to-action strategy.
Accordingly, the state representation is the fulfillment Besides, they perform additional (ViZDoom and Unreal
status of the general frame states within a targeted bounding Engine simulation) environment augmentation techniques
box. In general, actions are the transformation result of the and a customized reward function to boost the training
bounding box while tracking that can shift, scale, and turn process to achieve better target tracking performance.
actions depending on how the network learned and adapted Another virtual simulation-based approach [36] uses a
to the environment in training time. In DRL-based object monocular onboard camera via a DRL model to follow the
tracking, accuracy (precision) is emphasized as a reward detected target object. They state that this technique is a more
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2023.3325062
FIGURE 2. The flow chart of the DRL-based tf-agent object tracking model in Game Engine.
accurate and cost-efficient strategy for adopting an algorithm framework's fundamental idea is to learn the action space
in a virtual environment by using multiple sensor data points using direct input from a virtual simulation platform.
from the pre-calculated trajectory. The proposed model Platforms allow training the network with little effort spent
combines one of the object detection models called actively learning and tracking operations. However, the
MobileNet [37] to get the bounding box information from method must be correctly linked with the Q-learning network
the image input of the learning process. The model includes model to get the required object feature information to study
convergence-based exploration and exploitation for the environment and aid in making continuous action
adaptively aligning algorithms with the network. decisions in each frame of the tracking sequence.
Moreover, J. Schulman et al. suggested a reinforcement
learning-based drone follow-me behavior object tracking A. ALGORITHM BASELINE
framework [38] using the Deep Q-Learning (DQN) model to Figure 1 shows the baseline of our proposed method
control RL agents with adaptive and flexible behavior. In this illustration. Firstly, the AirSim simulation platform must be
object-tracking model, stacked image frames and the installed and set with the required characteristic parameters
inclusion of depth information to integrate as input frames to to integrate the designed algorithm model. We manually
the learning and testing process. The proposed model has insert an object into the simulation platform with a defined
experimented with the different level environments with walking route around the particular location specified in the
several structural changes reasonably. Experimental output virtual environment part of the pipeline (Figure 1). The
with several specific conditions showed that the RL-based virtual simulation platform provides essential input frame
drone following technique succeeded in its adaptive and sequences with feature information, such as ordinary,
generalizing behavior. segmented, and gray-scale (negative) depth images, that
In our recent research, we proposed virtual simulation- could proceed through tf-agent DRL network layers to learn
based visual object tracking via a deep reinforcement and take action for target tracking measures. We use image
learning algorithm [25], which the AirSim drone agent uses depth to identify object location and targeted class while
to track the targeted object class in a runtime virtual experimenting through the network for adaptation to
simulation environment by utilizing sequential frames unknown conditions.
directly from it. Additionally, the suggested model has been
tested with a public dataset to evaluate the performance of B. TF-AGENT-BASED DRL OBJECT TRACKING MODEL
recent research outputs. The main advantage of a virtual Tracking objects on a virtual simulation platform differs
simulation platform is that researchers can conduct from typical state-of-the-art target-tracking framework
experimentation several times with different fine-tuning approaches. The target object moves automatically across
techniques at no cost until they improve their proposal with the simulation platform area, occluded by obstacles such as
high accuracy. Accordingly, generating new, fake, or high walls, several different-shaped objects, etc. In this
augmented data or collecting or reusing data from public sets proposal, a random walking pedestrian was set into a
to reinforce model learning and boost localization exactness simulation environment to create learning and tracking
while decreasing estimation time and human effort are conditions by a virtual AirSim drone agent. As shown in
unnecessary. Figure 2, we integrated the simulation platform with the
suggested alternative algorithm model to jointly optimize
III. PROPOSED METHOD representatives by experimenting in different conditions.
In this technique, we created an algorithm implementation Firstly, we request the environment simulation platform to
that includes several components of the object tracking get the typical depth images and the segmentation map to get
framework, including training and tracking, and we the pixels with a target. In the next step, frames will be gray-
evaluated it in a virtual simulation environment. The scaled and normalized for further recognition of an object in
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2023.3325062
the virtual simulation model. After getting the pixels with the the computed Q-value. However, it should be at a balanced
target and creating them, bounding box points are learning rate to keep the trade-off between the previous and
concatenated and transformed into network-readable values new Q-values for the further training process. A learned
for the following process. value is a reward 𝑹𝒕+𝟏 that the drone agent receives moving
randomly from the starting state point, plus discounted
1) DQN-BASED TF-AGENT estimation 𝜸 of optimal future Q-value for a new state-action
The DQN agent is suitable for any environmental condition match (𝒔′ , 𝒂′ ) in 1-time steps. The output of the learned
possessed by a discrete action space formulated value multiplication by the learning rate 𝜶 is done to get the
deterministically for simplicity and expectations over optimal policy value update. The Q-learning process update
stochastic environmental transitions. The main goal of the illustration is in Figure 3.
DQN agent in this model is to train a policy to maximize the As illustrated in Figure 3, there can be several actions
discounted cumulative reward (1). through the training or learning process where an agent
chooses the seemingly optimal actions 𝑸𝝅 (𝒔𝒕 , 𝒂𝒕 ) and
𝑹𝒕𝟎 = ∑∞
𝒕=𝒕𝟎 𝜸
𝒕−𝒕𝟎
𝒓𝒕 (1) receives a reward for the agent’s performance through steps
in a virtual environment. For further learning, the agent
That is also known as the return value 𝑹𝒕𝟎 . In most RL-based should choose an action from the 𝑺𝒕+𝟏 state to continuously
networks, the discount factor γ should be a constant value learn and analyze the environment with more profound
highlighting the sum of converges between 0 and 1. It allows feature results. Here, the greedy epsilon option is a
our agent to gain better reward value results by avoiding straightforward strategy for balancing exploration and
uncertain environment feature information and identifying exploitation by randomly selecting between the two. The
which is less relevant than a fairly confident one. The 𝑸∗ is method, where epsilon is the likelihood of selecting to
to achieve an affordable reward or return value emphasized: explore or exploit, determines whether it proceeds to explore
𝑸∗ : 𝑺𝒕𝒂𝒕𝒆 × 𝑨𝒄𝒕𝒊𝒐𝒏 → ℝ when the action is taken in a the environment with a slight chance. We can see the action
given state, the return results from a constructed policy to selection with the epsilon greedy method mathematical
achieve maximized rewards (2). formulation below the equation (5).
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2023.3325062
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2023.3325062
FIGURE 4. Probability ratio 𝒓 of the surrogate function 𝑳𝑪𝑳𝑰𝑷 with positive 𝑨 > 𝟎 and negative 𝑨 < 𝟎 advantages. The
red circle on each plot shows the starting point for the optimization, i.e., 𝒓 = 1. The sum of the surrogate function
𝑳𝑪𝑳𝑰𝑷 will be performed for many terms [40].
IV. EXPERIMENTAL RESULTS receive as a reward during the training process. The
One of the main objectives and focuses was to get the most minimum reward is typically a negative since most problems
advantage from the simulation platform to perform involve a penalty for making suboptimal decisions – the
experiments in different conditions with parametric changes. training epoch and reward at 2000 and 50, respectively.
Many related researchers used several simulation platforms
to test and evaluate their algorithms in several evaluation
studies. There are different methods to get an advantage from
the realistic virtual platform. In most cases, platforms apply
for experimenting purposes only. However, it can also be
prioritized widely in learning, training, and testing. The
current development of simulation platforms like Unity,
Unreal Engine, and Cecium gives great opportunities and
advantages to process and experiment with state-of-the-art
models in multiple and impractical circumstances. One of the
prime features of the simulators is the interconnection
FIGURE 5. The minimum received reward output for two
between programming languages (Javascript, Python, Go, training models: DQN-based TF-AGENT and PPO-based
Java, Kotlin, PHP, C#, Swift, etc.) and frameworks (Angular, TF-AGENT.
jQuery, React, Ruby, and Rails, Vue, ASP.NET Core,
Furthermore, the background was set with a plot tab color in
Django, Express, etc.). However, building or setting up this
each method to show the overall performance of the training
type of architecture and framework is quite tricky, and it
agents. Each agent model initially gained different rewards,
could only be successful in some case due to third-party
whereas the DQN-based agent performed better.
programs’ and libraries’ conflicts and disproportionality. In
Nevertheless, at the end of the training epochs, the PPO-
this research work, we conducted experiments with different
based agent receives better results than the DQN-based
parametric changes and finetuning, as explained in the
model agent. The whole training reward performance
following chapter sessions.
illustration in Figure 6 above, set to 2000 and 50, training
A. TRAINING RESULTS epoch and reward, respectively. The maximum point is the
We have trained our proposed model with a simple Bloks highest value that the agent can receive in the training
environment by inserting randomly moving objects to learn process. The maximum reward is typically a positive value
environmental space and to create a model for future testing since most problems involve a reward for making optimal
and evaluation purposes. We applied two types of tf-agent decisions. The DQN-based TF-AGENT model initially gains
models, DQN and PPO-based tf-agents, to achieve more a higher reward value in this graph. However, the PPO-based
comparable output results with a 0.001 learning rate model performs better after 400 epochs until the end of the
configuration. training steps. Understanding the range of possible rewards
Figure 5 above illustrates the minimum reward outputs of can help set the hyperparameters of the models, such as the
the trained models in a typical Blocks environment, where learning rate or the discount factor. It can also help assess the
the DQN-based tf-agent and the PPO-based tf-agent model performance of the trained agent, as the rewards obtained by
is are marked with a blue and pink, respectively. The the agent compare against the minimum and maximum
minimum reward is the smallest value that the agent can possible values.
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2023.3325062
the agent during training can help create a model and apply
this model to the testing process as a performance metric.
This metric measures the agent’s ability to navigate the
environment and obtain expected output tracking. The
diagram below (Figure 8) represents the DQN-based TF-
AGENT model’s output with the reward percentage received
from unseen testing scenarios. In the testing session, the
received reward percentage was set to 100 in the 50 steps
respectively in every episode. The overall received in every
FIGURE 6. The received maximum reward output for two
step marked with a column and red line illustrates the
types of training models: DQN-based TF-AGENT and smoothed value of the DQN-based TF-AGENT testing
PPO-based TF-AGENT. results trained and tested with standard reward in Figure 8
(a).
The given average reward below refers to the mean value of Figure 8 (b) shows the PPO-based TF-AGENT’s received
the rewards received by agents during their interactions with percentage reward testing results in the 50 steps of the
the environment while training using DQN and PPO-based episode, along with smoothed red line output. The different
model algorithms. The average reward is essential for results between the DQN and PPO models received rewards
evaluating the agents’ performance during training. During in every training step. As we can see, the testing results show
training, agents try to learn an optimal policy that maximizes that both models give high accuracy and precise learning
the cumulative reward obtained over time. Calculating the performance in every testing step output with an elevated
average evaluation reward is done by dividing the sum of the conclusion.
V. CONCLUSION
In this research work, we have presented a DQN and PPO-
based TF-AGENT model-based object tracking framework
integrated with a simple Blocks environment to experiment
with and evaluate the performance of the proposed
algorithm. It has been integrated with the simulation
platform to highlight the algorithm’s overall performance.
The simulation platform provides three types of essential
input images to experiment with and evaluate the overall
status. While testing in a virtual-reality scenario with virtual
FIGURE 7. The received average rewards outcome of drone agents and finetuning to reach the best or desired
training for DQN and PPO-based TF-AGENTS. results, the productivity and eligibility of these platforms’
importance is vital. The DQN and PPO-based virtual tf-agent
rewards received during all episodes dividing it by the total
drones learn how to detect and track an object inserted in this
number of episodes. The estimated calculation is the average
platform by obtaining consecutive frames from a primary
reward the agent will receive when interacting with the
Blocks environment and using a DRL network to manage the
environment using the learned policy.
actions, states, and tracking pipeline. Both tf-agents are
By evaluating the average reward received, we can see the
trained in a Blocks environment to adapt to the surroundings
difference between DQN and PPO-based model’s
and existing objects in a simulation condition for additional
performance in varied configurations. However, in some
testing, tracking accuracy, and speed assessment. In the
scenarios, the average reward may not be the most suitable
training process, both models showed presentable results:
metric for evaluating the agent’s performance.
minimum 49 (PPO) and 48 (DQN) rewards in 2000 epochs;
B. TESTING RESULTS maximum 49 (DQN) and 49 (PPO) rewards in 2000 epochs;
We have tested our proposed DQN and PPO-based model average 49 rewards were received for both (PPO and DQN)
agents with the same environmental condition but different models. In the case of testing both models' performance
unseen test episodes to explore the ability of the models and contrasted 50 steps of one episode testing set, where the
compare their performance. As mentioned above, the DRL- PPO-based tf-agent gets its pick value reward of 97% in step
based algorithm’s performance evaluation differs from other 23, DQN-based agent receives its max value of 86% in the
state-of-the-art algorithms in the case of performance metrics 17th step respectively. However, the overall performance of
evaluations and comparison techniques. The agent-based the received percentage reward graph (Figure 8, a and b)
models’ precision can be seen or taken as a received reward indicates that the DQN-based model sequent results better
value. As mentioned earlier, the average reward obtained by than PPO-based. Regarding stability, reward contribution,
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2023.3325062
a)
b)
FIGURE 8. The testing reward distribution of the DQN-based TF-AGENT (a) and the PPO-based TF-AGENT (b) in 50
steps of the episode.
and numeric graphical performance, we examined and 3. Gregory McNeal, “Drones and aerial surveillance: Considerations for
legislatures”, November 2014, Available:
compared the algorithm techniques to various established
https://www.brookings.edu/research/drones-and-aerial-surveillance-
hyperparametric changes with reinforcement learning-based considerations-for-legislatures/
network control incorporated into the simulation process. In 4. Purahong, B., Anuwongpinit, T., Juhong, A., Kanjanasurat, I., and
future work, we are going to integrate our model with several Pintaviooj, C. “Medical Drone Managing System for Automated
state-of-the-art tracking techniques to improve the External Defibrillator Delivery Service”, Drones 2022, 6(4), 93
https://doi.org/10.3390/drones6040093
performance of the target tracking framework by testing it in
5. Norbert Tusnio, and Wojciech Wroblewski, “The efficiency of drones
more complex virtual simulation environments. usage for safety and rescue operations in an open area: A case from
Poland”, Sustainability 2022, 14, 327. https://
REFERENCES doi.org/10.3390/su14010327
1. The Manufacturer. “The benefits of drones in manufacturing”, The 6. Mouna Elloumi, Riadh Dhaou, Benoit Escrig, and Hanen Idoudi,
Manufacturer, 29 Jun. 2022, Available: “Monitoring road traffic with a UAV-based system”, IEEE Wireless
https://www.themanufacturer.com/articles/the-benefits-of-drones-in- Communications and Networking Conference, 11 June 2018,
manufacturing/ DOI: 10.1109/WCNC.2018.8377077
2. Croptracker. “Drone Technology in Agriculture”, Dragonfly IT, 26 7. Chodorek, A., Chodorek R.R., and Yastrebov, A. “Weather Sensing in
April 2022, Available: https://www.croptracker.com/blog/drone- an Urban Environment with the Use of a UAV and WebRTC-Based
technology-in-agriculture.html Platform: A Pilot Study”, Sensors 2021, 21, 7113.
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2023.3325062
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2023.3325062
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4