0% found this document useful (0 votes)
25 views8 pages

FG-PE: Factor-Graph Approach For Multi-Robot Pursuit-Evasion

Uploaded by

Anil Nara
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
25 views8 pages

FG-PE: Factor-Graph Approach For Multi-Robot Pursuit-Evasion

Uploaded by

Anil Nara
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

1

FG-PE: Factor-graph Approach for Multi-robot Pursuit-Evasion


Messiah Abolfazli Esfahani*, Ayşe Başar*, and Sajad Saeedi*

Abstract—With the increasing use of robots in daily life, there from sensor measurements and past observations. The main
is a growing need to provide robust collaboration protocols for contributions of the proposed method are: (1) FG-PE provides
robots to tackle more complicated and dynamic problems effec- a factor graph-based solution for addressing the pursuit-evasion
tively. This paper presents a novel, factor graph-based approach
to address the pursuit-evasion problem, enabling accurate esti- problem. By constructing a factor graph, the proposed method
mation, planning, and tracking of an evader by multiple pursuers can estimate the position of the evader at each time step. The
working together. It is assumed that there are multiple pursuers generated graph enables querying the position of each pursuer
and only one evader in this scenario. The proposed method and other entities present in the scene. To the best of our
arXiv:2411.00741v1 [cs.RO] 1 Nov 2024

significantly improves the accuracy of evader estimation and knowledge, this paper is the first to formulate the PE problem
tracking, allowing pursuers to capture the evader in the shortest
possible time and distance compared to existing techniques. In as a factor graph. (2) FG-PE can estimate the evader’s position
addition to these primary objectives, the proposed approach and plan the pursuers while considering short time windows.
effectively minimizes uncertainty while remaining robust, even This means that at each time step, it can determine the
when communication issues lead to some messages being dropped planned position for each pursuer for the subsequent time step.
or lost. Through a series of comprehensive experiments, this (3) FG-PE accounts for uncertainty in predictions using the
paper demonstrates that the proposed algorithm consistently
outperforms traditional pursuit-evasion methods across several probabilistic nature of factor graphs. It aims to minimize this
key performance metrics, such as the time required to capture uncertainty by providing a plan through factor graphs, which
the evader and the average distance traveled by the pursuers. Ad- enable efficient representation and optimization in the presence
ditionally, the proposed method is tested in real-world hardware of noise. Videos of the experiments are available on the
experiments, further validating its effectiveness and applicability. website of the project: https://sites.google.com/view/pursuit-
Index Terms—multi-robot system, factor graph, pursuit eva- evasion. (4) FG-PE is easily extendable to multiple pursuers
sion, uncertainty, tracking and allows for high obstacle-wise scalability. (5) FG-PE is
robust against lost messages and performs well even when
I. I NTRODUCTION some messages are dropped within the graph structure.
The rest of the paper is organized as follows: Sec. II presents
With the increasing presence of robots across various in-
background. Sec. III introduces the proposed method. Sec. IV
dustries, there has been a growing emphasis on inter-robot
shows the experimental results. Sec. V concludes the paper.
collaboration. Collaborative robots assist with tasks such as
search and rescue [1], trajectory planning [2], agriculture [3],
object transportation [4], and collision avoidance [5]. A key II. L ITERATURE R EVIEW AND BACKGROUND
challenge in search and rescue operations is the pursuit-evasion The PE problem finds applications in various robotics sce-
(PE) problem, where pursuers try to catch evaders. narios, such as surveillance, search and rescue operations, and
Several methods have been utilized to solve the PE problem. multiplayer games [11]. PE involves two groups of agents:
Classical methods use graph-based techniques, treating the PE pursuers and evaders. The evaders aim to reach a predefined
problem as a search problem [6]. Reinforcement learning (RL) goal, such as a specific location, object, or target area, while
approaches [7], [8] and game theory methods [9] have also the pursuers try to prevent them from doing so. Key aspects
been investigated. However, existing methods often lack an of the PE problem include multi-agent tracking [12] and path
efficient representation of the relationships between pursuers, planning [2]. There are three main approaches solving the
evaders, and their environment. For instance, if one pursuer PE problem: search-based, learning-based, and graph-based
loses its history of movements, retrieving that information can approaches.
be challenging. Utilizing a graph-based approach can establish A direct method for addressing the PE problem is to use search
structural connections between entities, improving the overall algorithms like Depth-First Search (DFS) [13] or Breadth-
understanding of the situation. In addition, the aforementioned First Search (BFS) [14]. A critical aspect of the pursuit-
methods do not account for uncertainties in predictions and do evasion problem is the speed of the agents. Furhan et al. [15]
not facilitate message passing among robots. This limitation introduced a method that accounts for uncertainties in the
affects the scalability and versatility of PE variants. pursuers’ speed, which can lead to issues such as moving to
In this paper, we present FG-PE, a method that employs unintended locations.
factor graphs (FG) [10] to tackle the PE problem. Factor Learning-based algorithms are also suitable for solving the
graphs provide a flexible framework capable of adapting to PE problem. Yu et al. [16] proposed a method focusing on
varying numbers of pursuers and obstacles. This approach ad- the application of reinforcement learning (RL) techniques to
dresses prediction uncertainty by guiding pursuers to minimize address PE. The authors propose a fully decentralized multi-
capture time, distance, and uncertainty, utilizing information agent deep RL approach. In this approach, each agent is
*Toronto Metropolitan University, Toronto, Canada modeled as an individual deep RL agent with its learning
email: {mabolfazli, ayse.bener, s.saeedi}@torontomu.ca system, including an individual action-value function, learning
2

Var. Description
update process, and action output. To facilitate coordination
xqt Evader pose in SE(2) at time step t
among agents, limited information about other environmental x̂qt Estimated evader pose in SE(2) at time step t
agents is incorporated into the learning process. Engin et xpt Pursuer pose in SE(2) at time step t
al.[17] aimed to solve a type of PE in which agents have Ω Information matrix of variables
limited visibility. To overcome the challenges associated with ftdq Dynamic factor of the evader at time step t
dp
large state spaces, a new learning-based method is introduced ft j Dynamic factor of the j th pursuer at time step t
mp
that compresses the game state and uses it to plan actions ft j Measurement factor btw j th pursuer and the evader at time t
m o
for the players [17]. Nikolaos-Marios et. al formulated the ft j i Measurement factor btw j th pursuer and ith obstacle at time t
p mov
problem of PE as a zero-sum differential game and they ft j Planning factor of the j th pursuer at time t
cp
ft j Collision avoidance factor for j th pursuer at time t
proposed an online RL approach to solve it [18]. One of o p
ft i j Obstacle avoidance factor btw j th pursuer and ith obstacle at
the other proposed approaches for solving the PE problem is time t
utilizing the Q-learning approach[19]. eqd Cost of dynamic factor of the evader
t
Graph-based methods have also been utilized to solve the PE p d
et j Cost of dynamic factor of the j th pursuer
problem. Kirousis et al. [20] converted the PE problem into mp
et j Cost of measurement factor btw j th Pursuer and Evader
m o
a graph searching problem. Another approach to solving the et j i Cost of measurement factor btw j th pursuer and ith obstacle
cp
problem in a graph searching manner, involves vertex pursuit, et j Cost of collision avoidance btw pursuer j and other pursuers
o p
where the pursuer aims to occupy the same vertex as the et i j Cost of obstacle avoidance factor btw pursuer j and obstacle i
evader [21]. Another type is the edge pursuit, in which the TABLE I: Variables used in this work.
entities move along the edges of the graph, and the pursuer
attempts to move to the edge occupied by the evader [22]. In Eq. (3), if at time step t the distance between a pursuer and
There are several real-world applications that require the the evader is less than the capturing radius, it indicates that the
collaboration of multiple robots such as object transportation pursuers win the game. In this paper, we assume: 1) pursuers
[23] and task allocation [24]. One approach to coordinating the can observe the evaders but the measurements (range/bearing)
necessary communication among robots is by utilizing graph are noisy, 2) there are random number of obstacles in the
representation. Factor graphs [10], described in Sec. III-C in environment that is shown by No , 3) the evader has a pre-
detail, can represent complex relationships between agents defined goal point to reach, and 4) pursuers neither know the
concisely and are suitable for addressing a wide range of goal of the evader nor the motion model of the evader. The
problems such as kinodynamic motion planning that fully objective for pursuers is to 1) estimate the position of the
considers whole-body dynamics and contacts [25]. Factor evader, 2) minimize the uncertainty of the estimate, and 3)
graphs have been widely used to solve various multi-robot capture the evader.
problems, such as localization [26], path planning [27], and
control [4]. In this paper, we utilize factor graphs to address B. Path Planning of Evader
the PE problem. To the best of the authors’ knowledge, there
is no existing work, that formulates the PE problem using a The evader aims to reach a predefined goal, known only
factor graph representation. to itself. The evader employs the Dynamic Window Approach
(DWA), a popular method in robotics for motion planning and
obstacle avoidance [28]. DWA is effective in scenarios where
III. P ROPOSED M ETHOD a robot navigates through moving obstacles. It dynamically
In this section, the problem statement and the proposed adjusts the robot’s velocity and steering commands based on
solution are presented. Table I lists the notations used. real-time sensory input and desired goal states. By contin-
uously evaluating and updating potential future trajectories
A. Problem Statement within a dynamically defined window, the evader can effi-
ciently navigate toward its goal while avoiding obstacles and
In this work, we assume that a team of Np pursuers, indexed pursuers. The evader iterates through all possible trajectories
by N = 1, 2, . . . , Np , and one evader (Nq = 1) are exploring based on angular velocity and linear velocity, selecting the best
the R2 space. The linear and angular velocity of robots are movement for the next time step. In this paper, we utilize a
bounded based on the following conditions, where v max and modification of the DWA approach for planning the pursuer
ω max represent the maximum allowable linear and angular trajectory.1
velocity for the robots:
U = {v ∈ R2 | ||v|| ≤ v max } (1) C. Factor Graphs
This paper proposes a factor graph based solution to deter-
R = {ω ∈ R2 | ||ω|| ≤ ω max } (2) mine the optimal motion of pursuers. In factor graphs, nodes
represent variables, such as a robot’s pose, while edges repre-
The goal of the pursuers is to capture the evader within a sent relationships between these variables, including kinematic
defined capturing radius r > 0. The pursuers win if the constraints and measurements. An optimization cost is defined
following condition is satisfied: based on minimizing the costs associated with the edges of
∃i ∈ Np , ||xpt i − xqt || ≤ r (3) 1 https://github.com/RajPShinde/Dynamic-Window-Approach
3

m o
f0 1 1
m o
f1 1 1
m o
f2 1 1
m o
f3 1 1 dynamic cost of the evader’s movements at time step t can be
computed using the following cost function:
dp dp dp 2
p
f1 1
p
f2 1
p
f3 1
p
eqd q q q
t (xt ) = ||xt − xt−1 ||Ωq , (5)
x0 1 x1 1 x2 1 x3 1 t

In Eq. (5), xqt represents the evader’s position at time t. The


mp
f0 1
mp
f3 1 cost function measures the prediction error of the relative trans-
dq
f0
dq
f1
dq
f2
formation between two consecutive evader poses, comparing
q
x0
q
x1
q
x2
q
x3 the predicted motion based on current pose estimates with the
m o
f1 2 1
m o
f2 2 1
actual odometry measurements. This error is weighted by the
mp
f0 2
mp
f3 2
uncertainty of those measurements.
m o dp dp dp m o
2) Dynamic Factor of Pursuers: Similar to the cost function
f0 2 1 f1 2 f2 2 f3 2 f3 2 1
p p p p of the dynamic factor for the evader, the cost function for the
x0 2 x1 2 x2 2 x3 2
dynamic factor of each pursuer can be defined as follows:
Fig. 1: The factor graph representation for the PE problem. Green p d 2
nodes represent the variables associated with the poses of the
et j (xpt ) = ||xpt − xpt−1 ||Ωp , (6)
t
pursuers. Red nodes are poses of evaders. Squares represent the
constraints between the variables, including motion models (e.g. where xpt represents the pursuer’s pose at time t.
f1dp1 ), measurements of the robots (e.g. f0mp1 ) or landmarks (e.g. 3) Measurement factor between each pursuer and evader:
f1m1 o1 ). The measurement factor between each pursuer and evader
compares the estimated pose of the evader with the measured
the factor graph. The optimization problem is formulated as value. The measurement factor can be defined as follows:
a maximum a posteriori (MAP) inference problem and is mpj 2
implemented using GTSAM [29]. The Levenberg–Marquardt et (xqt , xpt ) = ||x̂qt − h(xqt , xpt )||Ωpq , (7)
t

(LM) algorithm [30] is employed as the optimizer, providing where h represents the measurement function, and x̂qt is the
fast convergence. The factor graph representation of the PE evader’s estimated position at time t.
problem is adaptable and can be designed to enable the 4) Measurement factor between pursuers and obstacles:
pursuers to move within the search space to actively pursue Pursuers need to localize themselves in the scene, and as a
the evader. Fig. 1 illustrates the factor graph of the PE result, they will take measurements with respect to obstacles.
problem. Variables that need to be estimated or optimized are In this factor, the pose of each pursuer is compared against the
represented as nodes inside circles (green for pursuers and red estimated pursuer’s pose at each time step. The measurement
for the evader). xpt i denotes the pose of pursuer i at time t, factor compares the estimated pose of the evader and the
while xqt represents the pose of the evader at time t and T measured value. This measurement factor can be defined as
shows the current time step. follows:
The overall cost function can be defined as mj oi 2
et (xo , xpt ) = ||x̂pt − h(xo , xpt )||Ωqo , (8)
T  NP Np NP t
mp cp
X X X X
J(xq0:T , xp0:T ) = eqd
t + et j + where et j x̂pt
represents the estimated position of a pursuer at time
o
t=0 j=1 t, and x represents the position of obstacle o.
j=1 i=1
i̸=j
5) Planning factor for pursuers: As seen in Fig. 1, pursuers
Np No Np Np  can move around the search space. The goal function defined
X X op X p d
X p mov
+ et i j + emj oi + et j + et j
for the pursuers is to get as close to the evader as possible.
j=1 i=1 j=1 j=1
As mentioned earlier, the goal of the pursuer is to get as close
(4)
to the evader as possible. To achieve this goal, each pursuer
q
where, ||.|| denotes the Mahalanobis distance [31], and Ω attempts to move as close as possible to x̂t , noted with
qd
represents the information matrices of the variables. Here, et p mov q
et j
2
(x̂t , x̂pt ) = ||x̂qt − x̂pt ||Ωpq . (9)
represents the cost associated with the dynamic factor of the t
pj d
evader, while et denotes the dynamic factor cost of the 6) Collision Avoidance factor for pursuers: Since there
th is a possibility of collisions between pursuers, the collision
j pursuer. The measurement factor costs are denoted by
mpj o i pj mpj
et and et . Specifically, et indicates the measurement avoidance factor is defined as follows:
th
factor cost between the j pursuer and the evader, whereas
m o
et j i represents the measurement factor cost between the ith
 2
th p j mov ||1 − dp || ,

dp < ds
obstacle and the j pursuer. et represents the movement cp
et j (xpt ) = c1 Ωpt , (10)
th
cost for the j pursuer, minimizing the distance between the 
0, d ≥ d

evader’s estimated position and the j th pursuer. The terms p s
o p cp
et i j and et j correspond to the cost of the obstacle avoidance where the objective is to define a safety bubble for each
and the collision avoidance factor, respectively. The problem pursuer. The radius of the safety bubble is given by ds . If
consists of seven types of factors, described below: the distance between pursuers pi and pj is less than ds , the
d
1) Dynamic Factor of Evader: The evader aims to reach error term ||1 − c1p || is applied. Here, c1 is a constant, and dp
a predefined goal (G) by applying the DWA approach. The represents the distance between the two pursuers pi and pj .
4

7) Obstacle Avoidance factor for pursuers: To avoid col- The LM solver can be approximated by O(ϵe −2 ) [32] where
lisions with obstacles in the scene, pursuers use the obstacle O exhibits the existence of logarithmic factors in ϵ, and ϵ
e
avoidance factor. Similar to the collision avoidance factor, the represents an ϵ-stationary point. By considering m as solver
purpose of this factor is to create a safety bubble around the parameters and considering the post-optimization that takes
pursuers, preventing collisions with obstacles. e −2 (n2 +
place in O(I), the overall complexity would be O(nϵ
 2 m) + nI).
||1 − do || ,

do < ds
o i pj p
et (xt ) = c2 Ωot . (11)
 E. Convergence Analysis
0, do ≥ ds

To analyze the convergence of the proposed method, the
Here, do represents the distance between each moving pursuer inference method must be investigated, and a scenario that
and the obstacles in the scene, while c2 is a constant. always converges should be provided. Referring to Eq. (4),
The optimized cost function generates the path based on the
the cost function consists of nonlinear least square parts. It is
estimated position of pursuers (x̂pt ) and the evader (x̂qt ) at time
proven that MAP inference becomes the maximization of the
t.
product of all factors [33] as
Y
D. Time Complexity Analysis X MAP = arg max fi (Xi ) (12)
X
To analyze the complexity of the problem, the worst-case i
scenario is considered. The number of variables and factors where factors are in the following form
grows over time. The number of variables added at each time  
1 2
step is Np + Nq . To investigate the number of factors added at fi (Xi ) ∝ exp − ∥hi (Xi ) − zi ∥Σi (13)
2
each time step, the effect of each factor must be investigated.
The number of factors added at each time step for the dynamic In Eq. (13), Σi shows the covariance, zi shows the actual
factor of pursuers f dp , planning factor of pursuers f pmove , and measurement, and hi (Xi ) shows the measurement model.
measurement factor between pursuers and evader f mp is the Taking a negative likelihood of Eq. (13), the problem becomes
same and is Np . The number of factors added for the dynamic the minimization of non-linear factors:
factor of evader (f dq ) at each time step is Nq . The number X 2
X MAP = arg min ∥hi (Xi ) − zi ∥Ωi (14)
of factors added at each time step for collision avoidance X
N ×(N −1) i
factor is p 2 p . The number of factors added for the
measurement factor between pursuers and obstacles (f mo ) and The LM optimizer can solve Eq. (14) by liniarizing it as:
obstacle avoidance factor(f op ) is Np × No . ∆∗ = arg min ||A∆ − b||22 (15)
The overall complexity is ∆

In Eq. (15), A is the single measurement Jacobian, and b shows


Np × (Np − 1)
O((Np + Nq ) × n + ( + 2 × Np × No all the prediction errors in the form of zi − hi (Xt0 ).
2 As a simple scenario, if there are two pursuers, p1 and p2 , and
+ Np × Nq + Np + 1) × n), they observe the evader only once the factor graph simplifies
The first part, which represents the number of variables, grows as shown in Fig. 3. This graph has a tree structure, and it
linearly in O(n), while the second part, which represents the converges surely [34]. In the factor graph, the factor function
number of factors added at each time step, grows quadratically for the movements of pursuers is denoted as f1dp1 for p1 at
in O(n2 ). Thus, the overall complexity of the problem is in time-step 0, as shown in Fig. 3. The prior factor of the pursuer
O(n2 ). p is represented as f0p1 at time-step 0.
Variable elimination is an exact inference method.
350 f dp As an example, the probability function of variable xq3 can
# of Factors (×103)

f dq
300 f mo
be found by summing out all other variables in the joint
f mp distribution:
250
f cp
200 f pmov
X XX X
150 f op P (xq3 ) = ··· ···
Total
100 xq0 xq3 xp
0
1 p
x3 1
50 X X
0 ··· P (xq0 , . . . , xq3 , xp01 , . . . , xp31 , xp02 , . . . , xp32 )
0 500 1000 1500 2000 p
x0 2
p
x3 2
Time
(16)
Fig. 2: The number of different factors at each time step is shown
in the figure above. In this figure, f op and f mo are increasing at the In Eq. (16), all variables are marginalized out, except By xq3 .
same rate, while f dp , f mp , and f pmove are increasing at the same starting the elimination process in the graph shown in Fig. 3,
speed. we begin by eliminating node xp01 . The involved factors are
f0p1 , f0mp1 , and f1dp1 . After eliminating xp01 , two marginal
As shown in Fig. 2, the number of measurement factors factors are created.
between pursuers and obstacles and obstacle avoidance factors
increase faster compared with other factors and variables. F (xp01 , f1dp1 ) = f0p1 .f1dp1 (17)
5

dp
f1 1
dp
f2 1
dp
f3 1 In the above equation, J represents the Jacobian matrix of
p
x0 1
p
x1 1
p
x2 1
p
x3 1 the cost function, and λ is the damping factor that balances
between the gradient descent and Gauss-Newton methods.
mp
f0 1
By considering that the only factor existing in the problem
dq dq dq
f1 f2 f3
q q q q
is the dynamic factor, the uncertainty can be expressed as:
x0 x1 x2 x3

T
X
mp
f3 2 σT = σ0 + eqt − ∆LM (25)
dp dp dp
f1 2 f2 2 f3 2
i=0
p p p p
x0 2 x1 2 x2 2 x3 2
This equation represents the accumulated uncertainty. In
Fig. 3: A simplified factor graph with a tree structure is presented. In Eq. (25), σ0 denotes the uncertainty at time 0, and ∆LM
this scenario, there is only one measurement from each pursuer and represents the correction from the LM optimizer.
evader.
To minimize the uncertainty, a measurement must be ob-
tained to correct the estimations. When a measurement arrives
F (xp01 , f0mp1 ) = f0p1 .f0mp1 (18) from the pursuers, the pose of the evader can be estimated as
follows:
By marginalizing over xp01 ,
impacted factors are: mpj
xqt = xqt−1 − (J T J + λI)−1 J T et (26)
d
X
f1 p1 = F (xp01 , f1dp1 ) (19)
p
x0 1 Now, the Jacobian is updated, incorporating richer information,
which improves the conditioning of the problem. This leads to
X more accurate estimations and reduced uncertainty. Similarly,
f0mp1 = F (xp01 , f0mp1 ) (20) the dynamic factor of pursuers (f dp ) and the measurement
p
x0 1 factors between pursuers and landmarks (f mo ) can also be
In the next steps, we can remove the nodes representing the modeled, contributing to the reduction of uncertainty.
evader’s position, starting by eliminating the node xq0 . The
updated factor is then only f0dq , which can be computed as IV. R ESULTS
follows:
Experiments are conducted in two environments: (i) a simu-
lated environment and (ii) a real-world hardware environment.
F (xq0 , f0dq ) = f0q .f1dq (21) The performance analysis is based on the average distance
By marginalizing over xq0 ,impacted factors will be: traveled by pursuers, time, and the number of time steps.
Parameters are set based on the robot web paper by Muraiet
d
X
f1 q = F (xq0 , f1dq ) (22) al. [26].
xq0 • Dynamic factor: In this problem, the uncertainties for
each step along the x-axis and y-axis are denoted as
The elimination step, continues for nodes xq1 , xq2 , xq3 , andxp32
σx = 0.1 and σy = 0.1, respectively. Additionally, the
in the same manner. The starting point is not important since
uncertainty in orientation is σθ = 0.01.
there is no cycle in the tree. In conclusion, if the graph does
• Measurement factor: The uncertainty of each measure-
not have a cycle, it surely converges. Conversely, if it has a
ment factor is represented by σrange and σbearing . In
cycle, MAP inference converges most of the time [10].
this context, σrange typically refers to the uncertainty
associated with the distance measurement, while σbearing
F. Uncertainty minimization analysis represents the uncertainty in the angle or direction mea-
In this section, we investigate the effects of different factors surement. For the simulation, we set σrange = 10 and
on the uncertainty of predictions and the impact of measure- σbearing = 0.05 for both ftmo and ftmp .
ments on that uncertainty. If we consider the evader to be • Prior factor: Since agents are in motion, it is necessary
moving around the search space, the new pose of the evader to designate certain factors as prior factors to establish
can be assessed using the following equation: reference points and enhance measurement accuracy. In
this context, ’prior factors’ refer to fixed reference points
xqt = f (xqt , ∆xqt ) + Σqt (23) or landmarks.
In Eq. (23), f is the motion model function that moves the • Collision avoidance factor: In order to prevent pursuers
evader by ∆xqt . Given a sequence of odometry readings, from having collisions, this factor is utilized. The un-
the LM optimizer aims to minimize the sum of the squared certainties for each step along the x-axis and y-axis are
residuals between the predicted and actual robot poses as denoted as σx = 1 and σy = 0.1, respectively.
shown in Eq. (5) [30]. The update step is performed using • Obstacle avoidance factor: In order to prevent pursuers
the following equation: having collision with obstacles, this factor is utilized. The
uncertainties for each step along the x-axis and y-axis are
xqt = xqt−1 − (J T J + λI)−1 J T eqd
t (24) denoted as σx = 1 and σy = 0.1 respectively.
6

OURS 500 500


1400 OURS
PP
Evader G.T
Pursuer 1
PP
400
Avg Distance

1200 GUT
400 400 Pursuer 2

Time steps
GUT

Y Position

Y Position
CB Pursuer 3
CB
1000 300
Pursuer 4
300 300 Evader
800
600 200 200 Evader G.T 200
Pursuer 1
400 Pursuer 2
100 100 Pursuer 3 100
200 Pursuer 4
Evader
0 0 0 0
2:1 1.5:1 1.05:1 0.9:1 2:1 1.5:1 1.05:1 0.9:1 0 100 200 300 400 500 0 100 200 300 400 500
Speed Rates Vp : Vq Speed Rates Vp : Vq X Position X Position
Fig. 5: The effect of different trajectories on the estimation over 500
Fig. 4: Average distance traveled by pursuers (top) and time to capture time-steps has been investigated. The circle shows the start position
the evader (down) for different approaches. Vp and Vq show the rate of the robots. Pursuers try to get closer to the evader and catch it.
of pursuer speed in comparison to evader speed. PP and CB methods
are not able to catch the evader at all speed rates, in which no values 160
1 Pursuer 1 Pursuer
are reported for them. 300 2 Pursuers
140 2 Pursuers

Avg Distance
3 Pursuers

Time Steps
3 Pursuers
250 120
4 Pursuers 4 Pursuers
200 100
150 80
A. Simulation 60
100
40
The simulated environment is developed in a way that the 50 20
number of pursuers, evaders, and obstacles in the scene can be 0
1 0.5 0.33 0.25 0.2 0.16 0.14
0
1 0.5 0.33 0.25 0.2 0.16 0.14
modified. The simulated environment is developed in a 3500× Measurement Freq Measurement Freq
3500 window size. Additionally, the simulated environment
is capable of simulating different trajectories of the evader.
Fig. 6: Average distance traveled by pursuers and time to capture the
The simulation is implemented in C++ and OpenCV library is
evader comparison among different frequency of measurements and
utilised for visualization purpose. number of pursuers.
1) Comparison: Various approaches have been explored in
pursuit of strategies for capturing and chasing evaders. Among
these, Qin et al. [35] introduced a hierarchical network-based 2000 200
Avg Distanace

Time steps
model aimed at devising a cooperative pursuit strategy. 150
1500
To ensure a fair comparison, the initial poses and speeds 100
1000
of both the evader and pursuers are kept the same for the 50
proposed and compared methods. Two different variations of 500
0
[35] are considered: Constant Bearing (CB) and Pure Pursuit 0
2 4 8 16 32 64 2 4 8 16 32 64
(PP) strategies. In the CB strategy, the bearing angle between a # of Pursuers # of Pursuers
pursuer and the evader remains fixed until the time of capture. Fig. 7: Average distance traveled by pursuers (left) and time to capture
Conversely, in the PP strategy, the pursuer’s velocity vector the evader (right) for a different number of pursuers (Np ).
aligns with the line of sight. Our proposed method operates at
a frequency of 20 Hz on average, while the compared methods
operate at a frequency of 30 Hz on average. evader (Vp = 2Vq ).
As can be seen in Fig. 4, the proposed method captures the 3) Number of Robots: The proposed method allows for flex-
evader faster in all speed rates that were tested, and also the av- ible measurement frequency, so interactions between pursuers
erage distance traveled by pursuers is much less in our method. and the evader do not need to occur at every time step. In
In speed rates (Vp =1.05:Vq =1) and (Vp = 0.9 : Vq = 1) Fig. 6, the effect of the frequency of measurement when we
pursuers fail to catch the evader in PP and CB methods, while have different numbers of pursuers is exhibited. Decreasing the
our method can pass. PP was not able to catch the evader in frequency of measurement results in a longer time to capture
half of the experiments and CB was not able to catch the evader the evader.
in 75% of experiments. A key factor in our method is the rate As can be seen in Fig. 7, different number of robots have
of measurement coming from pursuers. In the experiments, been used to evaluate the impact on evader capture time
it is set to 0.2 Hz for our proposed method, while in other and the distance pursuer traveled. With each combination
methods it is 1 Hz. This means that the proposed method does tested 10 times, it was found that increasing the number of
not need to have measurements in all time steps and can lead pursuers reduces both the capture time and the average distance
to less memory consumption and computation overhead. traveled.
2) Trajectory: To investigate the effect of different trajec- 4) Uncertainty Ellipse: In this subsection, the effectiveness
tories on the performance of the proposed method, various of the robot’s planning for catching the evader and decreasing
trajectories were tested. As shown in Fig. 5, the pursuers the uncertainty is investigated. Fig. 8 illustrates that pursuers
demonstrate accurate position estimation and effective pursuit reduce uncertainty by adjusting their movement. During the
of the evader. In the figure, the initial positions of the pursuers experiment and the planning of the pursuers, when a measure-
are indicated by circles, and there are four pursuers in the ment is taken from the pursuers of the evader, the uncertainty
scene. Notably, the speed of the pursuers is twice that of the decreases. While other approaches lack consideration of uncer-
7

20 Ground Truth 1e8 1e7


Estimated Trajectory Planned Trajectory Planned Trajectory
1.0 3.5
15 Fixed Pursuers Fixed Pursuers
3.0

Y Position
10 0.8
2.5

Area

Area
0.6 2.0
5
0.4 1.5
0 1.0
0.2
−5 0.5
0.0 0.0
−10 −5 0 5 10 15 0 20 40 60 80 100 120 140 0 20 40 60 80 100 120 140
X Position Time Steps Time Steps
Fig. 8: The uncertainty minimization of the estimated position. The Fig. 10: The comparison of the uncertainty ellipse area for the
green line shows the ground truth and the black line shows the scenarios where the pursuers are planned and fixed, with no landmark
estimated position. The frequency of measurement from pursuers to in the scene (left), and with one landmark in the scene (right).
evader is 0.01 Hz and the blue ellipse shows the uncertainty ellipse
that is scaled by 1/1000.
300
250

Time steps
tainty in their predictions, the proposed method accounts for 200
it and also minimizes it. In Fig. 9, the area of the uncertainty 150
ellipse for different measurement frequencies is shown. As can 100
be seen, decreasing the measurement frequency results in an 50
increase in the area of uncertainty. The area is shown in log 0
scale in Fig 9 0.0 0.5 0.66 0.75 0.8
Msg Drop Fraction

6 Fig. 11: The average time to catch the evader when different fraction
10
4
of messages are dropped.
10
Area (log scale)

2
10
0
1.8 Evader
Pursuer 1
Y Position (m)

10 1.6 Pursuer 2

−2
10 1.4
f=0.5
−4 f=0.25 1.2
10 f=0.125 1.0
0 5 10 15 20 25 30 35 40 0.8
Time Steps 0.6

Fig. 9: The area of the uncertainty ellipse for different frequencies of 1.4 1.6 1.8 2.0 2.2 2.4 2.6 2.8 3.0
measurements arriving from the pursuers. The X Position (m)
Fig. 12: The ground-truth trajectory of the evader is represented in
In Fig. 10, the effectiveness of the proposed planning red. The positions of the pursuers are shown in blue and green. The
method is illustrated in environments with either no obstacles initial position of pursuers is marked by triangles. The dashed circle
shows the catching criteria.
or a limited number of obstacles. As shown, when there are
no obstacles (left plot), the area of the uncertainty ellipse is
smaller compared to the scenario where the pursuers remain The localization of the robots was based on the odometry of
stationary. In the case where one obstacle is present (right plot), each robot, which increases the uncertainty and difficulty in
the reduction in the area of the uncertainty ellipse becomes localization and prediction. As illustrated in Fig. 12, although
even more pronounced. the pursuers are far from the evader and the localization is
5) Dropped messages: In this experiment, four pursuers are uncertain, they are able to catch the evader. The speed of the
placed randomly in the simulation environment for 30 trials at pursuers in this experiment is twice the speed of the evader
each fraction of dropped messages. As shown in Fig. 11, as (Vp = 2Vq ).
the frequency of dropped messages increases, the average time
step required to catch the evader also increases. Additionally,
V. C ONCLUSIONS
these results demonstrate the reliability of the proposed method
in handling dropped messages. In this paper, we propose a factor graph-based approach
to solve the pursuit-evasion (PE) problem, which is capable
of both tracking and planning within dynamic environments.
B. Hardware Real-World Simulation The pursuit-evasion problem is a critical challenge in robotics
To demonstrate that the provided solution is applicable in and multi-agent systems, where multiple pursuers must effec-
real-world scenarios, an experiment was designed to test it tively track and capture an evader. Our proposed approach is
in practical situations. For this experiment, TurtleBots were robust against dropped messages, which is essential for real-
utilized, and the development was conducted in ROS2. It time applications where communication may be unreliable.
shows the stability of the proposed solution even in real- Additionally, it is designed to be scalable to accommodate
world scenarios. Two pursuers and one evader were considered. different numbers of robots and obstacles, allowing it to be
8

applicable in various operational contexts. [10] F. Dellaert and M. Kaess, Factor Graphs for Robot Perception, ser.
To reduce optimization time, we utilize a single optimizer, Foundations and Trends in Robotics Series. Now Publishers, 2017.
[11] J. Kshirsagar, S. Shue, and J. M. Conrad, “A survey of implementation
which enhances scalability by reducing dimensionality and of multi-robot simultaneous localization and mapping,” in SoutheastCon
computational overhead. This design choice ensures that the 2018. IEEE, 2018, pp. 1–7.
algorithm remains efficient, even as the complexity of the [12] H. Hu, R. Hachiuma, H. Saito, Y. Takatsume, and H. Kajita, “Multi-
camera multi-person tracking and re-identification in an operating room,”
environment increases. Furthermore, we incorporate additional Journal of Imaging, vol. 8, no. 8, p. 219, 2022.
factors into the factor graph to add more constraints, which [13] R. Tarjan, “Depth-first search and linear graph algorithms,” SIAM journal
enhances the accuracy of the estimation and tracking processes. on computing, vol. 1, no. 2, pp. 146–160, 1972.
[14] A. Bundy and L. Wallen, “Breadth-first search,” Catalogue of artificial
This flexibility allows the system to adapt to changing condi- intelligence tools, pp. 13–13, 1984.
tions and maintain performance under various scenarios. [15] F. Yan, J. Jiang, K. Di, Y. Jiang, and Z. Hao, “Multiagent pursuit-
The proposed method is rigorously compared with state-of- evasion problem with the pursuers moving at uncertain speeds,” Journal
of Intelligent & Robotic Systems, vol. 95, pp. 119–135, 2019.
the-art techniques, demonstrating a significant reduction in [16] C. Yu, Y. Dong, Y. Li, and Y. Chen, “Distributed multi-agent deep
both the time required to capture the evader and the average reinforcement learning for cooperative multi-robot pursuit,” The Journal
distance traveled by the pursuers. The conducted hardware of Engineering, vol. 2020, no. 13, pp. 499–504, 2020.
[17] S. Engin, Q. Jiang, and V. Isler, “Learning to play pursuit-evasion with
experiments confirm the robustness of our approach, show- visibility constraints,” in 2021 IEEE/RSJ International Conference on
casing its effectiveness in real-world applications. Further- Intelligent Robots and Systems (IROS). IEEE, 2021, pp. 3858–3863.
more, scalability tests reveal that computation time decreases [18] J. Kuck, S. Chakraborty, H. Tang, R. Luo, J. Song, A. Sabharwal, and
S. Ermon, “Belief propagation neural networks,” Advances in Neural
even as environmental complexity increases, indicating that Information Processing Systems, vol. 33, pp. 667–678, 2020.
our method can handle more demanding situations without [19] J. Selvakumar and E. Bakolas, “Min–max q-learning for multi-player
sacrificing performance. pursuit-evasion games,” Neurocomputing, vol. 475, pp. 1–14, 2022.
[20] L. M. Kirousis and C. H. Papadimitriou, “Searching and pebbling,”
In the future, we aim to integrate Gaussian processes into Theoretical Computer Science, vol. 47, pp. 205–218, 1986.
the solution to enable distributed message passing, which will [21] M. V. Ramana and M. Kothari, “Pursuit-evasion games of high speed
further decrease the time needed to capture the evader. This evader,” Journal of intelligent & robotic systems, vol. 85, pp. 293–306,
2017.
integration could enhance the collaborative capabilities of the [22] A. Von Moll, D. W. Casbeer, E. Garcia, and D. Milutinović, “Pursuit-
robots, allowing them to share information more efficiently and evasion of an evader by multiple pursuers,” in 2018 International
respond dynamically to changes in the environment. Another Conference on Unmanned Aircraft Systems (ICUAS). IEEE, 2018, pp.
133–142.
potential avenue for future work could involve solving the [23] X. An, C. Wu, Y. Lin, M. Lin, T. Yoshinaga, and Y. Ji, “Multi-robot
pursuit-evasion problem in the presence of dynamic obstacles systems and cooperative object transport: Communications, platforms,
and multiple evaders, which would significantly add complex- and challenges,” IEEE Open Journal of the Computer Society, vol. 4,
pp. 23–36, 2023.
ity and realism to the pursuit-evasion scenario. By addressing [24] J. G. Martin, F. J. Muros, J. M. Maestre, and E. F. Camacho, “Multi-
these challenges, we aim to further improve the applicability robot task allocation clustering based on game theory,” Robotics and
of our approach in real-world situations where conditions are Autonomous Systems, vol. 161, p. 104314, 2023.
[25] M. Xie, A. Escontrela, and F. Dellaert, “A factor-graph approach
unpredictable and ever-changing. for optimization problems with dynamics constraints,” arXiv preprint
arXiv:2011.06194, 2020.
R EFERENCES [26] R. Murai, J. Ortiz, S. Saeedi, P. H. Kelly, and A. J. Davison, “A robot
web for distributed many-device localisation,” IEEE Transactions on
[1] J. P. Queralta, J. Taipalmaa, B. C. Pullinen, V. K. Sarker, T. N. Gia, Robotics, 2023.
H. Tenhunen, M. Gabbouj, J. Raitoharju, and T. Westerlund, “Collabo- [27] A. Patwardhan and A. J. Davison, “A distributed multi-robot framework
rative multi-robot search and rescue: Planning, coordination, perception, for exploration, information acquisition and consensus,” arXiv preprint
and active vision,” IEEE Access, vol. 8, pp. 191 617–191 643, 2020. arXiv:2310.01930, 2023.
[2] A. Patwardhan, R. Murai, and A. J. Davison, “Distributing collaborative [28] D. Fox, W. Burgard, and S. Thrun, “The dynamic window approach to
multi-robot planning with gaussian belief propagation,” IEEE Robotics collision avoidance,” IEEE Robotics & Automation Magazine, vol. 4,
and Automation Letters, vol. 8, no. 2, pp. 552–559, 2023. no. 1, pp. 23–33, 1997.
[3] C. Lytridis, V. G. Kaburlasos, T. Pachidis, M. Manios, E. Vrochidou, [29] F. Dellaert, “Factor graphs and GTSAM: A hands-on introduction,”
T. Kalampokas, and S. Chatzistamatis, “An overview of cooperative Georgia Institute of Technology, Tech. Rep, vol. 2, p. 4, 2012.
robotics in agriculture,” Agronomy, vol. 11, no. 9, p. 1818, 2021. [30] D. W. Marquardt, “An algorithm for least-squares estimation of non-
[4] H. A. Jaafar, C.-H. Kao, and S. Saeedi, “MR.CAP: Multi-Robot Joint linear parameters,” Journal of the society for Industrial and Applied
Control and Planning for Object Transport,” IEEE Control Systems Mathematics, vol. 11, no. 2, pp. 431–441, 1963.
Letters, vol. 8, pp. 139–144, 2024. [31] G. J. McLachlan, “Mahalanobis distance,” Resonance, vol. 4, no. 6, pp.
[5] J. Alonso-Mora, P. Beardsley, and R. Siegwart, “Cooperative collision 20–26, 1999.
avoidance for nonholonomic robots,” IEEE Transactions on Robotics, [32] E. H. Bergou, Y. Diouane, and V. Kungurtsev, “Convergence and
vol. 34, no. 2, pp. 404–420, 2018. complexity analysis of a levenberg–marquardt algorithm for inverse
[6] R. Ramaithitima, S. Srivastava, S. Bhattacharya, A. Speranzon, and problems,” Journal of Optimization Theory and Applications, vol. 185,
V. Kumar, “Hierarchical strategy synthesis for pursuit-evasion problems,” pp. 927–944, 2020.
in Proceedings of the Twenty-second European Conference on Artificial [33] F. Dellaert, “Factor graphs: Exploiting structure in robotics,” Annual
Intelligence, 2016, pp. 1370–1378. Review of Control, Robotics, and Autonomous Systems, vol. 4, no. 1, pp.
[7] Y. Kartal, K. Subbarao, A. Dogan, and F. Lewis, “Optimal game 141–166, 2021.
theoretic solution of the pursuit-evasion intercept problem using on- [34] D. Koller and N. Friedman, Probabilistic graphical models: principles
policy reinforcement learning,” International Journal of Robust and and techniques. MIT press, 2009.
Nonlinear Control, vol. 31, no. 16, pp. 7886–7903, 2021. [35] Q. Yang and R. Parasuraman, “Game-theoretic utility tree for multi-robot
[8] Z. Zhou and H. Xu, “Decentralized optimal large scale multi-player cooperative pursuit strategy,” in ISR Europe 2022; 54th International
pursuit-evasion strategies: A mean field game approach with reinforce- Symposium on Robotics. VDE, 2022, pp. 1–7.
ment learning,” Neurocomputing, vol. 484, pp. 46–58, 2022.
[9] Y. Ho, A. Bryson, and S. Baron, “Differential games and optimal pursuit-
evasion strategies,” IEEE Transactions on Automatic Control, vol. 10,
no. 4, pp. 385–389, 1965.

You might also like