Towards An Asp-Based Architecture For Autonomous Uavs in Dynamic Environments (Extended Abstract)
Towards An Asp-Based Architecture For Autonomous Uavs in Dynamic Environments (Extended Abstract)
Towards An Asp-Based Architecture For Autonomous Uavs in Dynamic Environments (Extended Abstract)
Abstract
Traditional AI reasoning techniques have been used successfully in many domains, including logistics,
scheduling and game playing. This paper is part of a project aimed at investigating how such techniques
can be extended to coordinate teams of unmanned aerial vehicles (UAVs) in dynamic environments. Specif-
ically challenging are real-world environments where UAVs and other network-enabled devices must com-
municate to coordinate – and communication actions are neither reliable nor free. Such network-centric
environments are common in military, public safety and commercial applications, yet most research (even
multi-agent planning) usually takes communications among distributed agents as a given. We address this
challenge by developing an agent architecture and reasoning algorithms based on Answer Set Programming
(ASP). Although ASP has been used successfully in a number of applications, to the best of our knowledge
this is the first practical application of a complete ASP-based agent architecture. It is also the first practical
application of ASP involving a combination of centralized reasoning, decentralized reasoning, execution
monitoring, and reasoning about network communications.
To appear in Theory and Practice of Logic Programming (TPLP).
1 Introduction
Unmanned Aerial Vehicles (UAVs) promise to revolutionize the way in which we use our airspace.
From talk of automating the navigation for major shipping companies to the use of small heli-
copters as ”deliverymen” that drop your packages at the door, it is clear that our airspaces will
become increasingly crowded in the near future. This increased utilization and congestion has
created the need for new and different methods of coordinating assets using the airspace. Cur-
rently, airspace management is the job for mostly human controllers. As the number of entities
using the airspace vastly increases—many of which are autonomous—the need for improved
autonomy techniques becomes evident.
The challenge in an environment full of UAVs is that the world is highly dynamic and the com-
munications environment is uncertain, making coordination difficult. Communicative actions in
such setting are neither reliable nor free.
The work discussed here is in the context of the development of a novel application of network-
aware reasoning and of an intelligent mission-aware network layer to the problem of UAV coor-
dination. Typically, AI reasoning techniques do not consider realistic network models, nor does
the network layer reason dynamically about the needs of the mission plan. With network-aware
reasoning, a reasoner (either centralized or decentralized) factors in the communications network
and its conditions.
2 M. Balduccini, W. Regli and D. Nguyen
In this paper we provide a general overview of the approach, and then focus on the aspect
of network-aware reasoning. We address this challenge by developing an agent architecture and
reasoning algorithms based on Answer Set Programming (ASP, (Gelfond and Lifschitz 1991;
Marek and Truszczynski 1999; Baral 2003)). ASP has been chosen for this task because it enables
high flexibility of representation, both of knowledge and of reasoning tasks. Although ASP has
been used successfully in a number of applications, to the best of our knowledge this is the first
practical application of a complete ASP-based agent architecture. It is also the first practical
application of ASP involving a combination of centralized reasoning, decentralized reasoning,
execution monitoring, and reasoning about network communications.
The next section describes relevant systems and reasoning techniques, and is followed by a
motivating scenario that applies to UAV coordination. The Technical Approach section describes
network-aware reasoning and demonstrates the level of sophistication of the behavior exhibited
by the UAVs using an example problem instance. Finally, we draw conclusions and discuss future
work.
2 Related Work
Incorporating network properties into planning and decision-making has been investigated in (Us-
beck et al. 2012). The authors’ results indicate that plan execution effectiveness and performance
is increased with the increased network-awareness during the planning phase. The UAV coor-
dination approach in this current work combines network-awareness during the reasoning pro-
cesses with a plan-aware network layer.
The problem of mission planning for UAVs under communication constraints has been ad-
dressed in (Kopeikin et al. 2013), where an ad-hoc task allocation process is employed to engage
under-utilized UAVs as communication relays. In our work, we do not separate planning from the
engagement of under-utilized UAVs, and do not rely on ad-hoc, hard-wired behaviors. Our ap-
proach gives the planner more flexibility and finer-grained control of the actions that occur in the
plans, and allows for the emergence of sophisticated behaviors without the need to pre-specify
them.
The architecture adopted in this work is an evolution of (Balduccini and Gelfond 2008), which
can be viewed as an instantiation of the BDI agent model (Rao and Georgeff 1991; Wooldridge
2000). Here, the architecture has been extended to include a centralized mission planning phase,
and to reason about other agents’ behavior. Recent related work on logical theories of intentions
(Blount et al. 2014) can be further integrated into our approach to allow for a more systematic
hierarchical characterization of actions, which is likely to increase performance.
Traditionally, AI planning techniques have been used (to great success) to perform multi-
agent teaming, and UAV coordination. Multi-agent teamwork decision frameworks such as the
ones described in (Pynadath and Tambe 2002) may factor communication costs into the decision-
making. However, the agents do not actively reason about other agent’s observed behavior, nor
about the communication process. Moreover, policies are used as opposed to online reasoning
about models of domains and of agent behavior.
The reasoning techniques used in the present work have already been successfully applied to
domains ranging from complex cyber-physical systems to workforce scheduling. To the best of
our knowledge, however, they have never been applied to domains combining realistic commu-
nications and multiple agents.
Finally, high-fidelity multi-agent simulators (e.g., AgentFly (David Sislak and Pechoucek
Towards an ASP-Based Architecture for Autonomous UAVs in Dynamic Environments 3
2012)) do not account for network dynamism nor provide a realistic network model. For this
reason, we base our simulator on the Common Open Research Emulator (CORE) (Ahrenholz
2010). CORE provides network models in which communications are neither reliable nor free.
3 Motivating Scenario
To motivate the need for network-aware reasoning and mission-aware networking, consider a
simple UAV coordination problem, depicted in Figure 2, in which two UAVs are tasked with
taking pictures of a set of three targets, and with relaying the information to a home base.
Fixed relay access points extend the communications range of the home base. The UAVs can
share images of the targets with each other and with the relays when they are within radio range.
The simplest solution to this problem consists in entirely disregarding the networking component
of the scenario, and generating a mission plan in which each UAV flies to a different set of targets,
takes pictures of them, and flies back to the home base, where the pictures are transferred. This
solution, however, is not satisfactory. First of all, it is inefficient, because it requires that the UAVs
fly all the way back to the home base before the images can be used. The time it takes for the
UAVs to fly back may easily render the images too outdated to be useful. Secondly, disregarding
the network during the reasoning process may lead to mission failure — especially in the case of
unexpected events, such as enemy forces blocking transit to and from the home base after a UAV
has reached a target. Even if the UAVs are capable of autonomous behavior, they will not be able
to complete the mission unless they take advantage of the network.
Another common solution consists of acknowledging the availability of the network, and as-
suming that the network is constantly available throughout plan execution. A corresponding mis-
sion plan would instruct each UAV to fly to a different set of targets, and take pictures of them,
while the network relays the data back to the home base. This solution is optimistic in that it
assumes that the radio range is sufficient to reach the area where the targets are located, and that
the relays will work correctly throughout the execution of the mission plan.
This optimistic solution is more efficient than the previous one, since the pictures are received
by the home base soon after they are taken. Under realistic conditions, however, the strong as-
sumptions it relies upon may easily lead to mission failure—for example, if the radio range does
not reach the area where the targets are located.
In this work, the reasoning processes take into account not only the presence of the network,
but also its configuration and characteristics, taking advantage of available resources whenever
possible. The mission planner is given information about the radio range of the relays and deter-
mines, for example, that the targets are out of range. A possible mission plan constructed by this
information into account consists in having one UAV fly to the targets and take pictures, while
the other UAV remains in a position to act as a network bridge between the relays and the UAV
that is taking pictures. This solution is as efficient as the optimistic solution presented earlier, but
is more robust, because it does not rely on the same strong assumptions.
Conversely, when given a mission plan, an intelligent network middleware service capable
of sensing conditions and modifying network parameters (e.g., modify network routes, limit
bandwidth to certain applications, and prioritize network traffic) is able to adapt the network to
provide optimal communications needed during plan execution. A relay or UAV running such a
middleware is able to interrupt or limit bandwidth given to other applications to allow the other
UAV to transfer images and information toward home base. Without this traffic prioritization,
network capacity could be reached prohibiting image transfer.
4 M. Balduccini, W. Regli and D. Nguyen
4 Technical Approach
In this section, we formulate the problem in more details, discuss the design of the agent architec-
ture and of the reasoning modules, and demonstrate the sophistication of the resulting behavior
of the agents in two scenarios. We assume familiarity with ASP, and refer the reader to (Gelfond
and Lifschitz 1991; Niemelä and Simons 2000; Baral 2003) for an introduction on the topic.
1 For simplicity, we assume that all the radio nodes use comparable network devices, and that thus ρ is unique throughout
the environment.
2 The tasks in the various boxes are executed only when necessary.
Towards an ASP-Based Architecture for Autonomous UAVs in Dynamic Environments 5
the UAVs may move in and out of range of each other and of the other network nodes. Un-
expected events, such as relays failing or temporarily becoming disconnected, may also affect
network connectivity. When that happens, each UAV reasons in a decentralized, autonomous
fashion to overcome the issues. As mentioned earlier, the key to taking into account, and hope-
fully compensating for, any unexpected circumstances is to actively employ, in the reasoning
processes, realistic and up-to-date information about the communications state.
The control loop used by each UAV is shown in Figure 1b. In line with (Gelfond and Lifschitz
1991; Marek and Truszczynski 1999; Baral 2003), the loop and the I/O functions are imple-
mented procedurally, while the reasoning functions (Goal Achieved , Unexpected Observations,
Explain Observations, Compute Plan) are implemented in ASP. The loop takes in input the
mission goal and the mission plan, which potentially includes courses of actions for multiple
UAVs. Functions New Observations, Next Action, Tail, Execute, Record Execution perform ba-
sic manipulations of data structures, and interface the agent with the execution and perception
layers. Functions Next Action and Tail are assumed to be capable of identifying the portions
of the mission plan that are relevant to the UAV executing the loop. The remaining functions
occurring in the control loop implement the reasoning tasks. Central to the architecture is the
maintenance of a history of past observations and actions executed by the agent. Such history is
stored in variable H and updated by the agent when it gathers observations about its environment
and when it performs actions. It is important to note that variable H is local to the specific agent
executing the loop, rather than shared among the UAVs (which would be highly unrealistic in a
communication-constrained environment). Thus, different agents will develop differing views of
the history of the environment as execution unfolds. At a minimum, the difference will be due to
the fact that agents cannot observe each other’s actions directly, but only their consequences, and
even those are affected by the partial observability of the environment.
Details on the control loop can be found in (Balduccini and Gelfond 2008). With respect to
that version of the loop, the control loop used in the present work does not allow for the selection
of a new goal at run-time, but it extends the earlier control loop with the ability to deal with,
and reason about, an externally-provided, multi-agent plan, and to reason about other agents’
behavior. We do not expect run-time selection of goals to be difficult to embed in the control
loop presented here, but doing so is out of the scope of the current phase of the project.
Observa9ons
Observa9ons
P := M ;
Explain
Explain
H := New Observations();
Observa9ons
+
Observa9ons
+
while ¬Goal Achieved(H , G) do
Explana9ons
Explana9ons
if Unexpected Observations(H ) then
Local
Planner
Local
Planner
H := Explain Observations(H );
Plan
Plan
P := Compute Plan(G, H , P );
Execute
Execute
end if
A := Next Action(P );
Network
Node
1
Network
Node
k
Network
Network
P := Tail(P );
State
State
Execute(A);
Plan-‐Aware
Networking
Plan-‐Aware
Networking
Component
Component
H := Record Execution(H , A);
Networking
Decisions
Networking
Decisions
H := H ∪ New Observations();
loop
Fig. 1: (a) Information flow (left) ; (b) Agent Control Loop (right).
atom h(f , s). If f is false, this is expressed by ¬h(f , s). The occurrence of an action a ∈ A at
step s is represented as o(a, s).
The history of the environment is formalized in ASP by two types of statements: obs(f , true, s)
states that f was observed to be true at step s (respectively, obs(f , false, s) states that f was false);
hpd (a, s) states that a was observed to occur at s. Because in the this paper other agents’ actions
are not observable, the latter expression is used only to record an agent’s own actions.
Objects in the UAV domain discussed in this paper are the home base, a set of fixed relays,
a set of UAVs, a set of targets, and a set of waypoints. The waypoints are used to simplify the
path-planning task, which we do not consider in the present work. The locations that the UAVs
can occupy and travel to are the home base, the waypoints, and the locations of targets and fixed
relays. The current location, l , of UAV u is represented by a fluent at(u, l ). For each location,
the collection of its neighbors is defined by relation next(l , l 0 ). UAV motion is restricted to occur
only from a location to a neighboring one. The direct effect of action move(u, l ), intuitively
stating that UAV u moves to location l , and its executability condition are described by the
following rules:
h(at(U , L2), S + 1) ←
o(move(U , L2), S ),
h(at(U , L1), S ),
next(L1, L2)·
← o(move(U , L2), S ),
h(at(U , L1), S ),
not next(L1, L2)·
The fact that two radio nodes are in radio contact is encoded by fluent in contact(r1 , r2 ). The
Towards an ASP-Based Architecture for Autonomous UAVs in Dynamic Environments 7
next two rules provide a recursive definition of the fluent, represented by means of state con-
straints:
h(in contact(R1, R2), S ) ←
R1 6= R2, ¬h(down(R1), S ), ¬h(down(R2), S ),
h(at(R1, L1), S ), h(at(R2, L2), S ), range(Rg), dist2(L1, L2, D), D ≤ Rg 2 ·
plan (for example, if the UAV malfunctions or is destroyed). Normally, the reasoning agent will
expect a UAV that aborts execution to remain in its latest location. In certain circumstances,
however, a UAV may need to deviate completely from the mission plan. To accommodate for
this situation, the agent may hypothesize that a UAV began behaving in an unpredictable way
(from the agent’s point of view) after aborting plan execution. The following choice rule allows
an agent to consider all of the possible explanations:
{ hpd (break (R), S ), hpd (aborted (U , S )), hpd (unpredictable(U , S )) }·
A constraint ensures that unpredictable behavior can be considered only if a UAV is believed to
have aborted the plan. If that happens, the following choice rule is used to consider all possible
courses of actions from the moment the UAV became unpredictable to the current time step.
{hpd (move(U , L), S 0 ) : S 0 ≥ S : S 0 < currstep} ← hpd (unpredictable(U , S ))·
In practice, such a thought process is important to enable coordination with other UAVs when
communications between them are impossible, and to determine the side-effects of the inferred
courses of actions and potentially take advantage of them (e.g., “the UAV must have flown by
target t3 . Hence, it is no longer necessary to take a picture of t3 ”). A minimize statement ensures
that only cardinality-minimal diagnoses are found:
#minimize[hpd (break (R), S ), hpd (aborted (U , S )), hpd (unpredictable(U , S ))]·
An additional effect of this statement is that the reasoning agent will prefer simpler expla-
nations, which assume that a UAV aborted the execution of the mission plan and stopped, over
those hypothesizing that the UAV engaged in an unpredictable course of actions.
Function Compute Plan, as well as the mission planner, compute a new plan using a rather
traditional approach, which relies on a choice rule for generation of candidate sequences of ac-
tions, constraints to ensure the goal is achieved, and minimize statements to ensure optimality of
the plan with respect to the given metrics.
Next, we outline a scenario demonstrating the features of our approach, including the ability
to work around unexpected problems autonomously.
Example Instance. Consider the environment shown in in Figure 2. Two UAVs, u1 and u2 are
initially located at the home base in the lower left corner. The home base, relays and targets are
positioned as shown in the figure, and the radio range is set to 7 grid units.
The mission planner finds a plan in which the UAVs begin by traveling toward the targets.
While u1 visits the first two targets, u2 positions itself so as to be in radio contact with u1 (Fig-
ure 2a). Upon receipt of the pictures, u2 moves within range of the relays to transmit the pictures
to the home base. At the same time, u1 flies toward the final target, where it will be reached by
u2 to exchange the final picture.
Now let us consider the impact of unexpected events during mission execution: while u2 is
flying back to re-connect with the relays, it observes (“Observe” step of the architecture from
Figure 1) that the home base is unexpectedly not in radio contact (Figure 2b). Hence, u2 uses
the available observations to determine plausible causes (“Explain” step of the architecture). In
this instance, u2 observes that relays r5 , r6 , r7 and all the network nodes South of them are not
reachable via the network. Based on knowledge of the layout of the network, u2 determines that
the simplest plausible explanation is that those three relays must have stopped working while
u2 was out of radio contact (e.g., started malfunctioning or have been destroyed). Next, u2 re-
plans (“Local Planner” step of the architecture). The plan is created based on the assumption that
Towards an ASP-Based Architecture for Autonomous UAVs in Dynamic Environments 9
(a) Step 5: u1 transmitting to u2 . (b) Step 6: Nodes 5-7 have (c) Step 7: u2 re-plans, moves
failed. closer to home base.
Fig. 2: Re-planning after relay node failure between steps 5 and 6 forcing the UAVs to re-plan.
u1 will continue executing the mission plan. This assumption can be later withdrawn if obser-
vations prove it false. Following the new plan, u2 moves further South towards the home base
(Figure 2c). Simultaneously, u1 continues with the execution of the mission plan, unaware that
the connectivity has changed and that u2 has deviated from the mission plan. After successfully
relaying the pictures to the home base, u2 moves back towards u1 . UAV u1 , on the other hand,
reaches the expected rendezvous point, and observes that u2 is not where expected (Figure 2d).
UAV u1 does not know the actual position of u2 , but its absence is evidence that u2 must have
deviated from the plan at some point in time. Thus, u1 ’s must now replan. Not knowing u2 ’s
state, u1 ’s plan is to fly South to relay the missing picture to the home base on its own. This
plan still does not deal with the unavailability of r5 , r6 , r7 , since u1 has not yet had a chance to
get in radio contact with the relays and observe the current network connectivity state. The two
UAVs continue with the execution of their new plans and eventually meet, unexpectedly for both
(Figure 2e). At that point, they automatically share the final picture. Both now determine that the
10 M. Balduccini, W. Regli and D. Nguyen
mission can be completed by flying South past the failed relays, and execute the corresponding
actions.
References
A HRENHOLZ , J. 2010. Comparison of CORE network emulation platforms. In IEEE Military Communi-
cations Conf.
BALDUCCINI , M. AND G ELFOND , M. 2003. Diagnostic reasoning with A-Prolog. Journal of Theory and
Practice of Logic Programming (TPLP) 3, 4–5 (Jul), 425–461.
BALDUCCINI , M. AND G ELFOND , M. 2008. The AAA Architecture: An Overview. In AAAI Spring
Symp.: Architectures for Intelligent Theory-Based Agents.
BARAL , C. 2003. Knowledge Representation, Reasoning, and Declarative Problem Solving. Cambridge
University Press.
BARAL , C. AND G ELFOND , M. 2000. Reasoning Agents In Dynamic Domains. In Workshop on Logic-
Based Artificial Intelligence. Kluwer Academic Publishers, 257–279.
B LOUNT, J., G ELFOND , M., AND BALDUCCINI , M. 2014. Towards a Theory of Intentional Agents. In
Knowledge Representation and Reasoning in Robotics. AAAI Spring Symp. Series.
DAVID S ISLAK , P REMYSL VOLF, S. K. AND P ECHOUCEK , M. 2012. AgentFly: Scalable, High-Fidelity
Framework for Simulation, Planning and Collision Avoidance of Multiple UAVs. Wiley Inc., Chapter 9,
235–264.
G ELFOND , M. AND L IFSCHITZ , V. 1991. Classical Negation in Logic Programs and Disjunctive
Databases. New Generation Computing 9, 365–385.
G ELFOND , M. AND L IFSCHITZ , V. 1993. Representing Action and Change by Logic Programs. Journal
of Logic Programming 17, 2–4, 301–321.
KOPEIKIN , A. N., P ONDA , S. S., J OHNSON , L. B., AND H OW, J. P. 2013. Dynamic Mission Planning for
Communication Control in Multiple Unmanned Aircraft Teams. Unmanned Systems 1, 1, 41–58.
M AREK , V. W. AND T RUSZCZYNSKI , M. 1999. The Logic Programming Paradigm: a 25-Year Perspective.
Springer Verlag, Berlin, Chapter Stable Models and an Alternative Logic Programming Paradigm, 375–
398.
M C C ARTHY, J. 1998. Elaboration Tolerance.
N IEMEL Ä , I. AND S IMONS , P. 2000. Logic-Based Artificial Intelligence. Kluwer Academic Publishers,
Chapter Extending the Smodels System with Cardinality and Weight Constraints.
P YNADATH , D. V. AND TAMBE , M. 2002. The Communicative Multiagent Team Decision Problem:
Analyzing Teamwork Theories and Models. JAIR 16, 389–423.
R AO , A. S. AND G EORGEFF , M. P. 1991. Modeling Rational Agents within a BDI-Architecture. In Proc.
of the Int’l Conf. on Principles of Knowledge Representation and Reasoning.
U SBECK , K., C LEVELAND , J., AND R EGLI , W. C. 2012. Network-centric ied detection planning.
IJIDSS 5, 1, 44–74.
W OOLDRIDGE , M. 2000. Reasoning about Rational Agents. MIT Press.