Flow-Based Synthesis of Reactive Tests for Discrete Decision-Making Systems with Temporal Logic Specifications

Josefine B. Graebener    Apurva S. Badithela    Denizalp Goktas    Wyatt Ubellacker    Eric V. Mazumdar   
Aaron D. Ames
   Richard M. Murray This work was supported in by the U.S. Air Force Office of Scientific Research (AFOSR) under Grant FA9550-22-1-0333 and Grant FA9550-19-1-0302.* These authors contributed equally. Corresponding author: A.S. Badithela.J.B. Graebener is with the Graduate Aerospace Laboratories of California Institute of Technology, Pasadena CA 91125 USA (e-mail: [email protected]).A.S. Badithela, W. Ubellacker, A.D. Ames, R.M. Murray are affiliated with Control and Dynamical Systems, California Institute of Technology, Pasadena CA 91125 USA. (e-mail: {apurva, wubellac, ames, murray}@caltech.edu).D. Goktas is with the Department of Computer Science, Brown University, Providence RI 02912 USA (e-mail: [email protected]).E.V. Mazumdar is with the Department of Computing and Mathematical Sciences, California Institute of Technology, Pasadena CA 91125 USA (e-mail: [email protected]).
Abstract

Designing tests to evaluate if a given autonomous system satisfies complex specifications is challenging due to the complexity of these systems. This work proposes a flow-based approach for reactive test synthesis from temporal logic specifications, enabling the synthesis of test environments consisting of static and reactive obstacles and dynamic test agents. The temporal logic specifications describe desired test behavior, including system requirements as well as a test objective that is not revealed to the system. The synthesized test strategy places restrictions on system actions in reaction to the system state. The tests are minimally restrictive and accomplish the test objective while ensuring realizability of the system’s objective without aiding it (semi-cooperative setting). Automata theory and flow networks are leveraged to formulate a mixed-integer linear program (MILP) to synthesize the test strategy. For a dynamic test agent, the agent strategy is synthesized for a GR(1) specification constructed from the solution of the MILP. If the specification is unrealizable by the dynamics of the test agent, a counterexample-guided approach is used to resolve the MILP until a strategy is found. This flow-based, reactive test synthesis is conducted offline and is agnostic to the system controller. Finally, the resulting test strategy is demonstrated in simulation and experimentally on a pair of quadrupedal robots for a variety of specifications.

Index Terms:
Test and Evaluation, Reactive Test Synthesis, Formal Methods, Network Flows, Optimization

I Introduction

Refer to caption
Figure 1: Overview of the flow-based test synthesis framework which consists of three key parts: i) graph construction, ii) routing optimization, and iii) test environment synthesis (e.g., reactive test strategy / test agent strategy, static obstacles).

Safety is imperative for a wide range of autonomous systems, from self-driving vehicles, to autonomous flight and space missions, to assistive robotics, and medical devices. To ensure safety, various challenges need to be addressed [1]. For example, these systems need to be aware of their own state and adapt their behavior in response to the environment, which requires reasoning over both discrete and continuous inputs and states. Deployment of these safety-critical autonomous systems requires thorough testing, both in simulation and in the operating environment, which is crucial to validating the system’s performance. Typically, test cases are designed to uncover bugs and corner cases in the system design that lead to safety-critical errors. However, for these tests to be successful, executing them requires setting up a test environment that is consistent with the test case while also allowing for correct system implementations. To make this process efficient, it is equally important to automatically synthesize these test environments for the desired test case, i.e., automatically synthesize test environments that reveal corner cases.

In this work, we focus on synthesizing test environments (e.g., placement of obstacles, agent strategies) to test the discrete decision-making logic in an autonomous system. Tests are synthesized from temporal logic descriptions of desired test behavior which encodes aspects of the test unknown to the system (test objective) in addition to the system requirements (system objective). The test objective is meant to capture the “challenging” aspect of the test in terms of high-level decision-making, and is not revealed to the system. The purpose of testing is to check that the system can take correct decisions despite being given opportunities to fail, i.e., verify correct decision in the presence of “hard tests” or corner cases. Our framework routes the system to the test objective while also giving the system freedom to make decisions and ensuring that the test is fair (i.e., system can satisfy its requirements if it makes correct decisions). Therefore, we synthesize tests that minimally restrict the system’s decision-making to realize the desired test behavior. Fig. 1 provides an overview of the flow-based test synthesis framework.

I-A Background on Test and Evaluation

Tests are typically manually designed by test engineers — identifying challenging test cases and manually constructing the test environments either from expert experience or failure reports. Examples of this include the qualification tests in the DARPA Urban Challenge, track testing by self-driving car companies [2, 3], and constructing test scenarios in simulation using tools such as CARLA [4] or Scenic [5], for which test engineers either partially specify the scenarios or recreate them from crash reports [6, 7, 3]. Due to the time-intensive nature of this endeavor, automatically finding challenging tests for safety-critical systems is an active area of research [8, 9]. For self-driving vehicles, there is ongoing effort to standardize the testing process [10, 11].

Black-box optimization algorithms [12] and reinforcement learning [13, 14] have been used to search over a specified input domain to find a falsifying input that leads to a trajectory that violates a metric of mission success. This metric can be derived from formal temporal logic specifications [15, 16, 17, 18, 19, 20, 21] or from control barrier functions [22]. However, falsification algorithms typically require a well-defined test environment, and find a falsifying trace by fine-tuning the parameters in that scenario. The framework proposed in this paper is complementary to these approaches — our focus is on synthesizing high-level strategies for the test environment, and continuous parameters of the synthesized test environment (e.g., continuous pose values of test agents, friction coefficients, exact timing of events) can be inputs to falsification algorithms for fine-tuning.

Typically, high-level choices of autonomous robotic systems exhibit discrete decision-making [23, 24]. The use of linear temporal logic (LTL) model checkers for testing has been explored in [25, 26, 27, 28]. In these works, counterexamples from model-checking are used to construct test cases for deterministic systems and are inconclusive if the system behavior deviates from the expected test case. However, since robotic systems are often reactive, and because we want to generate tests without specific knowledge of the system controller, the generated tests must be able to adapt or react to system behavior at runtime. Our test synthesis procedure is gray-box in the sense that it requires knowledge of a nondeterministic model of the system but is agnostic to the high-level controller of the system and is completely black-box to models and controllers at lower levels of abstraction.

Adaptive specification-based testing using discrete logics has been explored in [29, 30, 31, 32, 33]. Particularly in [29], an adaptive test strategy is synthesized using reactive synthesis [34] from LTL specifications of the system and the fault model, both of which are specified by the test engineer. This adaptive test strategy ensures that the resulting test trace demonstrates a fault if the system implementation is faulty according to the fault model. However, these fault models must be carefully specified over the outputs of the system. While this is incredibly useful for specifying and catching sub-system level faults, it becomes intractable for specifying complex system-level faults resulting from multiple outputs. Our test synthesis framework is also specification-based and adaptive, but we specify desired test behavior in the form of test objectives instead of specifying system-level faults. Furthermore, in [29], the adaptive test strategies are synthesized from fault models that are designed for coverage goals corresponding to specification coverage, without accounting for the freedom of the system to satisfy its own requirements. We seek to synthesize reactive test strategies that demonstrate the test objective while placing minimal restrictions on the system. The automata-theoretic tools used in this paper build on concepts used in correct-by-construction synthesis and model checking [35, 36]. This background is covered in Section II.

In [37], testing of reactive systems was introduced as a game between two players, where the tester and the system try to reveal and hide faults, respectively. Similarly, in [38] the test strategy is found by reasoning over a game graph to optimize reachability and coverage metrics. Testing in cooperative game settings has been explored in [39, 40]. However, the reactive test synthesis problem we consider is neither fully adversarial nor fully cooperative — a well-designed system is cooperative with the test environment in realizing the system objective, but since the system is agnostic to test objective, it need not cooperate with the test environment in realizing it.

We consider test environments that can consist of the following: static obstacles that restrict the system throughout the test, reactive obstacles and a dynamic test agent that is reactive to system behavior at runtime. In particular, we leverage flow networks to pose the test synthesis problem as a mixed-integer linear program (MILP). In recent years, network flow optimization frameworks with tight convex relaxations have led to massive computational speed-ups in solving robot motion planning problems [41, 42]. Network flow-based mixed integer programs have also been to synthesize playable game levels in video games [43], which was then applied to construct playable scenarios in robotics settings [44].

In previous work [45], we formulated this problem in a semi-cooperative setting as a min-max Stackelberg game with coupled constraints. Despite being defined over continuous variables with an affine objective and affine constraints, the prior formulation resulted in slow runtimes and did not guarantee that the optimal solution would realize the test objective. Furthermore, it could only reactively restrict system actions, and did not characterize how to translate these restrictions to the choice of a test agent strategy. In this work, we present a simpler formulation of the routing optimization as an MILP, which led to an improvement in runtime. Test strategies from optimal solutions of the MILP are guaranteed to realize the test objective in a least restrictive fashion. Finally, we present a formal approach to restricting system actions in the form of static/reactive obstacles and dynamic test agent strategies, including a counterexample-guided approach to synthesize a test agent strategy from the solution of the routing optimization.

I-B Contributions

In this work, we study the problem of synthesizing a reactive test strategy for a test environment for discrete decision-making systems given a formal test objective, unknown to the system under test. In particular, we ask whether such a test strategy exists without making it impossible for the system to meet its specification.

To obtain the main results of this paper, we first characterize system and test objectives using a variety of specification patterns commonly used in robotic missions [46]. We formalize both the restrictiveness and the feasibility of a test strategy, i.e., a system should have freedom to make decisions and a correct system should be able to pass the test. Secondly, these conditions are translated into a routing optimization on a flow network to capture the requirement that all test executions that satisfy the system objective should demonstrate the test objective. For each test environment, we set up an MILP to find cuts, corresponding to restrictions on system actions, on the flow network. For static and reactive obstacles, the solution of the MILP is realized in the form of an obstacle placement test strategy. We prove that the optimal solutions to the MILPs solve the aforementioned routing requirement. Third, in the case of dynamic agents, we match the restrictions on system actions to a test agent strategy via GR(1) synthesis [35, 47]. Furthermore, we use a counterexample-guided approach to exclude unrealizable solutions from the MILP until we find a realizable test agent strategy. We prove that test agent strategies synthesized in this manner exactly correspond to the test strategy found from the MILP. In the extended version, we prove that the routing problem is NP-hard via a reduction from 3-SAT. Despite this, our framework can reliably handle medium-sized problems with thousands of integer variables. Empirical runtimes for parametrized problems are also provided.

Finally, the test synthesis framework is demonstrated on simulated grid world settings and on hardware with a pair of quadrupedal robots. For all experiments, our framework synthesizes test strategies that place the fewest possible restrictions on the system over the course of the test either by obstacle placement or a dynamic agent. In experiments with reactive obstacles and dynamic agents, the reactive test strategy results in a different test execution depending on system behavior. Despite this, the system is always routed through the test objective (e.g., being put in low-fuel state or having to walk over challenging terrain).

II Preliminaries

This section introduces concepts from automata theory and network flows that are relevant to this work.

II-A Automata Theory and Temporal Logic

Definition 1 (Finite Transition System).

A finite transition system (FTS) is the tuple

TS(S,A,δ,S0,AP,L),𝑇𝑆𝑆𝐴𝛿subscript𝑆0𝐴𝑃𝐿TS\coloneqq(S,A,\delta,S_{0},AP,L),\,\vspace{-1mm}italic_T italic_S ≔ ( italic_S , italic_A , italic_δ , italic_S start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_A italic_P , italic_L ) ,

where S𝑆Sitalic_S denotes a finite set of states, A𝐴Aitalic_A is a finite set of actions, δ:S×AS:𝛿𝑆𝐴𝑆\delta:S\times A\rightarrow Sitalic_δ : italic_S × italic_A → italic_S the transition relation, S0subscript𝑆0S_{0}italic_S start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT the set of initial states, AP𝐴𝑃APitalic_A italic_P the set of atomic propositions, and L:S2AP:𝐿𝑆superscript2𝐴𝑃L:S\rightarrow 2^{AP}italic_L : italic_S → 2 start_POSTSUPERSCRIPT italic_A italic_P end_POSTSUPERSCRIPT denotes the labeling function. We denote the transitions in TS𝑇𝑆TSitalic_T italic_S as TS.E:={(s,s)S×S| if aA s.t. δ(s,a)=s}formulae-sequence𝑇𝑆assign𝐸conditional-set𝑠superscript𝑠𝑆𝑆 if 𝑎𝐴 s.t. 𝛿𝑠𝑎superscript𝑠TS.E:=\{(s,s^{\prime})\in S\times S\>|\>\text{ if }\exists a\in A\text{ s.t. }% \delta(s,a)=s^{\prime}\}italic_T italic_S . italic_E := { ( italic_s , italic_s start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ∈ italic_S × italic_S | if ∃ italic_a ∈ italic_A s.t. italic_δ ( italic_s , italic_a ) = italic_s start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT }. We refer to the states of TS𝑇𝑆TSitalic_T italic_S as TS.Sformulae-sequence𝑇𝑆𝑆TS.Sitalic_T italic_S . italic_S, and similarly denote the other elements of the tuple. An execution σ𝜎\sigmaitalic_σ is an infinite sequence σ=s0s1𝜎subscript𝑠0subscript𝑠1\sigma=s_{0}s_{1}\dotsitalic_σ = italic_s start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT italic_s start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT …, where s0S0subscript𝑠0subscript𝑆0s_{0}\in S_{0}italic_s start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ∈ italic_S start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT and skSsubscript𝑠𝑘𝑆s_{k}\in Sitalic_s start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ∈ italic_S is the state at time k𝑘kitalic_k. We denote the finite prefix of the trace σ𝜎\sigmaitalic_σ up to the current time k𝑘kitalic_k as σksubscript𝜎𝑘\sigma_{k}italic_σ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT. A strategy π𝜋\piitalic_π is a function π:(TS.S)TS.STS.A\pi:(TS.S)^{*}TS.S\rightarrow TS.Aitalic_π : ( italic_T italic_S . italic_S ) start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT italic_T italic_S . italic_S → italic_T italic_S . italic_A.

Definition 2 (System).

The system under test is modeled as a finite transition system Tsyssubscript𝑇sysT_{\text{sys}}italic_T start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT with a single initial state, that is, |Tsys.S0|=1|T_{\text{sys}}.S_{0}|=1| italic_T start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT . italic_S start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT | = 1. Furthermore, at least one of the system states is terminal (i.e., no outgoing edges).

The system designers provide the states S𝑆Sitalic_S, actions A𝐴Aitalic_A, transitions δ𝛿\deltaitalic_δ, and a set of possible initial conditions S0subscript𝑆0S_{0}italic_S start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT, set of atomic propositions, APsys𝐴subscript𝑃sysAP_{\text{sys}}italic_A italic_P start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT and a corresponding label function Lsys:S2APsys:subscript𝐿sys𝑆superscript2𝐴subscript𝑃sysL_{\text{sys}}:S\rightarrow 2^{AP_{\text{sys}}}italic_L start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT : italic_S → 2 start_POSTSUPERSCRIPT italic_A italic_P start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT end_POSTSUPERSCRIPT. We require a unique initial condition s0S0subscript𝑠0subscript𝑆0s_{0}\in S_{0}italic_s start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ∈ italic_S start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT to synthesize the test. If the test designer wishes to select an initial condition, then they can synthesize the test for each s0S0subscript𝑠0subscript𝑆0s_{0}\in S_{0}italic_s start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ∈ italic_S start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT and choose accordingly. In addition to APsys𝐴subscript𝑃sysAP_{\text{sys}}italic_A italic_P start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT, the test designer can choose additional atomic propositions APtest𝐴subscript𝑃testAP_{\text{test}}italic_A italic_P start_POSTSUBSCRIPT test end_POSTSUBSCRIPT and define a corresponding labeling function L:S2AP:𝐿𝑆superscript2𝐴𝑃L:S\rightarrow 2^{AP}italic_L : italic_S → 2 start_POSTSUPERSCRIPT italic_A italic_P end_POSTSUPERSCRIPT, where AP:=APsysAPtestassign𝐴𝑃𝐴subscript𝑃sys𝐴subscript𝑃testAP:=AP_{\text{sys}}\cup AP_{\text{test}}italic_A italic_P := italic_A italic_P start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT ∪ italic_A italic_P start_POSTSUBSCRIPT test end_POSTSUBSCRIPT. For test synthesis, the system model is Tsys=(S,A,δ,{s0},AP,L)subscript𝑇sys𝑆𝐴𝛿subscript𝑠0𝐴𝑃𝐿T_{\text{sys}}=(S,A,\delta,\{s_{0}\},AP,L)italic_T start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT = ( italic_S , italic_A , italic_δ , { italic_s start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT } , italic_A italic_P , italic_L ) is defined for the specific initial condition s0subscript𝑠0s_{0}italic_s start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT chosen by the test designer. The terminal state is used for defining test termination when the system satisfies its objective.

Assumption 1.

Except for sink states, transitions between states of the system are bidirectional: (s,s)Tsys.Eformulae-sequencefor-all𝑠superscript𝑠subscript𝑇sys𝐸\forall(s,s^{\prime})\in T_{\text{sys}}.E∀ ( italic_s , italic_s start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ∈ italic_T start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT . italic_E where ssuperscript𝑠s^{\prime}italic_s start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT is not a terminal state, we also have (s,s)Tsys.Eformulae-sequencesuperscript𝑠𝑠subscript𝑇sys𝐸(s^{\prime},s)\in T_{\text{sys}}.E( italic_s start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , italic_s ) ∈ italic_T start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT . italic_E.

This assumption is for a simpler presentation, and the framework can be extended to transition systems without this assumption (see Remark 7).

Definition 3 (Test Harness).

A test harness is used to constrain a state-action (s,a)𝑠𝑎(s,a)( italic_s , italic_a ) pair of the system in the sense that the system is prevented from taking action a𝑎aitalic_a from state sTsys.Sformulae-sequence𝑠subscript𝑇sys𝑆s\in T_{\text{sys}}.Sitalic_s ∈ italic_T start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT . italic_S. Let the actions AHTsys.Aformulae-sequencesubscript𝐴𝐻subscript𝑇sys𝐴A_{H}\subseteq T_{\text{sys}}.Aitalic_A start_POSTSUBSCRIPT italic_H end_POSTSUBSCRIPT ⊆ italic_T start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT . italic_A denote the subset of system actions that can be restricted by the test harness. The test harness H:Tsys.S2AHH:T_{\text{sys}}.S\rightarrow 2^{A_{H}}italic_H : italic_T start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT . italic_S → 2 start_POSTSUPERSCRIPT italic_A start_POSTSUBSCRIPT italic_H end_POSTSUBSCRIPT end_POSTSUPERSCRIPT maps states of the transition system to actions that can be restricted from that state.

In the examples in this paper, every state of the system has a self-loop transition corresponding to stay-in-place action, but the proposed framework does not require this. Note that in our examples, AHsubscript𝐴𝐻A_{H}italic_A start_POSTSUBSCRIPT italic_H end_POSTSUBSCRIPT does not contain self-loop actions.

Definition 4 (Test Environment).

The test environment consists of one or more of the following: static obstacles, reactive obstacles, and dynamic test agents. A static obstacle on (s,s)Tsys.Eformulae-sequence𝑠superscript𝑠subscript𝑇sys𝐸(s,s^{\prime})\in T_{\text{sys}}.E( italic_s , italic_s start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ∈ italic_T start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT . italic_E is a restriction on the system transition (s,s)𝑠superscript𝑠(s,s^{\prime})( italic_s , italic_s start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) that remains in place for the entire duration of the test. A reactive obstacle on (s,s)Tsys.Eformulae-sequence𝑠superscript𝑠subscript𝑇sys𝐸(s,s^{\prime})\in T_{\text{sys}}.E( italic_s , italic_s start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ∈ italic_T start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT . italic_E is a temporary restriction on the system transition (s,s)𝑠superscript𝑠(s,s^{\prime})( italic_s , italic_s start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) that can be enabled/disabled over the course of the test. A dynamic test agent can occupy states in Tsys.Sformulae-sequencesubscript𝑇sys𝑆T_{\text{sys}}.Sitalic_T start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT . italic_S, thus restricting the system from entering the occupied state.

In this work, we synthesize tests for high-level decision-making components of the system under test and therefore model it as a discrete-state system. Linear temporal logic (LTL) has been effective in formally specifying safety and liveness requirements for discrete-decision making [48, 49, 50]. For our problem, we use LTL to capture the system and test objectives.

Definition 5 (Linear Temporal Logic [36]).

Linear temporal logic (LTL) is a temporal logic specification language that allows reasoning over linear-time trace properties. The syntax of LTL is given as:

φ::=True|a|φ1φ2|¬φ|φ|φ1𝒰φ2,\varphi::=\emph{True}\>|\>a\>|\>\varphi_{1}\land\varphi_{2}\>|\neg\varphi\>|\>% \bigcirc\varphi\>|\>\varphi_{1}\mathcal{U}\varphi_{2},italic_φ : := True | italic_a | italic_φ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ∧ italic_φ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT | ¬ italic_φ | ○ italic_φ | italic_φ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT caligraphic_U italic_φ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ,

with aAP𝑎𝐴𝑃a\in APitalic_a ∈ italic_A italic_P, where AP𝐴𝑃APitalic_A italic_P is the set of atomic propositions, \land (conjunction) and ¬\neg¬ (negation) are the Boolean connectors from which other Boolean connectives such as \rightarrow can be defined, and \bigcirc (next) and 𝒰𝒰\mathcal{U}caligraphic_U (until) are temporal operators. Let φ𝜑\varphiitalic_φ be an LTL formula over AP𝐴𝑃APitalic_A italic_P. We can define the operators \operatorname{\rotatebox[origin={c}]{45.0}{$\Box$}} (eventually) and \square (always) as φ=True𝒰φ𝜑True𝒰𝜑\operatorname{\rotatebox[origin={c}]{45.0}{$\Box$}}\varphi=\emph{True}\>% \mathcal{U}\varphi□ italic_φ = True caligraphic_U italic_φ and φ=¬¬φ𝜑𝜑\square\varphi=\neg\operatorname{\rotatebox[origin={c}]{45.0}{$\Box$}}\neg\varphi□ italic_φ = ¬ □ ¬ italic_φ. For an execution σ=s0s1𝜎subscript𝑠0subscript𝑠1\sigma=s_{0}s_{1}\ldotsitalic_σ = italic_s start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT italic_s start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT … and an LTL formula φ𝜑\varphiitalic_φ, siφsubscript𝑠𝑖𝜑s_{i}\vDash\varphiitalic_s start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ⊨ italic_φ iff φ𝜑\varphiitalic_φ holds at i0𝑖0i\geq 0italic_i ≥ 0 of σ𝜎\sigmaitalic_σ. More formally, the semantics of LTL formula φ𝜑\varphiitalic_φ are inductively defined over an execution σ=s0s1𝜎subscript𝑠0subscript𝑠1\sigma=s_{0}s_{1}\ldotsitalic_σ = italic_s start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT italic_s start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT … as follows,

  • for aAP𝑎𝐴𝑃a\in APitalic_a ∈ italic_A italic_P, siasubscript𝑠𝑖𝑎s_{i}\vDash aitalic_s start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ⊨ italic_a iff a𝑎aitalic_a evaluates to True at sisubscript𝑠𝑖s_{i}italic_s start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT,

  • siφ1φ2subscript𝑠𝑖subscript𝜑1subscript𝜑2s_{i}\vDash\varphi_{1}\land\varphi_{2}italic_s start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ⊨ italic_φ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ∧ italic_φ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT iff siφ1subscript𝑠𝑖subscript𝜑1s_{i}\vDash\varphi_{1}italic_s start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ⊨ italic_φ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT and siφ2subscript𝑠𝑖subscript𝜑2s_{i}\vDash\varphi_{2}italic_s start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ⊨ italic_φ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT,

  • si¬φsubscript𝑠𝑖𝜑s_{i}\vDash\neg\varphiitalic_s start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ⊨ ¬ italic_φ iff ¬(siφ)subscript𝑠𝑖𝜑\neg(s_{i}\vDash\varphi)¬ ( italic_s start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ⊨ italic_φ ),

  • siφs_{i}\vDash\bigcirc\varphiitalic_s start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ⊨ ○ italic_φ iff si+1φsubscript𝑠𝑖1𝜑s_{i+1}\vDash\varphiitalic_s start_POSTSUBSCRIPT italic_i + 1 end_POSTSUBSCRIPT ⊨ italic_φ, and

  • siφ1𝒰φ2subscript𝑠𝑖subscript𝜑1𝒰subscript𝜑2s_{i}\vDash\varphi_{1}\mathcal{U}\varphi_{2}italic_s start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ⊨ italic_φ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT caligraphic_U italic_φ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT iff ki,skφ2formulae-sequence𝑘𝑖subscript𝑠𝑘subscript𝜑2\exists k\geq i,s_{k}\vDash\varphi_{2}∃ italic_k ≥ italic_i , italic_s start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ⊨ italic_φ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT and sjφ1subscript𝑠𝑗subscript𝜑1s_{j}\vDash\varphi_{1}italic_s start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ⊨ italic_φ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT, for all ij<k𝑖𝑗𝑘i\leq j<kitalic_i ≤ italic_j < italic_k.

An execution/trace σ=s0s1𝜎subscript𝑠0subscript𝑠1\sigma=s_{0}s_{1}\ldotsitalic_σ = italic_s start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT italic_s start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT … satisfies formula φ𝜑\varphiitalic_φ, denoted by σφmodels𝜎𝜑\sigma\models\varphiitalic_σ ⊧ italic_φ, iff s0φmodelssubscript𝑠0𝜑s_{0}\models\varphiitalic_s start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ⊧ italic_φ. A strategy π𝜋\piitalic_π is correct (satisfies formula φ𝜑\varphiitalic_φ), if the trace σπsubscript𝜎𝜋\sigma_{\pi}italic_σ start_POSTSUBSCRIPT italic_π end_POSTSUBSCRIPT resulting from the strategy satisfies φ𝜑\varphiitalic_φ.

Every LTL formula can be transformed into an equivalent non-deterministic Büchi automaton, which can then be converted to a deterministic Büchi automaton [36].

Definition 6 (Deterministic Büchi Automaton).

A non-deterministic Büchi automaton (NBA) [51, 36] is a tuple (Q,Ω,δ,Q0,F),𝑄Ω𝛿subscript𝑄0𝐹\mathcal{B}\coloneqq(Q,\Omega,\delta,Q_{0},F),caligraphic_B ≔ ( italic_Q , roman_Ω , italic_δ , italic_Q start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_F ) , where Q𝑄Qitalic_Q denotes the states, Ω2APΩsuperscript2𝐴𝑃\Omega\coloneqq 2^{AP}roman_Ω ≔ 2 start_POSTSUPERSCRIPT italic_A italic_P end_POSTSUPERSCRIPT is the set of alphabet for the set of atomic propositions AP𝐴𝑃APitalic_A italic_P, δ:Q×ΩQ:𝛿𝑄Ω𝑄\delta:Q\times\Omega\rightarrow Qitalic_δ : italic_Q × roman_Ω → italic_Q denotes the transition function, Q0Qsubscript𝑄0𝑄Q_{0}\subseteq Qitalic_Q start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ⊆ italic_Q represents the initial states, and FQ𝐹𝑄F\subseteq Qitalic_F ⊆ italic_Q is the set of acceptance states. The automaton is a deterministic Büchi automaton (DBA) iff |Q0|1subscript𝑄01|Q_{0}|\leq 1| italic_Q start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT | ≤ 1 and |δ(q,A)|1𝛿𝑞𝐴1|\delta(q,A)|\leq 1| italic_δ ( italic_q , italic_A ) | ≤ 1 for all qQ𝑞𝑄q\in Qitalic_q ∈ italic_Q and AΩ𝐴ΩA\in\Omegaitalic_A ∈ roman_Ω.

Remark 1.

We use deterministic Büchi automata since each input word corresponding to an execution should have a unique run on the automaton. While there are several different automata representations, deterministic Büchi automata are a natural choice for LTL specifications.

A product of two deterministic Büchi automata, 1subscript1\mathcal{B}_{1}caligraphic_B start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT and 2subscript2\mathcal{B}_{2}caligraphic_B start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT over the alphabet ΩΩ\Omegaroman_Ω, is defined as 12(Q,Ω,δ,Q0,F)tensor-productsubscript1subscript2𝑄Ω𝛿subscript𝑄0𝐹\mathcal{B}_{1}\otimes\mathcal{B}_{2}\coloneqq(Q,\Omega,\delta,Q_{0},F)caligraphic_B start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ⊗ caligraphic_B start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ≔ ( italic_Q , roman_Ω , italic_δ , italic_Q start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_F ), with states Q1.Q×2.Qformulae-sequence𝑄subscript1𝑄subscript2𝑄Q\coloneqq\mathcal{B}_{1}.Q\times\mathcal{B}_{2}.Qitalic_Q ≔ caligraphic_B start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT . italic_Q × caligraphic_B start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT . italic_Q, initial state Q01.Q0×2.Q0formulae-sequencesubscript𝑄0subscript1subscript𝑄0subscript2subscript𝑄0Q_{0}\coloneqq\mathcal{B}_{1}.Q_{0}\times\mathcal{B}_{2}.Q_{0}italic_Q start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ≔ caligraphic_B start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT . italic_Q start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT × caligraphic_B start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT . italic_Q start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT, acceptance states F1.F×2.Fformulae-sequence𝐹subscript1𝐹subscript2𝐹F\coloneqq\mathcal{B}_{1}.F\times\mathcal{B}_{2}.Fitalic_F ≔ caligraphic_B start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT . italic_F × caligraphic_B start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT . italic_F. The transition relation δ𝛿\deltaitalic_δ is defined as follows, for all (q1,q2)Qsubscript𝑞1subscript𝑞2𝑄(q_{1},q_{2})\in Q( italic_q start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_q start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) ∈ italic_Q, for all AΩ𝐴ΩA\in\Omegaitalic_A ∈ roman_Ω, δ((q1,q2),A)=(q1,q2)𝛿subscript𝑞1subscript𝑞2𝐴subscriptsuperscript𝑞1subscriptsuperscript𝑞2\delta((q_{1},q_{2}),A)=(q^{\prime}_{1},q^{\prime}_{2})italic_δ ( ( italic_q start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_q start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) , italic_A ) = ( italic_q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) where 1.δ(q1,A)=q1formulae-sequencesubscript1𝛿subscript𝑞1𝐴superscriptsubscript𝑞1\mathcal{B}_{1}.\delta(q_{1},A)=q_{1}^{\prime}caligraphic_B start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT . italic_δ ( italic_q start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_A ) = italic_q start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT and 2.δ(q2,A)=q2formulae-sequencesubscript2𝛿subscript𝑞2𝐴superscriptsubscript𝑞2\mathcal{B}_{2}.\delta(q_{2},A)=q_{2}^{\prime}caligraphic_B start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT . italic_δ ( italic_q start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , italic_A ) = italic_q start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT.

The desired test behavior can be captured via sub-tasks that are defined over atomic propositions AP𝐴𝑃APitalic_A italic_P. Table I lists the sub-task specification patterns that are considered. These specification patterns are commonly used to specify robotic missions [46]. The desired test behavior is characterized by the system and test objectives, defined over the set of atomic propositions AP𝐴𝑃APitalic_A italic_P that can be evaluated on system states Tsys.Sformulae-sequencesubscript𝑇sys𝑆T_{\text{sys}}.Sitalic_T start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT . italic_S.

Table I: Sub-task specification patterns defined on atomic propositions.
Name Formula
Visit i=1mpisuperscriptsubscript𝑖1𝑚superscript𝑝𝑖\bigwedge\limits_{i=1}^{m}\operatorname{\rotatebox[origin={c}]{45.0}{$\Box$}}p% ^{i}⋀ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_m end_POSTSUPERSCRIPT □ italic_p start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT \collectcell(s1)q:reach_spec\endcollectcell
Sequenced Visit (p0((p1pm)))superscript𝑝0superscript𝑝1superscript𝑝𝑚\operatorname{\rotatebox[origin={c}]{45.0}{$\Box$}}(p^{0}\land(\operatorname{% \rotatebox[origin={c}]{45.0}{$\Box$}}(p^{1}\land\ldots\operatorname{\rotatebox% [origin={c}]{45.0}{$\Box$}}p^{m})))□ ( italic_p start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ∧ ( □ ( italic_p start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT ∧ … □ italic_p start_POSTSUPERSCRIPT italic_m end_POSTSUPERSCRIPT ) ) ) \collectcell(s2)q:sequence_reach_spec\endcollectcell
Safety ¬p𝑝\square\neg p□ ¬ italic_p \collectcell(s3)q:safe_spec\endcollectcell
Instantaneous Reaction (pq)𝑝𝑞\square(p\rightarrow q)□ ( italic_p → italic_q ) \collectcell(s4)q:instant_reaction_spec\endcollectcell
Delayed Reaction (pq)𝑝𝑞\square(p\rightarrow\operatorname{\rotatebox[origin={c}]{45.0}{$\Box$}}q)□ ( italic_p → □ italic_q ) \collectcell(s5)q:delayed_reaction_spec\endcollectcell
Definition 7 (Test Objective).

The test objective φtestsubscript𝜑test\varphi_{\text{test}}italic_φ start_POSTSUBSCRIPT test end_POSTSUBSCRIPT consists of at least one visit or sequenced visit sub-task or a conjunction of these sub-tasks. The Büchi automaton testsubscripttest\mathcal{B}_{\text{test}}caligraphic_B start_POSTSUBSCRIPT test end_POSTSUBSCRIPT corresponds to the test objective φtestsubscript𝜑test\varphi_{\text{test}}italic_φ start_POSTSUBSCRIPT test end_POSTSUBSCRIPT.

Definition 8 (System Objective).

The system objective φsyssubscript𝜑sys\varphi_{\text{sys}}italic_φ start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT consists of at least one visit or sequenced visit sub-task. The final visit proposition should be a terminal state of the system. In addition, it can also contain some conjuction of safety, instantaneous and/or delayed reaction, and visit and/or sequenced visit sub-tasks. The Büchi automaton syssubscriptsys\mathcal{B}_{\text{sys}}caligraphic_B start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT corresponds to the system objective φsyssubscript𝜑sys\varphi_{\text{sys}}italic_φ start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT. We say that the system reaches its goal or that the test execution satisfies the system objective if the system trace is accepted syssubscriptsys\mathcal{B}_{\text{sys}}caligraphic_B start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT.

Typically, some aspects of a test are not revealed to the system until test time such as testing the persistence of a robot or prompting it to exhibit a difficult maneuver by placing obstacles in its path. This is formalized as a test objective which is not known to the system. In contrast, the system is aware of the system objective, which captures its requirements. For example, to test for safety, the system should know to avoid unsafe areas (sLABEL:eq:safe_spec). To test a reaction, (pq)𝑝𝑞\square(p\rightarrow q)□ ( italic_p → italic_q ), the system needs to be aware of the reaction requirement (sLABEL:eq:instant_reaction_spec), and the test objective needs to contain the corresponding visit requirement p𝑝\operatorname{\rotatebox[origin={c}]{45.0}{$\Box$}}p□ italic_p to trigger the reaction. Furthermore, the test objective can contain standalone reachability (visit and/or sequenced visit) sub-tasks that are not associated with a system reaction sub-task, but require the system to reach/visit certain states. The test objective is accomplished by restricting system actions in reaction to the system state.

In addition to the system objective, the system must interact safely with the test environment. The system must also obey the initial condition set by the test designer. For each obstacle/agent of the test environment, the system controller must respect the corresponding restrictions on its actions (i.e., cannot crash into obstacles/agents). Furthermore, for a valid system implementation, all lower-level planners and controllers of the system must simulate transitions on Tsyssubscript𝑇sysT_{\text{sys}}italic_T start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT.

Definition 9 (System Guarantees).

The system guarantees are a conjunction of the system objective, initial condition, safe interaction with the test environment, and a system implementation respecting the model Tsyssubscript𝑇sysT_{\text{sys}}italic_T start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT.

Definition 10 (System Assumptions).

The system assumes that the test environment satisfies the following conditions:
A1. The test environment can consist of: i) static obstacles (e.g., wall), ii) reactive obstacles (e.g., door), and iii) test agents whose dynamics are provided to the system.
A2. The test environment will not take any action that will inevitably lead to unsafe behavior (e.g., not restricting a system action after the system has committed to it, test agents not colliding into the system).
A3. The test environment will not take any action that will inevitably block all paths for the system to reach its goal (e.g., restrictions will not completely the enclose the system or block it from progressing to its goal).
A4. If the system and test environment are in a livelock, the system will have the option to break the livelock and take a different path toward its goal.

A correct system strategy satisfies the system guarantees when the test environment satisfies the system assumptions. This full system specification cannot always be expressed as an LTL formula. This is because, in an LTL synthesis setting, the system can assume that the test environment can behave in a worst-case manner and will never synthesize a satisfying controller. However, the system can assume that the test environment will always ensure that a path to achieving the system specification remains. For many examples, expressing that a satisfying path exists is not possible in LTL.

Definition 11 (Specification Product).

The specification product is the product π:=systestassignsubscript𝜋tensor-productsubscriptsyssubscripttest\mathcal{B}_{\pi}:=\mathcal{B}_{\text{sys}}\otimes\mathcal{B}_{\text{test}}caligraphic_B start_POSTSUBSCRIPT italic_π end_POSTSUBSCRIPT := caligraphic_B start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT ⊗ caligraphic_B start_POSTSUBSCRIPT test end_POSTSUBSCRIPT, where syssubscriptsys\mathcal{B}_{\text{sys}}caligraphic_B start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT is the Büchi automaton corresponding to the system specification, and testsubscripttest\mathcal{B}_{\text{test}}caligraphic_B start_POSTSUBSCRIPT test end_POSTSUBSCRIPT is the Büchi automaton corresponding to the test objective. The states (qsys,qtest)π.Qformulae-sequencesubscript𝑞syssubscript𝑞testsubscript𝜋𝑄(q_{\text{sys}},q_{\text{test}})\in\mathcal{B}_{\pi}.Q( italic_q start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT , italic_q start_POSTSUBSCRIPT test end_POSTSUBSCRIPT ) ∈ caligraphic_B start_POSTSUBSCRIPT italic_π end_POSTSUBSCRIPT . italic_Q, where qsyssys.Qformulae-sequencesubscript𝑞syssubscriptsys𝑄q_{\text{sys}}\in\mathcal{B}_{\text{sys}}.Qitalic_q start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT ∈ caligraphic_B start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT . italic_Q and qtesttest.Qformulae-sequencesubscript𝑞testsubscripttest𝑄q_{\text{test}}\in\mathcal{B}_{\text{test}}.Qitalic_q start_POSTSUBSCRIPT test end_POSTSUBSCRIPT ∈ caligraphic_B start_POSTSUBSCRIPT test end_POSTSUBSCRIPT . italic_Q, capture the event-based progression of the test and are referred to as history variables.

The system reaching its goal would typically mark the end of a test execution. However, the test engineer can also decide to terminate the test if the system appears to be stuck or enters an unsafe state. Tests that are terminated prematurely might result in inconclusive results [52], so we rely on the test engineer to determine the termination condition. We assume that the test engineer gives the system a reasonable amount of time to complete the test. Upon test termination in state snsubscript𝑠𝑛s_{n}italic_s start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT, we augment the trace σ𝜎\sigmaitalic_σ with the infinite suffix snωsuperscriptsubscript𝑠𝑛𝜔s_{n}^{\omega}italic_s start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_ω end_POSTSUPERSCRIPT for evaluation purposes.

Remark 2.

As tests have a defined start and end point, we need to bridge the gap between the finiteness of test executions and the infinite traces that are needed to evaluate LTL formulae. Augmenting the trace with the infinite suffix allows us to leverage useful tools available for LTL. Other research on interpreting LTL over finite traces can be found in [29, 53, 54].

Remark 3.

The states of the specification product automaton track the states of the individual Büchi automata, syssubscriptsys\mathcal{B}_{\text{sys}}caligraphic_B start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT and testsubscripttest\mathcal{B}_{\text{test}}caligraphic_B start_POSTSUBSCRIPT test end_POSTSUBSCRIPT, in the form of the Cartesian product to remember accepting states of the individual automata, which will be necessary for our framework (see Definitions 1119).

Refer to caption
(a) Example 1.
Refer to caption
(b) Example 2.
Figure 2: Grid world layouts for examples.
Example 1.

The system under test can transition (N-S-E-W) on the grid world as illustrated in Fig. 2. The initial condition of the system is marked by S𝑆Sitalic_S, and the system is required to visit one of the terminal goal states marked by T𝑇Titalic_T, φsys=Tsubscript𝜑sys𝑇\varphi_{\text{sys}}=\operatorname{\rotatebox[origin={c}]{45.0}{$\Box$}}Titalic_φ start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT = □ italic_T. The test objective is to observe the system visit at least one of the I𝐼Iitalic_I states before the system reaches its goal, encoded as φtest=Isubscript𝜑test𝐼\varphi_{\text{test}}=\operatorname{\rotatebox[origin={c}]{45.0}{$\Box$}}Iitalic_φ start_POSTSUBSCRIPT test end_POSTSUBSCRIPT = □ italic_I.

Example 2.

In this example, the system under test can transition (N-S-E-W) on the grid world as illustrated in Fig. 2. The initial condition of the system is marked by S𝑆Sitalic_S, and the system objective is to visit terminal state T𝑇Titalic_T, φsys=Tsubscript𝜑sys𝑇\varphi_{\text{sys}}=\operatorname{\rotatebox[origin={c}]{45.0}{$\Box$}}Titalic_φ start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT = □ italic_T. The test objective is to observe the system visit states I1subscript𝐼1I_{1}italic_I start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT and I2subscript𝐼2I_{2}italic_I start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT: φtest=I1I2subscript𝜑testsubscript𝐼1subscript𝐼2\varphi_{\text{test}}=\operatorname{\rotatebox[origin={c}]{45.0}{$\Box$}}I_{1}% \wedge\operatorname{\rotatebox[origin={c}]{45.0}{$\Box$}}I_{2}italic_φ start_POSTSUBSCRIPT test end_POSTSUBSCRIPT = □ italic_I start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ∧ □ italic_I start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT. The corresponding Büchi automata are illustrated in Fig. 3.

The synchronous product operator is used to construct a product of a transition system and a Büchi automaton. In particular, we will use this operator to construct the virtual product graph and the system product graph (see Section III).

Definition 12 (Synchronous Product).

The synchronous product of a DBA \mathcal{B}caligraphic_B and a FTS Tsyssubscript𝑇sysT_{\text{sys}}italic_T start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT, where the alphabet of \mathcal{B}caligraphic_B is the labels of Tsyssubscript𝑇sysT_{\text{sys}}italic_T start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT, is the transition system PTsys𝑃tensor-productsubscript𝑇sysP\coloneqq T_{\text{sys}}\otimes\mathcal{B}italic_P ≔ italic_T start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT ⊗ caligraphic_B, where:

P.STsys.S×.Q,formulae-sequence𝑃𝑆subscript𝑇sys𝑆𝑄\displaystyle P.S\coloneqq T_{\text{sys}}.S\times\mathcal{B}.Q,italic_P . italic_S ≔ italic_T start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT . italic_S × caligraphic_B . italic_Q ,
P.δ((s,q),a)(s,q) if s,sTsys.S,q,qB.Q,formulae-sequence𝑃formulae-sequence𝛿𝑠𝑞𝑎superscript𝑠superscript𝑞 if for-all𝑠superscript𝑠subscript𝑇sys𝑆for-all𝑞superscript𝑞𝐵𝑄\displaystyle P.\delta((s,q),a)\coloneqq(s^{\prime},q^{\prime})\text{ if }% \forall s,s^{\prime}\in T_{\text{sys}}.S,\forall q,q^{\prime}\in B.Q,italic_P . italic_δ ( ( italic_s , italic_q ) , italic_a ) ≔ ( italic_s start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , italic_q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) if ∀ italic_s , italic_s start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ∈ italic_T start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT . italic_S , ∀ italic_q , italic_q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ∈ italic_B . italic_Q ,
aTsys.A, s.t. Tsys.δ(s,a)=s and .δ(q,Tsys.L(s))=q,\displaystyle\quad\exists a\in T_{\text{sys}}.A,\text{ s.t. }T_{\text{sys}}.% \delta(s,a)=s^{\prime}\text{ and }\mathcal{B}.\delta(q,T_{\text{sys}}.L(s^{% \prime}))=q^{\prime},∃ italic_a ∈ italic_T start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT . italic_A , s.t. italic_T start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT . italic_δ ( italic_s , italic_a ) = italic_s start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT and caligraphic_B . italic_δ ( italic_q , italic_T start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT . italic_L ( italic_s start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ) = italic_q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ,
P.S0{(s0,q)|s0Tsys.S0,q0.Q0 s.t.\displaystyle P.S_{0}\coloneqq\{(s_{0},q)\,|\,s_{0}\in T_{\text{sys}}.S_{0},\,% \exists q_{0}\in\mathcal{B}.Q_{0}\text{ s.t. }italic_P . italic_S start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ≔ { ( italic_s start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_q ) | italic_s start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ∈ italic_T start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT . italic_S start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , ∃ italic_q start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ∈ caligraphic_B . italic_Q start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT s.t.
.δ(q0,Tsys.L(s0))=q},\displaystyle\quad\mathcal{B}.\delta(q_{0},T_{\text{sys}}.L(s_{0}))=q\},caligraphic_B . italic_δ ( italic_q start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_T start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT . italic_L ( italic_s start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) ) = italic_q } ,
P.AP.Q,formulae-sequence𝑃𝐴𝑃𝑄\displaystyle P.AP\coloneqq\mathcal{B}.Q,italic_P . italic_A italic_P ≔ caligraphic_B . italic_Q ,
P.L((s,q)){q},(s,q)P.S.formulae-sequence𝑃formulae-sequence𝐿𝑠𝑞𝑞for-all𝑠𝑞𝑃𝑆\displaystyle P.L((s,q))\coloneqq\{q\},\quad\forall(s,q)\in P.S.italic_P . italic_L ( ( italic_s , italic_q ) ) ≔ { italic_q } , ∀ ( italic_s , italic_q ) ∈ italic_P . italic_S .

We denote the transitions in P𝑃Pitalic_P as P.E{(s,s)|s,sP.S if aP.A s.t. P.δ(s,a)=s}formulae-sequence𝑃𝐸conditional-set𝑠superscript𝑠formulae-sequence𝑠superscript𝑠𝑃𝑆 if 𝑎𝑃𝐴 s.t. 𝑃𝛿𝑠𝑎superscript𝑠P.E\coloneqq\{(s,s^{\prime})\,|\,s,s^{\prime}\in P.S\text{ if }\exists a\in P.% A\text{ s.t. }P.\delta(s,a)=s^{\prime}\}italic_P . italic_E ≔ { ( italic_s , italic_s start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) | italic_s , italic_s start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ∈ italic_P . italic_S if ∃ italic_a ∈ italic_P . italic_A s.t. italic_P . italic_δ ( italic_s , italic_a ) = italic_s start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT }. An infinite sequence on P𝑃Pitalic_P corresponds to a state-history trace ϑ=(s,q)0,(s,q)1,(s,q)nωitalic-ϑsubscript𝑠𝑞0subscript𝑠𝑞1superscriptsubscript𝑠𝑞𝑛𝜔\vartheta=(s,q)_{0},(s,q)_{1},\dots(s,q)_{n}^{\omega}italic_ϑ = ( italic_s , italic_q ) start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , ( italic_s , italic_q ) start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … ( italic_s , italic_q ) start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_ω end_POSTSUPERSCRIPT. We refer to (s,q)P.Sformulae-sequence𝑠𝑞𝑃𝑆(s,q)\in P.S( italic_s , italic_q ) ∈ italic_P . italic_S as the state-history pair and define the corresponding path to be the finite prefix: ϑn=(s,q)0,(s,q)1,,(s,q)nsubscriptitalic-ϑ𝑛subscript𝑠𝑞0subscript𝑠𝑞1subscript𝑠𝑞𝑛\vartheta_{n}=(s,q)_{0},(s,q)_{1},\ldots,(s,q)_{n}italic_ϑ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT = ( italic_s , italic_q ) start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , ( italic_s , italic_q ) start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , ( italic_s , italic_q ) start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT.

II-B Network Flows

Definition 13 (Flow Network [55]).

A flow network is a tuple 𝒩=(V,E,(Vs,Vt))𝒩𝑉𝐸subscript𝑉𝑠subscript𝑉𝑡\mathcal{N}=(V,E,(V_{s},V_{t}))caligraphic_N = ( italic_V , italic_E , ( italic_V start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT , italic_V start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ) ), where V𝑉Vitalic_V denotes the set of nodes, EV×V𝐸𝑉𝑉E\subseteq V\times Vitalic_E ⊆ italic_V × italic_V the set of edges excluding self-loops, VsVsubscript𝑉𝑠𝑉V_{s}\subseteq Vitalic_V start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT ⊆ italic_V the source nodes, and VtVsubscript𝑉𝑡𝑉V_{t}\subseteq Vitalic_V start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ⊆ italic_V the sink nodes. We assume unit capacity for all edges. On the flow network 𝒩𝒩\mathcal{N}caligraphic_N, we can define the flow vector f0|E|fsubscriptsuperscript𝐸absent0\textbf{f}\in\mathbb{R}^{|E|}_{\geq 0}f ∈ blackboard_R start_POSTSUPERSCRIPT | italic_E | end_POSTSUPERSCRIPT start_POSTSUBSCRIPT ≥ 0 end_POSTSUBSCRIPT to satisfy the following constraints: i) the capacity constraint

0fe1,eE,formulae-sequence0superscript𝑓𝑒1for-all𝑒𝐸0\leq f^{e}\leq 1,\forall e\in E,0 ≤ italic_f start_POSTSUPERSCRIPT italic_e end_POSTSUPERSCRIPT ≤ 1 , ∀ italic_e ∈ italic_E , (6)

ii) the conservation constraint

uVf(u,v)=uVf(v,u),vV{Vs,Vt}, andformulae-sequencesubscript𝑢𝑉superscript𝑓𝑢𝑣subscript𝑢𝑉superscript𝑓𝑣𝑢for-all𝑣𝑉subscript𝑉𝑠subscript𝑉𝑡 and\sum_{u\in V}f^{(u,v)}=\sum_{u\in V}f^{(v,u)},\forall v\in V\setminus\{V_{s},V% _{t}\},\text{ and}∑ start_POSTSUBSCRIPT italic_u ∈ italic_V end_POSTSUBSCRIPT italic_f start_POSTSUPERSCRIPT ( italic_u , italic_v ) end_POSTSUPERSCRIPT = ∑ start_POSTSUBSCRIPT italic_u ∈ italic_V end_POSTSUBSCRIPT italic_f start_POSTSUPERSCRIPT ( italic_v , italic_u ) end_POSTSUPERSCRIPT , ∀ italic_v ∈ italic_V ∖ { italic_V start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT , italic_V start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT } , and (7)

iii) no flow into the source or out of the sink

f(u,v)=0 if uVt or vVs.superscript𝑓𝑢𝑣0 if 𝑢subscript𝑉𝑡 or 𝑣subscript𝑉𝑠f^{(u,v)}=0\text{ if }u\in V_{t}\text{ or }v\in V_{s}.italic_f start_POSTSUPERSCRIPT ( italic_u , italic_v ) end_POSTSUPERSCRIPT = 0 if italic_u ∈ italic_V start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT or italic_v ∈ italic_V start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT . (8)

The flow value on the network 𝒩𝒩\mathcal{N}caligraphic_N is defined as

F(u,v)E,uVsf(u,v).𝐹subscript𝑢𝑣𝐸𝑢subscript𝑉𝑠superscript𝑓𝑢𝑣F\coloneqq\sum_{\begin{subarray}{c}(u,v)\in E,\\ u\in V_{s}\end{subarray}}f^{(u,v)}.italic_F ≔ ∑ start_POSTSUBSCRIPT start_ARG start_ROW start_CELL ( italic_u , italic_v ) ∈ italic_E , end_CELL end_ROW start_ROW start_CELL italic_u ∈ italic_V start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT end_CELL end_ROW end_ARG end_POSTSUBSCRIPT italic_f start_POSTSUPERSCRIPT ( italic_u , italic_v ) end_POSTSUPERSCRIPT . (9)
Refer to caption
(a) syssubscriptsys\mathcal{B}_{\text{sys}}caligraphic_B start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT
Refer to caption
(b) testsubscripttest\mathcal{B}_{\text{test}}caligraphic_B start_POSTSUBSCRIPT test end_POSTSUBSCRIPT
Refer to caption
(c) πsubscript𝜋\mathcal{B}_{\pi}caligraphic_B start_POSTSUBSCRIPT italic_π end_POSTSUBSCRIPT
Figure 3: Automata for Example 2. Yellow \ThisStyle\SavedStyle{\color[rgb]{1,0.69140625,0}\definecolor[named]{pgfstrokecolor}{rgb}{% 1,0.69140625,0}\mathbin{\ThisStyle{\vbox{\hbox{\scalebox{1.5}{$\SavedStyle% \bullet$}}}}}} and blue \ThisStyle\SavedStyle{\color[rgb]{0.390625,0.5625,1}\definecolor[named]{pgfstrokecolor}{rgb}{% 0.390625,0.5625,1}\mathbin{\ThisStyle{\vbox{\hbox{\scalebox{1.5}{$\SavedStyle% \bullet$}}}}}} nodes in syssubscriptsys\mathcal{B}_{\text{sys}}caligraphic_B start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT and testsubscripttest\mathcal{B}_{\text{test}}caligraphic_B start_POSTSUBSCRIPT test end_POSTSUBSCRIPT are the respective accepting states. In the product πsubscript𝜋\mathcal{B}_{\pi}caligraphic_B start_POSTSUBSCRIPT italic_π end_POSTSUBSCRIPT, we continue to track these states for the system and test objectives. States in the product πsubscript𝜋\mathcal{B}_{\pi}caligraphic_B start_POSTSUBSCRIPT italic_π end_POSTSUBSCRIPT that are accepting to both objectives (e.g., q1111) are also shaded yellow.

III Problem Statement

In this section, we will state the test environment synthesis problem. The test engineer provides a system objective and a test objective, which describes the desired test behavior. Then, we find a reactive test strategy for which every test execution that satisfies the system objective also satisfies the test objective.

Definition 14 (Reactive Test Strategy).

A reactive test strategy πtest:(Tsys.S)Tsys.S2AH\pi_{\text{test}}:(T_{\text{sys}}.S)^{*}T_{\text{sys}}.S\rightarrow 2^{A_{H}}italic_π start_POSTSUBSCRIPT test end_POSTSUBSCRIPT : ( italic_T start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT . italic_S ) start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT italic_T start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT . italic_S → 2 start_POSTSUPERSCRIPT italic_A start_POSTSUBSCRIPT italic_H end_POSTSUBSCRIPT end_POSTSUPERSCRIPT defines the set of restricted system actions at each state during its execution σ𝜎\sigmaitalic_σ. For some finite prefix s0sisubscript𝑠0subscript𝑠𝑖s_{0}\ldots s_{i}italic_s start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT … italic_s start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT of execution σ𝜎\sigmaitalic_σ starting from initial state s0Tsys.S0formulae-sequencesubscript𝑠0subscript𝑇syssubscript𝑆0s_{0}\in T_{\text{sys}}.S_{0}italic_s start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ∈ italic_T start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT . italic_S start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT, πtest(s0si)H(si)subscript𝜋testsubscript𝑠0subscript𝑠𝑖𝐻subscript𝑠𝑖\pi_{\text{test}}(s_{0}\ldots s_{i})\subseteq H(s_{i})italic_π start_POSTSUBSCRIPT test end_POSTSUBSCRIPT ( italic_s start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT … italic_s start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) ⊆ italic_H ( italic_s start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) is the set of actions that the system cannot take from state sisubscript𝑠𝑖s_{i}italic_s start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT. A test environment is said to realize a reactive test strategy πtestsubscript𝜋test\pi_{\text{test}}italic_π start_POSTSUBSCRIPT test end_POSTSUBSCRIPT if it restricts system actions according to πtestsubscript𝜋test\pi_{\text{test}}italic_π start_POSTSUBSCRIPT test end_POSTSUBSCRIPT.

Let Σfin:=(Tsys.S)Tsys.S\Sigma_{\text{fin}}:=(T_{\text{sys}}.S)^{*}T_{\text{sys}}.Sroman_Σ start_POSTSUBSCRIPT fin end_POSTSUBSCRIPT := ( italic_T start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT . italic_S ) start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT italic_T start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT . italic_S be the set of all finite prefixes of system traces. At each time step k0𝑘0k\geq 0italic_k ≥ 0, a correct system strategy πsys:ΣfinTsys.Aπtest(Σfin)\pi_{\text{sys}}:\Sigma_{\text{fin}}\rightarrow T_{\text{sys}}.A\setminus\pi_{% \text{test}}(\Sigma_{\text{fin}})italic_π start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT : roman_Σ start_POSTSUBSCRIPT fin end_POSTSUBSCRIPT → italic_T start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT . italic_A ∖ italic_π start_POSTSUBSCRIPT test end_POSTSUBSCRIPT ( roman_Σ start_POSTSUBSCRIPT fin end_POSTSUBSCRIPT ) must pick from available actions at state sksubscript𝑠𝑘s_{k}italic_s start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT. The resulting execution is denoted as σ(πsys×πtest)𝜎subscript𝜋syssubscript𝜋test\sigma({\pi_{\text{sys}}\times\pi_{\text{test}}})italic_σ ( italic_π start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT × italic_π start_POSTSUBSCRIPT test end_POSTSUBSCRIPT ).

Remark 4.

Note that the test environment externally blocks system transitions, and as a consequence, restricts corresponding actions that the system can safely take. When actions are restricted by the test environment, the system strategy πsyssubscript𝜋sys\pi_{\text{sys}}italic_π start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT should select from the available actions at each state. Since these restrictions can be placed during the test execution, the system might have to re-plan and choose a different action than originally planned.

Definition 15 (Feasibility of a Test Strategy).

Given a test environment, system Tsyssubscript𝑇sysT_{\text{sys}}italic_T start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT, system and test objectives, φsyssubscript𝜑sys\varphi_{\text{sys}}italic_φ start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT and φtestsubscript𝜑test\varphi_{\text{test}}italic_φ start_POSTSUBSCRIPT test end_POSTSUBSCRIPT, a reactive test strategy πtestsubscript𝜋test\pi_{\text{test}}italic_π start_POSTSUBSCRIPT test end_POSTSUBSCRIPT is said to be feasible iff: i) the test environment can realize πtestsubscript𝜋test\pi_{\text{test}}italic_π start_POSTSUBSCRIPT test end_POSTSUBSCRIPT, ii) there exists a correct system strategy πsyssubscript𝜋sys\pi_{\text{sys}}italic_π start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT, and iii) any execution corresponding to a correct πsyssubscript𝜋sys\pi_{\text{sys}}italic_π start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT satisfies the system and test objectives: σ(πsys×πtest)φtestφsys𝜎subscript𝜋syssubscript𝜋testsubscript𝜑testsubscript𝜑sys\sigma({\pi_{\text{sys}}\times\pi_{\text{test}}})\vDash\varphi_{\text{test}}% \wedge\varphi_{\text{sys}}italic_σ ( italic_π start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT × italic_π start_POSTSUBSCRIPT test end_POSTSUBSCRIPT ) ⊨ italic_φ start_POSTSUBSCRIPT test end_POSTSUBSCRIPT ∧ italic_φ start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT.

Note that the test strategy is not aiding the system in achieving the system objective; it only restricts system actions such that the test objective is realized. That is, the system is free to choose an incorrect strategy, in which case there are no guarantees. Furthermore, the test strategy should allow the system to make multiple decisions at each step of the execution, if possible, as opposed to leaving a single allowed action. For any system trace σ=s0s1𝜎subscript𝑠0subscript𝑠1\sigma=s_{0}s_{1}\ldots\,italic_σ = italic_s start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT italic_s start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT …, every finite prefix of σ𝜎\sigmaitalic_σ maps to a history variable qπ.Qformulae-sequence𝑞subscript𝜋𝑄q\in\mathcal{B}_{\pi}.Qitalic_q ∈ caligraphic_B start_POSTSUBSCRIPT italic_π end_POSTSUBSCRIPT . italic_Q. For each σ𝜎\sigmaitalic_σ, we can define a corresponding state-history trace ϑ=(s,q)0,(s,q)1,italic-ϑsubscript𝑠𝑞0subscript𝑠𝑞1\vartheta=(s,q)_{0},(s,q)_{1},\ldotsitalic_ϑ = ( italic_s , italic_q ) start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , ( italic_s , italic_q ) start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , …, where history variable q𝑞qitalic_q at time step i𝑖iitalic_i corresponds to the prefix of s0sisubscript𝑠0subscript𝑠𝑖s_{0}\ldots s_{i}italic_s start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT … italic_s start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT of σ𝜎\sigmaitalic_σ. From now on, we will refer to σ𝜎\sigmaitalic_σ and the associated ϑitalic-ϑ\varthetaitalic_ϑ as the test execution, and clarify the context if necessary.

Definition 16 (Restrictiveness of a Test Strategy).

State-history traces ϑ1subscriptitalic-ϑ1\vartheta_{1}italic_ϑ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT and ϑ2subscriptitalic-ϑ2\vartheta_{2}italic_ϑ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT are unique if they do not share any consecutive state-history pairs. For a feasible πtestsubscript𝜋test\pi_{\text{test}}italic_π start_POSTSUBSCRIPT test end_POSTSUBSCRIPT, let ΣΣ\Sigmaroman_Σ be the set of all executions corresponding to correct system strategies, and let ΘΘ\Thetaroman_Θ be the set of all state-history traces corresponding to ΣΣ\Sigmaroman_Σ. Let ΘuΘsubscriptΘ𝑢Θ\Theta_{u}\subseteq\Thetaroman_Θ start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT ⊆ roman_Θ be a set of unique state-history traces. A test strategy πtestsubscript𝜋test\pi_{\text{test}}italic_π start_POSTSUBSCRIPT test end_POSTSUBSCRIPT is least restrictive if the cardinality of ΘusubscriptΘ𝑢\Theta_{u}roman_Θ start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT is maximized.

Remark 5.

Note that the set of all state history traces ΘΘ\Thetaroman_Θ can be infinite. However, the set ΘusubscriptΘ𝑢\Theta_{u}roman_Θ start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT is finite because: i) the system has a finite number of states and the specification product has a finite number of history variables, and ii) every state-history trace in ΘusubscriptΘ𝑢\Theta_{u}roman_Θ start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT is unique with respect to any other trace in ΘusubscriptΘ𝑢\Theta_{u}roman_Θ start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT.

Problem 1 (Finding a Reactive Test Strategy).

Given a high-level abstraction of the system model Tsyssubscript𝑇sysT_{\text{sys}}italic_T start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT, test harness H𝐻Hitalic_H, system objective φsyssubscript𝜑sys\varphi_{\text{sys}}italic_φ start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT, test objective φtestsubscript𝜑test\varphi_{\text{test}}italic_φ start_POSTSUBSCRIPT test end_POSTSUBSCRIPT, find a feasible, reactive test strategy πtestsubscript𝜋test\pi_{\text{test}}italic_π start_POSTSUBSCRIPT test end_POSTSUBSCRIPT that is least restrictive.

The restrictions on system actions placed by the test strategy can be realized in several ways in the test environment. For example, a dynamic test agent, together with any static obstacles, can be used to enforce the test strategy. This leads to the second problem of synthesizing a reactive strategy for a test agent to realize the test strategy. That is, at each time step of the test execution, the test environment consisting of an agent and static obstacles restricts the system actions according to πtestsubscript𝜋test\pi_{\text{test}}italic_π start_POSTSUBSCRIPT test end_POSTSUBSCRIPT.

Problem 2 (Reactive Test Agent Strategy Synthesis).

Given a high-level abstraction of the system model Tsyssubscript𝑇sysT_{\text{sys}}italic_T start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT, test harness H𝐻Hitalic_H, system objective φsyssubscript𝜑sys\varphi_{\text{sys}}italic_φ start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT, test objective φtestsubscript𝜑test\varphi_{\text{test}}italic_φ start_POSTSUBSCRIPT test end_POSTSUBSCRIPT, and a test agent modeled by transition system TTAsubscript𝑇TAT_{\text{TA}}italic_T start_POSTSUBSCRIPT TA end_POSTSUBSCRIPT. Find the test agent strategy πTAsubscript𝜋TA\pi_{\text{TA}}italic_π start_POSTSUBSCRIPT TA end_POSTSUBSCRIPT and the set of static obstacles 𝙾𝚋𝚜𝙾𝚋𝚜\mathtt{Obs}typewriter_Obs that: i) satisfy the system’s assumptions on its environment, and ii) realize a reactive test strategy πtestsubscript𝜋test\pi_{\text{test}}italic_π start_POSTSUBSCRIPT test end_POSTSUBSCRIPT that is least-restrictive and feasible.

IV Graph Construction

To reason about executions of the system in relation to the system and test objectives, we leverage automata theory to construct the following graphs.

Definition 17 (Virtual Product Graph and System Product Graph).

A virtual product graph is the product transition system G:=Tsysπassign𝐺tensor-productsubscript𝑇syssubscript𝜋G:=T_{\text{sys}}\otimes\mathcal{B}_{\pi}italic_G := italic_T start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT ⊗ caligraphic_B start_POSTSUBSCRIPT italic_π end_POSTSUBSCRIPT. Similarly, the system product graph is defined as Gsys:=Tsyssysassignsubscript𝐺systensor-productsubscript𝑇syssubscriptsysG_{\text{sys}}:=T_{\text{sys}}\otimes\mathcal{B}_{\text{sys}}italic_G start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT := italic_T start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT ⊗ caligraphic_B start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT.

The virtual product graph G𝐺Gitalic_G tracks the test execution in relation to both the system and test objectives while the system product graph Gsyssubscript𝐺sysG_{\text{sys}}italic_G start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT tracks the system objective. We will find the restrictions on system actions on G𝐺Gitalic_G, while Gsyssubscript𝐺sysG_{\text{sys}}italic_G start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT represents the system’s perspective concerning the system objective during the test execution. For each node u=(s,q)G.Sformulae-sequence𝑢𝑠𝑞𝐺𝑆u=(s,q)\in G.Sitalic_u = ( italic_s , italic_q ) ∈ italic_G . italic_S, we denote the corresponding state in sTsys.Sformulae-sequence𝑠subscript𝑇sys𝑆s\in T_{\text{sys}}.Sitalic_s ∈ italic_T start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT . italic_S as u.s:=sformulae-sequence𝑢assign𝑠𝑠u.s:=sitalic_u . italic_s := italic_s. Similarly, the state corresponding to vGsys.Sformulae-sequence𝑣subscript𝐺sys𝑆v\in G_{\text{sys}}.Sitalic_v ∈ italic_G start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT . italic_S is denoted by v.ssformulae-sequence𝑣𝑠𝑠v.s\coloneqq sitalic_v . italic_s ≔ italic_s. For practical implementation, we remove nodes on the product graphs that are not reachable from the corresponding initial states, G.S0formulae-sequence𝐺subscript𝑆0G.S_{0}italic_G . italic_S start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT or Gsys.S0formulae-sequencesubscript𝐺syssubscript𝑆0G_{\text{sys}}.S_{0}italic_G start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT . italic_S start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT.

Definition 18 (Projection).

We map states from G𝐺Gitalic_G to Gsyssubscript𝐺sysG_{\text{sys}}italic_G start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT using the projection 𝒫GGsys:G.SGsys.S\mathcal{P}_{G\rightarrow G_{\text{sys}}}:G.S\rightarrow G_{\text{sys}}.Scaligraphic_P start_POSTSUBSCRIPT italic_G → italic_G start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT end_POSTSUBSCRIPT : italic_G . italic_S → italic_G start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT . italic_S as

𝒫GGsys(s,(qsys,qtest))=(s,qsys).subscript𝒫𝐺subscript𝐺sys𝑠subscript𝑞syssubscript𝑞test𝑠subscript𝑞sys\mathcal{P}_{G\rightarrow G_{\text{sys}}}(s,(q_{\text{sys}},q_{\text{test}}))=% (s,q_{\text{sys}}).caligraphic_P start_POSTSUBSCRIPT italic_G → italic_G start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( italic_s , ( italic_q start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT , italic_q start_POSTSUBSCRIPT test end_POSTSUBSCRIPT ) ) = ( italic_s , italic_q start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT ) . (10)

These projections help us to reason about how restrictions found on G𝐺Gitalic_G map to the system Tsyssubscript𝑇sysT_{\text{sys}}italic_T start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT and the system product graph Gsyssubscript𝐺sysG_{\text{sys}}italic_G start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT. We can now define the edges on G𝐺Gitalic_G that we can restrict with the test harness as follows,

EH={((s,q),(s,q))G.E|sTsys.S,aH(s) s.t. s=Tsys.δ(s,a)}.\begin{split}E_{H}=&\{((s,q),(s^{\prime},q^{\prime}))\in G.E|\;\forall s\in T_% {\text{sys}}.S,\\ &\forall a\in H(s)\text{ s.t. }s^{\prime}=T_{\text{sys}}.\delta(s,a)\}.\end{split}start_ROW start_CELL italic_E start_POSTSUBSCRIPT italic_H end_POSTSUBSCRIPT = end_CELL start_CELL { ( ( italic_s , italic_q ) , ( italic_s start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , italic_q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ) ∈ italic_G . italic_E | ∀ italic_s ∈ italic_T start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT . italic_S , end_CELL end_ROW start_ROW start_CELL end_CELL start_CELL ∀ italic_a ∈ italic_H ( italic_s ) s.t. italic_s start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT = italic_T start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT . italic_δ ( italic_s , italic_a ) } . end_CELL end_ROW (11)
Lemma 1.

For every path (s,qsys)0,(s,qsys)1,,(s,qsys)nsubscript𝑠subscript𝑞sys0subscript𝑠subscript𝑞sys1subscript𝑠subscript𝑞sys𝑛(s,q_{\text{sys}})_{0},(s,q_{\text{sys}})_{1},\ldots,(s,q_{\text{sys}})_{n}( italic_s , italic_q start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT ) start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , ( italic_s , italic_q start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT ) start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , ( italic_s , italic_q start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT ) start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT on Gsyssubscript𝐺sysG_{\text{sys}}italic_G start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT, there exists at least one corresponding path on G𝐺Gitalic_G.

Proof.

Suppose there exists some qtest 0,,qtestntest.Qformulae-sequencesubscript𝑞test 0subscript𝑞test𝑛subscripttest𝑄q_{\text{test}\>0},\ldots,q_{\text{test}\>n}\in\mathcal{B}_{\text{test}}.Qitalic_q start_POSTSUBSCRIPT test 0 end_POSTSUBSCRIPT , … , italic_q start_POSTSUBSCRIPT test italic_n end_POSTSUBSCRIPT ∈ caligraphic_B start_POSTSUBSCRIPT test end_POSTSUBSCRIPT . italic_Q such that (s,(qsys,qtest))0,,(s,(qsys,qtest))nsubscript𝑠subscript𝑞syssubscript𝑞test0subscript𝑠subscript𝑞syssubscript𝑞test𝑛(s,(q_{\text{sys}},q_{\text{test}}))_{0},\ldots,(s,(q_{\text{sys}},q_{\text{% test}}))_{n}( italic_s , ( italic_q start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT , italic_q start_POSTSUBSCRIPT test end_POSTSUBSCRIPT ) ) start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , … , ( italic_s , ( italic_q start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT , italic_q start_POSTSUBSCRIPT test end_POSTSUBSCRIPT ) ) start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT is a path on G𝐺Gitalic_G. Then, by construction, there exists a path on Gsyssubscript𝐺sysG_{\text{sys}}italic_G start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT where (s,(qsys,qtest))ksubscript𝑠subscript𝑞syssubscript𝑞test𝑘(s,(q_{\text{sys}},q_{\text{test}}))_{k}( italic_s , ( italic_q start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT , italic_q start_POSTSUBSCRIPT test end_POSTSUBSCRIPT ) ) start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT maps to (s,qsys)ksubscript𝑠subscript𝑞sys𝑘(s,q_{\text{sys}})_{k}( italic_s , italic_q start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT ) start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT for all 0kn0𝑘𝑛0\leq k\leq n0 ≤ italic_k ≤ italic_n. ∎

Paths on the virtual product graph G𝐺Gitalic_G correspond to possible test executions. We identify the nodes on G𝐺Gitalic_G that capture the acceptance conditions for the system and test objectives.

Definition 19 (Source, Intermediate, and Target Nodes).

The source node 𝚂𝚂\mathtt{S}typewriter_S represents the initial condition of the system. The intermediate nodes 𝙸𝙸\mathtt{I}typewriter_I correspond to system states in which the test objective acceptance conditions are met. Finally, the target nodes 𝚃𝚃\mathtt{T}typewriter_T represent the system states for which the acceptance condition for the system objective is satisfied. Formally, these nodes are defined as follows,

𝚂{(s0,q0)G.S|s0Tsys.S0,q0π.Q0},\displaystyle\mathtt{S}\coloneqq\{(s_{0},q_{0})\in G.S\,|\,s_{0}\in T_{\text{% sys}}.S_{0},q_{0}\in\mathcal{B}_{\pi}.Q_{0}\},typewriter_S ≔ { ( italic_s start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_q start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) ∈ italic_G . italic_S | italic_s start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ∈ italic_T start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT . italic_S start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_q start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ∈ caligraphic_B start_POSTSUBSCRIPT italic_π end_POSTSUBSCRIPT . italic_Q start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT } ,
𝙸{(s,(qsys,qtest))G.S|qtesttest.F,qsyssys.F},\displaystyle\mathtt{I}\coloneqq\{(s,(q_{\text{sys}},q_{\text{test}}))\in G.S% \,|\,q_{\text{test}}\in\mathcal{B}_{\text{test}}.F,\,q_{\text{sys}}\notin% \mathcal{B}_{\text{sys}}.F\},typewriter_I ≔ { ( italic_s , ( italic_q start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT , italic_q start_POSTSUBSCRIPT test end_POSTSUBSCRIPT ) ) ∈ italic_G . italic_S | italic_q start_POSTSUBSCRIPT test end_POSTSUBSCRIPT ∈ caligraphic_B start_POSTSUBSCRIPT test end_POSTSUBSCRIPT . italic_F , italic_q start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT ∉ caligraphic_B start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT . italic_F } ,
𝚃{(s,(qsys,qtest))G.S|qsyssys.F}.\displaystyle\mathtt{T}\coloneqq\{(s,(q_{\text{sys}},q_{\text{test}}))\in G.S% \,|\,q_{\text{sys}}\in\mathcal{B}_{\text{sys}}.F\}.typewriter_T ≔ { ( italic_s , ( italic_q start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT , italic_q start_POSTSUBSCRIPT test end_POSTSUBSCRIPT ) ) ∈ italic_G . italic_S | italic_q start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT ∈ caligraphic_B start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT . italic_F } .

In addition, we define the set of states corresponding to the system acceptance condition on Gsyssubscript𝐺sysG_{\text{sys}}italic_G start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT as 𝚃sys:={(s,q)Gsys.S|qsys.F}\mathtt{T}_{\text{sys}}:=\{(s,q)\in G_{\text{sys}}.S\>|\>q\in\mathcal{B}_{% \text{sys}}.F\}typewriter_T start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT := { ( italic_s , italic_q ) ∈ italic_G start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT . italic_S | italic_q ∈ caligraphic_B start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT . italic_F }.

Proposition 1.

Every test execution corresponds to a path ϑn=(s,q)0,(s,q)1,,(s,q)nsubscriptitalic-ϑ𝑛subscript𝑠𝑞0subscript𝑠𝑞1subscript𝑠𝑞𝑛\vartheta_{n}=(s,q)_{0},(s,q)_{1},\ldots,(s,q)_{n}italic_ϑ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT = ( italic_s , italic_q ) start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , ( italic_s , italic_q ) start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , ( italic_s , italic_q ) start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT on G𝐺Gitalic_G where (s,q)0𝚂subscript𝑠𝑞0𝚂(s,q)_{0}\in\mathtt{S}( italic_s , italic_q ) start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ∈ typewriter_S. The corresponding system trace σnsubscript𝜎𝑛\sigma_{n}italic_σ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT satisfies the system objective, σφsysmodels𝜎subscript𝜑sys\sigma\models\varphi_{\text{sys}}italic_σ ⊧ italic_φ start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT iff (s,q)n𝚃subscript𝑠𝑞𝑛𝚃(s,q)_{n}\in\mathtt{T}( italic_s , italic_q ) start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ∈ typewriter_T. Furthermore, if σφtestmodels𝜎subscript𝜑test\sigma\models\varphi_{\text{test}}italic_σ ⊧ italic_φ start_POSTSUBSCRIPT test end_POSTSUBSCRIPT, then the path ϑnsubscriptitalic-ϑ𝑛\vartheta_{n}italic_ϑ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT contains a state-history pair (s,q)i𝙸subscript𝑠𝑞𝑖𝙸(s,q)_{i}\in\mathtt{I}( italic_s , italic_q ) start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ∈ typewriter_I for some 0in0𝑖𝑛0\leq i\leq n0 ≤ italic_i ≤ italic_n.

Provided that there exists a path on G𝐺Gitalic_G from 𝚂𝚂\mathtt{S}typewriter_S to 𝚃𝚃\mathtt{T}typewriter_T, identifying a feasible reactive test strategy corresponds to identifying edges to cut on G𝐺Gitalic_G. These edge cuts correspond to restricted system actions. In particular, these edge cuts are such that all paths on G𝐺Gitalic_G from source 𝚂𝚂\mathtt{S}typewriter_S to target 𝚃𝚃\mathtt{T}typewriter_T visit the intermediate 𝙸𝙸\mathtt{I}typewriter_I.

V Network Flow Optimization for Identifying Restrictions on System Actions

Refer to caption
(a) Virtual product graph G𝐺Gitalic_G.
Refer to caption
(b) Gsys(q0,s3)superscriptsubscript𝐺sysq0subscript𝑠3G_{\text{sys}}^{(\text{q}0,s_{3})}italic_G start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( q 0 , italic_s start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT ) end_POSTSUPERSCRIPT
Refer to caption
(c) Gsys(q6,s1)superscriptsubscript𝐺sysq6subscript𝑠1G_{\text{sys}}^{(\text{q}6,s_{1})}italic_G start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( q 6 , italic_s start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) end_POSTSUPERSCRIPT
Refer to caption
(d) Gsys(q7,s11)superscriptsubscript𝐺sysq7subscript𝑠11G_{\text{sys}}^{(\text{q}7,s_{11})}italic_G start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( q 7 , italic_s start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT ) end_POSTSUPERSCRIPT
Figure 4: Virtual product graph and system product graphs for Example 2. Fig. 4 shows the virtual product graph G𝐺Gitalic_G, with the source 𝚂𝚂\mathtt{S}typewriter_S (magenta \ThisStyle\SavedStyle{\color[rgb]{0.86328125,0.1484375,0.49609375}\definecolor[named]{% pgfstrokecolor}{rgb}{0.86328125,0.1484375,0.49609375}\mathbin{\ThisStyle{\vbox% {\hbox{\scalebox{1.5}{$\SavedStyle\bullet$}}}}}}), the intermediate nodes 𝙸𝙸\mathtt{I}typewriter_I (blue \ThisStyle\SavedStyle{\color[rgb]{0.390625,0.5625,1}\definecolor[named]{pgfstrokecolor}{rgb}{% 0.390625,0.5625,1}\mathbin{\ThisStyle{\vbox{\hbox{\scalebox{1.5}{$\SavedStyle% \bullet$}}}}}}), and the target nodes (yellow \ThisStyle\SavedStyle{\color[rgb]{1,0.69140625,0}\definecolor[named]{pgfstrokecolor}{rgb}{% 1,0.69140625,0}\mathbin{\ThisStyle{\vbox{\hbox{\scalebox{1.5}{$\SavedStyle% \bullet$}}}}}}). Edge cut values for each edge in G𝐺Gitalic_G are grouped by their history variable q𝑞qitalic_q and projected to the corresponding copy of Gsyssubscript𝐺sysG_{\text{sys}}italic_G start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT. Red dashed lines indicate edge cuts. Figs. 4-4 show the copies of Gsyssubscript𝐺sysG_{\text{sys}}italic_G start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT with their source (s3subscript𝑠3s_{3}italic_s start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT, s6subscript𝑠6s_{6}italic_s start_POSTSUBSCRIPT 6 end_POSTSUBSCRIPT or s11subscript𝑠11s_{11}italic_s start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT in orange \ThisStyle\SavedStyle{\color[rgb]{0.99609375,0.37890625,0}\definecolor[named]{pgfstrokecolor}{rgb}{% 0.99609375,0.37890625,0}\mathbin{\ThisStyle{\vbox{\hbox{\scalebox{1.5}{$% \SavedStyle\bullet$}}}}}}) and target nodes (yellow \ThisStyle\SavedStyle{\color[rgb]{1,0.69140625,0}\definecolor[named]{pgfstrokecolor}{rgb}{% 1,0.69140625,0}\mathbin{\ThisStyle{\vbox{\hbox{\scalebox{1.5}{$\SavedStyle% \bullet$}}}}}}). The graphs in Figs. 4-4 correspond to the history variables q00, q6666, and q7777 from πsubscript𝜋\mathcal{B}_{\pi}caligraphic_B start_POSTSUBSCRIPT italic_π end_POSTSUBSCRIPT shown in Fig. 3. The constraints (c6)-(c8) ensure that the edge cuts are such that a path from each source to the target node exists for each history variable q𝑞qitalic_q.

To identify which edges to cut on G𝐺Gitalic_G, we use network flow optimization, a commonly used paradigm for flow-cut problems on graphs. On G𝐺Gitalic_G, which characterizes all possible test executions, all paths from the initial condition 𝚂𝚂\mathtt{S}typewriter_S to the system goal 𝚃𝚃\mathtt{T}typewriter_T must be routed through the intermediate 𝙸𝙸\mathtt{I}typewriter_I. Furthermore, the edge cuts should be least-restrictive and such that the system can satisfy the test objective and system objective. Maximum flow can be a proxy for freedom of the system under test to make decisions — a higher network flow corresponds to more unique paths on G𝐺Gitalic_G. Since we use flow networks with unit edge capacities, a realization of maximum flow corresponds to a set of paths that do not share an edge. Furthermore, this flow should be achieved with the fewest possible cuts to not unnecessarily restrict system actions. A high network flow with the minimum possible edge cuts corresponds to a least restrictive test for the system.

V-A Optimization Setup

We define the flow network 𝒢(V,E,(𝚂,𝚃))𝒢𝑉𝐸𝚂𝚃\mathcal{G}\coloneqq(V,E,(\mathtt{S},\mathtt{T}))caligraphic_G ≔ ( italic_V , italic_E , ( typewriter_S , typewriter_T ) ), where VG.Sformulae-sequence𝑉𝐺𝑆V\coloneqq G.Sitalic_V ≔ italic_G . italic_S, EG.Eformulae-sequence𝐸𝐺𝐸E\coloneqq G.Eitalic_E ≔ italic_G . italic_E, source and target nodes correspond to 𝚂𝚂\mathtt{S}typewriter_S and 𝚃𝚃\mathtt{T}typewriter_T, with the corresponding flow 𝐟|E|𝐟superscript𝐸\mathbf{f}\in\mathbb{R}^{|E|}bold_f ∈ blackboard_R start_POSTSUPERSCRIPT | italic_E | end_POSTSUPERSCRIPT. For simplicity, we use the same notation to refer to nodes and edges on the graph and the corresponding flow network. The Boolean edge cut vector 𝐝𝔹|E|𝐝superscript𝔹𝐸\mathbf{d}\in\mathbb{B}^{|E|}bold_d ∈ blackboard_B start_POSTSUPERSCRIPT | italic_E | end_POSTSUPERSCRIPT represents whether edges are cut or not. That is, de=1superscript𝑑𝑒1d^{e}=1italic_d start_POSTSUPERSCRIPT italic_e end_POSTSUPERSCRIPT = 1 refers to edge eE𝑒𝐸e\in Eitalic_e ∈ italic_E being cut, and de=0superscript𝑑𝑒0d^{e}=0italic_d start_POSTSUPERSCRIPT italic_e end_POSTSUPERSCRIPT = 0 implies that edge e𝑒eitalic_e is not cut,

de{0,1},eE, and de=0,eEH.formulae-sequencesuperscript𝑑𝑒01formulae-sequencefor-all𝑒𝐸formulae-sequence and superscript𝑑𝑒0for-all𝑒subscript𝐸𝐻d^{e}\in\{0,1\},\quad\forall e\in E,\text{ and }d^{e}=0,\quad\forall e\notin E% _{H}.italic_d start_POSTSUPERSCRIPT italic_e end_POSTSUPERSCRIPT ∈ { 0 , 1 } , ∀ italic_e ∈ italic_E , and italic_d start_POSTSUPERSCRIPT italic_e end_POSTSUPERSCRIPT = 0 , ∀ italic_e ∉ italic_E start_POSTSUBSCRIPT italic_H end_POSTSUBSCRIPT . (c1)

The edges into and out of the intermediate 𝙸𝙸\mathtt{I}typewriter_I nodes are denoted as E(𝙸):={(u,v)E|u𝙸 or v𝙸}assign𝐸𝙸conditional-set𝑢𝑣𝐸𝑢𝙸 or 𝑣𝙸E(\mathtt{I}):=\{(u,v)\in E\>|\>u\in\mathtt{I}\text{ or }v\in\mathtt{I}\}italic_E ( typewriter_I ) := { ( italic_u , italic_v ) ∈ italic_E | italic_u ∈ typewriter_I or italic_v ∈ typewriter_I }. To solve Problem 1, we formulate a mixed-integer linear program (MILP).

Objective. To find the least restrictive test, we want to maximize the system’s freedom in satisfying the test objective. To capture this, we optimize for edge cuts that maximize the flow value on 𝒢𝒢\mathcal{G}caligraphic_G. However, a realization of maximum flow on a network is not unique. To ensure that we do not cut any edges unnecessarily, we subtract the sum of the edge cuts from the flow value:

(u,v)E,u𝚂f(u,v)1|E|eEde.subscript𝑢𝑣𝐸𝑢𝚂superscript𝑓𝑢𝑣1𝐸subscript𝑒𝐸superscript𝑑𝑒\sum_{\begin{subarray}{c}(u,v)\in E,\\ u\in\mathtt{S}\end{subarray}}f^{(u,v)}-\frac{1}{|E|}\sum_{e\in E}d^{e}.∑ start_POSTSUBSCRIPT start_ARG start_ROW start_CELL ( italic_u , italic_v ) ∈ italic_E , end_CELL end_ROW start_ROW start_CELL italic_u ∈ typewriter_S end_CELL end_ROW end_ARG end_POSTSUBSCRIPT italic_f start_POSTSUPERSCRIPT ( italic_u , italic_v ) end_POSTSUPERSCRIPT - divide start_ARG 1 end_ARG start_ARG | italic_E | end_ARG ∑ start_POSTSUBSCRIPT italic_e ∈ italic_E end_POSTSUBSCRIPT italic_d start_POSTSUPERSCRIPT italic_e end_POSTSUPERSCRIPT . (12)

The regularizer 1|E|1𝐸\frac{1}{|E|}divide start_ARG 1 end_ARG start_ARG | italic_E | end_ARG on the sum of edge cuts is chosen such that it will not compete with the maximum flow value on the network. The weighted sum 1|E|eEde1𝐸subscript𝑒𝐸superscript𝑑𝑒\frac{1}{|E|}\sum_{e\in E}d^{e}divide start_ARG 1 end_ARG start_ARG | italic_E | end_ARG ∑ start_POSTSUBSCRIPT italic_e ∈ italic_E end_POSTSUBSCRIPT italic_d start_POSTSUPERSCRIPT italic_e end_POSTSUPERSCRIPT is always between 00 and 1111, and binary edge cuts and unit capacity will always result in maximum flow being integer-valued. Thus, the optimization will always favor increasing the maximum flow value rather than reducing edge cuts.

Network flow constraints. First, the network flow optimization is subject to the following standard constraints on flow 𝐟𝐟\mathbf{f}bold_f:

Flow constraints (6),  (7), and (8) on flow network 𝒢.Flow constraints (6),  (7), and (8) on flow network 𝒢\text{Flow constraints\leavevmode\nobreak\ \eqref{eq:flow_capacity}, % \leavevmode\nobreak\ \eqref{eq:flow_conservation}, and\leavevmode\nobreak\ % \eqref{eq:flow_no_in_src_no_out_sink} on flow network }\mathcal{G}.Flow constraints ( ), ( ), and ( ) on flow network caligraphic_G . (c2)

An edge that is cut restricts flow completely, while an edge that is not cut may or may not have flow,

eE,de+fe1.formulae-sequencefor-all𝑒𝐸superscript𝑑𝑒superscript𝑓𝑒1\forall e\in E,\quad d^{e}+f^{e}\leq 1.∀ italic_e ∈ italic_E , italic_d start_POSTSUPERSCRIPT italic_e end_POSTSUPERSCRIPT + italic_f start_POSTSUPERSCRIPT italic_e end_POSTSUPERSCRIPT ≤ 1 . (c3)

Partition constraints. The following constraints ensure that all flow across the network will be routed through 𝙸𝙸\mathtt{I}typewriter_I. To accomplish this, we adapt the partitioning conditions given in [56] as follows. Except for the 𝙸𝙸\mathtt{I}typewriter_I nodes, we divide the remaining nodes into two groups defined by the partition variable 𝝁|V𝙸|𝝁superscript𝑉𝙸\bm{\mu}\in\mathbb{R}^{|V\setminus\mathtt{I}|}bold_italic_μ ∈ blackboard_R start_POSTSUPERSCRIPT | italic_V ∖ typewriter_I | end_POSTSUPERSCRIPT, and ensure that the nodes 𝚂𝚂\mathtt{S}typewriter_S belong to one group, and 𝚃𝚃\mathtt{T}typewriter_T belong to the other:

0μv1,μ𝚂μ𝚃1,vV𝙸.formulae-sequence0superscript𝜇𝑣1formulae-sequencesuperscript𝜇𝚂superscript𝜇𝚃1for-all𝑣𝑉𝙸0\leq\mu^{v}\leq 1,\quad\mu^{\mathtt{S}}-\mu^{\mathtt{T}}\geq 1,\forall v\in V% \setminus\mathtt{I}.0 ≤ italic_μ start_POSTSUPERSCRIPT italic_v end_POSTSUPERSCRIPT ≤ 1 , italic_μ start_POSTSUPERSCRIPT typewriter_S end_POSTSUPERSCRIPT - italic_μ start_POSTSUPERSCRIPT typewriter_T end_POSTSUPERSCRIPT ≥ 1 , ∀ italic_v ∈ italic_V ∖ typewriter_I . (c4)

The two groups are partitioned by the edge cut vector 𝐝𝐝\mathbf{d}bold_d, where this constraint is only defined over the edges that do not go into or out of nodes in 𝙸𝙸\mathtt{I}typewriter_I,

d(u,v)μu+μv0,(u,v)EE(I).formulae-sequencesuperscript𝑑𝑢𝑣superscript𝜇𝑢superscript𝜇𝑣0for-all𝑢𝑣𝐸𝐸𝐼d^{(u,v)}-\mu^{u}+\mu^{v}\geq 0,\,\forall(u,v)\in E\setminus E(I).italic_d start_POSTSUPERSCRIPT ( italic_u , italic_v ) end_POSTSUPERSCRIPT - italic_μ start_POSTSUPERSCRIPT italic_u end_POSTSUPERSCRIPT + italic_μ start_POSTSUPERSCRIPT italic_v end_POSTSUPERSCRIPT ≥ 0 , ∀ ( italic_u , italic_v ) ∈ italic_E ∖ italic_E ( italic_I ) . (c5)

Feasibility constraints. To ensure that the test is not impossible from the system’s perspective, we map restrictions found on G𝐺Gitalic_G to Gsyssubscript𝐺sysG_{\text{sys}}italic_G start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT via the following feasibility constraints. For each history variable qπ.Qformulae-sequence𝑞subscript𝜋𝑄q\in\mathcal{B}_{\pi}.Qitalic_q ∈ caligraphic_B start_POSTSUBSCRIPT italic_π end_POSTSUBSCRIPT . italic_Q, we define the set of state-history pairs that captures the possible first observations of the history variable in a test execution via the function 𝚂G:π.QG.S\mathtt{S}_{G}:\mathcal{B}_{\pi}.Q\rightarrow G.Stypewriter_S start_POSTSUBSCRIPT italic_G end_POSTSUBSCRIPT : caligraphic_B start_POSTSUBSCRIPT italic_π end_POSTSUBSCRIPT . italic_Q → italic_G . italic_S defined as follows,

𝚂G(q):={(s,q)G.S|((s¯,q¯),(s,q))G.E,q¯q}.\begin{split}\mathtt{S}_{G}(q):=&\{(s,q)\in G.S\>|\>\\ &\forall((\bar{s},\bar{q}),(s,q))\in G.E,\,\bar{q}\neq q\}.\end{split}start_ROW start_CELL typewriter_S start_POSTSUBSCRIPT italic_G end_POSTSUBSCRIPT ( italic_q ) := end_CELL start_CELL { ( italic_s , italic_q ) ∈ italic_G . italic_S | end_CELL end_ROW start_ROW start_CELL end_CELL start_CELL ∀ ( ( over¯ start_ARG italic_s end_ARG , over¯ start_ARG italic_q end_ARG ) , ( italic_s , italic_q ) ) ∈ italic_G . italic_E , over¯ start_ARG italic_q end_ARG ≠ italic_q } . end_CELL end_ROW (13)

These sets of states are mapped to Gsyssubscript𝐺sysG_{\text{sys}}italic_G start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT as follows:

𝚂Gsys(q):={uGsys.S|u=𝒫GGsys(v),v𝚂G(q), and path(u,𝚃sys)},\begin{split}\mathtt{S}_{G_{\text{sys}}}(q):=&\{u\in G_{\text{sys}}.S\>|\>u=% \mathcal{P}_{G\rightarrow G_{\text{sys}}}(v),\\ &\>v\in\mathtt{S}_{G}(q),\text{ and }\exists\>\text{path}(u,\mathtt{T}_{\text{% sys}})\},\end{split}start_ROW start_CELL typewriter_S start_POSTSUBSCRIPT italic_G start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( italic_q ) := end_CELL start_CELL { italic_u ∈ italic_G start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT . italic_S | italic_u = caligraphic_P start_POSTSUBSCRIPT italic_G → italic_G start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( italic_v ) , end_CELL end_ROW start_ROW start_CELL end_CELL start_CELL italic_v ∈ typewriter_S start_POSTSUBSCRIPT italic_G end_POSTSUBSCRIPT ( italic_q ) , and ∃ path ( italic_u , typewriter_T start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT ) } , end_CELL end_ROW (14)

where this set is empty if no path from the node u𝑢uitalic_u to 𝚃syssubscript𝚃sys\mathtt{T}_{\text{sys}}typewriter_T start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT exists on Gsyssubscript𝐺sysG_{\text{sys}}italic_G start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT. For each qπ.Qformulae-sequence𝑞subscript𝜋𝑄q\in\mathcal{B}_{\pi}.Qitalic_q ∈ caligraphic_B start_POSTSUBSCRIPT italic_π end_POSTSUBSCRIPT . italic_Q, for each source in 𝚜𝚂Gsys(q)𝚜subscript𝚂subscript𝐺sys𝑞\mathtt{s}\in\mathtt{S}_{G_{\text{sys}}}(q)typewriter_s ∈ typewriter_S start_POSTSUBSCRIPT italic_G start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( italic_q ), we define a flow network 𝒢sys(q,𝚜)(Vsys,Esys,c,(𝚜,𝚃sys))superscriptsubscript𝒢sys𝑞𝚜subscript𝑉syssubscript𝐸sys𝑐𝚜subscript𝚃sys\mathcal{G}_{\text{sys}}^{(q,\mathtt{s})}\coloneqq(V_{\text{sys}},E_{\text{sys% }},c,(\mathtt{s},\mathtt{T}_{\text{sys}}))caligraphic_G start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_q , typewriter_s ) end_POSTSUPERSCRIPT ≔ ( italic_V start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT , italic_E start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT , italic_c , ( typewriter_s , typewriter_T start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT ) ), where VsysGsys.Sformulae-sequencesubscript𝑉syssubscript𝐺sys𝑆V_{\text{sys}}\coloneqq G_{\text{sys}}.Sitalic_V start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT ≔ italic_G start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT . italic_S, and EsysGsys.Eformulae-sequencesubscript𝐸syssubscript𝐺sys𝐸E_{\text{sys}}\coloneqq G_{\text{sys}}.Eitalic_E start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT ≔ italic_G start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT . italic_E, with the corresponding flow variable 𝐟sys(q,𝚜)superscriptsubscript𝐟sys𝑞𝚜\mathbf{f}_{\text{sys}}^{(q,\mathtt{s})}bold_f start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_q , typewriter_s ) end_POSTSUPERSCRIPT. For each of these flow networks, we define a flow subject to the standard flow constraints:

qπ.Q,𝚜𝚂Gsys(q),Flow constraints (6), (7), and (8) on network 𝒢sys(q,𝚜).formulae-sequencefor-all𝑞subscript𝜋𝑄for-all𝚜subscript𝚂subscript𝐺sys𝑞Flow constraints (6), (7), and (8) on network superscriptsubscript𝒢sys𝑞𝚜\begin{split}&\forall q\in\mathcal{B}_{\pi}.Q,\forall\mathtt{s}\in\mathtt{S}_{% G_{\text{sys}}}(q),\\ &\text{Flow constraints\leavevmode\nobreak\ \eqref{eq:flow_capacity},% \leavevmode\nobreak\ \eqref{eq:flow_conservation}, and\leavevmode\nobreak\ % \eqref{eq:flow_no_in_src_no_out_sink} on network }\mathcal{G}_{\text{sys}}^{(q% ,\mathtt{s})}.\end{split}start_ROW start_CELL end_CELL start_CELL ∀ italic_q ∈ caligraphic_B start_POSTSUBSCRIPT italic_π end_POSTSUBSCRIPT . italic_Q , ∀ typewriter_s ∈ typewriter_S start_POSTSUBSCRIPT italic_G start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( italic_q ) , end_CELL end_ROW start_ROW start_CELL end_CELL start_CELL Flow constraints ( ), ( ), and ( ) on network caligraphic_G start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_q , typewriter_s ) end_POSTSUPERSCRIPT . end_CELL end_ROW (c6)

For each 𝒢sys(q,𝚜)superscriptsubscript𝒢sys𝑞𝚜\mathcal{G}_{\text{sys}}^{(q,\mathtt{s})}caligraphic_G start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_q , typewriter_s ) end_POSTSUPERSCRIPT, we map the edge cuts 𝐝𝐝\mathbf{d}bold_d and check that there is still a path from 𝚜𝚜\mathtt{s}typewriter_s to some node in 𝚃syssubscript𝚃sys\mathtt{T}_{\text{sys}}typewriter_T start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT. This ensures that reactively placing restrictions on system actions does not make it impossible for a correct system strategy to make progress toward its goal. Intuitively, the edge cuts are grouped by the history variable q𝑞qitalic_q and checked to ensure that the system has a feasible path when these restrictions are placed on system actions. The edges are grouped by their history variable using the mapping 𝙶𝚛:π.Q2G.E\mathtt{Gr}:\mathcal{B}_{\pi}.Q\rightarrow 2^{G.E}typewriter_Gr : caligraphic_B start_POSTSUBSCRIPT italic_π end_POSTSUBSCRIPT . italic_Q → 2 start_POSTSUPERSCRIPT italic_G . italic_E end_POSTSUPERSCRIPT, defined as follows:

𝙶𝚛(q){((s,q),(s,q))G.E}.\mathtt{Gr}(q)\coloneqq\{((s,q),(s^{\prime},q^{\prime}))\in G.E\}.typewriter_Gr ( italic_q ) ≔ { ( ( italic_s , italic_q ) , ( italic_s start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , italic_q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ) ∈ italic_G . italic_E } . (15)

The edge cuts are mapped onto the corresponding 𝒢sys(q,𝚜)superscriptsubscript𝒢sys𝑞𝚜\mathcal{G}_{\text{sys}}^{(q,\mathtt{s})}caligraphic_G start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_q , typewriter_s ) end_POSTSUPERSCRIPT to cut the corresponding flow 𝐟sys(q,𝚜)superscriptsubscript𝐟sys𝑞𝚜\mathbf{f}_{\text{sys}}^{(q,\mathtt{s})}bold_f start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_q , typewriter_s ) end_POSTSUPERSCRIPT as follows:

qπ.Q,𝚜𝚂Gsys(q),(u,v)𝙶𝚛(q),(u,v)Esys,d(u,v)+fsys(q,𝚜)(u,v)1, if u.s=u.s and v.s=v.s.formulae-sequencefor-all𝑞subscript𝜋𝑄for-all𝚜subscript𝚂subscript𝐺sys𝑞formulae-sequencefor-all𝑢𝑣𝙶𝚛𝑞formulae-sequencefor-allsuperscript𝑢superscript𝑣subscript𝐸syssuperscript𝑑𝑢𝑣superscriptsubscriptsuperscript𝑓𝑞𝚜syssuperscript𝑢superscript𝑣1 if superscript𝑢𝑠𝑢𝑠 and superscript𝑣𝑠𝑣𝑠\begin{split}\forall q\in\mathcal{B}_{\pi}.Q,\forall\mathtt{s}\in\mathtt{S}_{G% _{\text{sys}}}(q),\forall(u,v)\in\mathtt{Gr}(q),\forall(u^{\prime},v^{\prime})% \in E_{\text{sys}},\\ d^{(u,v)}+{f^{(q,\mathtt{s})}_{\text{sys}}}^{(u^{\prime},v^{\prime})}\leq 1,\,% \text{ if }u^{\prime}.s=u.s\text{ and }v^{\prime}.s=v.s.\end{split}start_ROW start_CELL ∀ italic_q ∈ caligraphic_B start_POSTSUBSCRIPT italic_π end_POSTSUBSCRIPT . italic_Q , ∀ typewriter_s ∈ typewriter_S start_POSTSUBSCRIPT italic_G start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( italic_q ) , ∀ ( italic_u , italic_v ) ∈ typewriter_Gr ( italic_q ) , ∀ ( italic_u start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , italic_v start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ∈ italic_E start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT , end_CELL end_ROW start_ROW start_CELL italic_d start_POSTSUPERSCRIPT ( italic_u , italic_v ) end_POSTSUPERSCRIPT + italic_f start_POSTSUPERSCRIPT ( italic_q , typewriter_s ) end_POSTSUPERSCRIPT start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_u start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , italic_v start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) end_POSTSUPERSCRIPT ≤ 1 , if italic_u start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT . italic_s = italic_u . italic_s and italic_v start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT . italic_s = italic_v . italic_s . end_CELL end_ROW (c7)

Since we are agnostic to the system controller, we need to ensure that a path to the system’s goal exists at all times during the test execution. To enforce this, we require a flow of at least 1 on each system flow network 𝒢sys(q,𝚜)superscriptsubscript𝒢sys𝑞𝚜\mathcal{G}_{\text{sys}}^{(q,\mathtt{s})}caligraphic_G start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_q , typewriter_s ) end_POSTSUPERSCRIPT,

(𝚜,v)Esysfsys(q,𝚜)(𝚜,v)1,qπ.Q,𝚜𝚂Gsys(q).formulae-sequenceformulae-sequencesubscript𝚜𝑣subscript𝐸syssuperscriptsubscriptsuperscript𝑓𝑞𝚜sys𝚜𝑣1for-all𝑞subscript𝜋𝑄for-all𝚜subscript𝚂subscript𝐺sys𝑞\sum_{(\mathtt{s},v)\in E_{\text{sys}}}{f^{(q,\mathtt{s})}_{\text{sys}}}^{(% \mathtt{s},v)}\geq 1,\>\forall q\in\mathcal{B}_{\pi}.Q,\,\forall\mathtt{s}\in% \mathtt{S}_{G_{\text{sys}}}(q).∑ start_POSTSUBSCRIPT ( typewriter_s , italic_v ) ∈ italic_E start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT end_POSTSUBSCRIPT italic_f start_POSTSUPERSCRIPT ( italic_q , typewriter_s ) end_POSTSUPERSCRIPT start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( typewriter_s , italic_v ) end_POSTSUPERSCRIPT ≥ 1 , ∀ italic_q ∈ caligraphic_B start_POSTSUBSCRIPT italic_π end_POSTSUBSCRIPT . italic_Q , ∀ typewriter_s ∈ typewriter_S start_POSTSUBSCRIPT italic_G start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( italic_q ) . (c8)

These feasibility cuts correspond to the reactive constraint setting since edge cuts are placed on 𝒢𝒢\mathcal{G}caligraphic_G and depend on the history variable q𝑞qitalic_q. For an illustrated explanation for Example 2, refer to Fig. 4. Finally, the optimization to identify edge cuts for the reactive test strategy is characterized by the following mixed-integer linear program (MILP) with the cuts 𝐝𝐝\mathbf{d}bold_d as the integer variables, and the flow and partition variables taking continuous values.

  MILP-reactive:

max𝐟,𝐝,𝝁,𝐟sys(q,𝚜)qπ.Q𝚜𝚂𝒢sys(q)F1|E|eEdesubscript𝐟𝐝𝝁formulae-sequencesuperscriptsubscript𝐟sys𝑞𝚜for-all𝑞subscript𝜋𝑄for-all𝚜subscript𝚂subscript𝒢sys𝑞𝐹1𝐸subscript𝑒𝐸superscript𝑑𝑒\displaystyle\max_{\begin{subarray}{c}\mathbf{f},\mathbf{d},\bm{\mu},\\ \mathbf{f}_{\text{sys}}^{(q,\mathtt{s})}\,\forall q\in\mathcal{B}_{\pi}.Q\,% \forall\mathtt{s}\in\mathtt{S}_{\mathcal{G}_{\text{sys}}}(q)\end{subarray}}F-% \frac{1}{|E|}\sum_{e\in E}d^{e}roman_max start_POSTSUBSCRIPT start_ARG start_ROW start_CELL bold_f , bold_d , bold_italic_μ , end_CELL end_ROW start_ROW start_CELL bold_f start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_q , typewriter_s ) end_POSTSUPERSCRIPT ∀ italic_q ∈ caligraphic_B start_POSTSUBSCRIPT italic_π end_POSTSUBSCRIPT . italic_Q ∀ typewriter_s ∈ typewriter_S start_POSTSUBSCRIPT caligraphic_G start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( italic_q ) end_CELL end_ROW end_ARG end_POSTSUBSCRIPT italic_F - divide start_ARG 1 end_ARG start_ARG | italic_E | end_ARG ∑ start_POSTSUBSCRIPT italic_e ∈ italic_E end_POSTSUBSCRIPT italic_d start_POSTSUPERSCRIPT italic_e end_POSTSUPERSCRIPT (16)
s.t.(c1)-(c3),(c4)-(c5),(c6)-(c8).s.t.(c1)-(c3)(c4)-(c5)(c6)-(c8)\displaystyle\text{s.t.}\quad\text{\eqref{eq:binary_cuts}-\eqref{eq:cut_const}% },\text{\eqref{eq:mu_partition}-\eqref{eq:cuts_partition}},\text{\eqref{eq:% flow_constraints_on_G_sys}-\eqref{eq:feasibility_flow_on_Gsys}}.s.t. ( )-( ) , ( )-( ) , ( )-( ) .

 

Static Constraints. We can simplify the feasibility constraints in the case of static obstacles. This corresponds to the requirement that any transition that is restricted will remain restricted for the entire duration of the test. From the system’s perspective, the restrictions will not change depending on the history variable q𝑞qitalic_q. That is, edges in G𝐺Gitalic_G corresponding to the same transition in Tsys.Eformulae-sequencesubscript𝑇sys𝐸T_{\text{sys}}.Eitalic_T start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT . italic_E are grouped and share the same cut value:

d(u,v)=d(u,v),(u,v),(u,v)E,if u.s=u.s and v.s=v.s.formulae-sequenceformulae-sequencesuperscript𝑑𝑢𝑣superscript𝑑superscript𝑢superscript𝑣for-all𝑢𝑣superscript𝑢superscript𝑣𝐸if 𝑢𝑠superscript𝑢𝑠 and 𝑣𝑠superscript𝑣𝑠\begin{split}d^{(u,v)}=d^{(u^{\prime},v^{\prime})},\>\forall(u,v),(u^{\prime},% v^{\prime})\in E,\\ \text{if }u.s=u^{\prime}.s\text{ and }v.s=v^{\prime}.s.\end{split}start_ROW start_CELL italic_d start_POSTSUPERSCRIPT ( italic_u , italic_v ) end_POSTSUPERSCRIPT = italic_d start_POSTSUPERSCRIPT ( italic_u start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , italic_v start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) end_POSTSUPERSCRIPT , ∀ ( italic_u , italic_v ) , ( italic_u start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , italic_v start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ∈ italic_E , end_CELL end_ROW start_ROW start_CELL if italic_u . italic_s = italic_u start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT . italic_s and italic_v . italic_s = italic_v start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT . italic_s . end_CELL end_ROW (c9)

Similarly, the optimization to find edge cuts in a static setting is as follows.
  MILP-static:

max𝐟,𝐝,𝝁F1|E|eEdesubscript𝐟𝐝𝝁𝐹1𝐸subscript𝑒𝐸superscript𝑑𝑒\displaystyle\max_{\mathbf{f},\mathbf{d},\bm{\mu}}F-\frac{1}{|E|}\sum_{e\in E}% d^{e}roman_max start_POSTSUBSCRIPT bold_f , bold_d , bold_italic_μ end_POSTSUBSCRIPT italic_F - divide start_ARG 1 end_ARG start_ARG | italic_E | end_ARG ∑ start_POSTSUBSCRIPT italic_e ∈ italic_E end_POSTSUBSCRIPT italic_d start_POSTSUPERSCRIPT italic_e end_POSTSUPERSCRIPT (17)
s.t.(c1)-(c3),(c4)-(c5),(c9).s.t.(c1)-(c3)(c4)-(c5)italic-(c9italic-)\displaystyle\text{s.t.}\quad\text{\eqref{eq:binary_cuts}-\eqref{eq:cut_const}% },\text{\eqref{eq:mu_partition}-\eqref{eq:cuts_partition}},\eqref{eq:static_% map_cuts_in_G}.s.t. ( )-( ) , ( )-( ) , italic_( italic_) .

 

Lemma 2.

For the case of static constraints, due to (c9), ensuring feasibility from the system’s perspective is guaranteed by checking F>0𝐹0F>0italic_F > 0 on G𝐺Gitalic_G. That is, F>0𝐹0F>0italic_F > 0 on G𝐺Gitalic_G is equivalent to checking (c6)-(c8).

Proof.

Under (c9), the edge groupings 𝙶𝚛(q)𝙶𝚛𝑞\mathtt{Gr}(q)typewriter_Gr ( italic_q ) become the same for all qπ.Qformulae-sequence𝑞subscript𝜋𝑄q\in\mathcal{B}_{\pi}.Qitalic_q ∈ caligraphic_B start_POSTSUBSCRIPT italic_π end_POSTSUBSCRIPT . italic_Q. Thus, the constraints (c6)-(c8) can be reduced onto a single flow network 𝒢sys=(Vsys,Esys,(𝚂sys,𝚃sys))subscript𝒢syssubscript𝑉syssubscript𝐸syssubscript𝚂syssubscript𝚃sys\mathcal{G}_{\text{sys}}=(V_{\text{sys}},E_{\text{sys}},(\mathtt{S}_{\text{sys% }},\mathtt{T}_{\text{sys}}))caligraphic_G start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT = ( italic_V start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT , italic_E start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT , ( typewriter_S start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT , typewriter_T start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT ) ), where 𝚂sys:=Gsys.Iformulae-sequenceassignsubscript𝚂syssubscript𝐺sys𝐼\mathtt{S}_{\text{sys}}:=G_{\text{sys}}.Itypewriter_S start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT := italic_G start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT . italic_I. Equation (c8) being satisfied on 𝒢syssubscript𝒢sys\mathcal{G}_{\text{sys}}caligraphic_G start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT implies that there is a path on G𝐺Gitalic_G from 𝚂𝚂\mathtt{S}typewriter_S to 𝚃𝚃\mathtt{T}typewriter_T via Lemma 1. Additionally, if there is a path on G𝐺Gitalic_G from 𝚂𝚂\mathtt{S}typewriter_S to 𝚃𝚃\mathtt{T}typewriter_T with the static constraints (c9), then it must be that there exists a path from 𝚂syssubscript𝚂sys\mathtt{S}_{\text{sys}}typewriter_S start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT to 𝚃syssubscript𝚃sys\mathtt{T}_{\text{sys}}typewriter_T start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT on Gsyssubscript𝐺sysG_{\text{sys}}italic_G start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT. ∎

Remark 6.

For the reactive constraint setting, we can replace the feasibility constraints (c6)-(c8) by several static constraints. That is, we introduce a copy of 𝒢𝒢\mathcal{G}caligraphic_G for each history variable qπ.Qformulae-sequence𝑞subscript𝜋𝑄q\in\mathcal{B}_{\pi}.Qitalic_q ∈ caligraphic_B start_POSTSUBSCRIPT italic_π end_POSTSUBSCRIPT . italic_Q and each source 𝚜𝚂G(q)𝚜subscript𝚂𝐺𝑞\mathtt{s}\in\mathtt{S}_{G}(q)typewriter_s ∈ typewriter_S start_POSTSUBSCRIPT italic_G end_POSTSUBSCRIPT ( italic_q ), denoted 𝒢(q,𝚜)=(V,E,𝚜,𝚃)superscript𝒢𝑞𝚜𝑉𝐸𝚜𝚃\mathcal{G}^{(q,\mathtt{s})}=(V,E,\mathtt{s},\mathtt{T})caligraphic_G start_POSTSUPERSCRIPT ( italic_q , typewriter_s ) end_POSTSUPERSCRIPT = ( italic_V , italic_E , typewriter_s , typewriter_T ), and require a path from 𝚜𝚜\mathtt{s}typewriter_s to T𝑇Titalic_T to exist under a static mapping of the edges in the group 𝙶𝚛(q)𝙶𝚛𝑞\mathtt{Gr}(q)typewriter_Gr ( italic_q ) by constraint (c9). We choose the former since it reduces the number of variables and constraints in the optimization.

Mixed Constraints. In some cases, it might be desirable to define specific transitions Tsys.EstaticTsys.Eformulae-sequencesubscript𝑇syssubscript𝐸staticsubscript𝑇sys𝐸T_{\text{sys}}.E_{\text{static}}\subseteq T_{\text{sys}}.Eitalic_T start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT . italic_E start_POSTSUBSCRIPT static end_POSTSUBSCRIPT ⊆ italic_T start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT . italic_E which require static constraints. The mixed setting of reactive and static transition restrictions can be implemented by enforcing the feasibility constraints (c6)-(c8), and the static constraints (c9) on edges (u,v)E𝑢𝑣𝐸(u,v)\in E( italic_u , italic_v ) ∈ italic_E, where the corresponding transition (u.s,v.s)Tsys.Estatic(u.s,v.s)\in T_{\text{sys}}.E_{\text{static}}( italic_u . italic_s , italic_v . italic_s ) ∈ italic_T start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT . italic_E start_POSTSUBSCRIPT static end_POSTSUBSCRIPT. Finally, the optimization for the mixed constraint setting is as follows.

  MILP-mixed:

max𝐟,𝐝,𝝁,𝐟sys(q,𝚜)qπ.Q𝚜𝚂𝒢sys(q)F1|E|eEdesubscript𝐟𝐝𝝁formulae-sequencesuperscriptsubscript𝐟sys𝑞𝚜for-all𝑞subscript𝜋𝑄for-all𝚜subscript𝚂subscript𝒢sys𝑞𝐹1𝐸subscript𝑒𝐸superscript𝑑𝑒\displaystyle\max_{\begin{subarray}{c}\mathbf{f},\mathbf{d},\bm{\mu},\\ \mathbf{f}_{\text{sys}}^{(q,\mathtt{s})}\,\forall q\in\mathcal{B}_{\pi}.Q\,% \forall\mathtt{s}\in\mathtt{S}_{\mathcal{G}_{\text{sys}}}(q)\end{subarray}}F-% \frac{1}{|E|}\sum_{e\in E}d^{e}roman_max start_POSTSUBSCRIPT start_ARG start_ROW start_CELL bold_f , bold_d , bold_italic_μ , end_CELL end_ROW start_ROW start_CELL bold_f start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_q , typewriter_s ) end_POSTSUPERSCRIPT ∀ italic_q ∈ caligraphic_B start_POSTSUBSCRIPT italic_π end_POSTSUBSCRIPT . italic_Q ∀ typewriter_s ∈ typewriter_S start_POSTSUBSCRIPT caligraphic_G start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( italic_q ) end_CELL end_ROW end_ARG end_POSTSUBSCRIPT italic_F - divide start_ARG 1 end_ARG start_ARG | italic_E | end_ARG ∑ start_POSTSUBSCRIPT italic_e ∈ italic_E end_POSTSUBSCRIPT italic_d start_POSTSUPERSCRIPT italic_e end_POSTSUPERSCRIPT (18)
s.t.(c1)-(c3),(c4)-(c5),(c6)-(c8),(c9).s.t.(c1)-(c3)(c4)-(c5)(c6)-(c8)(c9)\displaystyle\text{s.t.}\quad\text{\eqref{eq:binary_cuts}-\eqref{eq:cut_const}% },\text{\eqref{eq:mu_partition}-\eqref{eq:cuts_partition}},\text{\eqref{eq:% flow_constraints_on_G_sys}-\eqref{eq:feasibility_flow_on_Gsys}},\text{\eqref{% eq:static_map_cuts_in_G}}.s.t. ( )-( ) , ( )-( ) , ( )-( ) , ( ) .

 

Auxiliary Constraints. Additional constraints can be added to the optimization depending on the test harness or the desired test setup. For example, it might be required to enforce that if an edge is cut, the transition will be blocked in both directions. This can be enforced as follows,

d(u,v)=d(u,v),(u,v),(u,v)E,if u.s=v.s and v.s=u.s.formulae-sequenceformulae-sequencesuperscript𝑑𝑢𝑣superscript𝑑superscript𝑢superscript𝑣for-all𝑢𝑣superscript𝑢superscript𝑣𝐸if 𝑢𝑠superscript𝑣𝑠 and 𝑣𝑠superscript𝑢𝑠\begin{split}d^{(u,v)}=d^{(u^{\prime},v^{\prime})},\>\forall(u,v),\,(u^{\prime% },v^{\prime})\in E,\\ \text{if }u.s=v^{\prime}.s\text{ and }v.s=u^{\prime}.s.\end{split}start_ROW start_CELL italic_d start_POSTSUPERSCRIPT ( italic_u , italic_v ) end_POSTSUPERSCRIPT = italic_d start_POSTSUPERSCRIPT ( italic_u start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , italic_v start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) end_POSTSUPERSCRIPT , ∀ ( italic_u , italic_v ) , ( italic_u start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , italic_v start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ∈ italic_E , end_CELL end_ROW start_ROW start_CELL if italic_u . italic_s = italic_v start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT . italic_s and italic_v . italic_s = italic_u start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT . italic_s . end_CELL end_ROW (c14)
Algorithm 1 Finding the test strategy πtestsubscript𝜋test\pi_{\text{test}}italic_π start_POSTSUBSCRIPT test end_POSTSUBSCRIPT
1:procedure FindTestStrategy(Tsys,H,φsys,φtestsubscript𝑇sys𝐻subscript𝜑syssubscript𝜑testT_{\text{sys}},H,\varphi_{\text{sys}},\varphi_{\text{test}}italic_T start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT , italic_H , italic_φ start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT , italic_φ start_POSTSUBSCRIPT test end_POSTSUBSCRIPT)
2:transition system Tsyssubscript𝑇sysT_{\text{sys}}italic_T start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT, test harness H𝐻Hitalic_H, system objective φsyssubscript𝜑sys\varphi_{\text{sys}}italic_φ start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT, test objective φtestsubscript𝜑test\varphi_{\text{test}}italic_φ start_POSTSUBSCRIPT test end_POSTSUBSCRIPT
3:test strategy πtestsubscript𝜋test\pi_{\text{test}}italic_π start_POSTSUBSCRIPT test end_POSTSUBSCRIPT
4:     sysBA(φsys)subscriptsysBAsubscript𝜑sys\mathcal{B}_{\text{sys}}\leftarrow\text{BA}(\varphi_{\text{sys}})caligraphic_B start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT ← BA ( italic_φ start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT ) \triangleright System Büchi automaton
5:     testBA(φtest)subscripttestBAsubscript𝜑test\mathcal{B}_{\text{test}}\leftarrow\text{BA}(\varphi_{\text{test}})caligraphic_B start_POSTSUBSCRIPT test end_POSTSUBSCRIPT ← BA ( italic_φ start_POSTSUBSCRIPT test end_POSTSUBSCRIPT ) \triangleright Tester Büchi automaton
6:     πsystestsubscript𝜋tensor-productsubscriptsyssubscripttest\mathcal{B}_{\pi}\leftarrow\mathcal{B}_{\text{sys}}\otimes\mathcal{B}_{\text{% test}}caligraphic_B start_POSTSUBSCRIPT italic_π end_POSTSUBSCRIPT ← caligraphic_B start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT ⊗ caligraphic_B start_POSTSUBSCRIPT test end_POSTSUBSCRIPT \triangleright Specification product
7:     GsysTsyssyssubscript𝐺systensor-productsubscript𝑇syssubscriptsysG_{\text{sys}}\leftarrow T_{\text{sys}}\otimes\mathcal{B}_{\text{sys}}italic_G start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT ← italic_T start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT ⊗ caligraphic_B start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT \triangleright System product
8:     GTsysπ𝐺tensor-productsubscript𝑇syssubscript𝜋G\leftarrow T_{\text{sys}}\otimes\mathcal{B}_{\pi}italic_G ← italic_T start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT ⊗ caligraphic_B start_POSTSUBSCRIPT italic_π end_POSTSUBSCRIPT \triangleright Virtual Product Graph
9:     𝚂𝚂\mathtt{S}typewriter_S, 𝙸𝙸\mathtt{I}typewriter_I, 𝚃𝚃absent\mathtt{T}\leftarrowtypewriter_T ← IdentifyNodes(G,sys,test)𝐺subscriptsyssubscripttest(G,\mathcal{B}_{\text{sys}},\mathcal{B}_{\text{test}})( italic_G , caligraphic_B start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT , caligraphic_B start_POSTSUBSCRIPT test end_POSTSUBSCRIPT )
10:     𝒢𝒢absent\mathcal{G}\leftarrowcaligraphic_G ← DefineNetwork (G,𝚂,𝚃)𝐺𝚂𝚃(G,\mathtt{S},\mathtt{T})( italic_G , typewriter_S , typewriter_T )
11:     𝔊𝚜𝚎𝚝()𝔊𝚜𝚎𝚝\mathfrak{G}\leftarrow\mathtt{set()}fraktur_G ← typewriter_set ( ) \triangleright System Perspective Graphs
12:     for qπ.Qformulae-sequence𝑞subscript𝜋𝑄q\in\mathcal{B}_{\pi}.Qitalic_q ∈ caligraphic_B start_POSTSUBSCRIPT italic_π end_POSTSUBSCRIPT . italic_Q do
13:         for 𝚜𝚂Gsys(q)𝚜subscript𝚂subscript𝐺sys𝑞\mathtt{s}\in\mathtt{S}_{G_{\text{sys}}}(q)typewriter_s ∈ typewriter_S start_POSTSUBSCRIPT italic_G start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( italic_q ) do
14:              𝒢sys(𝚜,q)superscriptsubscript𝒢sys𝚜𝑞absent\mathcal{G}_{\text{sys}}^{(\mathtt{s},q)}\leftarrowcaligraphic_G start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( typewriter_s , italic_q ) end_POSTSUPERSCRIPT ← DefineNetwork(Gsys,𝚜,𝚃sys)subscript𝐺sys𝚜subscript𝚃sys(G_{\text{sys}},\mathtt{s},\mathtt{T}_{\text{sys}})( italic_G start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT , typewriter_s , typewriter_T start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT )
15:              𝔊𝔊𝒢sys(𝚜,q)𝔊𝔊superscriptsubscript𝒢sys𝚜𝑞\mathfrak{G}\leftarrow\mathfrak{G}\cup\mathcal{G}_{\text{sys}}^{(\mathtt{s},q)}fraktur_G ← fraktur_G ∪ caligraphic_G start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( typewriter_s , italic_q ) end_POSTSUPERSCRIPT               
16:     𝐝superscript𝐝absent\mathbf{d}^{*}\leftarrowbold_d start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ← MILP(𝒢,Tsys,𝔊,𝙸,H)𝒢subscript𝑇sys𝔊𝙸𝐻(\mathcal{G},T_{\text{sys}},\mathfrak{G},\mathtt{I},H)( caligraphic_G , italic_T start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT , fraktur_G , typewriter_I , italic_H ) \triangleright Reactive, static, or mixed.
17:     C{(u,v)G.E|𝐝(u,v)=1}C\leftarrow\{(u,v)\in G.E\,|\,{\mathbf{d}^{*}}^{(u,v)}=1\}italic_C ← { ( italic_u , italic_v ) ∈ italic_G . italic_E | bold_d start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT start_POSTSUPERSCRIPT ( italic_u , italic_v ) end_POSTSUPERSCRIPT = 1 } \triangleright Cuts on G𝐺Gitalic_G
18:     πtestsubscript𝜋testabsent\pi_{\text{test}}\leftarrowitalic_π start_POSTSUBSCRIPT test end_POSTSUBSCRIPT ← Define test strategy according to equation (20)
19:     return πtestsubscript𝜋test\pi_{\text{test}}italic_π start_POSTSUBSCRIPT test end_POSTSUBSCRIPT

V-B Characterizing Optimization Results

The flow value (Eq. (9)) of the network is always integer-valued since the edge cuts are binary and edges have unit capacities, and therefore, any strictly positive flow value corresponds to at least one valid test execution. In the following cases, the problem data are inconsistent and a flow value 1absent1\geq 1≥ 1 cannot be found.
Case 1: There is no path from 𝚂𝚂\mathtt{S}typewriter_S to 𝚃𝚃\mathtt{T}typewriter_T on G𝐺Gitalic_G (and equivalently, no path from 𝚂syssubscript𝚂sys\mathtt{S}_{\text{sys}}typewriter_S start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT to 𝚃syssubscript𝚃sys\mathtt{T}_{\text{sys}}typewriter_T start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT on Gsyssubscript𝐺sysG_{\text{sys}}italic_G start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT). In this case, the optimization will not have to place any cuts because the only possible maximum flow value is 00.
Case 2: There is a path from 𝚂𝚂\mathtt{S}typewriter_S to 𝚃𝚃\mathtt{T}typewriter_T on G𝐺Gitalic_G, but there is no path 𝚂𝚂\mathtt{S}typewriter_S to 𝚃𝚃\mathtt{T}typewriter_T in G𝐺Gitalic_G visiting an intermediate node in 𝙸𝙸\mathtt{I}typewriter_I. In this case, the partition constraints will cut all paths from 𝚂𝚂\mathtt{S}typewriter_S to 𝚃𝚃\mathtt{T}typewriter_T, while by Lemma 1 the feasibility constraints require a path to exist from 𝚂𝚂\mathtt{S}typewriter_S to 𝚃𝚃\mathtt{T}typewriter_T—a contradiction. The routing optimization is infeasible in this instance.

For each MILP, the set of edges that are cut are found from the optimal 𝐝superscript𝐝\mathbf{d}^{*}bold_d start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT as follows, C{(u,v)EE(𝙸)|d(u,v)=1}𝐶conditional-set𝑢𝑣𝐸𝐸𝙸superscriptsuperscript𝑑𝑢𝑣1C\coloneqq\{(u,v)\in E\setminus E(\mathtt{I})\,|\,{d^{*}}^{(u,v)}=1\}italic_C ≔ { ( italic_u , italic_v ) ∈ italic_E ∖ italic_E ( typewriter_I ) | italic_d start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT start_POSTSUPERSCRIPT ( italic_u , italic_v ) end_POSTSUPERSCRIPT = 1 }, resulting in the cut network 𝒢cut=(V,EC,𝚂,𝚃)subscript𝒢cut𝑉𝐸𝐶𝚂𝚃\mathcal{G}_{\text{cut}}=(V,E\setminus C,\mathtt{S},\mathtt{T})caligraphic_G start_POSTSUBSCRIPT cut end_POSTSUBSCRIPT = ( italic_V , italic_E ∖ italic_C , typewriter_S , typewriter_T ). The bypass flow value is computed on the network 𝒢byp(Vbyp,Ebyp,𝚂,𝚃)subscript𝒢bypsubscript𝑉bypsubscript𝐸byp𝚂𝚃\mathcal{G}_{\text{byp}}\coloneqq(V_{\text{byp}},E_{\text{byp}},\mathtt{S},% \mathtt{T})caligraphic_G start_POSTSUBSCRIPT byp end_POSTSUBSCRIPT ≔ ( italic_V start_POSTSUBSCRIPT byp end_POSTSUBSCRIPT , italic_E start_POSTSUBSCRIPT byp end_POSTSUBSCRIPT , typewriter_S , typewriter_T ), where VbypV𝙸subscript𝑉byp𝑉𝙸V_{\text{byp}}\coloneqq V\setminus\mathtt{I}italic_V start_POSTSUBSCRIPT byp end_POSTSUBSCRIPT ≔ italic_V ∖ typewriter_I, and EbypE(E(𝙸)C)subscript𝐸byp𝐸𝐸𝙸𝐶E_{\text{byp}}\coloneqq E\setminus\big{(}E(\mathtt{I})\cup C\big{)}italic_E start_POSTSUBSCRIPT byp end_POSTSUBSCRIPT ≔ italic_E ∖ ( italic_E ( typewriter_I ) ∪ italic_C ). A strictly positive bypass flow value indicates the existence of a Path(𝚂,𝚃)𝑃𝑎𝑡𝚂𝚃Path(\mathtt{S},\mathtt{T})italic_P italic_a italic_t italic_h ( typewriter_S , typewriter_T ) on 𝒢cutsubscript𝒢cut\mathcal{G}_{\text{cut}}caligraphic_G start_POSTSUBSCRIPT cut end_POSTSUBSCRIPT that does not visit an intermediate node in 𝙸𝙸\mathtt{I}typewriter_I.

Theorem 1.

For each MILP, the optimal cuts C𝐶Citalic_C result in a bypass flow value of 00.

Proof.

The partition constraints (c4) and (c5) partition the set of vertices V𝙸𝑉𝙸V\setminus\mathtt{I}italic_V ∖ typewriter_I into two groups: nodes with potential μ=0𝜇0\mu=0italic_μ = 0 (e.g., 𝚃𝚃\mathtt{T}typewriter_T) and nodes with potential μ=1𝜇1\mu=1italic_μ = 1 (e.g., 𝚂𝚂\mathtt{S}typewriter_S). On any path v0vksubscript𝑣0subscript𝑣𝑘v_{0}\ldots v_{k}italic_v start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT … italic_v start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT on 𝒢bypsubscript𝒢byp\mathcal{G}_{\text{byp}}caligraphic_G start_POSTSUBSCRIPT byp end_POSTSUBSCRIPT, where v0=𝚂subscript𝑣0𝚂v_{0}=\mathtt{S}italic_v start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT = typewriter_S and vk=𝚃subscript𝑣𝑘𝚃v_{k}=\mathtt{T}italic_v start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT = typewriter_T, the difference in potential values can be expressed as a telescoping sum: i=0k1(μiμi+1)=μ𝚂μ𝚃superscriptsubscript𝑖0𝑘1superscript𝜇𝑖superscript𝜇𝑖1superscript𝜇𝚂superscript𝜇𝚃\sum_{i=0}^{k-1}(\mu^{i}-\mu^{i+1})=\mu^{\mathtt{S}}-\mu^{\mathtt{T}}∑ start_POSTSUBSCRIPT italic_i = 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_k - 1 end_POSTSUPERSCRIPT ( italic_μ start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT - italic_μ start_POSTSUPERSCRIPT italic_i + 1 end_POSTSUPERSCRIPT ) = italic_μ start_POSTSUPERSCRIPT typewriter_S end_POSTSUPERSCRIPT - italic_μ start_POSTSUPERSCRIPT typewriter_T end_POSTSUPERSCRIPT. Then, by partition constraints (c4) and (c5),

i=0k1d(vi,vi+1)i=0k1(μiμi+1)=μ𝚂μ𝚃1.superscriptsubscript𝑖0𝑘1superscript𝑑subscript𝑣𝑖subscript𝑣𝑖1superscriptsubscript𝑖0𝑘1superscript𝜇𝑖superscript𝜇𝑖1superscript𝜇𝚂superscript𝜇𝚃1\sum_{i=0}^{k-1}d^{(v_{i},v_{i+1})}\geq\sum_{i=0}^{k-1}(\mu^{i}-\mu^{i+1})=\mu% ^{\mathtt{S}}-\mu^{\mathtt{T}}\geq 1.∑ start_POSTSUBSCRIPT italic_i = 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_k - 1 end_POSTSUPERSCRIPT italic_d start_POSTSUPERSCRIPT ( italic_v start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT italic_i + 1 end_POSTSUBSCRIPT ) end_POSTSUPERSCRIPT ≥ ∑ start_POSTSUBSCRIPT italic_i = 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_k - 1 end_POSTSUPERSCRIPT ( italic_μ start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT - italic_μ start_POSTSUPERSCRIPT italic_i + 1 end_POSTSUPERSCRIPT ) = italic_μ start_POSTSUPERSCRIPT typewriter_S end_POSTSUPERSCRIPT - italic_μ start_POSTSUPERSCRIPT typewriter_T end_POSTSUPERSCRIPT ≥ 1 .

Therefore, for at least one edge (vi,vi+1)subscript𝑣𝑖subscript𝑣𝑖1(v_{i},v_{i+1})( italic_v start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT italic_i + 1 end_POSTSUBSCRIPT ) on the path, where 0ik10𝑖𝑘10\leq i\leq k-10 ≤ italic_i ≤ italic_k - 1, the corresponding cut value is d(vi,vi+1)=1superscript𝑑subscript𝑣𝑖subscript𝑣𝑖11d^{(v_{i},v_{i+1})}=1italic_d start_POSTSUPERSCRIPT ( italic_v start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT italic_i + 1 end_POSTSUBSCRIPT ) end_POSTSUPERSCRIPT = 1. These edges belong to the set of cut edges C𝐶Citalic_C. Thus, the flow value on 𝒢bypsubscript𝒢byp\mathcal{G}_{\text{byp}}caligraphic_G start_POSTSUBSCRIPT byp end_POSTSUBSCRIPT is zero. ∎

Theorem 2.

For each MILP, the optimal cuts C𝐶Citalic_C are such that there always exists a path to the goal from the system’s perspective.

Proof.

First, consider the MILP in the reactive setting. The optimal cuts C𝐶Citalic_C satisfy the feasibility constraints (c6), (c7), and (c8). These constraints ensure that for each history variable qπ.Qformulae-sequence𝑞subscript𝜋𝑄q\in\mathcal{B}_{\pi}.Qitalic_q ∈ caligraphic_B start_POSTSUBSCRIPT italic_π end_POSTSUBSCRIPT . italic_Q, there exists a path for the system from each state 𝚜𝚂Gsys(q)𝚜subscript𝚂subscript𝐺sys𝑞\mathtt{s}\in\mathtt{S}_{G_{\text{sys}}}(q)typewriter_s ∈ typewriter_S start_POSTSUBSCRIPT italic_G start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( italic_q ) to 𝚃syssubscript𝚃sys\mathtt{T}_{\text{sys}}typewriter_T start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT on Gsyssubscript𝐺sysG_{\text{sys}}italic_G start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT. The edge cuts C𝐶Citalic_C are grouped by their history variable (see equation (15)) and mapped to the corresponding 𝒢sys(q,𝚜)superscriptsubscript𝒢sys𝑞𝚜\mathcal{G}_{\text{sys}}^{(q,\mathtt{s})}caligraphic_G start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_q , typewriter_s ) end_POSTSUPERSCRIPT (see equation (c7)). Then, each copy 𝒢sys(q,𝚜)superscriptsubscript𝒢sys𝑞𝚜\mathcal{G}_{\text{sys}}^{(q,\mathtt{s})}caligraphic_G start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_q , typewriter_s ) end_POSTSUPERSCRIPT represents all the cuts that can be simultaneously applied when the state of the test execution is at history variable q𝑞qitalic_q. Thus, all restrictions on system actions at history q𝑞qitalic_q are captured by the cuts on 𝒢sys(q,𝚜)\mathcal{G}_{\text{sys}}^{(}q,\mathtt{s})caligraphic_G start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( end_POSTSUPERSCRIPT italic_q , typewriter_s ). Since this is true for every q𝑞qitalic_q and every source state 𝚜𝚜\mathtt{s}typewriter_s at which the test execution enters into q𝑞qitalic_q, there always exists a path to the goal by equation (c8). The proof for the static and mixed settings follows similarly. ∎

Lemma 3.

For each MILP, the optimal cuts C𝐶Citalic_C correspond to maximizing the cardinality of ΘusubscriptΘ𝑢\Theta_{u}roman_Θ start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT.

Proof.

By construction, a realization of the flow 𝐟𝐟\mathbf{f}bold_f on 𝒢𝒢\mathcal{G}caligraphic_G corresponds to a set of unique state-history traces ΘusubscriptΘ𝑢\Theta_{u}roman_Θ start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT. The MILP objective maximizes the flow, and therefore the cardinality of ΘusubscriptΘ𝑢\Theta_{u}roman_Θ start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT is maximized. ∎

Refer to caption
(a) Static obstacles in black.
Refer to caption
(b) Virtual product graph with edge cuts in dashed red.
Figure 5: Static obstacles in (a) corresponding to edge cuts found on the virtual product graph (b) for Example 1. States marked S𝑆Sitalic_S, I𝐼Iitalic_I, and T𝑇Titalic_T illustrated in (a) correspond to states 𝚂𝚂\mathtt{S}typewriter_S (magenta \ThisStyle\SavedStyle{\color[rgb]{0.86328125,0.1484375,0.49609375}\definecolor[named]{% pgfstrokecolor}{rgb}{0.86328125,0.1484375,0.49609375}\mathbin{\ThisStyle{\vbox% {\hbox{\scalebox{1.5}{$\SavedStyle\bullet$}}}}}}), 𝙸𝙸\mathtt{I}typewriter_I (blue \ThisStyle\SavedStyle{\color[rgb]{0.390625,0.5625,1}\definecolor[named]{pgfstrokecolor}{rgb}{% 0.390625,0.5625,1}\mathbin{\ThisStyle{\vbox{\hbox{\scalebox{1.5}{$\SavedStyle% \bullet$}}}}}}), and 𝚃𝚃\mathtt{T}typewriter_T (yellow \ThisStyle\SavedStyle{\color[rgb]{1,0.69140625,0}\definecolor[named]{pgfstrokecolor}{rgb}{% 1,0.69140625,0}\mathbin{\ThisStyle{\vbox{\hbox{\scalebox{1.5}{$\SavedStyle% \bullet$}}}}}}) on G as shown in (b).
Refer to caption
(a) q00
Refer to caption
(b) q6666
Refer to caption
(c) q7777
Figure 6: Test environment implementation of a reactive test strategy for Example 2.

VI Test Strategy Synthesis

In this section, we will outline how to find the reactive test strategy from the optimization result in the different settings.

VI-A Test Environments with Static and/or Reactive Obstacles

For each setting (static, reactive, and mixed), the optimal cuts from solving the corresponding MILP are used to realize a test strategy with static and/or reactive obstacles. The optimal cuts C𝐶Citalic_C for each MILP are parsed into a reactive map 𝒞:π.QTsys.E\mathcal{C}:\mathcal{B}_{\pi}.Q\rightarrow T_{\text{sys}}.Ecaligraphic_C : caligraphic_B start_POSTSUBSCRIPT italic_π end_POSTSUBSCRIPT . italic_Q → italic_T start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT . italic_E, where

𝒞(q){(s,s)Tsys.E|((s,q),(s,q))C}.\mathcal{C}(q)\coloneqq\{(s,s^{\prime})\in T_{\text{sys}}.E\>|\>((s,q),(s^{% \prime},q^{\prime}))\in C\}.caligraphic_C ( italic_q ) ≔ { ( italic_s , italic_s start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ∈ italic_T start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT . italic_E | ( ( italic_s , italic_q ) , ( italic_s start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , italic_q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ) ∈ italic_C } . (19)

The set 𝒞(q)𝒞𝑞\mathcal{C}(q)caligraphic_C ( italic_q ) captures cuts that will be used to restrict the system when the state of the test execution is at the history variable q𝑞qitalic_q. When the test execution ϑitalic-ϑ\varthetaitalic_ϑ reaches a state-history pair (s,q)𝑠𝑞(s,q)( italic_s , italic_q ) at time step k0𝑘0k\geq 0italic_k ≥ 0, and 𝒞(q)𝒞𝑞\mathcal{C}(q)caligraphic_C ( italic_q ) contains a system transition (s,s)Tsys.Eformulae-sequence𝑠superscript𝑠subscript𝑇sys𝐸(s,s^{\prime})\in T_{\text{sys}}.E( italic_s , italic_s start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ∈ italic_T start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT . italic_E, then the reactive test strategy πtestsubscript𝜋test\pi_{\text{test}}italic_π start_POSTSUBSCRIPT test end_POSTSUBSCRIPT will restrict the system action corresponding to this transition. That is, the set of restrictions on the system is given by

πtest(σk){aTsys.A|sTsys.δ(s,a) and (s,s)𝒞(q)}.\begin{split}\pi_{\text{test}}(\sigma_{k})&\coloneqq\{a\in T_{\text{sys}}.A\,|% \,\\ &s^{\prime}\in T_{\text{sys}}.\delta(s,a)\text{ and }(s,s^{\prime})\in\mathcal% {C}(q)\}.\end{split}start_ROW start_CELL italic_π start_POSTSUBSCRIPT test end_POSTSUBSCRIPT ( italic_σ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) end_CELL start_CELL ≔ { italic_a ∈ italic_T start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT . italic_A | end_CELL end_ROW start_ROW start_CELL end_CELL start_CELL italic_s start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ∈ italic_T start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT . italic_δ ( italic_s , italic_a ) and ( italic_s , italic_s start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ∈ caligraphic_C ( italic_q ) } . end_CELL end_ROW (20)

In practice, the reactive test strategy can be realized by the test environment by placing obstacles during the test execution. The set of active obstacles 𝙾𝚋𝚜(σk)𝙾𝚋𝚜subscript𝜎𝑘\mathtt{Obs}(\sigma_{k})typewriter_Obs ( italic_σ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) at time step k0𝑘0k\geq 0italic_k ≥ 0 is defined as the set of all state-action restrictions at time k𝑘kitalic_k. The test environment uses the test strategy πtestsubscript𝜋test\pi_{\text{test}}italic_π start_POSTSUBSCRIPT test end_POSTSUBSCRIPT to determine 𝙾𝚋𝚜𝙾𝚋𝚜\mathtt{Obs}typewriter_Obs in the following settings.
Instantaneous: In this setting, the test environment instantaneously places obstacles for the current history variable q𝑞qitalic_q. For any k0𝑘0k\geq 0italic_k ≥ 0, let (s,q)𝑠𝑞(s,q)( italic_s , italic_q ) be the state-history pair at time step k𝑘kitalic_k of the test execution. Therefore, the set of active obstacles at σksubscript𝜎𝑘\sigma_{k}italic_σ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT is given as, 𝙾𝚋𝚜(σk)={(s,a)|s′′Tsys.δ(s,a) and (s,s′′)𝒞(q)}𝙾𝚋𝚜subscript𝜎𝑘conditional-setsuperscript𝑠𝑎formulae-sequencefor-allsuperscript𝑠′′subscript𝑇sys𝛿superscript𝑠𝑎 and superscript𝑠superscript𝑠′′𝒞𝑞\mathtt{Obs}(\sigma_{k})=\{(s^{\prime},a)\>|\>\forall s^{\prime\prime}\in T_{% \text{sys}}.\delta(s^{\prime},a)\text{ and }(s^{\prime},s^{\prime\prime})\in% \mathcal{C}(q)\}typewriter_Obs ( italic_σ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) = { ( italic_s start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , italic_a ) | ∀ italic_s start_POSTSUPERSCRIPT ′ ′ end_POSTSUPERSCRIPT ∈ italic_T start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT . italic_δ ( italic_s start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , italic_a ) and ( italic_s start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , italic_s start_POSTSUPERSCRIPT ′ ′ end_POSTSUPERSCRIPT ) ∈ caligraphic_C ( italic_q ) }.
Accumulative: In this setting, the test environment accumulates obstacles according to the system state during the test execution. For any k0𝑘0k\geq 0italic_k ≥ 0, let (s¯,q¯)¯𝑠¯𝑞(\bar{s},\bar{q})( over¯ start_ARG italic_s end_ARG , over¯ start_ARG italic_q end_ARG ) and (s,q)𝑠𝑞(s,q)( italic_s , italic_q ) be the state-history pairs at time steps k1𝑘1k-1italic_k - 1 and k𝑘kitalic_k of the test execution, respectively. If q¯q¯𝑞𝑞\bar{q}\neq qover¯ start_ARG italic_q end_ARG ≠ italic_q, we set active obstacles to be 𝙾𝚋𝚜(σk)={(s,a)|aπtest(σk)}𝙾𝚋𝚜subscript𝜎𝑘conditional-set𝑠𝑎for-all𝑎subscript𝜋testsubscript𝜎𝑘\mathtt{Obs}(\sigma_{k})=\{(s,a)\>|\>\forall a\in\pi_{\text{test}}(\sigma_{k})\}typewriter_Obs ( italic_σ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) = { ( italic_s , italic_a ) | ∀ italic_a ∈ italic_π start_POSTSUBSCRIPT test end_POSTSUBSCRIPT ( italic_σ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) }. As the test execution progresses to state-history pair (s,q)superscript𝑠𝑞(s^{\prime},q)( italic_s start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , italic_q ) at time step l>k𝑙𝑘l>kitalic_l > italic_k, any transition restricted by the test strategy is added to the set of active obstacles 𝙾𝚋𝚜(σl)=i=kl𝙾𝚋𝚜(σi)𝙾𝚋𝚜subscript𝜎𝑙superscriptsubscript𝑖𝑘𝑙𝙾𝚋𝚜subscript𝜎𝑖\mathtt{Obs}(\sigma_{l})=\bigcup_{i=k}^{l}\mathtt{Obs}(\sigma_{i})typewriter_Obs ( italic_σ start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT ) = ⋃ start_POSTSUBSCRIPT italic_i = italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_l end_POSTSUPERSCRIPT typewriter_Obs ( italic_σ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) and is restricted by the test environment. These obstacles remain in place until the test execution reaches a state history pair (s′′,q)superscript𝑠′′superscript𝑞(s^{\prime\prime},q^{\prime})( italic_s start_POSTSUPERSCRIPT ′ ′ end_POSTSUPERSCRIPT , italic_q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) at time step m>k𝑚𝑘m>kitalic_m > italic_k, where qq𝑞superscript𝑞q\neq q^{\prime}italic_q ≠ italic_q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT, at which point the test environment resets the set of active obstacles to be 𝙾𝚋𝚜(σm)={(s′′,a)|aπtest(σm)}𝙾𝚋𝚜subscript𝜎𝑚conditional-setsuperscript𝑠′′𝑎for-all𝑎subscript𝜋testsubscript𝜎𝑚\mathtt{Obs}(\sigma_{m})=\{(s^{\prime\prime},a)\>|\>\forall a\in\pi_{\text{% test}}(\sigma_{m})\}typewriter_Obs ( italic_σ start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT ) = { ( italic_s start_POSTSUPERSCRIPT ′ ′ end_POSTSUPERSCRIPT , italic_a ) | ∀ italic_a ∈ italic_π start_POSTSUBSCRIPT test end_POSTSUBSCRIPT ( italic_σ start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT ) } and restrictions are accumulated until a different history variable is reached.

Remark 7.

Assumption 1 can be relaxed if we can ensure that cuts C𝐶Citalic_C do not introduce any livelocks, in which the system has no path the goal. The feasibility constraints ensure that there always exists a path to the goal from every source 𝚜𝚂Gsys(q)𝚜subscript𝚂subscript𝐺sys𝑞\mathtt{s}\in\mathtt{S}_{G_{\text{sys}}}(q)typewriter_s ∈ typewriter_S start_POSTSUBSCRIPT italic_G start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( italic_q ) for every history variable q𝑞qitalic_q, and under the bidirectional setting of Assumption 1, the system can navigate back to the corresponding source. Without Assumption 1 we need to check for every cut that the MILP returns, that a path to the goal still exists. If that is not the case, we can exclude the solution and re-solve the MILP in a counterexample-guided search similar to the approach presented in Section VI-B.

Proposition 2.

In both the instantaneous and accumulative settings, as long as no new restrictions that are not in 𝒞(q)𝒞𝑞\mathcal{C}(q)caligraphic_C ( italic_q ) are introduced, the flow value F𝐹Fitalic_F remains the same.

Example 2 (Small Reactive (continued)).

Fig. 6 illustrates the test environment implementing a reactive test strategy. The reactive test strategy is constructed from the optimal cuts (as depicted in Fig. 4) on 𝒢𝒢\mathcal{G}caligraphic_G found by solving MILP (reactive). The test starts in history variable q00 and the system transitions are restricted according to Fig. 6. If the system decides to visit 𝙸1subscript𝙸1\mathtt{I}_{1}typewriter_I start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT first, the test execution moves to history variable q6666 shown in Fig. 6, whereas if the system decides to visit 𝙸2subscript𝙸2\mathtt{I}_{2}typewriter_I start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT first, the test execution moves to q7777, as depicted in Fig. 6. This test environment can be implemented in either the instantaneous or the accumulative setting.

Static and Mixed Test Environments. The cuts found from MILP-static result in a reactive map 𝒞𝒞\mathcal{C}caligraphic_C in which 𝒞(q)=𝒞(q)𝒞𝑞𝒞superscript𝑞\mathcal{C}(q)=\mathcal{C}(q^{\prime})caligraphic_C ( italic_q ) = caligraphic_C ( italic_q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) for all q,qπ.Qformulae-sequence𝑞superscript𝑞subscript𝜋𝑄q,q^{\prime}\in\mathcal{B}_{\pi}.Qitalic_q , italic_q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ∈ caligraphic_B start_POSTSUBSCRIPT italic_π end_POSTSUBSCRIPT . italic_Q. That is, restrictions on system actions remain in place for the entire duration of the test, and do not change depending on the history variable q𝑞qitalic_q. In this fully static setting, every edge is in the static area, that is Tsys.Estatic=Tsys.Eformulae-sequencesubscript𝑇syssubscript𝐸staticsubscript𝑇sys𝐸T_{\text{sys}}.E_{\text{static}}=T_{\text{sys}}.Eitalic_T start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT . italic_E start_POSTSUBSCRIPT static end_POSTSUBSCRIPT = italic_T start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT . italic_E. Therefore, the test environment realizes the test strategy by restricting all system actions corresponding to any cut in 𝒞(q)𝒞𝑞\mathcal{C}(q)caligraphic_C ( italic_q ) for all qπ.Qformulae-sequence𝑞subscript𝜋𝑄q\in\mathcal{B}_{\pi}.Qitalic_q ∈ caligraphic_B start_POSTSUBSCRIPT italic_π end_POSTSUBSCRIPT . italic_Q with static obstacles simultaneously,

𝙾𝚋𝚜{(u.s,v.s)Tsys.Estatic|(u,v)C}.\mathtt{Obs}\coloneqq\{(u.s,v.s)\in T_{\text{sys}}.E_{\text{static}}\>|\>(u,v)% \in C\}.typewriter_Obs ≔ { ( italic_u . italic_s , italic_v . italic_s ) ∈ italic_T start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT . italic_E start_POSTSUBSCRIPT static end_POSTSUBSCRIPT | ( italic_u , italic_v ) ∈ italic_C } . (21)

In the mixed setting of static and reactive obstacles, the test strategy resulting from MILP-mixed is implemented similarly to the reactive setting, except for system transitions in Tsys.Estaticformulae-sequencesubscript𝑇syssubscript𝐸staticT_{\text{sys}}.E_{\text{static}}italic_T start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT . italic_E start_POSTSUBSCRIPT static end_POSTSUBSCRIPT that are blocked by static obstacles.

Example 1 (continued).

For the grid world example, Fig. 5 illustrates the static test on the grid world, and Fig. 5 shows the corresponding cuts Csuperscript𝐶C^{*}italic_C start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT on the virtual product network 𝒢𝒢\mathcal{G}caligraphic_G. Here, the 14 cuts on 𝒢𝒢\mathcal{G}caligraphic_G map to 4 static obstacles since multiple edges on 𝒢𝒢\mathcal{G}caligraphic_G correspond to the same transition in Tsyssubscript𝑇sysT_{\text{sys}}italic_T start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT. The optimal flow value is F=3superscript𝐹3F^{*}=3italic_F start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT = 3 and there is no bypass flow. Thus, as the system navigates from source 𝚂𝚂\mathtt{S}typewriter_S to target 𝚃𝚃\mathtt{T}typewriter_T, it must visit at least one of the intermediate nodes 𝙸𝙸\mathtt{I}typewriter_I.

Remark 8.

The instantaneous and accumulative implementations of the test environment guide when the obstacles are placed by the test environment. However, this does not have to be the same as when the system senses or observes these restrictions on its actions. We assume that the system can observe all restricted actions on its current state before it commits to an action.

The graph construction, network flow optimization, and finding the reactive test strategy are summarized in Algorithm 1.

Theorem 3.

If the problem data are not inconsistent (see Section V-B), the reactive test strategy πtestsubscript𝜋test\pi_{\text{test}}italic_π start_POSTSUBSCRIPT test end_POSTSUBSCRIPT found by Algorithm 1 solves Problem 1.

Proof.

The test environment informs the choice of the MILP (static, reactive, or mixed). Therefore, the resulting πtestsubscript𝜋test\pi_{\text{test}}italic_π start_POSTSUBSCRIPT test end_POSTSUBSCRIPT will be realizable by the test environment. By construction of Gsyssubscript𝐺sysG_{\text{sys}}italic_G start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT, any correct system strategy corresponds to a 𝙿𝚊𝚝𝚑(𝚂sys,𝚃sys)𝙿𝚊𝚝𝚑subscript𝚂syssubscript𝚃sys\mathtt{Path}(\mathtt{S}_{\text{sys}},\mathtt{T}_{\text{sys}})typewriter_Path ( typewriter_S start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT , typewriter_T start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT ). By Theorem 2, at any point during the test execution, if the system has not violated its guarantees, there exists a path on Gsyssubscript𝐺sysG_{\text{sys}}italic_G start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT to 𝚃syssubscript𝚃sys\mathtt{T}_{\text{sys}}typewriter_T start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT. Therefore, there exists a correct system strategy πsyssubscript𝜋sys\pi_{\text{sys}}italic_π start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT, and resulting trace σ(πsys×πtest)𝜎subscript𝜋syssubscript𝜋test\sigma(\pi_{\text{sys}}\times\pi_{\text{test}})italic_σ ( italic_π start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT × italic_π start_POSTSUBSCRIPT test end_POSTSUBSCRIPT ), which corresponds to the path ϑsys,n=(s,q)0(s,q)nsubscriptitalic-ϑsys𝑛subscript𝑠𝑞0subscript𝑠𝑞𝑛\vartheta_{\text{sys},n}=(s,q)_{0}\ldots(s,q)_{n}italic_ϑ start_POSTSUBSCRIPT sys , italic_n end_POSTSUBSCRIPT = ( italic_s , italic_q ) start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT … ( italic_s , italic_q ) start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT on Gsyssubscript𝐺sysG_{\text{sys}}italic_G start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT, where (s,q)0𝚂syssubscript𝑠𝑞0subscript𝚂sys(s,q)_{0}\in\mathtt{S}_{\text{sys}}( italic_s , italic_q ) start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ∈ typewriter_S start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT to (s,q)n𝚃syssubscript𝑠𝑞𝑛subscript𝚃sys(s,q)_{n}\in\mathtt{T}_{\text{sys}}( italic_s , italic_q ) start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ∈ typewriter_T start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT. By Lemma 1 any 𝙿𝚊𝚝𝚑(𝚂sys,𝚃sys)𝙿𝚊𝚝𝚑subscript𝚂syssubscript𝚃sys\mathtt{Path}(\mathtt{S}_{\text{sys}},\mathtt{T}_{\text{sys}})typewriter_Path ( typewriter_S start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT , typewriter_T start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT ) on Gsyssubscript𝐺sysG_{\text{sys}}italic_G start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT has a corresponding 𝙿𝚊𝚝𝚑(𝚂,𝚃)𝙿𝚊𝚝𝚑𝚂𝚃\mathtt{Path}(\mathtt{S},\mathtt{T})typewriter_Path ( typewriter_S , typewriter_T ) on G𝐺Gitalic_G and by Theorem 1, the cuts ensure that all such paths on G𝐺Gitalic_G are routed through the intermediate 𝙸𝙸\mathtt{I}typewriter_I. Therefore, for a correct system strategy πsyssubscript𝜋sys\pi_{\text{sys}}italic_π start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT, the trace σ(πsys×πtest)φsysφtestmodels𝜎subscript𝜋syssubscript𝜋testsubscript𝜑syssubscript𝜑test\sigma(\pi_{\text{sys}}\times\pi_{\text{test}})\models\varphi_{\text{sys}}% \wedge\varphi_{\text{test}}italic_σ ( italic_π start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT × italic_π start_POSTSUBSCRIPT test end_POSTSUBSCRIPT ) ⊧ italic_φ start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT ∧ italic_φ start_POSTSUBSCRIPT test end_POSTSUBSCRIPT. Thus, πtestsubscript𝜋test\pi_{\text{test}}italic_π start_POSTSUBSCRIPT test end_POSTSUBSCRIPT is feasible and by Proposition 2 and Lemma 3, πtestsubscript𝜋test\pi_{\text{test}}italic_π start_POSTSUBSCRIPT test end_POSTSUBSCRIPT is least-restrictive. Thus, Problem 1 is solved. ∎

Algorithm 2 Reactive Test Synthesis
1:procedure Test Synthesis(Tsys,TTA,H,φsys,φtestsubscript𝑇syssubscript𝑇TA𝐻subscript𝜑syssubscript𝜑testT_{\text{sys}},T_{\text{TA}},H,\varphi_{\text{sys}},\varphi_{\text{test}}italic_T start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT , italic_T start_POSTSUBSCRIPT TA end_POSTSUBSCRIPT , italic_H , italic_φ start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT , italic_φ start_POSTSUBSCRIPT test end_POSTSUBSCRIPT)
2:system Tsyssubscript𝑇sysT_{\text{sys}}italic_T start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT, test agent TTAsubscript𝑇TAT_{\text{TA}}italic_T start_POSTSUBSCRIPT TA end_POSTSUBSCRIPT, test harness H𝐻Hitalic_H, system objective φsyssubscript𝜑sys\varphi_{\text{sys}}italic_φ start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT, test objective φtestsubscript𝜑test\varphi_{\text{test}}italic_φ start_POSTSUBSCRIPT test end_POSTSUBSCRIPT
3:test agent strategy πTAsubscript𝜋TA\pi_{\text{TA}}italic_π start_POSTSUBSCRIPT TA end_POSTSUBSCRIPT
4:     Tsys.Estaticformulae-sequencesubscript𝑇syssubscript𝐸staticabsentT_{\text{sys}}.E_{\text{static}}\leftarrowitalic_T start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT . italic_E start_POSTSUBSCRIPT static end_POSTSUBSCRIPT ← Define from Tsyssubscript𝑇sysT_{\text{sys}}italic_T start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT, TTAsubscript𝑇TAT_{\text{TA}}italic_T start_POSTSUBSCRIPT TA end_POSTSUBSCRIPT \triangleright Static area (Eq. (22)
5:     𝒢,𝔊,𝙸,G𝒢𝔊𝙸𝐺absent\mathcal{G},\mathfrak{G},\mathtt{I},G\leftarrowcaligraphic_G , fraktur_G , typewriter_I , italic_G ← Setup arguments \triangleright Lines 2-13 in Alg. 1
6:     𝙲exsubscript𝙲ex\mathtt{C}_{\text{ex}}\leftarrow\emptysettypewriter_C start_POSTSUBSCRIPT ex end_POSTSUBSCRIPT ← ∅ \triangleright Initialize empty set of excluded solutions
7:     while 𝚃𝚛𝚞𝚎𝚃𝚛𝚞𝚎\mathtt{True}typewriter_True do
8:         𝐝superscript𝐝absent\mathbf{d}^{*}\leftarrowbold_d start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ←Solve MILP-agent(𝒢,𝔊,𝙸,Tsys,H,𝙲ex𝒢𝔊𝙸subscript𝑇sys𝐻subscript𝙲ex\mathcal{G},\mathfrak{G},\mathtt{I},T_{\text{sys}},H,\mathtt{C}_{\text{ex}}caligraphic_G , fraktur_G , typewriter_I , italic_T start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT , italic_H , typewriter_C start_POSTSUBSCRIPT ex end_POSTSUBSCRIPT)
9:         if Status(MILP)=𝚒𝚗𝚏𝚎𝚊𝚜𝚒𝚋𝚕𝚎Status(MILP)𝚒𝚗𝚏𝚎𝚊𝚜𝚒𝚋𝚕𝚎\textsc{Status(MILP)}=\mathtt{infeasible}Status(MILP) = typewriter_infeasible then
10:              return 𝚒𝚗𝚏𝚎𝚊𝚜𝚒𝚋𝚕𝚎𝚒𝚗𝚏𝚎𝚊𝚜𝚒𝚋𝚕𝚎\mathtt{infeasible}typewriter_infeasible          
11:         C{(u,v)G.E|𝐝(u,v)=1}C\leftarrow\{(u,v)\in G.E\,|\,{\mathbf{d}^{*}}^{(u,v)}=1\}italic_C ← { ( italic_u , italic_v ) ∈ italic_G . italic_E | bold_d start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT start_POSTSUPERSCRIPT ( italic_u , italic_v ) end_POSTSUPERSCRIPT = 1 } \triangleright Cuts on G𝐺Gitalic_G
12:         𝙾𝚋𝚜𝙾𝚋𝚜absent\mathtt{Obs}\leftarrowtypewriter_Obs ← Define from C𝐶Citalic_C \triangleright Static Obstacles (Eq. (21))
13:         absent\mathcal{R}\leftarrowcaligraphic_R ←Define from C𝐶Citalic_C \triangleright Reactive map (Eq. (23))
14:         A \leftarrow Assumptions (a1)–(a5) from Tsyssubscript𝑇sysT_{\text{sys}}italic_T start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT, TTAsubscript𝑇TAT_{\text{TA}}italic_T start_POSTSUBSCRIPT TA end_POSTSUBSCRIPT, G𝐺Gitalic_G, φsyssubscript𝜑sys\varphi_{\text{sys}}italic_φ start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT
15:         G \leftarrow Guarantees (g1)–(g7) from Tsyssubscript𝑇sysT_{\text{sys}}italic_T start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT, TTAsubscript𝑇TAT_{\text{TA}}italic_T start_POSTSUBSCRIPT TA end_POSTSUBSCRIPT, \mathcal{R}caligraphic_R
16:         φ(AG)𝜑AG\varphi\leftarrow(\textbf{A}\rightarrow\textbf{G})italic_φ ← ( A → G ) \triangleright Construct GR(1) formula
17:         if Realizable(φ)Realizable𝜑\textsc{Realizable}(\varphi)Realizable ( italic_φ ) then
18:              πTAsubscript𝜋TAabsent\pi_{\text{TA}}\leftarrowitalic_π start_POSTSUBSCRIPT TA end_POSTSUBSCRIPT ← GR1Solve(φ𝜑\varphiitalic_φ)
19:              return πTAsubscript𝜋TA\pi_{\text{TA}}italic_π start_POSTSUBSCRIPT TA end_POSTSUBSCRIPT, 𝙾𝚋𝚜𝙾𝚋𝚜\mathtt{Obs}typewriter_Obs          
20:         𝙲ex𝙲exCsubscript𝙲exsubscript𝙲ex𝐶\mathtt{C}_{\text{ex}}\leftarrow\mathtt{C}_{\text{ex}}\cup Ctypewriter_C start_POSTSUBSCRIPT ex end_POSTSUBSCRIPT ← typewriter_C start_POSTSUBSCRIPT ex end_POSTSUBSCRIPT ∪ italic_C      

This framework results in a test that is not impossible (with respect to the system objective) for a correctly implemented system. On the other hand, a poorly designed system can still fail since the system is not aided in satisfying its guarantees.

VI-B Synthesizing a Dynamic Test Strategy

In some test scenarios, it might be beneficial to make use of an available dynamic test agent. Thus, the challenge is to find a test agent strategy that corresponds to 𝒞𝒞\mathcal{C}caligraphic_C while ensuring that the system’s operational environment assumptions are satisfied. To accomplish this, we adapt the MILP-mixed using information about the dynamic test agent. Then, we find the test agent strategy using reactive synthesis and counter-example guided search. From the optimal cuts of MILP-mixed and the resulting reactive map 𝒞𝒞\mathcal{C}caligraphic_C, we can find states that the test agent must occupy in reaction to the system state. Then, we synthesize a strategy for the dynamic test agent using the Temporal Logic and Planning Toolbox (TuLiP) [57]. If we cannot synthesize a strategy, we use a counterexample-guided approach to exclude the current solution and resolve the MILP to return a different set of optimal cuts until a strategy can be synthesized. Suppose we are given a test agent whose dynamics are given by the finite transition system TTAsubscript𝑇TAT_{\text{TA}}italic_T start_POSTSUBSCRIPT TA end_POSTSUBSCRIPT, where TTA.Sformulae-sequencesubscript𝑇TA𝑆T_{\text{TA}}.Sitalic_T start_POSTSUBSCRIPT TA end_POSTSUBSCRIPT . italic_S contains at least one state that is not in Tsys.Sformulae-sequencesubscript𝑇sys𝑆T_{\text{sys}}.Sitalic_T start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT . italic_S, denoted as 𝚙𝚊𝚛𝚔𝚙𝚊𝚛𝚔\mathtt{park}typewriter_park. During the test execution, the test agent can navigate to these 𝚙𝚊𝚛𝚔𝚙𝚊𝚛𝚔\mathtt{park}typewriter_park states, if necessary. These states are required to synthesize a test agent strategy. From the test agent’s transition system TTAsubscript𝑇TAT_{\text{TA}}italic_T start_POSTSUBSCRIPT TA end_POSTSUBSCRIPT, we determine which states in Tsyssubscript𝑇sysT_{\text{sys}}italic_T start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT the test agent can occupy. From these states, we can define the static area as,

Tsys.Estatic{(u,v)Tsys.E|vTTA.S}.T_{\text{sys}}.E_{\text{static}}\coloneqq\{(u,v)\in T_{\text{sys}}.E\>|\>v% \notin T_{\text{TA}}.S\}.italic_T start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT . italic_E start_POSTSUBSCRIPT static end_POSTSUBSCRIPT ≔ { ( italic_u , italic_v ) ∈ italic_T start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT . italic_E | italic_v ∉ italic_T start_POSTSUBSCRIPT TA end_POSTSUBSCRIPT . italic_S } . (22)

Adapting the MILP: Since an agent can only occupy a single state at a time, we incentivize solutions in which multiple edge cuts can be realized by occupying the same state. For this, we introduce the variable 𝐝state+|V|subscript𝐝statesubscriptsuperscript𝑉\mathbf{d}_{\text{state}}\in\mathbb{R}^{|V|}_{+}bold_d start_POSTSUBSCRIPT state end_POSTSUBSCRIPT ∈ blackboard_R start_POSTSUPERSCRIPT | italic_V | end_POSTSUPERSCRIPT start_POSTSUBSCRIPT + end_POSTSUBSCRIPT, which represents whether an incoming edge into a state is cut. This is captured by the constraint

(u,v)E,d(u,v)dstatev,\begin{split}\forall(u,v)\in E,\quad d^{(u,v)}\leq d_{\text{state}}^{v},\end{split}start_ROW start_CELL ∀ ( italic_u , italic_v ) ∈ italic_E , italic_d start_POSTSUPERSCRIPT ( italic_u , italic_v ) end_POSTSUPERSCRIPT ≤ italic_d start_POSTSUBSCRIPT state end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_v end_POSTSUPERSCRIPT , end_CELL end_ROW (c10)

where dstatev1superscriptsubscript𝑑state𝑣1d_{\text{state}}^{v}\geq 1italic_d start_POSTSUBSCRIPT state end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_v end_POSTSUPERSCRIPT ≥ 1 corresponds to at least one incoming edge being cut. The adapted objective is then defined as

F1|E|vVdstatev1|E|2eEde.𝐹1𝐸subscript𝑣𝑉subscriptsuperscript𝑑𝑣state1superscript𝐸2subscript𝑒𝐸superscript𝑑𝑒F-\frac{1}{|E|}\sum_{v\in V}d^{v}_{\text{state}}-\frac{1}{|E|^{2}}\sum_{e\in E% }d^{e}.italic_F - divide start_ARG 1 end_ARG start_ARG | italic_E | end_ARG ∑ start_POSTSUBSCRIPT italic_v ∈ italic_V end_POSTSUBSCRIPT italic_d start_POSTSUPERSCRIPT italic_v end_POSTSUPERSCRIPT start_POSTSUBSCRIPT state end_POSTSUBSCRIPT - divide start_ARG 1 end_ARG start_ARG | italic_E | start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG ∑ start_POSTSUBSCRIPT italic_e ∈ italic_E end_POSTSUBSCRIPT italic_d start_POSTSUPERSCRIPT italic_e end_POSTSUPERSCRIPT .

The objective is chosen such that the number of states that need to be blocked is minimized with the fewest possible edge cuts. The regularizers are chosen to reflect this order of priority. The optimal cuts from the resulting MILP are used to synthesize a reactive test agent strategy as follows. From the optimal cuts C𝐶Citalic_C, we find the set of static obstacles 𝙾𝚋𝚜Tsys.Estaticformulae-sequence𝙾𝚋𝚜subscript𝑇syssubscript𝐸static\mathtt{Obs}\subseteq T_{\text{sys}}.E_{\text{static}}typewriter_Obs ⊆ italic_T start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT . italic_E start_POSTSUBSCRIPT static end_POSTSUBSCRIPT according to Eq. (21) and the reactive map :π.QTsys.E\mathcal{R}:\mathcal{B}_{\pi}.Q\rightarrow T_{\text{sys}}.Ecaligraphic_R : caligraphic_B start_POSTSUBSCRIPT italic_π end_POSTSUBSCRIPT . italic_Q → italic_T start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT . italic_E as follows:

(q){(s,s)Tsys.E|(s,s)Tsys.Estatic and ((s,q),(s,q))C}.\begin{split}\mathcal{R}(q)\coloneqq\{(s,s^{\prime})\in T_{\text{sys}}.E\>|\>(% s,s^{\prime})\notin T_{\text{sys}}.E_{\text{static}}\text{ and }\\ ((s,q),(s^{\prime},q^{\prime}))\in C\}.\end{split}start_ROW start_CELL caligraphic_R ( italic_q ) ≔ { ( italic_s , italic_s start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ∈ italic_T start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT . italic_E | ( italic_s , italic_s start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ∉ italic_T start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT . italic_E start_POSTSUBSCRIPT static end_POSTSUBSCRIPT and end_CELL end_ROW start_ROW start_CELL ( ( italic_s , italic_q ) , ( italic_s start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , italic_q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ) ∈ italic_C } . end_CELL end_ROW (23)

The reactive map \mathcal{R}caligraphic_R is used to synthesize a strategy for the test agent. If no strategy can be found, a counter-example guided approach is used to resolve the MILP.

Reactive Synthesis: From the solution of the MILP, we now construct the specification to synthesize the test agent strategy using TuLiP. In particular, we construct a GR(1) formula with assumptions being our model of the system and the guarantees capturing requirements on the test agent. Note that we are synthesizing a strategy for the test agent, where the environment is the system under test. The variables needed to define the GR(1) formula consist of variables capturing the system’s state 𝚡sysTsys.Sformulae-sequencesubscript𝚡syssubscript𝑇sys𝑆\mathtt{x}_{\text{sys}}\in T_{\text{sys}}.Stypewriter_x start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT ∈ italic_T start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT . italic_S and 𝚚histπ.Qformulae-sequencesubscript𝚚histsubscript𝜋𝑄\mathtt{q}_{\text{hist}}\in\mathcal{B}_{\pi}.Qtypewriter_q start_POSTSUBSCRIPT hist end_POSTSUBSCRIPT ∈ caligraphic_B start_POSTSUBSCRIPT italic_π end_POSTSUBSCRIPT . italic_Q, which track how system transitions affect the history variable q𝑞qitalic_q. The test agent state is represented in the variable 𝚡TATTA.Sformulae-sequencesubscript𝚡TAsubscript𝑇TA𝑆\mathtt{x}_{\text{TA}}\in T_{\text{TA}}.Stypewriter_x start_POSTSUBSCRIPT TA end_POSTSUBSCRIPT ∈ italic_T start_POSTSUBSCRIPT TA end_POSTSUBSCRIPT . italic_S.

First, we set up the subformulae constituting the assumptions on the system model. The initial conditions of the system are defined as

(𝚡sys=s0𝚚hist=q0),subscript𝚡syssubscript𝑠0subscript𝚚histsubscript𝑞0(\mathtt{x}_{\text{sys}}=s_{0}\land\mathtt{q}_{\text{hist}}=q_{0}),( typewriter_x start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT = italic_s start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ∧ typewriter_q start_POSTSUBSCRIPT hist end_POSTSUBSCRIPT = italic_q start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) , (a1)

where s0Tsys.S0formulae-sequencesubscript𝑠0subscript𝑇syssubscript𝑆0s_{0}\in T_{\text{sys}}.S_{0}italic_s start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ∈ italic_T start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT . italic_S start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT and π.Q0formulae-sequencesubscript𝜋subscript𝑄0\mathcal{B}_{\pi}.Q_{0}caligraphic_B start_POSTSUBSCRIPT italic_π end_POSTSUBSCRIPT . italic_Q start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT. We define the dynamics of the system and the history variable for each state (s,q)G.Sformulae-sequence𝑠𝑞𝐺𝑆(s,q)\in G.S( italic_s , italic_q ) ∈ italic_G . italic_S as follows:

((𝚡sys=s𝚚hist=q)(s,q)𝚜𝚞𝚌𝚌(s,q)(𝚡sys=s𝚚hist=q)),subscript𝚡sys𝑠subscript𝚚hist𝑞subscriptsuperscript𝑠superscript𝑞absent𝚜𝚞𝚌𝚌𝑠𝑞subscript𝚡syssuperscript𝑠subscript𝚚histsuperscript𝑞\square\Big{(}(\mathtt{x}_{\text{sys}}=s\land\mathtt{q}_{\text{hist}}=q)% \rightarrow\bigvee_{\begin{subarray}{c}(s^{\prime},q^{\prime})\in\\ \mathtt{succ}(s,q)\end{subarray}}\bigcirc\big{(}\mathtt{x}_{\text{sys}}=s^{% \prime}\land\mathtt{q}_{\text{hist}}=q^{\prime})\Big{)},□ ( ( typewriter_x start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT = italic_s ∧ typewriter_q start_POSTSUBSCRIPT hist end_POSTSUBSCRIPT = italic_q ) → ⋁ start_POSTSUBSCRIPT start_ARG start_ROW start_CELL ( italic_s start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , italic_q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ∈ end_CELL end_ROW start_ROW start_CELL typewriter_succ ( italic_s , italic_q ) end_CELL end_ROW end_ARG end_POSTSUBSCRIPT ○ ( typewriter_x start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT = italic_s start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ∧ typewriter_q start_POSTSUBSCRIPT hist end_POSTSUBSCRIPT = italic_q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ) , (a2)

where 𝚜𝚞𝚌𝚌(s,q)𝚜𝚞𝚌𝚌𝑠𝑞\mathtt{succ}(s,q)typewriter_succ ( italic_s , italic_q ) denotes the successors of state (s,q)G.Sformulae-sequence𝑠𝑞𝐺𝑆(s,q)\in G.S( italic_s , italic_q ) ∈ italic_G . italic_S. For simplicity, we choose a turn-based setting, in which each player will only take their action if it is their turn. To track this, we introduce the variable 𝚝𝚞𝚛𝚗𝔹𝚝𝚞𝚛𝚗𝔹\mathtt{turn}\in\mathbb{B}typewriter_turn ∈ blackboard_B as a test agent variable. For the system, this is encoded as remaining in place when 𝚝𝚞𝚛𝚗=1𝚝𝚞𝚛𝚗1\mathtt{turn}=1typewriter_turn = 1:

sTsys.S((𝚡sys=s𝚝𝚞𝚛𝚗=1)(𝚡sys=s)).\bigwedge_{s\in T_{\text{sys}}.S}\square\Big{(}(\mathtt{x}_{\text{sys}}=s\land% \mathtt{turn}=1)\rightarrow\bigcirc(\mathtt{x}_{\text{sys}}=s)\Big{)}.⋀ start_POSTSUBSCRIPT italic_s ∈ italic_T start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT . italic_S end_POSTSUBSCRIPT □ ( ( typewriter_x start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT = italic_s ∧ typewriter_turn = 1 ) → ○ ( typewriter_x start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT = italic_s ) ) . (a3)

If a turn-based setup is not used, we need to synthesize a Moore strategy for the test agent since it should account for all possible system actions. The system objective φsyssubscript𝜑sys\varphi_{\text{sys}}italic_φ start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT can be encoded as the formula

(𝚡sys=xgoal)φaux,subscript𝚡syssubscript𝑥goalsubscript𝜑aux\square\operatorname{\rotatebox[origin={c}]{45.0}{$\Box$}}(\mathtt{x}_{\text{% sys}}=x_{\text{goal}})\land\varphi_{\text{aux}},□ □ ( typewriter_x start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT = italic_x start_POSTSUBSCRIPT goal end_POSTSUBSCRIPT ) ∧ italic_φ start_POSTSUBSCRIPT aux end_POSTSUBSCRIPT , (a4)

where xgoalsubscript𝑥goalx_{\text{goal}}italic_x start_POSTSUBSCRIPT goal end_POSTSUBSCRIPT is the terminal state of the system and a reachability objective specified in φsyssubscript𝜑sys\varphi_{\text{sys}}italic_φ start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT. The other objectives specified in φsyssubscript𝜑sys\varphi_{\text{sys}}italic_φ start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT are transformed to their respective GR(1) forms in φauxsubscript𝜑aux\varphi_{\text{aux}}italic_φ start_POSTSUBSCRIPT aux end_POSTSUBSCRIPT. This transformation of LTL formulae into GR(1) form is detailed in [58]. In addition, the system is expected to safely operate in the test agent’s presence. The set of states where collision is possible is denoted by S:=Tsys.STTA.Sformulae-sequenceassignsubscript𝑆subscript𝑇sys𝑆subscript𝑇TA𝑆S_{\cap}:=T_{\text{sys}}.S\cap T_{\text{TA}}.Sitalic_S start_POSTSUBSCRIPT ∩ end_POSTSUBSCRIPT := italic_T start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT . italic_S ∩ italic_T start_POSTSUBSCRIPT TA end_POSTSUBSCRIPT . italic_S. Thus, the safety formula encoding that the system will not collide into the tester is given as:

sS(𝚡TA=s¬(𝚡sys=s)).\bigwedge_{s\in S_{\cap}}\square\Big{(}\mathtt{x}_{\text{TA}}=s\rightarrow% \bigcirc\neg(\mathtt{x}_{\text{sys}}=s)\Big{)}.⋀ start_POSTSUBSCRIPT italic_s ∈ italic_S start_POSTSUBSCRIPT ∩ end_POSTSUBSCRIPT end_POSTSUBSCRIPT □ ( typewriter_x start_POSTSUBSCRIPT TA end_POSTSUBSCRIPT = italic_s → ○ ¬ ( typewriter_x start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT = italic_s ) ) . (a5)

Equations (a1)– (a5) represent the test agent’s assumptions on the system model. Next, we describe the subformulae for the guarantees of the GR(1) specification. The initial conditions for the test agent are

sTTA.S0𝚡TA=s.subscriptformulae-sequence𝑠subscript𝑇TAsubscript𝑆0subscript𝚡TA𝑠\bigvee_{s\in T_{\text{TA}}.S_{0}}\mathtt{x}_{\text{TA}}=s.⋁ start_POSTSUBSCRIPT italic_s ∈ italic_T start_POSTSUBSCRIPT TA end_POSTSUBSCRIPT . italic_S start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUBSCRIPT typewriter_x start_POSTSUBSCRIPT TA end_POSTSUBSCRIPT = italic_s . (g1)

The test agent dynamics are represented by

((𝚡TA=s)(s,s)TTA.E(𝚡TA=s)).subscript𝚡TA𝑠subscriptformulae-sequence𝑠superscript𝑠subscript𝑇TA𝐸subscript𝚡TAsuperscript𝑠\square\Big{(}(\mathtt{x}_{\text{TA}}=s)\rightarrow\bigvee_{(s,s^{\prime})\in T% _{\text{TA}}.E}\bigcirc\big{(}\mathtt{x}_{\text{TA}}=s^{\prime})\Big{)}.□ ( ( typewriter_x start_POSTSUBSCRIPT TA end_POSTSUBSCRIPT = italic_s ) → ⋁ start_POSTSUBSCRIPT ( italic_s , italic_s start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ∈ italic_T start_POSTSUBSCRIPT TA end_POSTSUBSCRIPT . italic_E end_POSTSUBSCRIPT ○ ( typewriter_x start_POSTSUBSCRIPT TA end_POSTSUBSCRIPT = italic_s start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ) . (g2)

The test agent can also move only in its turn and will remain stationary when 𝚝𝚞𝚛𝚗=0𝚝𝚞𝚛𝚗0\mathtt{turn}=0typewriter_turn = 0:

sTTA.S((𝚡TA=s𝚝𝚞𝚛𝚗=0)(𝚡TA=s)).\bigwedge_{s\in T_{\text{TA}}.S}\square\Big{(}(\mathtt{x}_{\text{TA}}=s\land% \mathtt{turn}=0)\rightarrow\bigcirc(\mathtt{x}_{\text{TA}}=s)\Big{)}.⋀ start_POSTSUBSCRIPT italic_s ∈ italic_T start_POSTSUBSCRIPT TA end_POSTSUBSCRIPT . italic_S end_POSTSUBSCRIPT □ ( ( typewriter_x start_POSTSUBSCRIPT TA end_POSTSUBSCRIPT = italic_s ∧ typewriter_turn = 0 ) → ○ ( typewriter_x start_POSTSUBSCRIPT TA end_POSTSUBSCRIPT = italic_s ) ) . (g3)

The 𝚝𝚞𝚛𝚗𝚝𝚞𝚛𝚗\mathtt{turn}typewriter_turn variable alternates at each step:

(𝚝𝚞𝚛𝚗=1)(𝚝𝚞𝚛𝚗=0)(𝚝𝚞𝚛𝚗=0)(𝚝𝚞𝚛𝚗=1).\begin{split}(\mathtt{turn}=1)\rightarrow\bigcirc(\mathtt{turn}=0)\>\land\\ (\mathtt{turn}=0)\rightarrow\bigcirc(\mathtt{turn}=1).\end{split}start_ROW start_CELL ( typewriter_turn = 1 ) → ○ ( typewriter_turn = 0 ) ∧ end_CELL end_ROW start_ROW start_CELL ( typewriter_turn = 0 ) → ○ ( typewriter_turn = 1 ) . end_CELL end_ROW (g4)

To satisfy the system assumptions (Def. 10), the test agent should not adversarially collide into the system. This is captured via the following safety formula,

sS(𝚡sys=s¬(𝚡TA=s)).\bigwedge_{s\in S_{\cap}}\square\Big{(}\mathtt{x}_{\text{sys}}=s\rightarrow% \bigcirc\neg(\mathtt{x}_{\text{TA}}=s)\Big{)}.⋀ start_POSTSUBSCRIPT italic_s ∈ italic_S start_POSTSUBSCRIPT ∩ end_POSTSUBSCRIPT end_POSTSUBSCRIPT □ ( typewriter_x start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT = italic_s → ○ ¬ ( typewriter_x start_POSTSUBSCRIPT TA end_POSTSUBSCRIPT = italic_s ) ) . (g5)

Now, we enforce the optimal cuts found from the MILP. To enforce cuts reactively during the test execution, the states occupied by the system are defined as follows,

qπ.Q(s,s)(q)((𝚡sys=s𝚚hist=q𝚝𝚞𝚛𝚗=0)(𝚡TA=s)).subscriptformulae-sequence𝑞subscript𝜋𝑄subscript𝑠superscript𝑠𝑞subscript𝚡sys𝑠subscript𝚚hist𝑞𝚝𝚞𝚛𝚗0subscript𝚡TAsuperscript𝑠\begin{split}\bigwedge_{q\in\mathcal{B}_{\pi}.Q}\bigwedge_{(s,s^{\prime})\in% \mathcal{R}(q)}\square\Big{(}(\mathtt{x}_{\text{sys}}=s&\land\mathtt{q}_{\text% {hist}}=q\land\mathtt{turn}=0)\\ &\rightarrow(\mathtt{x}_{\text{TA}}=s^{\prime})\Big{)}.\end{split}start_ROW start_CELL ⋀ start_POSTSUBSCRIPT italic_q ∈ caligraphic_B start_POSTSUBSCRIPT italic_π end_POSTSUBSCRIPT . italic_Q end_POSTSUBSCRIPT ⋀ start_POSTSUBSCRIPT ( italic_s , italic_s start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ∈ caligraphic_R ( italic_q ) end_POSTSUBSCRIPT □ ( ( typewriter_x start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT = italic_s end_CELL start_CELL ∧ typewriter_q start_POSTSUBSCRIPT hist end_POSTSUBSCRIPT = italic_q ∧ typewriter_turn = 0 ) end_CELL end_ROW start_ROW start_CELL end_CELL start_CELL → ( typewriter_x start_POSTSUBSCRIPT TA end_POSTSUBSCRIPT = italic_s start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ) . end_CELL end_ROW (g6)

Essentially, for some history variable q𝑞qitalic_q, if (s,s)(q)𝑠superscript𝑠𝑞(s,s^{\prime})\in\mathcal{R}(q)( italic_s , italic_s start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ∈ caligraphic_R ( italic_q ) is an edge cut, then the test agent must occupy the state ssuperscript𝑠s^{\prime}italic_s start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT when the system is in the state s𝑠sitalic_s when the test execution is at history variable q𝑞qitalic_q. However, the test agent should not introduce any additional restrictions on the system, which is formulated as

qπ.Q(s,s)Tsys.E(s,s)(q)((𝚡sys=s𝚚hist=q𝚝𝚞𝚛𝚗=0)¬(𝚡TA=s)).subscriptformulae-sequence𝑞subscript𝜋𝑄subscriptformulae-sequence𝑠superscript𝑠subscript𝑇sys𝐸𝑠superscript𝑠𝑞subscript𝚡sys𝑠subscript𝚚hist𝑞𝚝𝚞𝚛𝚗0subscript𝚡TAsuperscript𝑠\begin{split}\bigwedge_{q\in\mathcal{B}_{\pi}.Q}\bigwedge_{\begin{subarray}{c}% (s,s^{\prime})\in T_{\text{sys}}.E\\ (s,s^{\prime})\not\in\mathcal{R}(q)\end{subarray}}\square\Big{(}(\mathtt{x}_{% \text{sys}}=s&\land\mathtt{q}_{\text{hist}}=q\land\mathtt{turn}=0)\\ &\rightarrow\neg(\mathtt{x}_{\text{TA}}=s^{\prime})\Big{)}.\end{split}start_ROW start_CELL ⋀ start_POSTSUBSCRIPT italic_q ∈ caligraphic_B start_POSTSUBSCRIPT italic_π end_POSTSUBSCRIPT . italic_Q end_POSTSUBSCRIPT ⋀ start_POSTSUBSCRIPT start_ARG start_ROW start_CELL ( italic_s , italic_s start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ∈ italic_T start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT . italic_E end_CELL end_ROW start_ROW start_CELL ( italic_s , italic_s start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ∉ caligraphic_R ( italic_q ) end_CELL end_ROW end_ARG end_POSTSUBSCRIPT □ ( ( typewriter_x start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT = italic_s end_CELL start_CELL ∧ typewriter_q start_POSTSUBSCRIPT hist end_POSTSUBSCRIPT = italic_q ∧ typewriter_turn = 0 ) end_CELL end_ROW start_ROW start_CELL end_CELL start_CELL → ¬ ( typewriter_x start_POSTSUBSCRIPT TA end_POSTSUBSCRIPT = italic_s start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ) . end_CELL end_ROW (g7)

Intuitively, this corresponds to the requirement that the tester agent shall not restrict system transitions that are not part of the reactive map \mathcal{R}caligraphic_R. A test agent strategy that satisfies the above specifications is guaranteed to not restrict any system action unnecessarily. However, the test agent can occupy a state that is not adjacent to the system and block all paths to the goal from the system’s perspective. This could lead the system to not making any progress towards the goal at all, resulting in a livelock. To avoid this, we characterize the livelock condition as a safety constraint that the test agent must satisfy (e.g., if it occupies a livelock state, it must not occupy it in the next step). The specific safety formula that captures the livelock depends on the example. We find the states where the tester would block the system from reaching its goal Tsys.SblockTTA.Sformulae-sequencesubscript𝑇syssubscript𝑆blocksubscript𝑇TA𝑆T_{\text{sys}}.S_{\text{block}}\subseteq T_{\text{TA}}.Sitalic_T start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT . italic_S start_POSTSUBSCRIPT block end_POSTSUBSCRIPT ⊆ italic_T start_POSTSUBSCRIPT TA end_POSTSUBSCRIPT . italic_S. The following condition ensures that it will only transiently occupy blocking states:

sTsys.Sblock(𝚡TA=s¬(𝚡TA=s)).\bigwedge_{s\in T_{\text{sys}}.S_{\text{block}}}\square\Big{(}\mathtt{x}_{% \text{TA}}=s\rightarrow\bigcirc\neg(\mathtt{x}_{\text{TA}}=s)\Big{)}.⋀ start_POSTSUBSCRIPT italic_s ∈ italic_T start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT . italic_S start_POSTSUBSCRIPT block end_POSTSUBSCRIPT end_POSTSUBSCRIPT □ ( typewriter_x start_POSTSUBSCRIPT TA end_POSTSUBSCRIPT = italic_s → ○ ¬ ( typewriter_x start_POSTSUBSCRIPT TA end_POSTSUBSCRIPT = italic_s ) ) . (g8)

Therefore, we synthesize a test agent strategy πTAsubscript𝜋TA\pi_{\text{TA}}italic_π start_POSTSUBSCRIPT TA end_POSTSUBSCRIPT for the GR(1) formula with assumptions (a1)–(a5) and guarantees  (g1)–(g8).

Counterexample-guided Approach: The MILP can have multiple optimal solutions, some of which may not be realizable for the test agent. If the GR(1) formula is unrealizable, we exclude the solution and re-solve the MILP until we find a realizable GR(1) formula. In particular, every new set of optimal cuts C𝐶Citalic_C that is unrealizable is added to the set 𝙲exsubscript𝙲ex\mathtt{C}_{\text{ex}}typewriter_C start_POSTSUBSCRIPT ex end_POSTSUBSCRIPT. Then, the MILP is resolved with an additional set of affine constraints as follows,

eCde|C|1,C𝙲ex.formulae-sequencesubscript𝑒𝐶superscript𝑑𝑒𝐶1for-all𝐶subscript𝙲ex\sum_{e\in C}d^{e}\leq|C|-1,\>\forall C\in\mathtt{C}_{\text{ex}}.∑ start_POSTSUBSCRIPT italic_e ∈ italic_C end_POSTSUBSCRIPT italic_d start_POSTSUPERSCRIPT italic_e end_POSTSUPERSCRIPT ≤ | italic_C | - 1 , ∀ italic_C ∈ typewriter_C start_POSTSUBSCRIPT ex end_POSTSUBSCRIPT . (c15)

This corresponds to preventing all edges in an excluded solution C𝐶Citalic_C from being cut at the same time. The adapted MILP is then defined as follows:

  MILP-agent:

max𝐟,𝐝,𝐝state,𝝁,𝐟sys(q,𝚜)qπ.Q,𝚜𝚂𝒢sys(q).subscript𝐟𝐝subscript𝐝state𝝁formulae-sequencesuperscriptsubscript𝐟sys𝑞𝚜for-all𝑞subscript𝜋𝑄for-all𝚜subscript𝚂subscript𝒢sys𝑞\displaystyle\max_{\begin{subarray}{c}\mathbf{f},\mathbf{d},\mathbf{d}_{\text{% state}},\bm{\mu},\\ \mathbf{f}_{\text{sys}}^{(q,\mathtt{s})}\,\forall q\in\mathcal{B}_{\pi}.Q,\,\\ \forall\mathtt{s}\in\mathtt{S}_{\mathcal{G}_{\text{sys}}}(q).\end{subarray}}roman_max start_POSTSUBSCRIPT start_ARG start_ROW start_CELL bold_f , bold_d , bold_d start_POSTSUBSCRIPT state end_POSTSUBSCRIPT , bold_italic_μ , end_CELL end_ROW start_ROW start_CELL bold_f start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_q , typewriter_s ) end_POSTSUPERSCRIPT ∀ italic_q ∈ caligraphic_B start_POSTSUBSCRIPT italic_π end_POSTSUBSCRIPT . italic_Q , end_CELL end_ROW start_ROW start_CELL ∀ typewriter_s ∈ typewriter_S start_POSTSUBSCRIPT caligraphic_G start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( italic_q ) . end_CELL end_ROW end_ARG end_POSTSUBSCRIPT F1|E|vVdstatev1|E|2eEde𝐹1𝐸subscript𝑣𝑉subscriptsuperscript𝑑𝑣state1superscript𝐸2subscript𝑒𝐸superscript𝑑𝑒\displaystyle F-\frac{1}{|E|}\sum_{v\in V}d^{v}_{\text{state}}-\frac{1}{|E|^{2% }}\sum_{e\in E}d^{e}italic_F - divide start_ARG 1 end_ARG start_ARG | italic_E | end_ARG ∑ start_POSTSUBSCRIPT italic_v ∈ italic_V end_POSTSUBSCRIPT italic_d start_POSTSUPERSCRIPT italic_v end_POSTSUPERSCRIPT start_POSTSUBSCRIPT state end_POSTSUBSCRIPT - divide start_ARG 1 end_ARG start_ARG | italic_E | start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG ∑ start_POSTSUBSCRIPT italic_e ∈ italic_E end_POSTSUBSCRIPT italic_d start_POSTSUPERSCRIPT italic_e end_POSTSUPERSCRIPT (24)
s.t.(c1)-(c9),(c10),(c15).s.t.(c1)-(c9)(c10)(c15)\displaystyle\text{s.t.}\quad\text{\eqref{eq:binary_cuts}-\eqref{eq:static_map% _cuts_in_G}},\text{\eqref{eq:d_state_constraint}},\text{\eqref{eq:% counterexample_constraint}}.s.t. ( )-( ) , ( ) , ( ) .

 

This process is repeated until a strategy is synthesized or the MILP-agent becomes infeasible.

Lemma 4.

Let πTAsubscript𝜋TA\pi_{\text{TA}}italic_π start_POSTSUBSCRIPT TA end_POSTSUBSCRIPT be the test agent strategy and let 𝙾𝚋𝚜𝙾𝚋𝚜\mathtt{Obs}typewriter_Obs be the set of static obstacles synthesized from the optimal solution Csuperscript𝐶C^{*}italic_C start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT of MILP-agent according to the GR(1) formula with assumptions (a1)–(a5) and guarantees (g1)–(g8). Let πtestsubscript𝜋test\pi_{\text{test}}italic_π start_POSTSUBSCRIPT test end_POSTSUBSCRIPT be the reactive test strategy corresponding to the optimal cuts Csuperscript𝐶C^{*}italic_C start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT. Then πTAsubscript𝜋TA\pi_{\text{TA}}italic_π start_POSTSUBSCRIPT TA end_POSTSUBSCRIPT and 𝙾𝚋𝚜𝙾𝚋𝚜\mathtt{Obs}typewriter_Obs realize πtestsubscript𝜋test\pi_{\text{test}}italic_π start_POSTSUBSCRIPT test end_POSTSUBSCRIPT.

Proof.

By construction in Eqs. (19), (21), (23), we have that 𝒞(q)=(q)𝙾𝚋𝚜𝒞𝑞𝑞𝙾𝚋𝚜\mathcal{C}(q)=\mathcal{R}(q)\cup\mathtt{Obs}caligraphic_C ( italic_q ) = caligraphic_R ( italic_q ) ∪ typewriter_Obs for all history variables qπ.Qformulae-sequence𝑞subscript𝜋𝑄q\in\mathcal{B}_{\pi}.Qitalic_q ∈ caligraphic_B start_POSTSUBSCRIPT italic_π end_POSTSUBSCRIPT . italic_Q. Due to guarantee (g6), the synthesized test agent strategy restricts the transitions in (q)𝑞\mathcal{R}(q)caligraphic_R ( italic_q ). The test agent is also prohibited from restricting any other transitions by the guarantee (g7). Therefore, at each step of the test execution, the system actions restricted as a result of πTAsubscript𝜋TA\pi_{\text{TA}}italic_π start_POSTSUBSCRIPT TA end_POSTSUBSCRIPT and static obstacles 𝙾𝚋𝚜𝙾𝚋𝚜\mathtt{Obs}typewriter_Obs exactly correspond to those restricted by the test strategy πtestsubscript𝜋test\pi_{\text{test}}italic_π start_POSTSUBSCRIPT test end_POSTSUBSCRIPT. ∎

Theorem 4.

Algorithm 2 solves Problem 2.

Proof.

The test agent strategy is synthesized to satisfy guarantees (g1)-(g8). The guarantees (g1)-(g4) specify the dynamics of the test agent, which satisfies A1. The safety guarantee (g5) satisfies A2. Guarantees (g6) and  (g7) realize the optimal cuts from MILP-agent. Due to constraint (c8) the optimal cuts ensure that there always exists a path on Gsyssubscript𝐺sysG_{\text{sys}}italic_G start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT. Together with guarantee (g8), this results in πTAsubscript𝜋TA\pi_{\text{TA}}italic_π start_POSTSUBSCRIPT TA end_POSTSUBSCRIPT satisfying assumptions A3 and A4. By Lemma 4, πTAsubscript𝜋TA\pi_{\text{TA}}italic_π start_POSTSUBSCRIPT TA end_POSTSUBSCRIPT is a realization of a least-restrictive feasible πtestsubscript𝜋test\pi_{\text{test}}italic_π start_POSTSUBSCRIPT test end_POSTSUBSCRIPT. ∎

The test agent strategy and obstacles, πTAsubscript𝜋TA\pi_{\text{TA}}italic_π start_POSTSUBSCRIPT TA end_POSTSUBSCRIPT and 𝙾𝚋𝚜𝙾𝚋𝚜\mathtt{Obs}typewriter_Obs correspond to the least-restrictive reactive test strategy πtestsubscript𝜋test\pi_{\text{test}}italic_π start_POSTSUBSCRIPT test end_POSTSUBSCRIPT possible for that test environment. Other test environments might result in different least-restrictive reactive test strategies.

VII Complexity

Our framework comprises three parts: i) graph construction, ii) routing optimization, and iii) reactive synthesis. For graph construction, we first need to construct Büchi automata from specifications. In the worst case, this construction has doubly-exponential complexity, 22|ϕ|superscript2superscript2italic-ϕ2^{2^{|\phi|}}2 start_POSTSUPERSCRIPT 2 start_POSTSUPERSCRIPT | italic_ϕ | end_POSTSUPERSCRIPT end_POSTSUPERSCRIPT, in the length of the formula ϕitalic-ϕ\phiitalic_ϕ [36]. Then, graph construction involves computing a Cartesian product of two graphs Tsyssubscript𝑇sysT_{\text{sys}}italic_T start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT and πsubscript𝜋\mathcal{B}_{\pi}caligraphic_B start_POSTSUBSCRIPT italic_π end_POSTSUBSCRIPT, and has a worst-case time complexity of O(|Tsys.S|2|π.Q|2)O(|T_{\text{sys}}.S|^{2}\cdot|\mathcal{B}_{\pi}.Q|^{2})italic_O ( | italic_T start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT . italic_S | start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ⋅ | caligraphic_B start_POSTSUBSCRIPT italic_π end_POSTSUBSCRIPT . italic_Q | start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ). For a more efficient implementation, we construct this product by expanding into states that are reachable from the source 𝚂𝚂\mathtt{S}typewriter_S. In the reactive synthesis part of the framework, we use GR(1) synthesis which is known to have a complexity of O(|N|)3𝑂superscript𝑁3O(|N|)^{3}italic_O ( | italic_N | ) start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT, where N𝑁Nitalic_N is the number of states required to define the formula. In this section, we will establish the computational complexity of the routing optimization and show that the associated decision problem is NP-hard.

Refer to caption
(a) Graphs matching formulae with a single variable x𝑥xitalic_x.
Refer to caption
(b) Graph resulting from a reduction of the 3-SAT formula F(x1,,x5)𝐹subscript𝑥1subscript𝑥5F(x_{1},\dots,x_{5})italic_F ( italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_x start_POSTSUBSCRIPT 5 end_POSTSUBSCRIPT ), where the resulting edge cuts correspond to the truth assignment of the variables x1,,x5subscript𝑥1subscript𝑥5x_{1},\dots,x_{5}italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_x start_POSTSUBSCRIPT 5 end_POSTSUBSCRIPT.
Figure 7: Graphs constructed from a 3-SAT formula, where a truth assignment for the variables can be found using the network flow approach for static obstacles.
Refer to caption
(a) Graph G𝐺Gitalic_G according to Construction 3 for the reactive case.
Refer to caption
(b) Graph Gsyssubscript𝐺sysG_{\text{sys}}italic_G start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT according to Construction 2.
Figure 8: Graphs G𝐺Gitalic_G and Gsyssubscript𝐺sysG_{\text{sys}}italic_G start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT constructed from a 3-SAT formula, where a truth assignment for the variables can be found using the flow approach for reactive obstacles.

To prove the computational complexity of finding the cuts on the graph, we first prove the computational complexity in the special case of static obstacles. As defined in sections IV and V, the problem data is a graph G=(V,E)𝐺𝑉𝐸G=(V,E)italic_G = ( italic_V , italic_E ) with node groups 𝚂𝚂\mathtt{S}typewriter_S, 𝙸𝙸\mathtt{I}typewriter_I, 𝚃𝚃\mathtt{T}typewriter_T, and the corresponding flow network 𝒢𝒢\mathcal{G}caligraphic_G. For some edge eEE(𝙸)𝑒𝐸𝐸𝙸e\in E\setminus E(\mathtt{I})italic_e ∈ italic_E ∖ italic_E ( typewriter_I ), the binary variable desuperscript𝑑𝑒d^{e}italic_d start_POSTSUPERSCRIPT italic_e end_POSTSUPERSCRIPT indicates whether the edge is cut: de=1superscript𝑑𝑒1d^{e}=1italic_d start_POSTSUPERSCRIPT italic_e end_POSTSUPERSCRIPT = 1. The set CE𝐶𝐸C\subset Eitalic_C ⊂ italic_E represents the set of edges with de=1superscript𝑑𝑒1d^{e}=1italic_d start_POSTSUPERSCRIPT italic_e end_POSTSUPERSCRIPT = 1. For static obstacles, the edges are grouped by the corresponding transition in Tsyssubscript𝑇sysT_{\text{sys}}italic_T start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT. The grouping 𝙶𝚛static:Tsys.EG.E\mathtt{Gr}_{\text{static}}:T_{\text{sys}}.E\rightarrow G.Etypewriter_Gr start_POSTSUBSCRIPT static end_POSTSUBSCRIPT : italic_T start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT . italic_E → italic_G . italic_E, and defined as follows,

𝙶𝚛static((s,s)){(u,v)G.E|u.s=s,v.s=s}.\mathtt{Gr}_{\text{static}}((s,s^{\prime}))\coloneqq\{(u,v)\in G.E\>|\>\>u.s=s% ,v.s=s^{\prime}\}.typewriter_Gr start_POSTSUBSCRIPT static end_POSTSUBSCRIPT ( ( italic_s , italic_s start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ) ≔ { ( italic_u , italic_v ) ∈ italic_G . italic_E | italic_u . italic_s = italic_s , italic_v . italic_s = italic_s start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT } . (25)

For some (s,s)Tsys.Eformulae-sequence𝑠superscript𝑠subscript𝑇sys𝐸(s,s^{\prime})\in T_{\text{sys}}.E( italic_s , italic_s start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ∈ italic_T start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT . italic_E, all edges e𝙶𝚛static((s,s))𝑒subscript𝙶𝚛static𝑠superscript𝑠e\in\mathtt{Gr}_{\text{static}}((s,s^{\prime}))italic_e ∈ typewriter_Gr start_POSTSUBSCRIPT static end_POSTSUBSCRIPT ( ( italic_s , italic_s start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ) have the same desuperscript𝑑𝑒d^{e}italic_d start_POSTSUPERSCRIPT italic_e end_POSTSUPERSCRIPT value, i.e., if de=1superscript𝑑𝑒1d^{e}=1italic_d start_POSTSUPERSCRIPT italic_e end_POSTSUPERSCRIPT = 1 for some edge e𝑒eitalic_e in the group, then all edges in this group will have desuperscript𝑑𝑒d^{e}italic_d start_POSTSUPERSCRIPT italic_e end_POSTSUPERSCRIPT set to 1. A bypass path on G𝐺Gitalic_G is some Path(𝚂,𝚃)𝑃𝑎𝑡𝚂𝚃Path(\mathtt{S},\mathtt{T})italic_P italic_a italic_t italic_h ( typewriter_S , typewriter_T ) which does not visit the intermediate 𝙸𝙸\mathtt{I}typewriter_I. The flow value F𝐹Fitalic_F on 𝒢𝒢\mathcal{G}caligraphic_G is defined from the source 𝚂𝚂\mathtt{S}typewriter_S to target 𝚃𝚃\mathtt{T}typewriter_T, with each edge having unit capacity. A valid set of edge cuts C𝐶Citalic_C is such that i) there does not exist a bypass path, ii) there exists a path from 𝚂𝚂\mathtt{S}typewriter_S to 𝚃𝚃\mathtt{T}typewriter_T, and iii) edges respect the grouping 𝙶𝚛staticsubscript𝙶𝚛static\mathtt{Gr}_{\text{static}}typewriter_Gr start_POSTSUBSCRIPT static end_POSTSUBSCRIPT.

Problem 3 (Static Obstacles Optimization Problem).

Given a graph G𝐺Gitalic_G, find a valid set of edge cuts C𝐶Citalic_C such that the resulting maximum flow F𝐹Fitalic_F is maximized over all possible sets of edge cuts, and such that |C|𝐶|C|| italic_C | is minimized for the flow F𝐹Fitalic_F.

This corresponds to finding the valid set of edge cuts C𝐶Citalic_C that as first priority, maximizes the flow F𝐹Fitalic_F, and subsequently chooses the set of edge cuts C𝐶Citalic_C with the smallest cardinality |C|𝐶|C|| italic_C | (i.e. breaking ties between all valid edge cuts that realize F𝐹Fitalic_F). For static obstacles, Problem 3 corresponds to the following decision problem.

Problem 4 (Static Obstacles Decision Problem).

Given a graph G𝐺Gitalic_G and an integer M0𝑀0M\geq 0italic_M ≥ 0, does there exist a valid set of edge cuts C𝐶Citalic_C such that |C|M𝐶𝑀|C|\,\leq M| italic_C | ≤ italic_M?

Lemma 5.

Any solution to Problem 3 can be used to construct a solution for Problem 4 in polynomial time.

Lemma 5 implies that if there exists a polynomial-time algorithm to compute a solution to Problem 3, then there also exists a polynomial-time algorithm to solve Problem 4. Thus, if we can show that Problem 4 belongs to the class of NP-hard problems (i.e., there exists a polynomial-time reduction from any arbitrary problem in NP to Problem 4), that would imply that there exists a polynomial-time algorithm to solve Problem 4 only if P=NP𝑃𝑁𝑃P=NPitalic_P = italic_N italic_P. This in turn would support the MILP approach we provide to solve Problem 3. To show that Problem 4 is NP-hard, we construct a polynomial-time reduction from 3-SAT to Problem 4. This polynomial-time reduction maps any instance of 3-SAT to Problem 4 such that the solution of the constructed instance of Problem 4 corresponds to a solution of the instance of the 3-SAT problem.

Definition 20 (3-SAT [59]).

Let f(x1,,xn)j=1mcj𝑓subscript𝑥1subscript𝑥𝑛superscriptsubscript𝑗1𝑚subscript𝑐𝑗f(x_{1},\ldots,x_{n})\coloneqq\bigwedge_{j=1}^{m}c_{j}italic_f ( italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_x start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ) ≔ ⋀ start_POSTSUBSCRIPT italic_j = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_m end_POSTSUPERSCRIPT italic_c start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT be a propositional logic formula over Boolean propositions x1,,xnsubscript𝑥1subscript𝑥𝑛x_{1},\ldots,x_{n}italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_x start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT in conjunctive normal form (CNF) in which each clause cjsubscript𝑐𝑗c_{j}italic_c start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT is a disjunction of three Boolean propositions or their negations. A solution to the 3-SAT problem is an algorithm that returns True if there exists a satisfying Boolean assignment to f(x1,,xn)𝑓subscript𝑥1subscript𝑥𝑛f(x_{1},\ldots,x_{n})italic_f ( italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_x start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ) and False otherwise.

We first introduce a construction which maps any clause in a propositional logic formula to some sub-graph of a graph. We will then connect these sub-graphs to obtain the graph which will allows a reduction of any instance of 3-SAT to Problem 4. In turn, we will show that we can use any algorithm that solves Problem 4 to solve the 3-SAT problem, showing that Problem 4 is polynomial-time only if there exists a polynomial-time algorithm to solve 3-SAT, implying P=NP.

Construction 1 (Clause to Sub-graph).

Given a 3-SAT clause cjsubscript𝑐𝑗c_{j}italic_c start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT, we can construct a sub-graph representing this clause as follows. For each clause cjsubscript𝑐𝑗c_{j}italic_c start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT, we introduce nodes sj1subscript𝑠𝑗1s_{j-1}italic_s start_POSTSUBSCRIPT italic_j - 1 end_POSTSUBSCRIPT and sjsubscript𝑠𝑗s_{j}italic_s start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT. Then, we add the nodes x1,j,xn,jsubscript𝑥1𝑗subscript𝑥𝑛𝑗x_{1,j},\dots x_{n,j}italic_x start_POSTSUBSCRIPT 1 , italic_j end_POSTSUBSCRIPT , … italic_x start_POSTSUBSCRIPT italic_n , italic_j end_POSTSUBSCRIPT corresponding to variables x1,,xnsubscript𝑥1subscript𝑥𝑛x_{1},\ldots,x_{n}italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_x start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT in the 3-SAT formula. We add the following directed edges for each xi,jsubscript𝑥𝑖𝑗x_{i,j}italic_x start_POSTSUBSCRIPT italic_i , italic_j end_POSTSUBSCRIPT node — an incoming edge from node sj1subscript𝑠𝑗1s_{j-1}italic_s start_POSTSUBSCRIPT italic_j - 1 end_POSTSUBSCRIPT to xi,jsubscript𝑥𝑖𝑗x_{i,j}italic_x start_POSTSUBSCRIPT italic_i , italic_j end_POSTSUBSCRIPT, and an outgoing edge from xi,jsubscript𝑥𝑖𝑗x_{i,j}italic_x start_POSTSUBSCRIPT italic_i , italic_j end_POSTSUBSCRIPT node to node sjsubscript𝑠𝑗s_{j}italic_s start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT. Then we add two nodes, denoted by 𝙸T,jsubscript𝙸T𝑗\mathtt{I}_{\text{T},j}typewriter_I start_POSTSUBSCRIPT T , italic_j end_POSTSUBSCRIPT and 𝙸F,jsubscript𝙸F𝑗\mathtt{I}_{\text{F},j}typewriter_I start_POSTSUBSCRIPT F , italic_j end_POSTSUBSCRIPT, to this sub-graph. If xisubscript𝑥𝑖x_{i}italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT appears in the clause cjsubscript𝑐𝑗c_{j}italic_c start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT, then we connect the 𝙸T,jsubscript𝙸T𝑗\mathtt{I}_{\text{T},j}typewriter_I start_POSTSUBSCRIPT T , italic_j end_POSTSUBSCRIPT node by bypassing the edge from xi,jsubscript𝑥𝑖𝑗x_{i,j}italic_x start_POSTSUBSCRIPT italic_i , italic_j end_POSTSUBSCRIPT to xjsubscript𝑥𝑗x_{j}italic_x start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT, and if x¯isubscript¯𝑥𝑖\bar{x}_{i}over¯ start_ARG italic_x end_ARG start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT appears in cjsubscript𝑐𝑗c_{j}italic_c start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT, then we connect 𝙸F,jsubscript𝙸F𝑗\mathtt{I}_{\text{F},j}typewriter_I start_POSTSUBSCRIPT F , italic_j end_POSTSUBSCRIPT to bypass the edge from sj1subscript𝑠𝑗1s_{j-1}italic_s start_POSTSUBSCRIPT italic_j - 1 end_POSTSUBSCRIPT to xi,jsubscript𝑥𝑖𝑗x_{i,j}italic_x start_POSTSUBSCRIPT italic_i , italic_j end_POSTSUBSCRIPT (as shown in Fig. 7).

Constructing a sub-graph for a clause cjsubscript𝑐𝑗c_{j}italic_c start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT via Construction 1 allows us to relate the edge cuts to the Boolean assignment for the variables x0,,xnsubscript𝑥0subscript𝑥𝑛x_{0},\dots,x_{n}italic_x start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , … , italic_x start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT. If the incoming edge into xi,jsubscript𝑥𝑖𝑗x_{i,j}italic_x start_POSTSUBSCRIPT italic_i , italic_j end_POSTSUBSCRIPT is cut, then the corresponding Boolean assignment to xisubscript𝑥𝑖x_{i}italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT is False, and if the outgoing edge from xi,jsubscript𝑥𝑖𝑗x_{i,j}italic_x start_POSTSUBSCRIPT italic_i , italic_j end_POSTSUBSCRIPT is cut, then the corresponding Boolean assignment to xisubscript𝑥𝑖x_{i}italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT is True. This ensures that a satisfying assignment for the clause corresponds to edge cuts such that all Paths(sj1,sj)𝑃𝑎𝑡𝑠subscript𝑠𝑗1subscript𝑠𝑗Paths(s_{j-1},s_{j})italic_P italic_a italic_t italic_h italic_s ( italic_s start_POSTSUBSCRIPT italic_j - 1 end_POSTSUBSCRIPT , italic_s start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) are routed through intermediate nodes {𝙸T,j,𝙸F,j}subscript𝙸T𝑗subscript𝙸F𝑗\{\mathtt{I}_{\text{T},j},\mathtt{I}_{\text{F},j}\}{ typewriter_I start_POSTSUBSCRIPT T , italic_j end_POSTSUBSCRIPT , typewriter_I start_POSTSUBSCRIPT F , italic_j end_POSTSUBSCRIPT }. An assignment that evaluates the clause cjsubscript𝑐𝑗c_{j}italic_c start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT to False corresponds to edge cuts in the sub-graph such that there is no path from sj1subscript𝑠𝑗1s_{j-1}italic_s start_POSTSUBSCRIPT italic_j - 1 end_POSTSUBSCRIPT to sjsubscript𝑠𝑗s_{j}italic_s start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT.

Construction 2 (Reduction of 3-SAT to Problem 4).

Suppose we have an instance of the 3-SAT problem with n𝑛nitalic_n variables x1,,xnsubscript𝑥1subscript𝑥𝑛x_{1},\dots,x_{n}italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_x start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT and m𝑚mitalic_m clauses c1,cmsubscript𝑐1subscript𝑐𝑚c_{1},\dots c_{m}italic_c start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … italic_c start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT. First, we construct the sub-graphs for each clause according to Construction 1. Let Mm×n𝑀𝑚𝑛M\coloneqq m\times nitalic_M ≔ italic_m × italic_n. We denote the node s0subscript𝑠0s_{0}italic_s start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT as the source 𝚂𝚂\mathtt{S}typewriter_S, and smsubscript𝑠𝑚s_{m}italic_s start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT as the sink 𝚃𝚃\mathtt{T}typewriter_T. The resulting graph is a series of sub-graphs representing each clause cjsubscript𝑐𝑗c_{j}italic_c start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT of the 3-SAT formula. For every variable xisubscript𝑥𝑖x_{i}italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT in the formula, we maintain two groups of edges: i) incoming edges {(sj1,xi,j)| 1jm}conditional-setsubscript𝑠𝑗1subscript𝑥𝑖𝑗1𝑗𝑚\{(s_{j-1},x_{i,j})\>|\>1\leq j\leq m\}{ ( italic_s start_POSTSUBSCRIPT italic_j - 1 end_POSTSUBSCRIPT , italic_x start_POSTSUBSCRIPT italic_i , italic_j end_POSTSUBSCRIPT ) | 1 ≤ italic_j ≤ italic_m }, and ii) outgoing edges {(xi,j,sj)| 1jm}conditional-setsubscript𝑥𝑖𝑗subscript𝑠𝑗1𝑗𝑚\{(x_{i,j},s_{j})\>|\>1\leq j\leq m\}{ ( italic_x start_POSTSUBSCRIPT italic_i , italic_j end_POSTSUBSCRIPT , italic_s start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) | 1 ≤ italic_j ≤ italic_m }. All edges in a group share the same edge cut value, corresponding to 𝙶𝚛staticsubscript𝙶𝚛static\mathtt{Gr}_{\text{static}}typewriter_Gr start_POSTSUBSCRIPT static end_POSTSUBSCRIPT. This ensures that every variable has the same Boolean assignment across clauses.

This allows us to construct a graph corresponding to a 3-SAT formula in polynomial time via the procedure outlined in Construction 2, also illustrated in Fig. 7.

Theorem 5.

Problem 4 is NP-complete.

Proof.

We will show that Problem 4 is NP-hard by showing that Construction 2 is a correct polynomial-time reduction of the 3-SAT problem to Problem 4 i.e., any polynomial-time algorithm to solve Problem 4 can be used to solve 3-SAT in polynomial-time. Consider the graph constructed by Construction 2 for any propositional logic formula. The valid set of edge cuts C𝐶Citalic_C on this graph with cardinality |C|M𝐶𝑀|C|\leq M| italic_C | ≤ italic_M is a witness for Problem 4. A witness for the 3-SAT formula is an assignment of the variables x1,,xnsubscript𝑥1subscript𝑥𝑛x_{1},\dots,x_{n}italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_x start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT. A witness to a problem is satisfying if the problem evaluates to True under that witness. Next, we show that a valid set of edge cuts C𝐶Citalic_C is a satisfying witness for Problem 4 iff the corresponding assignment to variables x1,,xnsubscript𝑥1subscript𝑥𝑛x_{1},\dots,x_{n}italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_x start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT is a satisfying witness for the 3-SAT formula.

First, consider a satisfying witness for Problem 4. By Construction 2, the cardinality of the witness, |C|=m×n𝐶𝑚𝑛|C|=m\times n| italic_C | = italic_m × italic_n will be exactly M𝑀Mitalic_M, which is the minimum number of edge cuts required to ensure no bypass paths on the constructed graph. This implies that each variable xisubscript𝑥𝑖x_{i}italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT has a Boolean assignment. By Construction 1, a strictly positive flow on the sub-graph of clause cjsubscript𝑐𝑗c_{j}italic_c start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT implies that cjsubscript𝑐𝑗c_{j}italic_c start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT is satisfied. By Construction 2, a strictly positive flow through the entire graph implies that all clauses in the 3-SAT formula are satisfied. Therefore, a satisfying witness to the 3-SAT formula can be constructed in polynomial-time from a satisfying witness for an instance of Problem 4.

Next, we consider a satisfying witness for the 3-SAT formula. The Boolean assignment for each variable xisubscript𝑥𝑖x_{i}italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT corresponds to edge cuts on the graph (see Fig. 7). Any Boolean assignment ensures that there is no bypass path on the graph since either all incoming edges or all outgoing edges for each variable xisubscript𝑥𝑖x_{i}italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT are cut. This also corresponds to the minimum number of edge cuts required to cut all bypass paths, corresponding to |C|=m×n𝐶𝑚𝑛|C|=m\times n| italic_C | = italic_m × italic_n. By Construction 1, a satisfying witness corresponds to a Path(sj1,sjsubscript𝑠𝑗1subscript𝑠𝑗s_{j-1},s_{j}italic_s start_POSTSUBSCRIPT italic_j - 1 end_POSTSUBSCRIPT , italic_s start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT) on the sub-graph for each clause cjsubscript𝑐𝑗c_{j}italic_c start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT. By Construction 2, observe that there exists a strictly positive flow on the graph. Thus, we can construct a satisfying witness to an instance of Problem 4 in polynomial time from a satisfying witness to the 3-SAT formula. Therefore, any 3-SAT problem reduces to an instance of Problem 4, and thus, Problem 4 is NP-hard. Additionally, Problem 4 is NP-complete since we can check the cardinality of C𝐶Citalic_C, and whether C𝐶Citalic_C is a valid set of edge cuts in polynomial time. ∎

Corollary 1.

Problem 3 is NP-hard [60].

Proof.

By Theorem 5, Problem 4 is NP-complete, and therefore by Lemma 5, Problem 3 is NP-hard. ∎

Additionally, we can identify the computational complexity for the reactive setting. For the reactive setting, a valid set of edge cuts is similar to the static setting, except in how edges are grouped, which is discussed in Remark 6. Fig. 8 illustrates the graphs used for establishing the computational complexity in this setting. The optimization problem and its corresponding decision problem can be stated as follows.

Problem 5 (Reactive Obstacles Optimization Problem).

Given a graph G𝐺Gitalic_G, identify a valid set of edge cuts C𝐶Citalic_C such that the resulting flow F𝐹Fitalic_F is maximized over all possible sets of edge cuts, and such that |C|𝐶|C|| italic_C | is minimized for the flow F.

Note that a valid set of edge cuts for the reactive problem is different from a valid set of edge cuts for the static problem.

Problem 6 (Reactive Obstacles Decision Problem).

Given a graph G𝐺Gitalic_G, and an integer M0𝑀0M\geq 0italic_M ≥ 0, does there exist a valid set of cuts C𝐶Citalic_C such that |C|M𝐶𝑀|C|\leq M| italic_C | ≤ italic_M?

Once again, we prove a reduction from 3-SAT, but to an instance of Problem 6 with a single history variable q𝑞qitalic_q. Given a 3-SAT formula, the construction of the graph follows from the static setting, but with a few key differences.

Construction 3 (Reduction from 3-SAT to Problem 6 with single history variable q𝑞qitalic_q).

Suppose we have an instance of the 3-SAT problem with n𝑛nitalic_n variables x1,,xnsubscript𝑥1subscript𝑥𝑛x_{1},\dots,x_{n}italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_x start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT and m𝑚mitalic_m clauses c1,cmsubscript𝑐1subscript𝑐𝑚c_{1},\dots c_{m}italic_c start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … italic_c start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT. Let Mn𝑀𝑛M\coloneqq nitalic_M ≔ italic_n. Using Construction 2, setup two graphs: G𝐺Gitalic_G and a copy G(q,𝚂)superscript𝐺𝑞𝚂G^{(q,\mathtt{S})}italic_G start_POSTSUPERSCRIPT ( italic_q , typewriter_S ) end_POSTSUPERSCRIPT for source 𝚂𝚂\mathtt{S}typewriter_S and the single history variable q𝑞qitalic_q. The key difference is that G(q,𝚂)superscript𝐺𝑞𝚂G^{(q,\mathtt{S})}italic_G start_POSTSUPERSCRIPT ( italic_q , typewriter_S ) end_POSTSUPERSCRIPT follows Construction 2 exactly, while in G𝐺Gitalic_G, edges in a group need not have the same cut value. Furthermore, for each group in G(q,𝚂)superscript𝐺𝑞𝚂G^{(q,\mathtt{S})}italic_G start_POSTSUPERSCRIPT ( italic_q , typewriter_S ) end_POSTSUPERSCRIPT, the cut value is set to the maximum edge-cut value in the corresponding group in G𝐺Gitalic_G.

Theorem 6.

Problem 6 is NP-complete and Problem 5 is NP-hard.

Proof.

The proof follows similarly from Theorem 5. In this setting, a witness for Problem 6 comprises the maximum edge cut value of each group in G𝐺Gitalic_G. Construction 3 relates edge cuts on G𝐺Gitalic_G and G(q,𝚂)superscript𝐺𝑞𝚂G^{(q,\mathtt{S})}italic_G start_POSTSUPERSCRIPT ( italic_q , typewriter_S ) end_POSTSUPERSCRIPT. This implies that edge cuts on G𝐺Gitalic_G are found under the condition that there is a strictly positive flow on G(q,𝚂)superscript𝐺𝑞𝚂G^{(q,\mathtt{S})}italic_G start_POSTSUPERSCRIPT ( italic_q , typewriter_S ) end_POSTSUPERSCRIPT under a static mapping of edges. The minimum set of edge cuts which ensures no bypass paths on G𝐺Gitalic_G has cardinality n𝑛nitalic_n, corresponding to only one of the sub-graphs having edge cuts. Furthermore, for each xisubscript𝑥𝑖x_{i}italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT, there will be one edge-cut in one of the two groups (incoming or outgoing edges). Therefore, for each xisubscript𝑥𝑖x_{i}italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT, only the incoming or the outgoing edge group will have a maximum edge cut value of 1, corresponding to the Boolean assignment for xisubscript𝑥𝑖x_{i}italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT. A minimum cut on G𝐺Gitalic_G found under the conditions of no bypass paths on G𝐺Gitalic_G and a positive flow on G(q,𝚂)superscript𝐺𝑞𝚂G^{(q,\mathtt{S})}italic_G start_POSTSUPERSCRIPT ( italic_q , typewriter_S ) end_POSTSUPERSCRIPT results in a Boolean assignment that is a satisfying witness to the 3-SAT formula. Thus, we have polynomial-time construction of a satisfying witness to the 3-SAT formula from a satisfying witness to Problem 6. This follows similarly to Theorem 5.

Likewise, a satisfying witness to the 3-SAT formula can be mapped to edge cuts on one of the sub-graphs of G𝐺Gitalic_G. These edge cuts will be such that there is no bypass path on G𝐺Gitalic_G, and will be the minimum set of edge cuts to accomplish this task, corresponding to |C|=n𝐶𝑛|C|=n| italic_C | = italic_n. Additionally, by construction of the graphs, this will correspond to a strictly positive flow on G(q,𝚂)superscript𝐺𝑞𝚂G^{(q,\mathtt{S})}italic_G start_POSTSUPERSCRIPT ( italic_q , typewriter_S ) end_POSTSUPERSCRIPT. Thus, we can construct a satisfying witness to Problem 6 in polynomial time from a satisfying witness of the 3-SAT formula. Therefore, any 3-SAT problem reduces to an instance of Problem 6. As a result, Problem 6 is NP-complete and following similarly to Corollary (1), Problem 5 is NP-hard. ∎

VIII Experiments

In this section, we demonstrate our framework on simulated and hardware experiments, and include runtime analysis. In the following experiments, examples with static test environments solve the routing optimization MILP-static to find the test strategy. Similarly, examples with reactive test environments solve MILP-reactive, and those with reactive dynamic agents solve MILP-agent, unless otherwise stated. These optimizations are solved using Gurobipy [61]. The reactive test agent strategies are synthesized using the temporal logic planning toolbox TuLiP [47].

In simulations and hardware experiments, we utilize Unitree A1 quadrupeds as both the system and test agents. The low-level control of the quadruped is managed through a motion primitive layer, which abstracts the underlying dynamics and facilitates transitions between primitives as described in [62]. This includes behaviors such as lying down, standing, walking, jumping, and reduced-order model-based waypoint tracking using a unicycle or single integrator model. These behaviors can be directly commanded by the autonomy layer provided by TuLiP. Individual motion primitives are implemented within our C++ motion primitive framework, with control laws, sensing, and estimation executed at 1kHz.

Refer to caption
(a) Beaver rescue.
Refer to caption
(b) Motion primitive example.
Refer to caption
(c) Maze 1.
Refer to caption
(d) Simulated alternative trace, Maze 2.
Figure 9: Simulated experiment results. Yellow boxes are obstacles to indicate states that are not navigable in Tsyssubscript𝑇sysT_{\text{sys}}italic_T start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT. Gray quadruped is the system, and yellow quadruped in (c) and (d) is the test agent. In (b), system demonstrates primitives in the order: stand (1), stand (2), jump (3), and lie (4), before advancing to goal (5). In (c) and (d), the test agent chooses to navigate off-grid after the test objective is realized.

VIII-A Simulation

Reactive Test Environment: The following two reactive examples were demonstrated on hardware in previous work in [45]. The updated framework in this paper resulted in simulated test traces (see Figs. 99) that are qualitatively similar to the hardware demo in [45]. Additionally, using the updated framework reduced the time to solve the optimization by three orders of magnitude.

VIII-A1 Beaver Rescue

The quadruped’s task is to rescue the beaver from the hallway and return it to the lab. The system objective is given as φsys=(beavergoal)subscript𝜑sysbeavergoal\varphi_{\text{sys}}=\operatorname{\rotatebox[origin={c}]{45.0}{$\Box$}}(\text% {beaver}\wedge\operatorname{\rotatebox[origin={c}]{45.0}{$\Box$}}\text{goal})italic_φ start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT = □ ( beaver ∧ □ goal ), where ‘beaver’ corresponds to the quadruped reaching the beaver, and ‘goal’ corresponds to the quadruped and the beaver reaching the safe location in the lab. The test objective is given as φtest=door1door2subscript𝜑testsubscriptdoor1subscriptdoor2\varphi_{\text{test}}=\operatorname{\rotatebox[origin={c}]{45.0}{$\Box$}}\text% {door}_{1}\land\operatorname{\rotatebox[origin={c}]{45.0}{$\Box$}}\text{door}_% {2}italic_φ start_POSTSUBSCRIPT test end_POSTSUBSCRIPT = □ door start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ∧ □ door start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT, ensuring that the quadruped will use different doors on the way to the beaver and back into the lab. The resulting test execution first shows the quadruped using door2subscriptdoor2\text{door}_{2}door start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT to exit the lab into the hallway, then after it reaches the beaver, door2subscriptdoor2\text{door}_{2}door start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT is shut and the quadruped walks to door1subscriptdoor1\text{door}_{1}door start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT to finally return to the lab. The reactive aspect here can be observed as follows — if the quadruped chose to enter the hallway through door1subscriptdoor1\text{door}_{1}door start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT, then the resulting test execution would constrain access to door1subscriptdoor1\text{door}_{1}door start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT when the quadruped is attempting to re-enter the lab with the beaver. The simulated test trace is shown in Fig. 9.

VIII-A2 Motion Primitive Example

In this example, we test the motion primitives of the quadruped given as ‘lie’, ‘jump’, and ‘stand’. The goal for the quadruped is to reach the beaver in the hallway. The test objective is given as φtest=jumpliestandsubscript𝜑testjumpliestand\varphi_{\text{test}}=\operatorname{\rotatebox[origin={c}]{45.0}{$\Box$}}\text% {jump}\wedge\operatorname{\rotatebox[origin={c}]{45.0}{$\Box$}}\text{lie}% \wedge\operatorname{\rotatebox[origin={c}]{45.0}{$\Box$}}\text{stand}italic_φ start_POSTSUBSCRIPT test end_POSTSUBSCRIPT = □ jump ∧ □ lie ∧ □ stand, which ensures that each motion primitive is tested at least once; and the system objective is φsys=goalsubscript𝜑sysgoal\varphi_{\text{sys}}=\operatorname{\rotatebox[origin={c}]{45.0}{$\Box$}}\text{goal}italic_φ start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT = □ goal, where ‘goal’ corresponds to the beaver location. The test setup includes doors that might be unlocked by the system demonstrating specific motion primitives. Our framework will decide whether the doors will be locked or unlocked according to which motion primitives have already been observed during the test. This is where the reactivity of this framework becomes apparent, if the quadruped chose a different set of doors and motion primitives, the resulting test execution would have been different. The simulated test trace is shown in Fig. 9.

Test Environment with Dynamic Agent

VIII-A3 Maze 1

The system (gray quadruped) wants to reach its goal location in the top left corner of the grid, and the test agent wants to route it through a series of states, labeled I1,I2subscript𝐼1subscript𝐼2I_{1},I_{2}italic_I start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_I start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT, and I3subscript𝐼3I_{3}italic_I start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT, shown in Fig. 9. The system specification and test objective are given as φsys=goalsubscript𝜑sysgoal\varphi_{\text{sys}}=\operatorname{\rotatebox[origin={c}]{45.0}{$\Box$}}\text{goal}italic_φ start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT = □ goal and φtest=I1I2I3.subscript𝜑testsubscript𝐼1subscript𝐼2subscript𝐼3\varphi_{\text{test}}=\operatorname{\rotatebox[origin={c}]{45.0}{$\Box$}}I_{1}% \land\operatorname{\rotatebox[origin={c}]{45.0}{$\Box$}}I_{2}\land% \operatorname{\rotatebox[origin={c}]{45.0}{$\Box$}}I_{3}.italic_φ start_POSTSUBSCRIPT test end_POSTSUBSCRIPT = □ italic_I start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ∧ □ italic_I start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ∧ □ italic_I start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT . The test agent (yellow quadruped) can move up on the center column of the grid, and its strategy is found using the flow-based synthesis framework. Observe that it blocks specific cells such that the quadruped cannot directly navigate to its goal through the center of the grid. Instead, the system quadruped is forced to visit the labeled cells, and only then, the test agent moves into the parking state off the grid to not excessively constrain the system. The resulting test execution is shown in Fig. 9.

VIII-B Hardware Experiments

Refer to caption
Figure 10: Refueling example experiment trace with yellow boxes representing static obstacles 𝙾𝚋𝚜𝙾𝚋𝚜\mathtt{Obs}typewriter_Obs.
Refer to caption
(a) Mars exploration experiment trace.
Refer to caption
(b) Mars exploration experiment snapshots.
Figure 11: Resulting test execution on the Unitree A1 quadruped for static test environments.

Static Test Environment:

VIII-B1 Running Example

For this experiment we implemented Example 1 on the quadruped. The resulting test trace is shown in Fig. 12.

Refer to caption
Figure 12: Experiment trace for Example 1.

For the following two examples, the system state also contains the fuel level. Thus, the auxiliary bidirectional constraints in MILP-static are such that the fuel level is abstracted away, meaning if a transition is cut for a specific fuel level, it is cut for all fuel levels.

VIII-B2 Refueling

This example highlights that intermediate nodes need not always represent poses of the system. In addition to the coordinates 𝐱=(x,y)𝐱𝑥𝑦\mathbf{x}=(x,y)bold_x = ( italic_x , italic_y ), the quadruped state also tracks the fuel level f𝑓fitalic_f. A full fuel tank consists of 10 units of fuel. Every move on the grid reduces the fuel level by 1 and reaching the refueling station (in the bottom right corner of the grid) resets the fuel tank to full. The desired test behavior is to have the system visit a state that is too far away to reach the goal state with its available fuel, specifically we want to see the system be in the lower three rows of the grid with a fuel level of lower than 2222. The system objective is given as φsys=goal¬(f=0)subscript𝜑sysgoal𝑓0\varphi_{\text{sys}}=\operatorname{\rotatebox[origin={c}]{45.0}{$\Box$}}\text{% goal}\land\square\neg(f=0)italic_φ start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT = □ goal ∧ □ ¬ ( italic_f = 0 ) and the test objective is φtest=(y<4f<2)subscript𝜑test𝑦4𝑓2\varphi_{\text{test}}=\operatorname{\rotatebox[origin={c}]{45.0}{$\Box$}}(y<4% \land f<2)italic_φ start_POSTSUBSCRIPT test end_POSTSUBSCRIPT = □ ( italic_y < 4 ∧ italic_f < 2 ). Note that this test objective also includes states where the fuel tank is empty, f=0𝑓0f=0italic_f = 0, but the MILP will not route the test execution through these unsafe states, but will automatically only route it through the states where f=1𝑓1f=1italic_f = 1 instead. Snapshots and the trace of the test execution are shown in Fig. 10. The color of the trace corresponds to the fuel level, and we observe that the obstacle configuration is such that for the quadruped to successfully reach its goal location it is required to visit the refueling station.

VIII-B3 Mars Exploration

In this example the system is tested for a combination of reachability, reaction and avoidance sub-tasks. This example is inspired by a planetary rover’s exploration of the Martian surface. Consequently, the grid world has states designated as ‘rock’, ‘ice’, and ‘drop-off’, denoting sample locations and the drop-off position, respectively. In addition to the coordinates 𝐱𝐱\mathbf{x}bold_x, the quadruped state also contains the fuel level f𝑓fitalic_f that decreases by 1 for every transition on the grid. The maximum fuel capacity is 10 units and is reset to full at the refueling locations labeled ‘R’. The system objective states that the quadruped must reach its goal location, labeled ‘T’, and if it picks up a sample, it shall drop it off at the drop-off location, while not running out of fuel. This is captured in the system objective

φsys=T¬(f=0)(icerockdrop-off).subscript𝜑sys𝑇𝑓0icerockdrop-off\varphi_{\text{sys}}=\operatorname{\rotatebox[origin={c}]{45.0}{$\Box$}}T\land% \square\neg(f=0)\land\square(\text{ice}\lor\text{rock}\rightarrow\operatorname% {\rotatebox[origin={c}]{45.0}{$\Box$}}\text{drop-off}).italic_φ start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT = □ italic_T ∧ □ ¬ ( italic_f = 0 ) ∧ □ ( ice ∨ rock → □ drop-off ) .

The test objective corresponds to the triggers of the reaction sub-task. Specifically, the quadruped is required to collect a ‘rock’ sample and an ‘ice’ sample, and is routed such that a successful run requires the quadruped to refuel:

φtest=rockice(d>f),subscript𝜑testrockice𝑑𝑓\varphi_{\text{test}}=\operatorname{\rotatebox[origin={c}]{45.0}{$\Box$}}\text% {rock}\land\operatorname{\rotatebox[origin={c}]{45.0}{$\Box$}}\text{ice}\land% \operatorname{\rotatebox[origin={c}]{45.0}{$\Box$}}(d>f),italic_φ start_POSTSUBSCRIPT test end_POSTSUBSCRIPT = □ rock ∧ □ ice ∧ □ ( italic_d > italic_f ) ,

where d=|𝐱𝐱goal|𝑑𝐱subscript𝐱goald=|\mathbf{x}-\mathbf{x}_{\text{goal}}|italic_d = | bold_x - bold_x start_POSTSUBSCRIPT goal end_POSTSUBSCRIPT | is the distance to the goal and f𝑓fitalic_f is the fuel level. The experiment trace and snapshots of the hardware test execution are shown in Figs. 11 and 11. From the experiment trace, the static obstacles are placed such that the quadruped has to pick up rock and ice samples, refuel twice, and then drop off samples before reaching its goal. The test environment for the hardware run in Fig. 11 corresponds to a sub-optimal solution of MILP-static with a flow of 1. This sub-optimal solution still ensures that the system is routed in a manner that the test objective is still satisfied. In Table V, we list the runtimes for getting the optimal solution for this example.

Refer to caption
(a) Grid world layout.
Refer to caption
(b) Reactive cuts in q00.
Refer to caption
(c) Reactive cuts in q6666.
Refer to caption
(d) Reactive cuts in q7777.
Figure 13: (a) Grid world layout with cells traversible by the test agent marked. Dark gray cells are not traversible by either agent. (b) Black edges indicate reactive cuts corresponding to the history variables for the Maze 2 experiment. Note that the cuts are not bidirectional. The history variable states q00, q6666, and q7777 can be inferred from πsubscript𝜋\mathcal{B}_{\pi}caligraphic_B start_POSTSUBSCRIPT italic_π end_POSTSUBSCRIPT illustrated in Fig. 3, and correspond to initial state, visiting I1subscript𝐼1I_{1}italic_I start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT first, and visiting I2subscript𝐼2I_{2}italic_I start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT first.
Refer to caption
(a) Maze 2 trace.
Refer to caption
(b) Maze 2 experiment snapshots.
Figure 14: Resulting test execution for the Maze 2 experiment with a dynamic test agent.

Reactive Dynamic Agent:

VIII-B4 Patrolling

This example is similar to the static refueling example, except that the test environment now consists of a test agent and static obstacles (see Fig. 1). The system (gray quadruped) starts in the lower right corner and must reach its goal in the lower left corner of the grid without running out of fuel, which is encoded in the system objective: φsys=T¬(f=0).subscript𝜑sys𝑇𝑓0\varphi_{\text{sys}}=\operatorname{\rotatebox[origin={c}]{45.0}{$\Box$}}T\land% \square\neg(f=0).italic_φ start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT = □ italic_T ∧ □ ¬ ( italic_f = 0 ) . The refueling station is denoted ‘R’ in Fig. 1. Once again, the test objective routes the system through a state from which a successful test execution requires it to refuel,

φtest=(d>f),subscript𝜑test𝑑𝑓\varphi_{\text{test}}=\operatorname{\rotatebox[origin={c}]{45.0}{$\Box$}}(d>f),italic_φ start_POSTSUBSCRIPT test end_POSTSUBSCRIPT = □ ( italic_d > italic_f ) ,

where d=|𝐱𝐱goal|𝑑𝐱subscript𝐱goald=|\mathbf{x}-\mathbf{x}_{\text{goal}}|italic_d = | bold_x - bold_x start_POSTSUBSCRIPT goal end_POSTSUBSCRIPT | is the distance to the goal. The test agent can move up and down the third column of the grid, and can leave the grid from the first and last rows to a parking state. As shown in the trace and hardware snapshots in Fig. 1, our framework chooses to place a static obstacle near the start state, and the test agent blocks the system from directly navigating to the goal (see panels 2222, 3333 and 4444 in Fig. 1) until its fuel level is low enough, thus requiring it to refuel. For this experiment, we solve MILP-agent with the objective (12) for numerical stability in Gurobi.

VIII-B5 Maze 2

In this example, the system quadruped starts in the bottom left corner of the grid, and must reach its goal location in the top right corner. The grid world is a 5×5555\times 55 × 5 grid, with a symmetric obstacle configuration shown in yellow in Fig. 13. In this example, the test environment consists of a test agent that can traverse along the center row and center column of the grid. While the test environment can also place static obstacles, it realizes the test strategy entirely via the test agent. The system objective is given as follows φsys=Tsubscript𝜑sys𝑇\varphi_{\text{sys}}=\operatorname{\rotatebox[origin={c}]{45.0}{$\Box$}}Titalic_φ start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT = □ italic_T. The test objective consists of two visit tasks in arbitrary order, encoded as

φtest=I1I2,subscript𝜑testsubscript𝐼1subscript𝐼2\varphi_{\text{test}}=\operatorname{\rotatebox[origin={c}]{45.0}{$\Box$}}I_{1}% \land\operatorname{\rotatebox[origin={c}]{45.0}{$\Box$}}I_{2},italic_φ start_POSTSUBSCRIPT test end_POSTSUBSCRIPT = □ italic_I start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ∧ □ italic_I start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ,

where I1subscript𝐼1I_{1}italic_I start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT and I2subscript𝐼2I_{2}italic_I start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT correspond to the designated locations on the grid. The specification product is the same as shown in Fig. 3, we can see that to route the test execution through the test objective acceptance states, we need to find cuts for the history variables q00, q6666, and q7777. The reactive cuts found by the flow-based synthesis procedure are shown in Figs. 13-13. The trace and snapshots of the resulting test execution is shown in Figs. 14 and 14. We observe that the system quadruped decides to take the top path first, visits I2subscript𝐼2I_{2}italic_I start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT (see panel 2222 in Fig. 14), and is blocked by the test agent (see panel 3333). It then decides to try navigating through the center of the grid, and is again blocked by the test agent (see panel 4444). Subsequently, it decides to try the bottom path, visits I1subscript𝐼1I_{1}italic_I start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT (see panel 5555), and successfully reaches the goal without any further test agent intervention. If the system decided to visit I1subscript𝐼1I_{1}italic_I start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT first, the adaptive test agent strategy would have blocked the system from reaching the goal directly from I1subscript𝐼1I_{1}italic_I start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT until it visits I2subscript𝐼2I_{2}italic_I start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT. This is an example with a maximum flow of F=2𝐹2F=2italic_F = 2, corresponding to the two unique ways for the system to reach the goal. For an alternative system controller in which the system chooses to approach the goal through I1subscript𝐼1I_{1}italic_I start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT, the simulated trace resulting from the test agent strategy is shown in Fig. 9.

VIII-C Runtimes

Table V showcases runtimes for simulated and hardware experiments involving static or reactive obstacles. Table VI shows runtimes for simulated and hardware experiments with a dynamic test agent. The size of the automata and graphs reported in these tables corresponds to the tuple (|V|,|E|)𝑉𝐸(|V|,|E|)( | italic_V | , | italic_E | ), where V𝑉Vitalic_V is the number of nodes, and E𝐸Eitalic_E the number of edges. To evaluate the scalability of this framework, we include runtimes on randomized grid worlds for specification sub-tasks in Table IV for static obstacles, and in Table III for reactive obstacles. These experiments were conducted on an Apple M2 Pro with 16 GB of RAM. The Mars exploration example corresponds to solving an MILP with over 13,0001300013,00013 , 000 binary variables, for which the solver takes 46.646.646.646.6s to find the optimal solution. For examples involving a dynamic agent such as Maze 1 and Maze 2, our framework iterates through counterexamples that are not dynamically feasible for the test agent until it finds a solution.

Table II: Graph Construction Runtimes (with mean and standard deviation) for Random Grid World Experiments
Experiment 5×5555\times 55 × 5 10×10101010\times 1010 × 10 15×15151515\times 1515 × 15 20×20202020\times 2020 × 20
|AP|𝐴𝑃|AP|| italic_A italic_P | |π|subscript𝜋|\mathcal{B}_{\pi}|| caligraphic_B start_POSTSUBSCRIPT italic_π end_POSTSUBSCRIPT | Graph Construction [s]
Reachability:
2 (4, 9) 0.046±plus-or-minus\,\pm\,± 0.001 0.224±plus-or-minus\,\pm\,± 0.0056 0.554±plus-or-minus\,\pm\,± 0.009 1.078±plus-or-minus\,\pm\,± 0.011
3 (8, 27) 0.344±plus-or-minus\,\pm\,± 0.007 1.661±plus-or-minus\,\pm\,± 0.022 4.004±plus-or-minus\,\pm\,± 0.048 7.376±plus-or-minus\,\pm\,± 0.061
4 (16, 81) 1.997±plus-or-minus\,\pm\,± 0.077 9.895±plus-or-minus\,\pm\,± 0.109 23.512±plus-or-minus\,\pm\,± 0.179 43.188±plus-or-minus\,\pm\,± 0.454
Reachability & Reaction:
3 (6, 21) 0.090±plus-or-minus\,\pm\,± 0.001 0.424±plus-or-minus\,\pm\,± 0.016 1.037±plus-or-minus\,\pm\,± 0.004 2.044±plus-or-minus\,\pm\,± 0.013
5 (20, 155) 1.628±plus-or-minus\,\pm\,± 0.087 7.560±plus-or-minus\,\pm\,± 0.023 18.019±plus-or-minus\,\pm\,± 0.129 33.539±plus-or-minus\,\pm\,± 0.144
7 (68, 1065) 44.809±plus-or-minus\,\pm\,± 0.996 209.612±plus-or-minus\,\pm\,± 1.732 488.611±plus-or-minus\,\pm\,± 6.308 869.060±plus-or-minus\,\pm\,± 16.870
Reachability & Safety:
3 (6, 18) 0.102±plus-or-minus\,\pm\,± 0.002 0.508±plus-or-minus\,\pm\,± 0.010 1.278±plus-or-minus\,\pm\,± 0.022 2.557±plus-or-minus\,\pm\,± 0.023
4 (6, 18) 0.116±plus-or-minus\,\pm\,± 0.002 0.590±plus-or-minus\,\pm\,± 0.009 1.485±plus-or-minus\,\pm\,± 0.024 2.918±plus-or-minus\,\pm\,± 0.046
5 (6,18) 0.179±plus-or-minus\,\pm\,± 0.027 0.960±plus-or-minus\,\pm\,± 0.037 2.329±plus-or-minus\,\pm\,± 0.072 4.482±plus-or-minus\,\pm\,± 0.116
Table III: Run Times (with mean and standard deviation) for Random Grid World Experiments solving MILP-reactive
Experiment 5×5555\times 55 × 5 10×10101010\times 1010 × 10 15×15151515\times 1515 × 15 20×20202020\times 2020 × 20
|AP|𝐴𝑃|AP|| italic_A italic_P | |π|subscript𝜋|\mathcal{B}_{\pi}|| caligraphic_B start_POSTSUBSCRIPT italic_π end_POSTSUBSCRIPT | Optimization[s], Success Rate (%)
Reachability:
2 (4, 9) 5.63±plus-or-minus\,\pm\,±13.43 100 64.62±plus-or-minus\,\pm\,±38.75 100 67.38±plus-or-minus\,\pm\,±25.47 100 68.63±plus-or-minus\,\pm\,±31.12 100
3 (8, 27) 23.36±plus-or-minus\,\pm\,±38.15 100 61.68±plus-or-minus\,\pm\,±35.12 100 91.54±plus-or-minus\,\pm\,±31.41 100 117.82±plus-or-minus\,\pm\,±34.89 100
4 (16, 81) 22.49±plus-or-minus\,\pm\,±36.33 100 83.52±plus-or-minus\,\pm\,±29.25 100 171.49±plus-or-minus\,\pm\,±50.72 100 317.62±plus-or-minus\,\pm\,±89.08 100
Reachability & Reaction:
3 (6, 21) 5.97±plus-or-minus\,\pm\,±13.21 100 61.06±plus-or-minus\,\pm\,±34.67 100 71.64±plus-or-minus\,\pm\,±41.03 100 85.20±plus-or-minus\,\pm\,±19.49 100
5 (20, 155) 17.19±plus-or-minus\,\pm\,±25.51 100 78.44±plus-or-minus\,\pm\,±34.71 100 159.91±plus-or-minus\,\pm\,±76.63 100 279.86±plus-or-minus\,\pm\,±148.23 90
7 (68, 1065) 52.71±plus-or-minus\,\pm\,±41.23 100 331.32±plus-or-minus\,\pm\,±187.28 90 585.21±plus-or-minus\,\pm\,±67.58 15 600.00±plus-or-minus\,\pm\,±0.00 0
Reachability & Safety:
3 (6, 18) 0.76±plus-or-minus\,\pm\,±1.52 100 70.82±plus-or-minus\,\pm\,±89.70 100 63.68±plus-or-minus\,\pm\,±27.54 100 80.58±plus-or-minus\,\pm\,±20.79 100
4 (6, 18) 0.15±plus-or-minus\,\pm\,±0.29 100 71.47±plus-or-minus\,\pm\,±80.61 100 59.59±plus-or-minus\,\pm\,±38.92 100 76.02±plus-or-minus\,\pm\,±27.11 100
5 (6, 18) 0.12±plus-or-minus\,\pm\,±0.18 100 94.68±plus-or-minus\,\pm\,±88.04 100 71.34±plus-or-minus\,\pm\,±30.89 100 82.54±plus-or-minus\,\pm\,±22.69 100
Table IV: Run Times (with mean and standard deviation) for Random Grid World Experiments solving MILP-static.
Experiment 5×5555\times 55 × 5 10×10101010\times 1010 × 10 15×15151515\times 1515 × 15 20×20202020\times 2020 × 20
|AP|𝐴𝑃|AP|| italic_A italic_P | |π|subscript𝜋|\mathcal{B}_{\pi}|| caligraphic_B start_POSTSUBSCRIPT italic_π end_POSTSUBSCRIPT | Optimization [s], Success Rate (%)
Reachability:
2 (4, 9) 8.17±plus-or-minus\,\pm\,±13.14 100 54.07±plus-or-minus\,\pm\,±17.98 100 60.17±plus-or-minus\,\pm\,±0.12 100 60.17±plus-or-minus\,\pm\,±0.10 100
3 (8, 27) 27.78±plus-or-minus\,\pm\,±21.71 100 60.17±plus-or-minus\,\pm\,±0.10 100 60.48±plus-or-minus\,\pm\,±0.86 100 74.02±plus-or-minus\,\pm\,±38.70 100
4 (16, 81) 52.60±plus-or-minus\,\pm\,±14.05 100 60.42±plus-or-minus\,\pm\,±0.34 100 82.02±plus-or-minus\,\pm\,±41.26 100 265.41±plus-or-minus\,\pm\,±203.51 80
Reachability & Reaction:
3 (6, 21) 10.62±plus-or-minus\,\pm\,±14.85 100 60.09±plus-or-minus\,\pm\,±0.06 100 60.23±plus-or-minus\,\pm\,±0.24 100 60.34±plus-or-minus\,\pm\,±0.46 100
5 (20, 155) 20.41±plus-or-minus\,\pm\,±19.21 100 67.77±plus-or-minus\,\pm\,±31.90 100 95.31±plus-or-minus\,\pm\,±116.65 95 268.50±plus-or-minus\,\pm\,±222.14 75
7 (68, 1065) 36.64±plus-or-minus\,\pm\,±23.34 100 110.63±plus-or-minus\,\pm\,±92.81 100 419.77±plus-or-minus\,\pm\,±214.30 55 556.38±plus-or-minus\,\pm\,±131.06 10
Reachability & Safety:
3 (6, 18) 1.27±plus-or-minus\,\pm\,±1.47 100 60.08±plus-or-minus\,\pm\,±0.06 100 57.27±plus-or-minus\,\pm\,±12.61 100 60.32±plus-or-minus\,\pm\,±0.24 100
4 (6, 18) 0.17±plus-or-minus\,\pm\,±0.23 100 60.06±plus-or-minus\,\pm\,±0.05 100 60.14±plus-or-minus\,\pm\,±0.10 100 60.30±plus-or-minus\,\pm\,±0.19 100
5 (6, 18) 0.11±plus-or-minus\,\pm\,±0.16 100 54.15±plus-or-minus\,\pm\,±17.80 100 60.17±plus-or-minus\,\pm\,±0.09 100 60.29±plus-or-minus\,\pm\,±0.26 100
Table V: Runtimes for Simulated and Hardware Experiments showing sizes of the automata and graphs
Experiment |π|subscript𝜋|\mathcal{B}_{\pi}|| caligraphic_B start_POSTSUBSCRIPT italic_π end_POSTSUBSCRIPT | |Tsys|subscript𝑇sys|T_{\text{sys}}|| italic_T start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT | |G|𝐺|G|| italic_G | G𝐺Gitalic_G[s] ||||BinVars|||| ||||ContVars|||| ||||Constraints|||| Opt[s] Flow |C|𝐶|C|| italic_C |
Example 1 (4, 9) (15, 53) (27, 96) 0.0270 73 87 540 0.0003 3.0 14
Refueling (6, 18) (265, 1047) (332, 1346) 0.6655 1014 1261 19819 0.8682 2.0 199
Mars Exploration (36, 354) (376, 1522) (4073, 17251) 75.8313 13178 16604 1646480 46.6209 2.0 1641
Example 2 (8, 27) (6, 17) (20, 56) 0.0452 25 115 409 0.0003 2.0 4
Beaver Rescue (12, 54) (7, 19) (15, 39) 0.0470 8 154 441 0.0001 2.0 2
Motion Primitives (16, 81) (15, 42) (72, 207) 0.4286 106 761 2606 0.0005 3.0 15
Table VI: Runtimes for Simulated and Hardware Experiments with Dynamic Agents
Experiment |π|subscript𝜋|\mathcal{B}_{\pi}|| caligraphic_B start_POSTSUBSCRIPT italic_π end_POSTSUBSCRIPT | |Tsys|subscript𝑇sys|T_{\text{sys}}|| italic_T start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT | |G|𝐺|G|| italic_G | G𝐺Gitalic_G[s] |||| BinVars |||| Opt[s] Controller[s] |𝙲ex|subscript𝙲ex|\mathtt{C}_{\text{ex}}|| typewriter_C start_POSTSUBSCRIPT ex end_POSTSUBSCRIPT | Flow |C|𝐶|C|| italic_C |
Maze 1 (16, 81) (26, 80) (196, 604) 1.6226 355 0.0007 68.7052 3 1.0 3
Patrolling (6, 18) (386, 1539) (210, 831) 0.4573 621 6.0535 16.1191 0 1.0 13
Maze 2 (8, 27) (21, 66) (80, 252) 0.2195 176 0.0160 5.0072 5 2.0 8

For randomized experiments, we time out if the Gurobi fails to find a feasible solution to the MILP within 10 min. If it finds a feasible solution within 10 minutes, we allocate an additional minute for the optimizer reach the optimal, otherwise terminating the optimization with a feasible solution. For these experiments, we increase the length of the system and test objective for the following three classes of specification patterns: i) reachability, ii) reachability and reaction, and iii) reachability and safety. For reachability patterns, the set AP comprises of atomic propositions needed to describe the system and test objectives as follows, φsys=p0subscript𝜑syssubscript𝑝0\varphi_{\text{sys}}=\operatorname{\rotatebox[origin={c}]{45.0}{$\Box$}}p_{0}italic_φ start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT = □ italic_p start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT and φtest=i=1npisubscript𝜑testsuperscriptsubscript𝑖1𝑛subscript𝑝𝑖\varphi_{\text{test}}=\bigwedge_{i=1}^{n}\operatorname{\rotatebox[origin={c}]{% 45.0}{$\Box$}}p_{i}italic_φ start_POSTSUBSCRIPT test end_POSTSUBSCRIPT = ⋀ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT □ italic_p start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT, and the total number of atomic propositions are |AP|=|{p0,,pn}|=n+1𝐴𝑃subscript𝑝0subscript𝑝𝑛𝑛1|AP|=|\{p_{0},\ldots,p_{n}\}|=n+1| italic_A italic_P | = | { italic_p start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , … , italic_p start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT } | = italic_n + 1. Similarly, for reachability and reaction patterns (case ii), we have φsys=p1i=2n(piqi)subscript𝜑syssubscript𝑝1superscriptsubscript𝑖2𝑛subscript𝑝𝑖subscript𝑞𝑖\varphi_{\text{sys}}=\operatorname{\rotatebox[origin={c}]{45.0}{$\Box$}}p_{1}% \wedge\bigwedge_{i=2}^{n}\square(p_{i}\rightarrow\operatorname{\rotatebox[orig% in={c}]{45.0}{$\Box$}}q_{i})italic_φ start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT = □ italic_p start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ∧ ⋀ start_POSTSUBSCRIPT italic_i = 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT □ ( italic_p start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT → □ italic_q start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) and φtest=p0i=2npisubscript𝜑testsubscript𝑝0superscriptsubscript𝑖2𝑛subscript𝑝𝑖\varphi_{\text{test}}=\operatorname{\rotatebox[origin={c}]{45.0}{$\Box$}}p_{0}% \wedge\bigwedge_{i=2}^{n}\operatorname{\rotatebox[origin={c}]{45.0}{$\Box$}}p_% {i}italic_φ start_POSTSUBSCRIPT test end_POSTSUBSCRIPT = □ italic_p start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ∧ ⋀ start_POSTSUBSCRIPT italic_i = 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT □ italic_p start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT, with |AP|=|{p0,,pn,q2,,qn}|=2n𝐴𝑃subscript𝑝0subscript𝑝𝑛subscript𝑞2subscript𝑞𝑛2𝑛|AP|=|\{p_{0},\ldots,p_{n},q_{2},\ldots,q_{n}\}|=2n| italic_A italic_P | = | { italic_p start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , … , italic_p start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT , italic_q start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , … , italic_q start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT } | = 2 italic_n. In the reachability and safety case (iii), only the length of the system objective changes: φsys=p1i=2n¬pisubscript𝜑syssubscript𝑝1superscriptsubscript𝑖2𝑛subscript𝑝𝑖\varphi_{\text{sys}}=\operatorname{\rotatebox[origin={c}]{45.0}{$\Box$}}p_{1}% \wedge\bigwedge_{i=2}^{n}\square\neg p_{i}italic_φ start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT = □ italic_p start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ∧ ⋀ start_POSTSUBSCRIPT italic_i = 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT □ ¬ italic_p start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT and φtest=p0subscript𝜑testsubscript𝑝0\varphi_{\text{test}}=\operatorname{\rotatebox[origin={c}]{45.0}{$\Box$}}p_{0}italic_φ start_POSTSUBSCRIPT test end_POSTSUBSCRIPT = □ italic_p start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT, with |AP|=|{p0,,pn}|=n+1𝐴𝑃subscript𝑝0subscript𝑝𝑛𝑛1|AP|=|\{p_{0},\ldots,p_{n}\}|=n+1| italic_A italic_P | = | { italic_p start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , … , italic_p start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT } | = italic_n + 1. Improving the runtimes for graph construction and controller synthesis subroutines is orthogonal to the focus of this paper. Since the test synthesis framework is carried out offline, we observe reasonable runtimes for medium-sized problems with hundreds to thousands of integer variables. An interesting direction for future research involves identifying good convex relaxations of the MILPs to further improve scalability.

IX Comparison to Reactive Synthesis

We presented an approach to solve Problems 1 and 2 leveraging tools from automata theory and network flow optimization. In particular, for Problem 2, we rely on the optimization solution to construct a GR(1) specification to reactively synthesize a test agent strategy. One indication of the optimization step being necessary is the computational complexity of the problem. If the problem data are consistent, there exists a GR(1) specification for the test agent that would solve the problem, but directly expressing this specification is impractical. Essentially, the challenge is in finding the restrictions on system actions, which are then captured in the sub-formulae of the GR(1) specification. In this section, we argue that we cannot solve Problems 1 and 2 solely via synthesis from an LTL specification.

To the authors’ knowledge, directly capturing the different perspectives of the system and the test agent in this neither fully adversarial nor fully cooperative setting is not possible with current state-of-the-art approaches in GR(1) synthesis. Particularly in the reactive setting, the test strategy must ensure that from the system’s perspective, there always exists a path to the system goal. To capture this constraint, we reason over a second product graph that represents the system perspective. It is not obvious how this semi-cooperative setting can be directly encoded as a synthesis problem in common temporal logics.

In the static setting, the problem can be posed on a single graph. However, it is difficult to find the set of static obstacles directly from GR(1) synthesis. Every state in the winning set describes an edge-cut combination, but qualitative GR(1) synthesis cannot maximize the flow or minimize the cuts. Furthermore, the winning set can include states that vacuously satisfy the formula, i.e., not allowing the system any path to the goal. Finally, the combinatorial complexity of the problem would manifest as follows. Although the time complexity of GR(1) synthesis is O(N3)𝑂superscript𝑁3O(N^{3})italic_O ( italic_N start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT ) in the number of states N𝑁Nitalic_N, we require an exponential number of states to characterize the GR(1) formula. For example, in Figure 15, this is illustrated for the GR(1) formula:

φsysdyn𝚃φtestdynφtestaux_dynIaux,subscriptsuperscript𝜑dynsys𝚃subscriptsuperscript𝜑dyntestsubscriptsuperscript𝜑aux_dyntestsubscript𝐼aux\square\varphi^{\text{dyn}}_{\text{sys}}\wedge\square\operatorname{\rotatebox[% origin={c}]{45.0}{$\Box$}}\mathtt{T}\rightarrow\square\varphi^{\text{dyn}}_{% \text{test}}\wedge\square\varphi^{\text{aux}\_\text{dyn}}_{\text{test}}\wedge% \square\operatorname{\rotatebox[origin={c}]{45.0}{$\Box$}}I_{\text{aux}},□ italic_φ start_POSTSUPERSCRIPT dyn end_POSTSUPERSCRIPT start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT ∧ □ □ typewriter_T → □ italic_φ start_POSTSUPERSCRIPT dyn end_POSTSUPERSCRIPT start_POSTSUBSCRIPT test end_POSTSUBSCRIPT ∧ □ italic_φ start_POSTSUPERSCRIPT aux _ dyn end_POSTSUPERSCRIPT start_POSTSUBSCRIPT test end_POSTSUBSCRIPT ∧ □ □ italic_I start_POSTSUBSCRIPT aux end_POSTSUBSCRIPT ,

where φsysdynsubscriptsuperscript𝜑dynsys\varphi^{\text{dyn}}_{\text{sys}}italic_φ start_POSTSUPERSCRIPT dyn end_POSTSUPERSCRIPT start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT captures the system transitions on the grid world, φtestdynsubscriptsuperscript𝜑dyntest\varphi^{\text{dyn}}_{\text{test}}italic_φ start_POSTSUPERSCRIPT dyn end_POSTSUPERSCRIPT start_POSTSUBSCRIPT test end_POSTSUBSCRIPT are the dynamics of the test environment, and φtestaux_dynsubscriptsuperscript𝜑aux_dyntest\varphi^{\text{aux}\_\text{dyn}}_{\text{test}}italic_φ start_POSTSUPERSCRIPT aux _ dyn end_POSTSUPERSCRIPT start_POSTSUBSCRIPT test end_POSTSUBSCRIPT and Iauxsubscript𝐼auxI_{\text{aux}}italic_I start_POSTSUBSCRIPT aux end_POSTSUBSCRIPT capture the I𝐼\operatorname{\rotatebox[origin={c}]{45.0}{$\Box$}}I□ italic_I condition in GR(1) form. In this example, each edge in the system transition system Tsyssubscript𝑇sysT_{\text{sys}}italic_T start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT can take 0/1 values, and once an edge is cut, it remains cut and the system cannot take a transition that corresponds to a cut edge. Due to this, the number of states N𝑁Nitalic_N to describe the GR(1) formula includes the 2|Tsys.E|2^{|T_{\text{sys}}.E|}2 start_POSTSUPERSCRIPT | italic_T start_POSTSUBSCRIPT sys end_POSTSUBSCRIPT . italic_E | end_POSTSUPERSCRIPT states that characterize the edge cuts. As seen in Figure 15, the direct GR(1) synthesis approach returns a trivial solution corresponding to an impossible setting for the system. Finally, even when an acceptable solution is returned, the problem being at least NP-hard will result in the combinatorial complexity manifesting in the synthesis approach.

One key advantage of the network flow optimization is reasoning over flows as opposed to paths, which allows for tractable implementations. These insights from network flow optimization in this work can help in driving further research along these directions.

Refer to caption
Figure 15: Solution returned by GR(1) synthesis and the network flow optimization in the case of static constraints

X Conclusion and Future Work

We presented a framework to synthesize least-restrictive strategies for test environments according to specified system and test objectives. To do this, we formulate a network flow-based MILP corresponding to the types of agents available in the test environment. In the case of a dynamic test agent, we parse the solution of the MILP to synthesize a test agent strategy via reactive synthesis. Furthermore, we use a counterexample-guided approach to find a realizable test agent strategy. Our problem is shown to be NP-hard, yet the MILP can handle medium-sized problem instances. Our test strategies are such that the system is minimally restricted while routing the test execution through the test objective without creating a livelock. Therefore, a test execution in which the system fails to meet the system objective is solely the fault of the system, and not due to the test environment.

There are several exciting future directions. First, we aim to extend this framework to automatically select dynamic test agents from a library. This selection can optimized to meet user-defined metrics such as testing effort or cost. Secondly, we wish to improve the runtime of our algorithm by using symbolic methods to speed up graph construction and exploring convex relaxations to the MILP. More broadly, we want to investigate how to incorporate test metrics such as coverage and difficulty into our framework.

Acknowledgment

The authors acknowledge Emily Fourney, Chris Umans, Scott Livingston, Joel Burdick, Ioannis Filippidis, Mani Chandy, and Lijun Chen for useful discussions.

References

  • [1] I. S. Organization, “Road vehicles: Safety of the intended functionality (ISO Standard No. 21448:2022),” 2022. https://www.iso.org/standard/77490.html, Last accessed on 2024-04-11.
  • [2] Zoox, “Putting Zoox to the test: preparing for the challenges of the road,” 2021. https://zoox.com/journal/structured-testing/, Last accessed on 2024-04-11.
  • [3] N. Webb, D. Smith, C. Ludwick, T. Victor, Q. Hommes, F. Favaro, G. Ivanov, and T. Daniel, “Waymo’s safety methodologies and safety readiness determinations,” 2020.
  • [4] A. Dosovitskiy, G. Ros, F. Codevilla, A. Lopez, and V. Koltun, “CARLA: An open urban driving simulator,” in Conference on Robot Learning, pp. 1–16, PMLR, 2017.
  • [5] D. J. Fremont, T. Dreossi, S. Ghosh, X. Yue, A. L. Sangiovanni-Vincentelli, and S. A. Seshia, “Scenic: a language for scenario specification and scene generation,” in Proceedings of the 40th ACM SIGPLAN Conference on Programming Language Design and Implementation, pp. 63–78, 2019.
  • [6] A. Gambi, T. Huynh, and G. Fraser, “Generating effective test cases for self-driving cars from police reports,” in Proceedings of the 2019 27th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, pp. 257–267, 2019.
  • [7] C. Stark, C. Medrano-Berumen, and M. Akbaş, “Generation of autonomous vehicle validation scenarios using crash data,” in 2020 SoutheastCon, pp. 1–6, 2020.
  • [8] G. Lou, Y. Deng, X. Zheng, M. Zhang, and T. Zhang, “Testing of autonomous driving systems: where are we and where should we go?,” in Proceedings of the 30th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering, pp. 31–43, 2022.
  • [9] A. Corso, R. Moss, M. Koren, R. Lee, and M. Kochenderfer, “A survey of algorithms for black-box safety validation of cyber-physical systems,” Journal of Artificial Intelligence Research, vol. 72, pp. 377–428, 2021.
  • [10] H. Winner, K. Lemmer, T. Form, and J. Mazzega, “Pegasus—first steps for the safe introduction of automated driving,” in Road Vehicle Automation 5, pp. 185–195, Springer, 2019.
  • [11] L. Li, W.-L. Huang, Y. Liu, N.-N. Zheng, and F.-Y. Wang, “Intelligence testing for autonomous vehicles: A new approach,” IEEE Transactions on Intelligent Vehicles, vol. 1, no. 2, pp. 158–166, 2016.
  • [12] G. E. Mullins, P. G. Stankiewicz, R. C. Hawthorne, and S. K. Gupta, “Adaptive generation of challenging scenarios for testing and evaluation of autonomous vehicles,” Journal of Systems and Software, vol. 137, pp. 197–215, 2018.
  • [13] A. Corso, P. Du, K. Driggs-Campbell, and M. J. Kochenderfer, “Adaptive stress testing with reward augmentation for autonomous vehicle validatio,” in 2019 IEEE Intelligent Transportation Systems Conference (ITSC), pp. 163–168, IEEE, 2019.
  • [14] S. Feng, H. Sun, X. Yan, H. Zhu, Z. Zou, S. Shen, and H. X. Liu, “Dense reinforcement learning for safety validation of autonomous vehicles,” Nature, vol. 615, no. 7953, pp. 620–627, 2023.
  • [15] Y. Annpureddy, C. Liu, G. Fainekos, and S. Sankaranarayanan, “S-taliro: A tool for temporal logic falsification for hybrid systems,” in International Conference on Tools and Algorithms for the Construction and Analysis of Systems, pp. 254–257, Springer, 2011.
  • [16] H. Abbas and G. Fainekos, “Linear hybrid system falsification through local search,” in International Symposium on Automated Technology for Verification and Analysis, pp. 503–510, Springer, 2011.
  • [17] G. E. Fainekos and G. J. Pappas, “Robustness of temporal logic specifications for continuous-time signals,” Theoretical Computer Science, vol. 410, no. 42, pp. 4262–4291, 2009.
  • [18] A. Donzé, “Breach, a toolbox for verification and parameter synthesis of hybrid systems,” in International Conference on Computer Aided Verification, pp. 167–170, Springer, 2010.
  • [19] D. J. Fremont, E. Kim, Y. V. Pant, S. A. Seshia, A. Acharya, X. Bruso, P. Wells, S. Lemke, Q. Lu, and S. Mehta, “Formal scenario-based testing of autonomous vehicles: From simulation to the real world,” in 2020 IEEE 23rd International Conference on Intelligent Transportation Systems (ITSC), pp. 1–8, IEEE, 2020.
  • [20] C. E. Tuncali, G. Fainekos, H. Ito, and J. Kapinski, “Simulation-based adversarial test generation for autonomous vehicles with machine learning components,” in 2018 IEEE Intelligent Vehicles Symposium (IV), pp. 1555–1562, IEEE, 2018.
  • [21] C. Innes and S. Ramamoorthy, “Automated testing with temporal logic specifications for robotic controllers using adaptive experiment design,” in 2022 International Conference on Robotics and Automation (ICRA), pp. 6814–6821, 2022.
  • [22] P. Akella, M. Ahmadi, R. M. Murray, and A. D. Ames, “Formal test synthesis for safety-critical autonomous systems based on control barrier functions,” in 2020 59th IEEE Conference on Decision and Control (CDC), pp. 790–795, 2020.
  • [23] T. Wongpiromsarn, M. Ghasemi, M. Cubuktepe, G. Bakirtzis, S. Carr, M. O. Karabag, C. Neary, P. Gohari, and U. Topcu, “Formal methods for autonomous systems,” arXiv preprint arXiv:2311.01258, 2023.
  • [24] G. Fainekos, H. Kress-Gazit, and G. Pappas, “Hybrid controllers for path planning: A temporal logic approach,” in Proceedings of the 44th IEEE Conference on Decision and Control, pp. 4885–4890, 2005.
  • [25] L. Tan, O. Sokolsky, and I. Lee, “Specification-based testing with linear temporal logic,” in Proceedings of the 2004 IEEE International Conference on Information Reuse and Integration, 2004. IRI 2004., pp. 493–498, IEEE, 2004.
  • [26] E. Plaku, L. E. Kavraki, and M. Y. Vardi, “Falsification of LTL safety properties in hybrid systems,” International Journal on Software Tools for Technology Transfer, vol. 15, no. 4, pp. 305–320, 2013.
  • [27] G. Fraser and F. Wotawa, “Using LTL rewriting to improve the performance of model-checker based test-case generation,” in Proceedings of the 3rd International Workshop on Advances in Model-Based Testing, pp. 64–74, 2007.
  • [28] G. Fraser and P. Ammann, “Reachability and propagation for LTL requirements testing,” in 2008 The Eighth International Conference on Quality Software, pp. 189–198, IEEE, 2008.
  • [29] R. Bloem, G. Fey, F. Greif, R. Könighofer, I. Pill, H. Riener, and F. Röck, “Synthesizing adaptive test strategies from temporal logic specifications,” Formal Methods in System Design, vol. 55, no. 2, pp. 103–135, 2019.
  • [30] J. Tretmans, “Conformance testing with labelled transition systems: Implementation relations and test generation,” Computer Networks and ISDN Systems, vol. 29, no. 1, pp. 49–79, 1996.
  • [31] B. K. Aichernig, H. Brandl, E. Jöbstl, W. Krenn, R. Schlick, and S. Tiran, “Killing strategies for model-based mutation testing,” Software Testing, Verification and Reliability, vol. 25, no. 8, pp. 716–748, 2015.
  • [32] R. Hierons, “Applying adaptive test cases to nondeterministic implementations,” Information Processing Letters, vol. 98, no. 2, pp. 56–60, 2006.
  • [33] A. Petrenko and N. Yevtushenko, “Adaptive testing of nondeterministic systems with FSM,” in 2014 IEEE 15th International Symposium on High-Assurance Systems Engineering, pp. 224–228, IEEE, 2014.
  • [34] A. Pnueli and R. Rosner, “On the synthesis of a reactive module,” in Proceedings of the 16th ACM SIGPLAN-SIGACT symposium on Principles of programming languages, pp. 179–190, 1989.
  • [35] R. Bloem, B. Jobstmann, N. Piterman, A. Pnueli, and Y. Sa’ar, “Synthesis of reactive (1) designs,” Journal of Computer and System Sciences, vol. 78, no. 3, pp. 911–938, 2012.
  • [36] C. Baier and J.-P. Katoen, Principles of model checking. MIT press, 2008.
  • [37] M. Yannakakis, “Testing, optimization, and games,” in Proceedings of the 19th Annual IEEE Symposium on Logic in Computer Science, 2004., pp. 78–88, IEEE, 2004.
  • [38] L. Nachmanson, M. Veanes, W. Schulte, N. Tillmann, and W. Grieskamp, “Optimal strategies for testing nondeterministic systems,” ACM SIGSOFT Software Engineering Notes, vol. 29, no. 4, pp. 55–64, 2004.
  • [39] A. David, K. G. Larsen, S. Li, and B. Nielsen, “Cooperative testing of timed systems,” Electronic Notes in Theoretical Computer Science, vol. 220, no. 1, pp. 79–92, 2008.
  • [40] E. Bartocci, R. Bloem, B. Maderbacher, N. Manjunath, and D. Ničković, “Adaptive testing for specification coverage in CPS models,” IFAC-PapersOnLine, vol. 54, no. 5, pp. 229–234, 2021.
  • [41] T. Marcucci, J. Umenberger, P. Parrilo, and R. Tedrake, “Shortest paths in graphs of convex sets,” SIAM Journal on Optimization, vol. 34, no. 1, pp. 507–532, 2024.
  • [42] T. Marcucci, M. Petersen, D. von Wrangel, and R. Tedrake, “Motion planning around obstacles with convex optimization,” Science Robotics, vol. 8, no. 84, p. eadf7843, 2023.
  • [43] H. Zhang, M. Fontaine, A. Hoover, J. Togelius, B. Dilkina, and S. Nikolaidis, “Video game level repair via mixed integer linear programming,” in Proceedings of the AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment, vol. 16, pp. 151–158, 2020.
  • [44] M. Fontaine, Y.-C. Hsu, Y. Zhang, B. Tjanaka, and S. Nikolaidis, “On the Importance of Environments in Human-Robot Coordination,” in Proceedings of Robotics: Science and Systems, (Virtual), July 2021.
  • [45] A. Badithela, J. B. Graebener, W. Ubellacker, E. V. Mazumdar, A. D. Ames, and R. M. Murray, “Synthesizing reactive test environments for autonomous systems: testing reach-avoid specifications with multi-commodity flows,” in 2023 IEEE International Conference on Robotics and Automation (ICRA), pp. 12430–12436, IEEE, 2023.
  • [46] C. Menghi, C. Tsigkanos, P. Pelliccione, C. Ghezzi, and T. Berger, “Specification patterns for robotic missions,” IEEE Transactions on Software Engineering, vol. 47, no. 10, pp. 2208–2224, 2019.
  • [47] T. Wongpiromsarn, U. Topcu, N. Ozay, H. Xu, and R. M. Murray, “TuLiP: a software toolbox for receding horizon temporal logic planning,” in Proceedings of the 14th International Conference on Hybrid Systems: Computation and Control, HSCC ’11, (New York, NY, USA), p. 313–314, Association for Computing Machinery, 2011.
  • [48] T. Wongpiromsarn, U. Topcu, and R. M. Murray, “Receding horizon temporal logic planning,” IEEE Transactions on Automatic Control, vol. 57, no. 11, pp. 2817–2830, 2012.
  • [49] H. Kress-Gazit, G. E. Fainekos, and G. J. Pappas, “Temporal-logic-based reactive mission and motion planning,” IEEE Transactions on Robotics, vol. 25, no. 6, pp. 1370–1381, 2009.
  • [50] C. Belta and S. Sadraddini, “Formal methods for control synthesis: An optimization perspective,” Annual Review of Control, Robotics, and Autonomous Systems, vol. 2, pp. 115–140, 2019.
  • [51] J. R. Büchi, On a Decision Method in Restricted Second Order Arithmetic, pp. 425–435. New York, NY: Springer New York, 1990.
  • [52] A. Bauer, M. Leucker, and C. Schallhart, “Runtime verification for LTL and TLTL,” ACM Transactions on Software Engineering and Methodology (TOSEM), vol. 20, no. 4, pp. 1–64, 2011.
  • [53] K. Havelund and G. Rosu, “Monitoring programs using rewriting,” in Proceedings 16th Annual International Conference on Automated Software Engineering (ASE 2001), pp. 135–143, IEEE, 2001.
  • [54] A. Morgenstern, M. Gesell, and K. Schneider, “An asymptotically correct finite path semantics for LTL,” in International Conference on Logic for Programming Artificial Intelligence and Reasoning, pp. 304–319, Springer, 2012.
  • [55] T. H. Cormen, C. E. Leiserson, R. L. Rivest, and C. Stein, Introduction to algorithms. MIT press, 2022.
  • [56] V. V. Vazirani, Approximation algorithms, vol. 1. Springer, 2001.
  • [57] I. Filippidis, S. Dathathri, S. C. Livingston, N. Ozay, and R. M. Murray, “Control design for hybrid systems with tulip: The temporal logic planning toolbox,” in 2016 IEEE Conference on Control Applications (CCA), pp. 1030–1041, IEEE, 2016.
  • [58] S. Maoz and J. O. Ringert, “GR(1) synthesis for LTL specification patterns,” in Proceedings of the 2015 10th joint meeting on foundations of software engineering, pp. 96–106, 2015.
  • [59] S. A. Cook, “The complexity of theorem-proving procedures,” in Logic, Automata, and Computational Complexity: The Works of Stephen A. Cook, pp. 143–152, 2023.
  • [60] C. H. Papadimitriou, Computational complexity, p. 260–265. GBR: John Wiley and Sons Ltd., 2003.
  • [61] Gurobi Optimization, LLC, “Gurobi Optimizer Reference Manual,” 2023.
  • [62] W. Ubellacker and A. D. Ames, “Robust locomotion on legged robots through planning on motion primitive graphs,” in 2023 IEEE International Conference on Robotics and Automation (ICRA), preprint, 2023.
[Uncaptioned image] Josefine B. Graebener (Student Member, IEEE) received a B.Eng. in Aerospace Engineering in 2017 from the Aachen University of Applied Sciences (FH Aachen) in Aachen, Germany, and a M.S. in Space Engineering from California Institute of Technology in 2019. Currently, she is a Ph.D. candidate in Space Engineering with a minor in Computer Science at the California Institute of Technology. Her research interest lies in using formal methods for test and evaluation of autonomous systems, and system diagnostics.
[Uncaptioned image] Apurva Badithela (Student Member, IEEE) received a Bachelors degree in Aerospace Engineering and Mechanics in 2018 from the University of Minnesota, Twin-Cities. Currently, she is a Ph.D. candidate in Control and Dynamical Systems at the California Institute of Technology. Her dissertation work focuses on Formal Test Synthesis and System-level Evaluation for Safety-Critical Autonomous Systems.
[Uncaptioned image] Denizalp Goktas (Student Member, IEEE), Denizalp Goktas is a Ph.D. Candidate in Computer Science at Brown University. His research focuses on artificial intelligence, particularly how it intersects with economics and computer science. His research seeks to create algorithms for games and markets, aiming to use these to tackle problems practical problem such as in economics and robotics. He is supported by a JP Morgan AI fellowship. Previously, Denizalp earned his BA in Computer Science and Statistics from Columbia University and another BA in Political Science and Economics from Sciences Po. His past research experience includes internships at Google DeepMind and JP Morgan, and a visiting scholar position at UC Berkeley’s Simons Institute.
[Uncaptioned image] Wyatt Ubellacker (Student Member, IEEE), earned his B.S. and M.S. degrees in Mechanical Engineering from the Massachusetts Institute of Technology in 2013 and 2016, respectively. Prior to joining Caltech in 2019, he was a Robotics Technologist at the Jet Propulsion Laboratory, where he wrote autonomy and control algorithms for the Mars Perseverance Rover. Currently, he is a Ph.D. candidate in Control and Dynamical Systems at Caltech. His research interests focus on control and autonomy for dynamic platforms, with a special emphasis on robotic morphologies that are capable of exhibiting a wide variety of behaviors.
[Uncaptioned image] Eric V. Mazumdar (Member, IEEE) received a B.S. degree in Computer Science from the Massachusetts Institute of Technology (MIT) in 2015, and a Ph.D in Electrical Engineering and Computer Science from UC Berkeley in 2021. Currently he is an Assistant Professor in Computing and Mathematical Sciences and Economics at Caltech. His research interests lie at the intersection of machine learning and economics, focusing on developing theoretical foundations and tools to confidently deploy machine learning algorithms into societal systems, particularly in settings with uncertain, dynamic environments in which learning algorithms interact with strategic agents. Dr. Mazumdar received the NSF CAREER Award in 2023 as well as a Research Fellowship for Learning in Games from the Simons Institute for Theoretical Computer Science.
[Uncaptioned image] Aaron D. Ames (Fellow, IEEE) received a B.S. degree in Mechanical Engineering and a B.A. degree in Mathematics from the University of St. Thomas in 2001, and a M.A. degree in Mathematics and a Ph.D. in Electrical Engineering and Computer Sciences from UC Berkeley in 2006. Currently he is the Bren Professor of Mechanical and Civil Engineering and Control and Dynamical Systems at the California Institute of Technology. Prior to joining Caltech, he was an Associate Professor in Mechanical Engineering and Electrical & Computer Engineering at the Georgia Institute of Technology. He was as a Postdoctoral Scholar in Control and Dynamical Systems at Caltech from 2006 to 2008, and began is faculty career at Texas A&M University in 2008. His research interests span the areas of robotics, nonlinear control and hybrid systems, with a special focus on applications to bipedal robotic walking—both formally and through experimental validation. Dr. Ames was the recipient of the 2005 Leon O. Chua Award for achievement in nonlinear science at UC Berkeley, and the 2006 Bernard Friedman Memorial Prize in Applied Mathematics. Dr. Ames received the NSF CAREER award in 2010, and the Donald P. Eckman Award in 2015, and the 2019 Antonio Ruberti Young Researcher Prize.
[Uncaptioned image] Richard M. Murray (Fellow, IEEE) received the B.S. degree in Electrical Engineering from California Institute of Technology in 1985 and the M.S. and Ph.D. degrees in Electrical Engineering and Computer Sciences from the University of California, Berkeley, in 1988 and 1991, respectively. He is currently the Thomas E. and Doris Everhart Professor of Control & Dynamical Systems and Bioengineering at Caltech. Murray’s research is in the application of feedback and control to networked systems, with applications in synthetic biology and autonomy. Current projects include design and implementation of synthetic cells and design, verification, and test synthesis for discrete decision-making protocols for safety-critical, reactive control systems. Dr. Murray’s professional awards include the Richard P. Feynman-Hughes Faculty Fellowship in 1993, awarded annually to an outstanding young faculty member in Engineering and Applied Science at Caltech, the National Science Foundation Early Faculty Career Development (CAREER) Award in 1995, the Office of Naval Research Young Investigator Award in 1995 and the Donald P. Eckman Award in 1997. He is a Fellow of the Institute for Electrical and Electronics Engineers (IEEE) and holds an honorary doctorate from Lund University in Sweden. He is an elected member of the National Academy of Engineering (2013). Dr. Murray received the IEEE Bode Lecture Prize in 2016, the IEEE Control Systems Award in 2017, and the AACC John R. Ragazzini Education Award in 2019.