0% found this document useful (0 votes)
2 views

System Level Formal Verification via Distributed Multi-core Hardware in the Loop Simulation

Uploaded by

jackmill764
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

System Level Formal Verification via Distributed Multi-core Hardware in the Loop Simulation

Uploaded by

jackmill764
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

2014 22nd Euromicro International Conference on Parallel, Distributed, and Network-Based Processing

System Level Formal Verification via Distributed Multi-Core Hardware in the Loop
Simulation

Toni Mancini, Federico Mari, Annalisa Massini, Igor Melatti, Enrico Tronci
Computer Science Department, Sapienza University of Rome
Via Salaria 113, I-00198 Roma, Italy
Email: {tmancini,mari,massini,melatti,tronci}@di.uniroma1.it

Abstract—The goal of System Level Formal Verification A. Main Contribution


(SLFV) is to show system correctness notwithstanding uncon-
trollable events (such as: faults, variation in system parameters, Our System Under Verification (SUV) is a Hybrid System
external inputs, etc). Hardware In the Loop Simulation (HILS) (see, e.g., [2] and citations thereof) whose inputs belong to a
based SLFV attains such a goal by considering exhaustively all finite set of uncontrollable events (disturbances) modelling
relevant simulation scenarios.
We present a distributed multi-core algorithm for HILS- failures in sensors or actuators, variations in the system
based SLFV. Our experimental results on the Fuel Control parameters, etc. We focus on deterministic systems (the typ-
System example in the Simulink distribution show that by using ical case for control systems), and model nondeterministic
64 machines with an 8 core processor each we can complete the behaviours (such as faults) with disturbances. Accordingly,
SLFV activity in about 27 hours whereas a sequential approach in our framework, a simulation scenario is just a finite
would require more than 200 days.
To the best of our knowledge this is the first time that sequence of disturbances and a simulation campaign is a se-
a distributed multi-core algorithm for HILS-based SLFV is quence of simulation instructions (namely: save a simulation
presented. state, restore a saved simulation state, inject a disturbance,
Keywords-Model Checking; Hybrid Systems; System Level advance the simulation of a given time length).
Formal Verification; Distributed Multi-Core Hardware in the A system is expected to withstand all disturbance se-
Loops Simulation; quences that may arise in its operational environment. Cor-
rectness of a system is thus defined with respect to such
I. I NTRODUCTION admissible disturbance sequences.
System Level Verification has the goal of verifying that In such a framework (as in [1]) we address Bounded
the whole (i.e., software + hardware) system meets the SLFV of safety properties. That is, given a time step τ
given specifications. Model checkers for hybrid systems (time quantum between disturbances) and a time horizon
cannot handle System Level Formal Verification (SLFV) hτ (i.e., h multiplied by τ ) we return PASS if there is
of actual systems. Thus Hardware In the Loop Simula- no admissible disturbance sequence of length h and time
tion (HILS) is currently the main workhorse for system step τ that violates the given safety property. We return
level verification and is supported by Model Based Design FAIL , along with a counterexample, otherwise. Therefore,
tools like Simulink (http://www.mathworks.com) and Vis- SLFV is an exhaustive (with respect to admissible simulation
Sim (http://www.vissim.com). In HILS, the actual software scenarios) HILS. In other words, we are aiming at (black
reads/sends values from/to mathematical models (simula- box) bounded model checking where the SUV behaviour is
tion) of the physical systems (e.g., engines, analog circuits, defined by a simulator (Simulink in our examples).
etc.) it will be interacting with. To enable an effective distributed approach to SLFV, we
SLFV basically is an exhaustive HILS where all relevant split the verification process into two main phases. First,
simulation scenarios are considered. Unless the number of an off-line phase, where we: (a) generate the simulation
such scenarios is small, exhaustive HILS is infeasible. scenarios to be considered and evenly split them into disjoint
The situation can be considerably improved by evenly slices, by using the approach in [1]; (b) generate, for each
splitting the simulation scenarios into disjoint slices and slice, a highly optimised simulation campaign for the given
then effectively distributing the simulation of such slices on simulator. Second, an on-line distributed phase where each
different machines. This has been done in [1]. simulator runs its simulation campaign independently and
Unfortunately the approach in [1] cannot exploit the avail- stops as soon as an error is found. The rationale is that the
ability of multi-core processors. In this paper we advance simulation phase is the heavier one from a computational
the state of the art by presenting a distributed multi-core point of view, thus our approach aims at parallelising such
approach to HILS-based SLFV. a phase.

1066-6192/14 $31.00 © 2014 IEEE 734


DOI 10.1109/PDP.2014.32
Authorized licensed use limited to: Univ of Calif Santa Cruz. Downloaded on October 15,2024 at 21:08:15 UTC from IEEE Xplore. Restrictions apply.
The on-line phase is supported by simulation tools B. Related Work
(Simulink in our examples). Here we provide methods and
The paper closest to ours is [1], where a HILS-based
tools to effectively carry out step b of the above off-
distributed algorithm for SLFV has been presented. We note
line phase. Our main contributions can be summarised as
however that the algorithm of [1] cannot exploit availability
follows.
of multi-core processors since the simulation campaigns
Distributed multi-core SLFV: We present an optimisation generated by such an optimiser heavily relies on the local
algorithm that transforms a sequence of simulation scenarios disk that becomes the main bottleneck if more than one
into a very efficient simulation campaign that avoids revis- simulation campaign tries to use it. The present paper ad-
iting already visited states by using simulator save/restore vances the state of the art by presenting a novel optimisation
commands. Our optimisation algorithm generates a very algorithm that generates simulation campaigns requiring a
efficient simulation campaign that during the simulation small amount of RAM. This, in turn, enables use of multi-
stores at most h states (if hτ is the time horizon). This allows core parallelism.
us to store such states using a small amount of RAM (about HILS-based SLFV has also been investigated in [3] where
15MB in our case study where h = 100 and each simulator the CMurphi [4] capability to call external C functions in a
state takes about 150KB). Such a small RAM footprint black box fashion has been used to drive the ESA satellite
allows running in parallel a simulation campaign for each simulator SIMSAT in order to verify satellite operational
of the available cores. Note that an efficient simulation procedures.
campaign generated using the optimiser in [1] may need to Statistical model checking, being basically black box, is
store many states. For this reason simulation campaigns in also closely related to our approach. In such a setting, [5] is
[1] store states on the local disk. As a result, the algorithm in closely related to our paper since it addresses system level
[1] cannot exploit availability of multi-core processors, since verification of Simulink models and presents experimental
the the local disk becomes the main bottleneck if more than results on the very same Simulink case study we are using.
one simulation campaign tries to use it. Devising a strategy Monte Carlo model checking methods (see, e.g., [6], [7],
for storing/restoring simulation states that uses a moderate [8]) are also related to our approach. The main differences
amount of memory and can thus be implemented in RAM between the above statistical approaches and ours are the fol-
rather then on disk is indeed the main obstacle we had to lowing: (i) statistical methods sample the space of admissible
overcome to achieve multi-core parallelism. simulation scenarios, whereas we address exhaustive HILS;
Experimental Results: We implemented our approach and (ii) statistical methods do not address optimisation of the
present experimental results on its usage in the Fuel Control simulation campaign which is our main concern here, since
System (FCS) example in the Simulink distribution. In our this is what makes exhaustive HILS viable.
experiments we set our time step τ to 1 second and our time Formal verification of Simulink models has been widely
horizon hτ to 100 seconds (i.e., h = 100). SLFV for this investigated, examples are in [9], [10], [11]. Such methods
case study entails running more than 4 million simulation however focus on discrete time models (e.g., Stateflow or
scenarios. Simulink restricted to discrete time operators) with small
Each core runs an instance of our optimisation algorithm domain variables. Therefore they are well suited to analyse
taking as input a different slice of the simulation scenar- critical subsystems, but cannot handle complex system level
ios. Our optimiser takes just a few minutes to generate verification tasks (e.g., as our case study). This is indeed the
a simulation campaign from a given slice. In our setting motivation for the development of statistical model checking
each machine has a single processor with c = 8 cores. We methods as the one in [5] and for our exhaustive HILS-based
present experimental results with k = 8, 16, 32, 64 machines approach.
totalling kc = 64, 128, 256, 512 cores. Synergies between simulation and formal methods have
Our experimental results show that, with respect to the been widely investigated in digital hardware verification.
(distributed single-core) approach in [1], our distributed Examples are in [12], [13], [14], [15] and citations thereof.
multi-core approach saves more than 65% of the compu- The main differences between the above approaches and ours
tation time when the same hardware is used. For example, are: (i) they focus on finite state systems whereas we focus
when using 64 machines we can complete SLFV for our on infinite state systems (namely, hybrid systems); (ii) they
case study in about 27 hours whereas using the approach in are white box (requiring availability of the system model)
[1] requires about 81 hours. Note that a purely sequential whereas we are black box. We note that the idea of speeding
approach would require more than 200 days. up the simulation process by saving and restoring suitably
Summing up: We present an effective distributed multi- selected visited states is also present in [15].
core approach to HILS-based SLFV. To the best of our Parallel algorithms for explicit state exploration have been
knowledge this is the first time that such an approach is widely investigated. Examples are in [16], [17], [18], [19],
presented. [20], [21]. The main difference with our approach is that all

735

Authorized licensed use limited to: Univ of Calif Santa Cruz. Downloaded on October 15,2024 at 21:08:15 UTC from IEEE Xplore. Restrictions apply.
the above ones focus on parallelising the state space explo- • s0 ∈ S is the initial state.
ration engine by devising techniques to minimise locking of • d ∈ N+ defines the input space as Ud (the set of discrete
the visited state hash table whereas we leave unchanged the event sequences over [0, d]).
state space exploration engine (the simulator in our context) • O is the set of output values (finite, countable, contin-
and use a Map-Reduce like strategy that splits (Map step) uous, or any combination thereof).
≥0
the set of simulation scenarios into equal sized subsets to be • flow : S × R → S. For all s ∈ S, t ∈ R≥0 , flow(s, t)
simulated on different cores and stops verification as soon defines the state reached by H from state s after time t when
as one of such cores finds an error (Reduce step). Note that no event occurs. Accordingly, we stipulate that for all s ∈ S,
we propose an embarrassingly parallel algorithm for (black flow(s, 0) = s.
box) formal verification of hybrid systems. Embarrassingly • jump : S × [0, d] → S. For all s ∈ S, e ∈ [0, d]
parallel verification algorithms have also been investigated in jump(s, e) defines the state reached by H from state s upon
[22], as for finite state system verification, and in [23], as for occurrence of event e (no time flows). Accordingly, we
symbolic testing of programs. Such approaches are close in stipulate that for all s ∈ S, jump(s, 0) = s.
spirit to ours, although they differ from ours as for the class • output : S → O. The value output(s) defines the output
of systems considered (we focus on hybrid systems whereas of H in state s.
the above papers focus on discrete systems) as well as for The state, respectively output, reached after time t by a
the modelling approach (our black box algorithm rests on DES with a given input can be computed with the DES state,
the disturbance model whereas the above papers both present respectively output, function (Definition 3).
white box algorithms resting on the system model). Definition 3 (DES state and output functions): The state
function of DES H is a function φ : Ud × R≥0 → S, where
II. BACKGROUND φ(u, t) is the state reached at time t by H with input the
In this section we give some background notions. Unless discrete event sequence u. Function φ is defined inductively
otherwise stated, all definitions are based on [24], [1]. as follows:
Throughout the paper, we use R≥0 for the set of non- • φ(u, 0) = jump(s0 , u(0)), where s0 is the initial state
negative reals, R+ for the set of strictly positive reals, and of H;
Bool = {0, 1} for the set of Boolean values (0 for false and • For each t > 0, φ(u, t) = jump(flow(φ(u, t ), t −

1 for true). N+ denotes the set of positive natural numbers. ∗ ∗


t ), u(t)), where: t < t is the greatest value such that
u(t∗ ) = 0 and we let t∗ = 0 if such a value does not exist
A. Modelling uncontrollable events (i.e., when u is always 0 before t).
A discrete event sequence (Definition 1 and Fig. 1a) is The output function of H is the function ψ : Ud × R≥0 →
a function associating to each (continuous) time instant a O defined as ψ(u, t) = output(φ(u, t)). In other words, ψ
disturbance event (or, simply, disturbance). Disturbances, computes the output (as a function of time) of H when the
encoded by integers in the interval [0, d] (for a given input to H is the discrete event sequence u. In general,
d ∈ N+ ), represent exogenous events (e.g., faults). We use ψ(u, t) is not a discrete event sequence (e.g., it may take
event 0 to represent the event carrying no disturbance. As a non-zero value an infinite number of times).
no system can withstand an infinite number of disturbances
within a finite time, we require that, in any time interval of C. Modelling the property to be verified
finite length, a discrete event sequence differs from 0 only We model the property to be verified with a continuous-
in a finite number of time points. time monitor which observes the state of the system to be
Definition 1 (Discrete event sequence): Let d ∈ N+ . A verified and checks whether the property under verification
discrete event sequence over integer interval [0, d] is a is satisfied (Fig. 1b). A temporal logic specification can be
function
 u : R≥0 → [0, d] such that, for all t ∈ R≥0 , the transformed into a continuous-time monitor as in [25]. The
set t̃ | 0 ≤ t̃ ≤ t and u(t̃) = 0 has finite cardinality. We output of our monitor is 0 as long as the property under
denote with Ud the set of discrete event sequences over [0, d]. verification is satisfied and becomes and stays 1 (sustain) as
soon as the property fails. This non-decreasing property of
B. Modelling the System Under Verification the monitor output ensures that we never miss a property
We model (Definition 2) our System Under Verification failure report, even when sampling the monitor output only
(SUV) as a continuous time Input-State-Output deterministic at discrete time points (Fig. 1c). The use of monitors gives
dynamical system whose inputs are discrete event sequences. us a flexible approach to model the property to be verified.
Definition 2 (Discrete Event System): A Discrete Event In particular, it is easy to model bounded safety and bounded
System (DES) is a tuple H = (S, s0 , d, O, flow, jump, liveness properties as monitors.
output) where: Since the monitor output is all we need to carry out our
• S is a set of states (finite, countable, continuous, or any verification task, we can model our SUV along with the
combination thereof). property to be verified as a DES with an embedded monitor

736

Authorized licensed use limited to: Univ of Calif Santa Cruz. Downloaded on October 15,2024 at 21:08:15 UTC from IEEE Xplore. Restrictions apply.
u(t)  = δ0 , . . . , δn−1 is an (h, d) sequence of disturbance traces,
 u(t)  and H = (S, s0 , d, flow, jump, output) is an MDES.

 
    The answer to SLFV problem P is FAIL if there exists a
  
t t disturbance trace δ in Δ such that ψ(uτδ , τ h) = 1 (in such a
(a) (b) (c)
case also the counterexample δ is returned), PASS otherwise.
Figure 1: (a) a discrete event sequence (d = 3); (b) our SUV Note that, notwithstanding the fact that the number of
with an embedded monitor; (c) the SUV monitor output. states of our SUV is infinite and we are in a continuous
time setting, to answer a SLFV problem we only need to
check a finite number of disturbance traces. This is because
we are bounding: (a) our time horizon to T = hτ (i.e.,
whose set of output values is Bool. We call such a DES a
h multiplied by τ ), and (b) the set of time points at which
Monitored Discrete Event System (Definition 4 and Fig. 1b).
disturbances can take place, by taking τ as the time quantum
Definition 4 (Monitored Discrete Event System): A
among disturbance events.
Monitored Discrete Event System (MDES) is a tuple
Thus, by taking h large enough (as in BMC) and τ
H = (S, s0 , d, flow, jump, output) such that (S, s0 , d,
small enough (to faithfully model our SUV operational
Bool, flow, jump, output) is a DES whose output function
scenarios), we can achieve any desired precision. On such
ψ(u, t) is non-decreasing with respect to t. That is, for any
considerations rests the effectiveness of the approach.
input sequence u ∈ Ud , for all t, t ∈ R≥0 , if t ≤ t then
ψ(u, t) ≤ ψ(u, t ). In other words, an MDES is a DES with F. HILS-based System Level Formal Verification
non-decreasing boolean outputs.
We use a black-box approach where the MDES H defin-
D. Modelling SUV operational scenarios ing our SUV and property to be verified is defined using the
System level verification follows an Assume-Guarantee modelling language of a suitable simulator (e.g., MatLab and
approach aimed at showing that the SUV meets its specifi- Stateflow for Simulink).
cation (Guarantee) as long as the SUV operational environ- We compute the answer to an SLFV problem (h, d,
ment behaves as expected (Assume). As in Bounded Model τ , Δ, H) by simulating each operational scenario δ in
Checking (BMC), we model (Definition 5) scenarios in the the operational environment Δ. In other words, we are
SUV operational environment as sequences of disturbances performing an exhaustive (with respect to Δ) HILS.
(disturbance traces) our SUV is expected to withstand. Each We drive a simulator for H (that is, a simulator running a
disturbance is an integer in [0, d] and disturbance traces are model for H) using four basic commands: store, load, free,
of finite length h. Given a time quantum τ ∈ R+ , a distur- run. Command store(l) stores in memory the current state
bance trace can be associated to a discrete event sequence of the simulator and labels with l such a state. Command
where all disturbances occur at time points multiple of τ . load(l) loads into the simulator the stored state labelled
Definition 5 (Disturbance trace): Let h, d ∈ N+ . An with l. Command free(l) removes from the memory the
(h, d) disturbance trace δ is a finite sequence δ : [0, h−1] → state labelled with l. Command run(e, t) (with e ∈ [0, d]
[0, d]. Given τ ∈ R+ (time quantum), to an (h, d) disturbance and t ∈ R+ ) injects disturbance e and then advances the
trace δ we can univocally associate a discrete event sequence simulation of time t. A simulation campaign is a sequence
uτδ , defined as follows (see also Fig. 2d, ignoring the letters of simulator commands.
in the disturbance traces). For all t ∈ R≥0 , if there exists Using the commands store and load we can avoid revisit-
k ∈ [0, h − 1] such that t = τ k then uτδ (t) = δ(k), else uτδ (t) ing simulation states (much as in explicit model checking).
= 0 (no disturbance). Using command free we can remove from the memory
Thus a disturbance trace δ defines an operational scenario states that will never be needed in the remaining part of
(namely, uτδ ) for our SUV. the simulation campaign. This is needed since each state
An (h, d) sequence of disturbance traces is a finite may require many KB of memory (150 KB in the case
sequence Δ = δ0 , . . . , δn−1 of (h, d) disturbance traces. study presented in this paper). We will show how optimised
Given τ ∈ R+ , to each sequence of disturbance traces Δ simulation campaigns enable HILS-based distributed multi-
= δ0 , . . . , δn−1 is associated a sequence of discrete event core SLFV.
τ
sequences UΔ = uτδ0 , . . . , uτδn−1 . Accordingly, we model our
III. OVERALL APPROACH
SUV operational environment as a sequence of disturbance
traces Δ since UΔ τ
defines the operational scenarios our SUV Our overall approach is shown in Fig. 3. To define an
should withstand. SLFV problem (h, d, τ , Δ, H), we need to build the
sequence of admissible disturbance traces Δ (operational
E. The System Level Formal Verification problem environment).
A System Level Formal Verification (SLFV) problem is a Of course, it is typically infeasible to define operational
tuple P = (h, d, τ , Δ, H) where: h, d ∈ N+ , τ ∈ R+ , Δ environments by listing all their disturbance traces. In [1]

737

Authorized licensed use limited to: Univ of Calif Santa Cruz. Downloaded on October 15,2024 at 21:08:15 UTC from IEEE Xplore. Restrictions apply.
  
      
  τ
  
  
          
 

  


   
           
        

 


$%&'%()*   + 
  
 
          
 

  ! !


(a) (b) (c) (d) t         

    

#
    

" 
Figure 2: (a) disturbance model; (b) CMurphi-based distur-

 
     
bance generator; (c) generated labelled admissible sequence  
 
          
 
of disturbance traces (d = 3, h = 6, integers from 0 to d    ,*      

 
   
denote disturbances, letters denote labels); (d) the discrete  
 
          
 
event sequence associated to the trace in the black rectangle         

in part (c), given time quantum τ .


(a) (b) (c) (d) (e)
Figure 3: Our approach to distributed multi-core SLFV

it is shown how operational environments can be easily


defined using the modelling language of a finite state model
checker (CMurphi [4]). We follow the approach in [1] and set of labels L (e.g., N+ ). Labels are defined by an injective
run CMurphi in Depth-First Search (DFS) mode to generate map λ from finite sequences of disturbances (including
the operational environment Δ. DFS guarantees that the the empty sequence) to L. As a consequence, prefixes
disturbance traces in Δ are generated in lexicographic order dˆ0 , . . . , dˆp−1 common to multiple disturbance traces in Δ
(see Fig. 2a–b and Fig. 3a). are followed, in Δλ , by the same label ˆlp = λ(dˆ0 , . . . , dˆp−1 )
(see Fig. 2c). Given that our CMurphi-based generator runs
Starting from Δ, we aim at generating highly optimised
in DFS mode, traces are produced in lexicographic order
simulation campaigns, which exploit as much as possible the
and labelled at no additional computational cost during
capabilities of modern simulators to store and restore simula-
generation, as shown in [1]. This allows us to assume that
tion states. Given a disturbance trace δ = d0 , . . . , dh−1 ∈ Δ,
Δλ is available as an input. Hence, in our setting, the SLFV
any prefix of δ univocally identifies a simulation state, given
can be regarded to be (h, d, τ , Δλ , H) rather than (h, d, τ ,
that, to answer our SLFV problem, the simulator is intended
Δ, H). This greatly simplifies the design of our simulation
to be run under input δ starting from its initial state.
campaign optimiser (Section IV).
When simulating all traces in Δ, often multiple traces, In order to enable parallel computation on k ∈ N+
e.g., δ = dˆ0 , . . . , dˆp , dp+1 , . . . , dh−1 and δ  = dˆ0 , . . . , machines with c ∈ N+ cores each, we evenly partition the
dˆp , dp+1 , . . . , dh−1 , have a common prefix, e.g., dˆ0 , . . . , dˆp . labelled lexicographically ordered sequence of disturbance
In order to properly exploit the load/store capabilities of the traces Δλ into kc (labelled) lexicographically ordered se-
simulator, we would like to proceed as follows. When verify- quences of disturbance traces Δλ0 , . . . , Δλkc−1 , by assigning
ing the first of such traces, e.g., δ, we: (i) run the simulator the i-th trace (0 ≤ i < |Δλ |) to the ikc/|Δλ | -th slice
with input being the common prefix dˆ0 , . . . , dˆp ; (ii) store (Fig. 3a). We use such kc slices to compute in parallel kc
under a given label, e.g., l, the state reached by the simulator highly optimised simulation campaigns (Fig. 3b), which can
so far; (iii) continue the simulation of δ by injecting the be simulated in parallel using kc simulators each one running
remaining disturbances of δ, i.e., dp+1 , . . . , dh−1 . When on a different core of our k machines (Fig. 3c–d).
verifying δ  , we: (i) avoid the recomputation of the state The answer to the SLFV problem is FAIL if one of the
that the simulator would reach when run on the common simulation campaigns raises the simulator output function
prefix of disturbances dˆ0 , . . . , dˆp by loading back the state to 1. The answer is PASS otherwise. In case the answer is
previously stored under label l; (ii) continue the simulation FAIL , the driver of the simulator which raised the error can
of δ  by injecting the remaining disturbances of δ  , i.e., compute a disturbance trace δ (called counterexample) in the
dp+1 , . . . , dh−1 . input slice such that the discrete event sequence associated
Unfortunately, a naive identification of common pre- to δ under time quantum τ (see, e.g., Fig. 2d) would lead
fixes of traces in Δ would be computationally very ex- the SUV to an error state (Fig. 3e).
pensive, given that Δ may contain a huge number of
traces (about 4 million in our examples, which need about IV. C OMPUTATION OF S IMULATION C AMPAIGNS
3.5GB of memory to be stored). Following [1], we delegate In this section we describe our RAM-based simulation
the CMurphi-based disturbance trace generator to produce campaign optimiser which enables multi-core SLFV.
a labelled lexicographically ordered sequence of distur- Given a labelled (h, d) lexicographically ordered se-
bance traces Δλ . Each δ λ in Δλ is of the form δ λ = quence of disturbance traces Δλ = δ0λ , . . . , δn−1
λ
, our
l0 , d0 , l1 , d1 , . . . , lh−1 , dh−1 , lh , where δ = d0 , . . . , dh−1 is optimiser computes a simulation campaign for any simulator
a disturbance trace in Δ and l0 , . . . , lh belong to a countably of any DES H whose set of inputs is [0, d]. The computed

738

Authorized licensed use limited to: Univ of Calif Santa Cruz. Downloaded on October 15,2024 at 21:08:15 UTC from IEEE Xplore. Restrictions apply.
Input: Δλ , a labelled lex-ordered sequence of disturbance traces A. LBT construction
Output: χ , the computed simulation campaign, initially empty
1 LBT ← buildLBT(Δλ ); The LBT is a tree of labels rooted at l0 , the first label of all
2 let l0 be the first label common to all traces in Δλ ; traces (e.g., l0 = a in Fig. 2c and Fig. 4). The LBT collects
3 stored ← empty set of labels; /* inv: stored⊆LBT and |stored|≤h */ branching labels, i.e., labels li for which there exist at least
4 append store(l0 ) to χ and add l0 to stored;
5 i ← 0; two labelled disturbance traces δ λ = l0 , d0 , . . . , li , di , . . . , lh

6 foreach δ λ = l0 , d0 , . . . , lh−1 , dh−1 , lh in Δλ do and δ λ = l0 , d0 , . . . , li , di , . . . , lh in Δλ which are identical
7 i++; /* δ λ is the i-the trace in Δλ */ up to li and such that di = di . Branching labels represent
8 t load ← max t s.t. lt ∈ stored;
9 append load(lt load ) to χ;
simulator states whose storing may save simulation time (by
10 foreach label l ∈ stored s.t. LBT[l].lastTrace ≤ i do loading them back later).
11 append free(l) to χ; Label lj is a child of li in the LBT iff, for all δ λ = l0 , d0 ,
12 remove l from stored;
13 dˆ ← dt load ; steps ← 1;
. . . , li , . . . , lj , . . . , lh ∈ Δλ , no lk in δ λ with i < k < j is
14 for t ← t load + 1 to h − 1 do in the LBT (note: all such δ λ are identical at least up to lj ).
15 toBeStored ← (lt ∈ LBT − stored and LBT[lt ].lastTrace > i); For each label l in the LBT, the number of the last trace in
if toBeStored or dt = 0 then
16
ˆ steps) to χ; dˆ ← dt ; steps ← 1;
Δλ where it occurs is kept.
17 append run(d,
18 if toBeStored then The construction of the LBT is shown as function
19 append store(lt ) to χ and add lt to stored; buildLBT() in Algorithm 1 (from line 22). The function
20 else steps++; scans the input slice in order to recognise branching labels,
21 return χ;
keeping in array watched the labels of the last processed
22 function buildLBT(Δλ ) trace. In fact, as the traces in Δλ are lexicographically
23 LBT ← empty tree of labels;
/* for each l ∈ LBT, LBT[l].lastTrace stores the index of last trace
ordered, these are the only labels that may become branching
where it is known to occur */ when processing a new trace. To see why, assume that the
24 watched ← empty array [0..h − 1] of labels; optimiser is processing, e.g., trace 2 in Fig. 4a (left). As this
25 let l0 be the first label common to all traces in Δλ ; trace starts to be different with respect to the previous trace
26 set l0 as the root of LBT with LBT[l0 ].lastTrace ← |Δλ |;
27 watched[0] ← l0 ; (trace 1) from the disturbance at step 2 (i.e., disturbance 2
28 i ← 0; right after label c), the optimiser infers that labels d, e, f, g
29 foreach δ λ = l0 , d0 , . . . , lh−1 , dh−1 , lh in Δλ do of trace 1 will never occur in later traces of Δλ , and will
30 i++; /* δ λ is the i-th trace in Δλ */
31 for t ← 0 to h − 1 s.t. lt ∈ LBT do LBT[lt ].lastTrace ← i;
never become branching.
32 t lbt ← max t s.t. lt ∈ LBT; As for the actual recognition of a new branching la-
33 t w ← max t s.t. lt ∈ watched; bel and its addition to the LBT, assume that function
34 if t lbt = t w then
/* label lt w ∈ LBT: add it */ buildLBT() is processing the i-th disturbance trace δ =
35 t child ← min t > t w s.t. watched[t child] ∈ LBT (if any); l0 , d0 , . . . , lt lbt , dt lbt , . . . , lt w , dt wW , . . . , lh (line 29). Vari-
36 add lt w to LBT as child of lt lbt with LBT[lt w ].lastTrace = i; able t lbt is set to the max index of a label in δ λ already
37 move lt child (if any) as to be child of lt w in LBT;
38 foreach t ← t w + 1 to h − 1 do watched[t] ← lt ; in the LBT, and t w is the max index of a label in δ which
/* watched now contains labels of the last trace */ belongs also to array watched. As l0 is put both in the
39 return LBT; LBT and in watched[0] at the beginning, both values are
Algorithm 1: DFS-Optimiser pseudo-code
always defined. The algorithm infers that the current trace
is identical to the previously processed one up to t w, but
differs from it starting from the disturbance to be injected at
step t w + 1. If t w = t lbt, label lt w is already branching,
and nothing has to be done. Otherwise, the new label lt w is
campaign is abstract in that, for all commands of the form
recognised as branching, and is added to the LBT as a child
run(e, t), t is a natural number and not an actual time
of lt lbt (as, given that the input traces are in lexicographic
duration. By providing a time step τ ∈ R+ , χ can be
order, t w = t lbt implies t w > t lbt). As lt lbt could
instantiated into a concrete simulation campaign χτ , by
already have children in the LBT, the tree may need to be
replacing all run(e, t) commands by run(e, tτ ).
rearranged to accommodate the new label lt w . Given that the
The algorithm of our optimiser is shown as Algorithm 1. input traces are in lexicographic order, the last task is very
As the input sequence Δλ of labelled disturbance traces can simple, as at most one child of lt lbt must be moved. This
be too big to be kept in main memory, the optimiser reads child, if exists, must be a label that occurred in the previous
the input file sequentially twice. In the first scan of Δλ , the trace, i.e., it belongs to the watched array (line 35).
optimiser builds a data structure called Labels Branching Fig. 4a shows an example of LBT construction starting
Tree (LBT) as completely as possible within the available from a labelled lexicographically ordered sequence of dis-
RAM. Afterwards, it reads Δλ again to produce the abstract turbance traces Δλ consisting of 6 traces. Note that, out of
simulation campaign from the LBT. 25 labels in Δλ , only 5 of them belong to the LBT. When

739

Authorized licensed use limited to: Univ of Calif Santa Cruz. Downloaded on October 15,2024 at 21:08:15 UTC from IEEE Xplore. Restrictions apply.
  
       
be safely freed (line 12).
                   
     
Fig. 4(b) shows the simulation campaign computed by
     
  
    
    
               
                
the optimiser on the slice in Fig. 4(a). Except for the first
 
                      command which stores a (the label common to all traces and
representing the simulator initial state), each line represents
(a)
the portion of the simulation campaign stemming from each
store(a) trace. Note that only the first trace is simulated entirely,
load (a) run(0,1) store(b) run(2,1) store(c) run(1,3) run(1,1)
load (c) run(2,2) store(i) run(0,2) while all the others are simulated starting from intermediate,
load (i) free(i) run(3,2) previously stored, states.
load (c) run(3,1) store(p) run(1,1) run(1,2)
load (p) free(p) free(c) run(2,1) run(2,2) C. Optimiser soundness and completeness
load (b) free(b) free(a) run(3,3) run(1,2)

(b) Given an SLFV problem P = (h, d, τ , Δλ , H), Algo-


Figure 4: Simulation campaign optimiser. (a) Construction rithm 1 computes a simulation campaign χτ which is sound
of an LBT from 6 traces (labels are shown as letters). (b) The and complete with respect to Δλ . That is: if the answer
computed optimised simulation campaign. to P is PASS , then the output of the simulator at the end
of the execution of the simulation campaign χτ will be 0
(soundness). On the other hand, if the answer to P is FAIL
and δ λ is the first counterexample in Δλ , then the output of
processing, e.g., trace 6, tlbt = 0 points to label a having the simulator will raise from 0 to 1 during the simulation of
index 0 in trace 6 and tw = 1 points to label b having index a command of χτ stemming from δ λ (completeness).
1 in trace 6 and occurring in array watched. Label b becomes The result above can be proved by formalising the notion
branching as there are previous traces (e.g., trace 5) identical of simulator for H along the lines of [1].
to trace 6 up to b and different from that point on. The new
V. E XPERIMENTAL R ESULTS
branching label b becomes a child of a in the LBT. The
previous child of a, i.e., c, becomes a child of b. In this section we evaluate the effectiveness of our dis-
tributed multi-core approach to SLFV (in short mcSLFV)
B. Computation of the abstract simulation campaign and compare it with the distributed single-core approach
(in short scSLFV) of [1]. For this reason we: (i) use the
Once the LBT is built, the optimiser reads the input slice
very same case study of [1], i.e., the Fuel Control System
a second time to compute the abstract simulation campaign,
(FCS) model included in the Simulink distribution; (ii) run
keeping track of which LBT labels are stored in simulator
experiments on very similar machines, i.e., multiple 3.0
memory at any moment (set stored, line 3 of Algorithm 1).
GHz, 8GB RAM Intel hyperthreaded Quad Core Linux PCs.
For each δ λ = l0 , d0 , . . . , lload , . . . , lh in Δλ , let lload be
The FCS has three sensors subject to faults (disturbances).
the right-most label of δ λ currently stored by the simulator
We verify one of the system level specifications for such
(line 8). The optimiser appends to the output campaign the
a model, namely: the fuel air model variable is never 0
following commands: (i) load(lload ) (line 9); (ii) free(l) for
for more than one second. Accordingly, our SUV consists
each label l ∈ LBT which represents a currently stored
of the Simulink FCS model along with a monitor for the
state that will never occur in future traces (line 12); (iii) a
ˆ steps) for each maximal sub- property under verification. In our setting, the complexity
command of the form run(d,
of the computation of an optimised simulation campaign
sequence of length steps in δ λ (starting from lload ) of the
ˆ li , 0, li , . . . , 0, li d, ˜ ˜l where either d˜ = 0 or label primarily depends on the number of disturbance traces to
form d, 1 2 steps
be simulated. Thus, the worst case for our approach is when
˜l needs to be stored (line 17). In the latter case, command
all disturbance traces have to be simulated, i.e., when the
store(˜l) is appended as well (line 19). Label ˜l needs to be answer to the SLFV problem is PASS . We know that this
stored (see line 15) if it is in the LBT but not yet stored and is the case when no more than one fault occurs within a
it will occur again in a later trace. second. Thus, this will be our disturbance model. We set the
The maximum number of states that the simulator must disturbance traces horizon h to 100 and τ (quantum between
keep stored at any moment is bounded by h (the horizon). disturbances) to 1 second.
This is because, when starting the simulation of the portion
of χτ stemming from any trace δ λ ∈ Δλ , the simulator A. Generation and splitting of simulation scenarios
executes command load (lload ) with lload being a label in trace As [1], we use CMurphi to generate a labelled lexico-
δ λ having index i. Given that the disturbance traces in Δλ graphically ordered sequence Δλ of 4,023,955 disturbance
are in lexicographic order, all labels occurred with indices traces. This takes about 28 minutes and produces a 3.5GB
> i in traces of Δλ before δ λ never occur in traces of Δλ file. We then split such a Δλ into kc slices, with k =
after δ λ . Hence, currently stored states identified by those 8, 16, 32, 64 and c = 8. Splitting takes a few seconds,
labels will not need to be loaded back in the future and can regardless of the value of kc.

740

Authorized licensed use limited to: Univ of Calif Santa Cruz. Downloaded on October 15,2024 at 21:08:15 UTC from IEEE Xplore. Restrictions apply.
#traces scSLFV mcSLFV time scSLFV mcSLFV
#slices per slice optimiser optimiser saving %
#machines #slices time #slices time time saving %
1 4,023,955 20:27:26 0:7:16 99.41%
2 2,011,977 3:47:57 0:9:43 95.74% 8 8 711:3:33 64 205:49:20 71.05%
4 1,005,988 1:45:4 0:9:0 91.43% 16 16 343:24:27 128 100:47:4 70.65%
8 502,994 0:44:27 0:5:27 87.74% 32 32 167:6:9 256 58:26:29 65.03%
16 251,497 0:16:24 0:2:8 86.99% 64 64 81:49:3 512 27:18:2 66.63%
32 125,748 0:4:50 0:0:57 80.34%
64 62,874 0:0:51 0:0:29 43.14% Table III: Completion time of the parallel simulation (i.e.,
128 31,437 0:0:35 0:0:17 51.43% completion time of the longest campaign) with respect to
256 15,718 0:0:10 0:0:8 20.00% the approach of [1] (time in h:m:s).
512 7,859 0:0:5 0:0:4 20.00%

Table I: Comparison between scSLFV optimiser of [1] and


our mcSLFV optimiser (time in h:m:s).
needed to carry out the SLFV task with k c-core machines,
i.e., the sum of the disturbance trace generation and splitting
#mach #slices min max avg stddev
avg
% speedup efficiency time (about 28 minutes), optimisation time (from Table I),
and the max simulation time (column max) over all the
8 64 180:3:0 205:19:57 194:17:52 4.979% 54.63× 85.35%
16 128 70:6:4 100:17:53 87:49:56 13.772% 111.56× 87.15% kc = #slices slices. Time t1 (serial time) is the overall
32 256 44:0:27 57:57:27 48:34:6 10.323% 192.38× 75.15 time needed to carry out the SLFV task when only one core
64 512 18:32:36 26:49:4 23:2:19 11.110% 411.83× 80.43% is used. Let tavg
kc be the average time to simulate a slice
Table II: Statistics on the distributed (k = #mach(ines)) where kc = #slices cores are used (row #mach(ines) = k,
multi-core (c = 8) execution of simulation campaigns (time column avg). When using kc cores, the serial time can
in h:m:s). be estimated as kc × tavg kc . As this value changes a little
bit for different values of k, we estimated serial time
t1 as min{64tavg avg avg avg
64 , 128t128 , 256t256 , 512t512 }. This leads to
t1 = 491.5 days. From such a huge value it follows that
B. Computation of optimised simulation campaigns estimation is the only viable way to compute t1 . Note that
Table I compares the performance of our mcSLFV opti- in our computation we are slightly overestimating the serial
miser with those of the scSLFV optimiser of [1]. Column time, since we are assuming that the first trace of each slice
#slices gives the number of slices in which the sequence must be simulated from the initial state. In an actual 1-
of disturbance traces has been partitioned. Column #traces core execution of a simulation campaign, the optimiser may
per slice shows the number of traces in any single slice exploit stored simulator states to avoid simulation of such
(except the last slice, which may have up to #slices−1 more traces from the initial state. As the time to simulate a single
traces, as the overall number of traces is not a multiple of trace is of a few seconds, this is negligible with respect to
#slices). Columns scSLFV optimiser and mcSLFV optimiser the value of t1 .
show the maximum time needed by, respectively, the [1] and Column efficiency in Table II is computed, as typically
our optimisers to compute the simulation campaign from a done in the evaluation of parallel algorithms, by dividing the
slice. For each row in Table I, the entry in column time speedup by the number of parallel processes kc = #slices.
saving % is defined as (tsc − tmc )/tsc , where tsc and tmc In the same fashion, we can estimate the serial time and
are, respectively, the entries in columns scSLFV optimiser efficiency of the scSLFV approach of [1]. Serial time is
and mcSLFV optimiser. about 200 days, and efficiency is higher than ours, being
Note that, by exploiting the lexicographical order among almost 1. Our loss of efficiency stems from the fact that
traces in the input sequence, our mcSLFV optimiser is processes running on different cores of the same machine
always much faster than the scSLFV optimiser of [1]. share the RAM.
Such an efficiency measure does not take into account
C. Execution of the simulation campaigns the cost of the hardware. In fact, to enable kc-parallel pro-
Table II shows some statistics on the execution time of the cesses, the scSLFV approach needs kc machines, whereas
simulation campaigns generated by our mcSLFV optimiser. our mcSLFV approach needs only k machines. Table III
Note that the standard deviation of the simulation time is investigates such an issue, by showing the time saving
always very small, always less than 15% with respect to the realised by our mcSLFV approach. In particular, for each
average time (see column stddev/avg%). This shows that the value of k (#machines), columns scSLFV and mcSLFV show
computational load among cores is well balanced. the number of processes (#slices) that can be run on the
Column speedup shows the ratios t1 /tkc , typically used given machines and the time needed (time) to complete
in the evaluation of parallel algorithms. For each row the verification task with, respectively, the approach of [1]
(k = #mach(ines)) of Table II, time tkc is the overall time and ours. For each row, the entry in column time sav-

741

Authorized licensed use limited to: Univ of Calif Santa Cruz. Downloaded on October 15,2024 at 21:08:15 UTC from IEEE Xplore. Restrictions apply.
ing % is defined as (tsc − tmc )/tsc , where tsc and tmc [9] S. Tripakis, C. Sofronis, P. Caspi, and A. Curic, “Translating
are, respectively, the entries in columns scSLFV time and discrete-time simulink to lustre,” ACM Trans. Emb. Comp.
mcSLFV time. Table III shows that, when using the same Syst., vol. 4, no. 4, pp. 779–818, 2005.
hardware, our distributed multi-core approach saves at least [10] B. Meenakshi, A. Bhatnagar, and S. Roy, “Tool for translating
65% of verification time. For example, using 64 machines, simulink models into input language of a model checker,” in
the verification task using the single-core approach of [1] Proc. ICFEM 2006, 2006, pp. 606–620.
would need 81 hours, while ours needs less than 27 hours.
[11] M. Whalen, D. Cofer, S. Miller, B. Krogh, and W. Storm,
Note that a serial approach to verification would need more “Integration of formal analysis into a model-based software
than 200 days. development process,” in Proc. FMICS 2007, 2007.
VI. C ONCLUSIONS [12] C. Yang and D. Dill, “Validation with guided search of the
We have presented a distributed multi-core approach to state space,” in Proc. DAC 1998. ACM, 1998, pp. 599–604.
HILS-based SLFV. We have implemented our algorithms
[13] P. Ho, T. Shiple, K. Harer, J. Kukula, R. Damiano,
and run experiments on a large control system case study V. Bertacco, J. Taylor, and J. Long, “Smart simulation
in the Simulink distribution, whose operational environment using collaborative formal and simulation engines,” in Proc.
consists of more than 4 million simulation scenarios. Our ICCAD 2000. IEEE, 2000, pp. 120–126.
distributed multi-core approach allows us to complete the
verification of such a system in about 27 hours using 64 [14] K. Nanshi and F. Somenzi, “Guiding simulation with
increasingly refined abstract traces,” in Proc. DAC 2006.
8-core machines, whereas a sequential computation would ACM, 2006, pp. 737–742.
require more than 200 days.
To the best of our knowledge, this is the first time that [15] F. De Paula and A. Hu, “An effective guidance strategy for
a distributed multi-core algorithm for HILS-based SLFV is abstraction-guided simulation,” in Proc. DAC 2007. ACM,
2007, pp. 63–68.
presented.
Acknowledgements: Work partially supported by FP7 [16] U. Stern and D. Dill, “Parallelizing the Murphi Verifier,”
Form. Methods Syst. Des., vol. 18, no. 2, pp. 117–129, 2001.
projects SmartHG (317761) and PAEON (600773). We
thank our reviewers for their valuable comments. [17] J. Barnat, L. Brim, I. Černá, P. Moravec, P. Ročkai, and
P. Šimeček, “Divine: a tool for distributed verification,” in
R EFERENCES Proc. CAV 2006. Springer, 2006, pp. 278–281.
[1] T. Mancini, F. Mari, A. Massini, I. Melatti, F. Merli, and
E. Tronci, “System level formal verification via model check- [18] I. Melatti, R. Palmer, G. Sawaya, Y. Yang, R. Kirby, and
ing driven simulation,” in Proc. CAV 2013, ser. LNCS, vol. G. Gopalakrishnan, “Parallel and distributed model checking
8044. Springer, 2013, pp. 296–312. in eddy,” Int. J. Softw. Tools Technol. Transf., vol. 11, no. 1,
pp. 13–25, 2009.
[2] R. Alur, “Formal verification of hybrid systems,” in Proc.
EMSOFT 2011. ACM, 2011, pp. 273–278. [19] B. Bingham, J. Bingham, F. De Paula, J. Erickson, G. Singh,
and M. Reitblatt, “Industrial strength distributed explicit
[3] F. Cavaliere, F. Mari, I. Melatti, G. Minei, I. Salvo, E. Tronci, state model checking,” in Proc. PDMC-HIBI 2010. IEEE,
G. Verzino, and Y. Yushtein, “Model checking satellite oper- 2010, pp. 28–36.
ational procedures,” in Proc. DASIA 2011, 2011.
[20] G. Holzmann, “Parallelizing the SPIN model checker,” in
[4] G. Della Penna, B. Intrigila, I. Melatti, E. Tronci, and Proc. SPIN 2012. Springer, 2012, pp. 155–171.
M. Venturini Zilli, “Exploiting transition locality in automatic
verification of finite state concurrent systems,” STTT, vol. 6, [21] A. Laarman, J. van de Pol, and M. Weber, “Boosting multi-
no. 4, pp. 320–341, 2004. core reachability performance with shared hash tables,” in
Proc. FMCAD 2010. IEEE, 2010, pp. 247–255.
[5] P. Zuliani, A. Platzer, and E. Clarke, “Bayesian statistical
model checking with application to simulink/stateflow verifi- [22] A. Wijs, “Towards informed swarm verification,” in Proc.
cation,” in Proc. HSCC 2010, 2010, pp. 243–252. NFM 2011, ser. LNCS, vol. 6617. Springer, 2011.

[6] K. Sen, M. Viswanathan, and G. Agha, “On statistical model [23] M. Staats and C. S. Pasareanu, “Parallel symbolic execution
checking of stochastic systems,” in Proc. CAV 2005, ser. for structural test generation,” in Proc. ISSTA 2010. ACM,
LNCS, vol. 3576. Springer, 2005, pp. 266–280. 2010, pp. 183–194.

[7] E. Tronci, G. Della Penna, B. Intrigila, and M. Venturini Zilli, [24] E. Sontag, Mathematical Control Theory: Deterministic
“A probabilistic approach to automatic verification of concur- Finite Dimensional Systems. Springer, 1998.
rent systems,” in Proc. APSEC 2001. IEEE, 2001.
[25] O. Maler and D. Nickovic, “Monitoring temporal proper-
[8] R. Grosu and S. Smolka, “Monte carlo model checking,” in ties of continuous signals,” in Proc. FORMATS 2004 and
Proc. TACAS 2005, ser. LNCS, vol. 3440. Springer, 2005. FTRTFT 2004, ser. LNCS, vol. 3253, 2004, pp. 152–166.

742

Authorized licensed use limited to: Univ of Calif Santa Cruz. Downloaded on October 15,2024 at 21:08:15 UTC from IEEE Xplore. Restrictions apply.

You might also like