0% found this document useful (0 votes)
5 views27 pages

Symbolic WCET Computation

Uploaded by

krealhtz
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views27 pages

Symbolic WCET Computation

Uploaded by

krealhtz
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 27

Symbolic WCET Computation

Clément Ballabriga, Julien Forget, Giuseppe Lipari

To cite this version:


Clément Ballabriga, Julien Forget, Giuseppe Lipari. Symbolic WCET Computation. ACM Transac-
tions on Embedded Computing Systems (TECS), 2017, 17 (2), pp.1 - 26. �10.1145/3147413�. �hal-
01665076�

HAL Id: hal-01665076


https://hal.science/hal-01665076v1
Submitted on 23 Feb 2021

HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est


archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents
entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non,
lished or not. The documents may come from émanant des établissements d’enseignement et de
teaching and research institutions in France or recherche français ou étrangers, des laboratoires
abroad, or from public or private research centers. publics ou privés.
Symbolic WCET computation

CLÉMENT BALLABRIGA, JULIEN FORGET, and GIUSEPPE LIPARI, Univ. Lille, CNRS, Centrale Lille,
UMR 9189 - CRIStAL, France

Parametric Worst-case execution time (WCET) analysis of a sequential program produces a formula that represents the worst-case
execution time of the program, where parameters of the formula are user-defined parameters of the program (as loop bounds, values
of inputs or internal variables, etc).
In this paper we propose a novel methodology to compute the parametric WCET of a program. Unlike other algorithms in the
literature, our method is not based on Integer Linear Programming (ILP). Instead, we follow an approach based on the notion of symbolic
computation of WCET formulae. After explaining our methodology and proving its correctness, we present a set of experiments to
compare our method against the state of the art. We show that our approach dominates other parametric analyses, and produces
results that are very close to those produced by non-parametric ILP-based approaches, while keeping very good computing time.

Additional Key Words and Phrases: Worst-case execution time, symbolic evaluation

1 INTRODUCTION
A real-time system is usually represented as a set of tasks. Tasks are subject to timing constraints: typically, the execution
of every instance of a periodic real-time task must be completed before its deadline. In order to guarantee the respect of
timing constraints, first a worst-case execution time (WCET) analysis is performed off-line, which calculates an upper
bound to the execution time of each task. Then, this information is used to perform a schedulability analysis and
guarantee that every task will meet its deadline.
In this paper, we focus on WCET analysis. In WCET analysis, first the task code is analysed to model its set of
possible execution paths. Then, the impact of the hardware architecture is taken into account: local effects (timing
of basic blocks of code) and global effects (impact of processor pipeline, caches, and in general interactions between
basic blocks). Finally, an upper bound to the execution time is computed by calculating the worst-case path, taking into
account all effects. A popular technique for doing this, called Implicit Path Enumeration Technique (IPET), is to encode
the problem as an Integer Linear Programming (ILP) problem that is then solved with standard techniques [16].
With traditional WCET analysis, if any of the program parameters is changed, it is necessary to re-run the analysis.
Also, it is difficult to analyze the impact of different parameter values on the final WCET estimate. For example, the
developer may be want to know the impact of the number of iterations of a certain loop on the WCET, the impact of
the cache size, etc. To answer these questions, it would be necessary to run the analysis several times with different
parameter values, which could be a very time consuming process.
An alternative approach is to calculate directly a parametric WCET formula instead of a constant value. If the
parameter changes, it is possible to recompute the WCET by simply substituting the parameter value into the formula.
Thus, it is possible to quickly explore the parameters space, which may be very useful in guiding developers at design

1
2 Clément Ballabriga, Julien Forget, and Giuseppe Lipari

time. Similarly, parametric WCET simplifies the analysis process when third-party software is involved, since the
developer can provide a parametric WCET along with the component, that can be adapted to the target system.
In addition, if the obtained formula is simple enough, it can be used to efficiently implement an adaptive real-time
system. Indeed, many system parameters are only known at run-time: loop bounds that depend on input values, software
and hardware state changes, operating system interference, etc. With traditional WCET analysis, adaptive features
would rely on a pre-computed WCET table containing different WCET values for different parameter values. Instead,
with parametric WCET analysis, we can compute off-line a WCET formula that depends on these parameters and
instantiate this formula on-line, at which point parameter values become known. As a result, with low overhead,
we obtain a tighter estimate of the task’s WCET and take better scheduling decisions. This can for instance benefit
energy-aware scheduling techniques based on Dynamic Voltage and Frequency Scaling (DVFS) [18].
Finally, large execution time values may happen only very rarely, for instance for unlikely combinations of input
data. By using parametric WCET analysis, it is possible to design the system according to an upper bound that is safe
for the vast majority of executions of the system, and then evaluate a parametric WCET formula at run-time to trigger
an alternate, less time-consuming computation when the formula returns a value exceeding the safe bound (and thus
remain under the safe bound).
Contribution. In this paper, we propose a novel approach to parametric WCET analysis based on symbolic computa-
tion that greatly improves upon the state of the art on parametric WCET. Unlike the majority of existing WCET analysis
algorithms, our methodology is not based on ILP: instead, we follow an approach based on symbolic computation of
WCET formulae.
We start from a representation of the program as a Control-Flow Graph where nodes of the graph are basic blocks
of code (the notion of CFG is recalled in Section 3). We transform the CFG into a Control-Flow Tree (CFT) (Section 4),
because a tree is more amenable to be transformed into arithmetic (symbolic) formulae. To represent global effects,
CFT nodes are annotated with context-sensitive annotations (Section 5): these annotations encode restrictions on the
number of iterations of basic blocks when executed inside loops. They may be considered as the equivalent of ILP
constraints in the IPET method [16]. We then move to the core method for generating a WCET formula. We first
introduce the notion of Abstract WCET (Section 5.2) and how to compute it starting from an annotated CFT in the
absence of parameters. Later, we introduce WCET parameters (Section 6) and we enunciate the rules for symbolic
computation and simplification of Abstract WCET formulae. Finally, in Section 7 we present experimental data that
compare our approach with the state of the art algorithms. We show that our algorithm produces results that are very
close to those of non-parametric ILP-based approaches, while keeping very good computing time. We also show that
simplified WCET formulae are very small, which implies low memory and execution time overhead in case of on-line
formula evaluation. Finally, we show that our approach dominates other parametric WCET analyses. This paper focuses
on the generic framework for symbolic WCET evaluation and only briefly outlines some applications in Section 6.2.
More complex applications (e.g. data-cache analysis) are out of the scope of this paper and are subject to future work.

2 RELATED WORKS
Various existing works suggest using symbolic methods in WCET analysis. However, their goal differs from ours. For
example, [4, 6, 8] use symbolic execution as a method to reduce the duration of the WCET analysis. In [21], the authors
use symbolic states to model the effect of pipelines on the WCET. The objective of these papers is not to produce a
parametric WCET formula.
Symbolic WCET computation 3

In [2], a technique is presented to perform a partial, composable WCET analysis. This work addresses mostly the
software and hardware modeling that occurs before the WCET computation proper. Results are presented for the
instruction cache and branch prediction analysis, and loop bounds estimation. However, no solution is provided to
perform the ILP computation parametrically.
Feautrier [11] presented a method for parametric ILP computation. The ILP solver presented in [11] (called PIPLib),
takes a parametrized ILP system as input, and produces a quast (quasi-affine selection tree). Once computed, this
tree can be evaluated for any valid parameter values, without having to re-run the solver. However, this approach
is computationally very expensive. Experiments [7] have shown that PIPLib does not scale well when applied in the
context of IPET. The MPA (Minimum Propagation Algorithm) [7] attempts to address these shortcomings. MPA takes
as input the results of the software and hardware modeling analysis, and produces directly a parametric WCET formula.
Compared with MPA, our method is significantly tighter because it takes into account various context-sensitive software
and hardware timing effects.
In the past, many tree-based WCET computation methods have been presented [17]. In [10], the authors suggest a
method to compute parametric WCETs using a tree-based approach. Our approach is also based on trees, but unlike [10]
it can work directly on the binary code. Furthermore, our method can model timing effects in a more generic and
accurate way thanks to context annotations (Section 5).
ParaScale [18] is an approach to exploit variability in execution time to save energy. By statically analyzing the tasks,
a parametric WCET formula is given for loops in terms of the loop iteration count. At run-time, before entering a loop,
the formula is evaluated and the system dynamically scales the voltage and frequency of the processor. In comparison,
the parameters in our method are not limited to loop bounds.

3 CONTROL-FLOW GRAPH
In this section we recall the definition of Control-Flow Graphs (CFG), the input model in our approach. The CFG is
extracted from the binary code of the task under analysis.

Definition 1. A Control Flow Graph (CFG) is a directed graph G =< B, E >. The set of vertices B corresponds to the
set of basic blocks of the program represented by the CFG. A directed edge (bi , b j ) ∈ E (where E ⊆ B × B), represents a
valid succession of two basic blocks in the program execution. We denote by time(b) the worst-case execution time (WCET)
of block b.

An entry node is a node without incoming edges, and an exit node is a node without outgoing edges. We assume,
without loss of generality, that a CFG has one single entry node and one single exit node (otherwise, it is always possible
to add fictive entry and exit nodes with the corresponding edges). We also assume that each node is reachable from the
entry node, and that the exit node is reachable from any node.
An execution path is a sequence of nodes (basic blocks): p.b denotes a path whose last node is b; p1 @p2 denotes the
path consisting of path p1 followed by path p2 . By abuse of notation, we also denote b the path consisting only of node
b. ϵ denotes the empty path.

Definition 2. Let G =< B, E > be a CFG. Let p = b1 . . . bk an execution path. We say that p is a valid path of G (or
simply a path of G) iff:

∀i ∈ {1, . . . , k }, bi ∈ B ∧ ∀i ∈ {1, . . . , k − 1}, (bi , bi+1 ) ∈ E


4 Clément Ballabriga, Julien Forget, and Giuseppe Lipari

If b1 is an entry node of G and bn is an exit node of G, then p represents a complete execution of the program represented
by G.
Ík
Definition 3. Let p = b1 . . . bk an execution path. We have: time(p) ≡ i=1 time(bk )

We introduce now a set of additional definitions concerning the CFG topology that will allow us to manipulate the
CFG in the following sections.

Definition 4. Let G =< B, E >. Let bi , b j , h ∈ B and let h be a loop header (see definition below).

• We say that bi is a predecessor of b j , and denote bi → b j , iff (bi , b j ) ∈ E;


• We say that bi dominates b j , and denote bi ≫ b j , iff all paths from the entry node to b j go through bi ;
• bk is the immediate dominator of bi iff bk ≫ bi , bk , bi and there exists no bk ′ such that bk ′ , bi , bk ′ , bk ,
bk ′ ≫ bi , bk ≫ bk ′ ;
• h is a loop header if it has at least one predecessor bi such that h ≫ bi . We denote lh the loop associated to header h;
• An edge (bi , h) such that h ≫ bi is called a back-edge of lh ;
• An edge (bi , h) that is not a back-edge is called an entry-edge of lh .
• The body of the loop of header h, denoted body(h), is the set of all nodes bi such that bi belongs to a path P, where P
starts with h, ends with a back-edge of lh and does not go through any entry-edges of lh .
• An edge (bi , b j ) such that bi ∈ body(h) and b j < body(h) is called an exit-edge of lh ;
• An execution path of a loop lh is a path p = h.b1 . . . bn , where there exists an exit edge (bn , bx ) of lh . Note that b1 ,
bn , h may actually not be distinct. The number of iterations of lh in p corresponds to the number of back-edges in p.
The maximum number of iterations of the loop lh , denoted by xh , is the maximum of the number of iterations of any
execution path of lh .
• Let lh , lh ′ be two loops of G. We say that lh contains lh ′ and denote lh ′ ⊑ lh iff h ′ ∈ body(h);
• The loop lh immediately contains bi iff bi ∈ body(h) and there exists no loop lh ′ , lh such that bi ∈ body(h ′ ) and
lh ′ ⊑ lh .
• The set of loops of graph G is denoted LG .

We define two additional loops, that are not actually part of the represented program:

• ⊤ is such that for all l ∈ LG , l ⊑ ⊤. In other words, ⊤ is a fictive loop whose body is the whole CFG (body(⊤) = G);
• ⊥ is such that for all l ∈ LG , ⊥ ⊑ l. In other words, ⊥ is a fictive empty loop (body(⊥) = ∅).

′ = L ∪ {⊤, ⊥}, ⊑) is a lattice.


Property 1. (LG G

Proof. Trivial due to the definition of ⊤ and ⊥. □

In the following:

• ⊔ : LG
′ × L ′ → L ′ denotes the least upper bound, i.e. l ⊔ l is the least element of {l ∈ L ′ |l ⊑ l ∧ l ⊑ l }.
G G 1 2 G 1 2
• ⊓ : LG × LG
′ ′ → L ′ denotes the greatest lower bound, i.e. l ⊓ l is the greatest element of {l ∈ L ′ |l ⊑ l ∧ l ⊑ l }.
G 1 2 G 1 2

Figure 1a shows of a simple CFG. Nodes b1 and b2 are loop headers. Loop lb1 contains b1, b2, b3, b4 , b6 , but it
immediately contains only b1 and b3 . (b3, b1 ) is a back-edge and (b1, b5 ) is an exit-edge for loop lb1 . Loop lb2 is contained
within loop lb1 . b1 dominates all the other nodes of the CFG. b1 is the immediate dominator of b3 .
Symbolic WCET computation 5

Seq

Loop(b1 ) b5
b1
Seq b1

b1 Alt b3
b2 b4 b6
b6 Loop(b2 )

Seq b2

b5 b3 b2 b4

(a) CFG representation (b) CFT representation

Fig. 1. A program with two nested loops.

4 CONTROL-FLOW TREE
We propose to translate the CFG into a Control-Flow Tree, which also represents the possible execution paths of a
program but, thanks to its tree structure, is more prone to recursive WCET analysis than a CFG. A Control-flow Tree is
similar to Abstract Syntax Trees used in programming languages compilation, except that it represents the structure of
binary code. As such, it will be quite natural to represent the WCET of a CFT as an arithmetic expression (see Section 6).

4.1 Definition
The set of Control-flow Trees T is defined inductively as follows:

Definition 5. Let n, m ∈ N∗ , t 1 , . . ., tn , ∈ T n . A control-flow tree t ∈ T is one of:


• Leaf(b), which represents the execution of basic block b ∈ B;
• Alt(t 1, . . . , tn ), which represents an alternative between the execution of trees t 1 , . . ., tn ;
• Loop(h, t 1, n, t 2 ), which represents a loop with header h, that repeats the execution of tree t 1 , with a maximum
number of iterations n, and exits from the loop executing the tree t 2 ;
• Seq(t 1, . . . , tn ), which represents a sequential execution of trees t 1 , . . ., tn .

As an example, Figure 1b shows the tree corresponding to the CFG of Figure 1a. In the following sections, we will
use this example to describe the steps of the conversion from CFG to CFT. Our definition of loops considers that we
repeat a sub-tree and then execute a different sub-tree when finishing the loop. This enables to represent a wide variety
of loops: f or , while, do...while, etc.

4.2 From CFG to Control-flow Tree


Algorithm 1 translates a loop of the CFG into a Directed Acyclic Graph (DAG) that represents the loop body. Algorithm 2
is the recursive procedure that generates the complete control-flow tree. It relies on Algorithm 1 to process the CFG
loops.
Our control-flow tree construction method works only for CFGs that contain no irreducible loops (i.e. loops with
multiples entries). In the general case, it is possible to transform CFGs with irreducible loops by using node splitting [14]
6 Clément Ballabriga, Julien Forget, and Giuseppe Lipari

ex it b1
Seq

Lb2 b6 b2 exit
b1 Al t b3

nex t b3 nex t b4
b6 Lb2

(a) DAG for loop lb1 (b) DAG for loop lb2 (c) CFT for the body
of loop lb1

Fig. 2. From loop to DAG and CFT

algorithms. In [20] the authors show that it is possible to detect the set of irreducible loops in a CFG in O(n2 ). While the
complexity of the node-splitting algorithm is not reported, the algorithm is meant to be executed only on irreducible
loops, which usually constitute a small part of the analysed program.

4.2.1 Loop to DAG (Algorithm 1). The DAG produced for a loop lh represents its body. In this DAG, inner loops
are replaced by hierarchical nodes, which themselves correspond to separate DAGs. For instance, Figure 2a shows the
DAG produced for loop lb1 (the construction steps and the meaning of nodes exit and next are detailed below). Lb2 is a
hierarchical node representing loop lb2 . The DAG produced for loop lb2 is shown in Figure 2b. In the remainder of this
section, we use the example of Figure 2a to illustrate Algorithm 1.
Algorithm 1 constructs the DAG corresponding to a loop lh . At line 2 the algorithm adds all nodes immediately
contained in lh to the DAG nodes. Any edge in the CFG between these nodes is added to the DAG edges (line 3). In our
example, this corresponds to nodes b1 , b3 , b6 and to edges (b1, b6 ) and (b6, b3 ).
Virtual exit and next nodes are created to represent, respectively, transferring control to the next iteration and exiting
the loop (lines 4 and 5). For any back-edge (bi , b j ) of lh in the CFG, we add a corresponding edge in the DAG, from bi to
the virtual next node (line 6). Similarly, for any exit edge (bi , b j ) of lh in the CFG we add a corresponding edge in the
DAG, from bi to the virtual exit node (line 7). In our example, we have an edge (b1, exit) and an edge (b3, next).
Inner loops are handled by the for in lines 9–14. For each loop lh ′ directly in lh , we create a hierarchical node Lh ′ .
For each exit edge (bi , b j ) of lh′ , an edge (Lh ′ , b j ) is created (line 11) and for each entry edge (bi , b j ), an edge (bi , Lh ′ ) is
created (line 12). In our example, a hierarchical node Lb2 is created to represent the loop lb2 (which is directly in loop
lb1 ) and we also create edges (b1, Lb2 ) and (Lb2 , b3 ).
We assumed in Section 3 that the whole CFG is the body of a (fictive) loop ⊤. Therefore, the whole CFG can also be
transformed into a DAG using Algorithm 1. It produces a hierarchy of DAGs corresponding to the CFG containing only
reducible loops.
Note that similar algorithms have been proposed in [22]. However, the most notable difference between the work
presented in [22] and our approach, is that while our transformation may not preserve the semantics of the program,
we guarantee that it does not decrease the execution time. On the contrary, the method proposed in [22] guarantees the
preservation of the program semantics, but not the execution time.
Symbolic WCET computation 7

Algorithm 1 Loop to DAG


.
1: function DAG(G =< B, E >, lh ∈ LG ∪ ⊤)
2: Bd = {n |lh immediately contains n }
3: Ed = {(bi , b j ) |bi ∈ Bd ∧ b j ∈ Bd ∧ (bi , b j ) ∈ E }
4: n ← new virtual node (next)
5: e ← new virtual node (exit)
6: sn ← {(bi , n) | ∃b j , (bi , b j ) back-edge of lh }
7: se ← {(bi , e) | ∃b j , (bi , b j ) exit-edge of lh }
8: (v , i, o) ← (∅, ∅, ∅)
9: for each loop lh ′ directly in lh do
10: Lh ′ ← new hierarchical node
11: i h ′ ← {(Lh ′ , b j ) | ∃b j , (bi , b j ) exit-edge of lh ′ }
12: oh ′ ← {(bi , Lh ′ ) | ∃bi , (bi , b j ) entry-edge of lh ′ }
13: (v , i, o) ← (v ∪ {Lh ′ }, i ∪ i h′ , o ∪ oh′ )
14: end for
15: d ←< Bd ∪ v ∪ {n, e }, Ed ∪ i ∪ o ∪ {sn, se } >
16: return (d , n, e)
17: end function

4.2.2 Tree construction (Algorithm 2). First, we introduce the notion of forced passage nodes, upon which the recursive
structure of our algorithm relies. Intuitively, these correspond to the set of nodes that appear in every path to the end
node of a DAG.

Definition 6. Let D a DAG. Let start the start node and end an exit node of D. The set of forced passage nodes of D
towards e, denoted f orced(D, e), is defined as:

f orced(D, e) = {n ∈ D|start ≫ n ∧ n ≫ end} \ start

The function MakeCFT described by Algorithm 2 builds recursively a control-flow tree from a DAG. Notice that
this function takes as arguments a start node and an end node. This is because in some cases it is useful to build the
control-flow tree representing paths between two arbitrary nodes that are different from the entry and exit nodes of the
DAG (see the different recursive calls in the algorithm for details).
Function MakeCFT returns a Seq node. The list of children for this Seq node is contained in variable ch. We will call
this Seq node the current sequential node.
We denote as N the set of forced passage nodes towards end. In the while loop (lines 7 to 19), the algorithm goes
through N in reverse dominance order (i.e from the end to the start). Since we must pass through all nodes in N , it is
clear that each node in N must be a leaf child of the current sequential node (line 18). As an example, consider the tree
obtained from the example of Figure 2a, which is represented in Figure 2c. During each iteration of loop lb1 , we are
forced to pass through b1 and b3 , so N = {b1, b3 }. Therefore, the control-flow tree has a Seq node as root, with children
b1 and b3 , as well as an Alt node whose construction is explained below.
If there exists multiple possible paths between two adjacent forced passage nodes (line 10) then an Alt node must
be added to the ch list. We construct a tree for each possible predecessor by recursively calling MakeCFT, and the Alt
node contains these trees as children (lines 13 to 15). In our example, the node b3 has two predecessors, Lb2 and b6 . The
control-flow trees corresponding to these two predecessors are respectively Leaf(Lb2 ) and Leaf(b6 ).
In lines 20 to 25, the algorithm deals with inner loops. Inner loops have previously been added to the ch list as
hierarchical Leaf nodes. Here, they are replaced by control-flow trees representing these loops. Such a tree is composed
of two parts, in sequence. The first part is the loop body (line 22), representing all the iterations of the loop. The second
part is the loop exit ex (line 23), which represents the paths from the last execution of the loop header, to the loop exit.
8 Clément Ballabriga, Julien Forget, and Giuseppe Lipari

For instance, in Figure 1b the sub-tree depicted in gray replaces the hierarchical node Lb2 . The left part of this sub-tree
corresponds to the body of loop lb2 , while the right part (below the dashed edge) corresponds to the exit of loop lb2 .
We note that in the algorithm, sometimes a single basic block can be represented by several Leaf nodes. When
such duplication occurs, we rename the duplicated basic block(s) such that each Leaf node has an unique label. This
guarantees that two different paths in the tree are always identified by different sequences of Leaf nodes.

Algorithm 2 DAG to control-flow tree

1: function MakeCFT( D, start, end) 15: end for


2: ch ← ∅ 16: ch ← ch ∪ Alt(br)
3: N ← f or ced (D, end) 17: end if
4: if start has no predecessors then 18: ch ← ch ∪ Leaf(c)
5: ch ← {Leaf(start)} 19: end while
6: end if 20: for all Leaf(c) ∈ ch, c representing lh do
7: while N , ∅ do 21: (D ′ , n, e) ← DAG(lh )
8: Pick c from N such that ∀n ∈ N, n ≫ c 22: bd ← MakeCFTD ′ , h, n)
9: N ← N \c 23: ex ← MakeCFTD ′ , h, e)
10: if c has at least 2 predecessors then 24: Replace Leaf(c) by Loop(h, bd, x h , ex)
11: br ← ∅ 25: end for
12: ncd ← imm. dominator of c in N 26: return Seq(ch)
13: for p in pr edecessor s(c) do 27: end function
14: br ← br ∪ MakeCFT(D, ncd , p)

4.3 Execution paths in CFG and Control-flow Tree


We will now establish a correspondence between CFG execution paths and tree execution paths. This subsection
contains the general idea and definitions. For a complete proof, see Appendix A.
First, we denote gpaths(G, e) the function that, given a graph G and a node e, returns the set of execution paths
{p1, . . . , pk } from the graph entry to the node e.
Second, a tree execution path is defined as a sequence of leaf nodes of the tree. We use the same notation for paths in
the CFG and for paths in the tree, with the obvious correspondence between leaf nodes and basic blocks. The function
tpaths(t) returns the set of tree execution paths of control-flow tree t. It is defined as follows:

Definition 7. Let t be a Control-flow tree. The set of feasible execution paths of t, denoted tpaths(t), is defined
inductively as follows:

tpaths(Leaf(b)) = {b }
tpaths(Seq(t 1 , . . . , t n )) = {p | ∃p1 ∈ tpaths(t 1 ), . . . , ∃pn ∈ tpaths(t n ), p = p1 @ . . . @pn }
tpaths(Loop(h, tb , n, t e )) = {p | ∃p1 , . . . , pn ∈ tpaths(tb ), ∃pe ∈ tpaths(t e ), p = p1 @ . . . @pn @pe }
Ø
tpaths(Alt(t 1 , . . . , t n )) = tpaths(t i )
1≤i ≤n

Let us denote Ds and De respectively the start and exit nodes of DAG D. Let G e denote the exit node of G. The
following theorem states the correctness of our translation from a CFG to a Control-flow Tree: any execution path
in the CFG is also an execution path in the corresponding Control-flow Tree. However, some paths that are valid in
the tree may not be valid in the CFG, therefore, the two representations are not equivalent. Still, this is safe, since the
presence of additional paths in the CFT can only lead to an over-approximation of the WCET.
Symbolic WCET computation 9

Lb 2

Lb 2 Seq b2

Seq b2 b2 Alt

b4 h b4m (b4m , lb2 , 1)


b2 b4
(a) Tree before annotation (b) Tree after annotation

Fig. 3. Context annotations

Theorem 4.1. Let G be a CFG. Let D = DAG(G, ⊤) and let t = MakeCFT(D, Ds , De ). We have:

gpaths(G, G e ) ⊆ tpaths(t)

Proof. See Appendix A for details. □

5 CONTEXT-SENSITIVE EXECUTION TIME


We now enrich the control-flow tree with context annotations designed to represent the result of extra-CFG analyses,
that will help us reduce the pessimism in WCET estimation.

5.1 Context annotations


A context annotation constrains the conditions under which a sub-tree can be executed. In this work, annotations only
represent constraints related to loops, which is usually the main source of WCET variability. Note that with IPET-based
approaches, this information would be represented by an ILP constraint. We will detail the role of context annotations
in parametric WCET in Section 6.

Definition 8. A context annotation is a tuple (t, l, m), where t is a tree, l refers to an external loop (i.e., l =
Loop(h, tb , xh , te ) is a loop such that t is contained within the loop body tb ), and m is the maximum number of times t can
be executed each time l is entered. The null annotation is denoted by (t, ⊤, ∞).
Let ann(t) be the annotation on the root of tree t and let ann∗ (t) be the set of annotations on all nodes in t (including
root t).
We define occ(P, p) occ : 2 P × P → N as the function that returns the number of occurrences of any path in P inside
path p.
Let t be a control-flow tree with context annotations. The previous definition of feasible execution paths is altered as
follows:

tpaths(Loop(h, t 1, n, t 2 )) = {p|∃p1, . . . , pn ∈ tpaths(t 1 ), pe ∈ tpaths(t 2 ), p = p1 @ . . . @pn @pe ,


∀(t, lh , m) ∈ ann∗ (t 1 ), occ(tpaths(t), p) ≤ m}

We motivate the need to represent context-sensitive information by using two examples. First, let us consider a
triangular loop: a for loop i = 1..10, containing an inner for loop j = i..10. The maximum iteration count for each loop
Í10
considered separately is 10, but the inner loop body can be executed at most i=1 i times. Knowing this information
10 Clément Ballabriga, Julien Forget, and Giuseppe Lipari

will enable us to produce a tighter WCET estimation. To model this example, we have a Leaf(b) node representing the
block inside the inner loop. This node has an annotation (Leaf(b), lout er , 55) where lout er represents the outer loop.
This annotation represent the fact that, due to the triangular loop, the block b can be executed at most 10 j=1 j = 55
Í

times in a complete execution of lout er .


As a second example, we consider the instruction cache analysis by categorization. In this approach, blocks can be
categorized as persistent with respect to a loop (for the sake of simplicity, we assume that each basic block matches
exactly a cache block), meaning that the block will stay in the cache during the whole execution of the loop (only
the first execution results in a cache miss). For instance, in the control-flow tree of Figure 3a, let us assume that the
block corresponding to Leaf(b4 ) is persistent. For every complete execution of loop lb2 , b4 can only cause a cache miss
once. Thus the execution time of b4 must account for the cache miss only once per complete execution of loop lb2 . To
model this example, we proceed in two steps. First, we modify the CFT by splitting the block b4 from Figure 3a into two
(virtual) leaves, representing respectively the cache hit and cache miss cases. This is shown in Figure 3b: Leaf(b4m )
corresponds to the miss and Leaf(b4 h ) to the hit. Then, we add an annotation (b4m , lb2 , 1) to represent the fact that b4m
can be executed only once per execution of loop lb2 .
Due to context annotations, some structurally feasible paths are now unfeasible. As an example, in the tree of
Figure 3b, path {b2, b4m , b2, b4m , b2 } is feasible if we ignore annotations. However, taking context annotations into
account, this path it is not.
Context annotations are intended to be a generic tool to model various WCET-related effects (hardware, and software),
therefore the exact way to generate those annotations will depend on the effect we want to model (and on the underlying
analysis). Furthermore, as shown with the cache example above, it may be necessary to modify the CFT to represent
some constraints. In the future, we might use other CFT transformations to represent other types of constraints (not
necessarily only duplication).

5.2 Abstract WCET


Due to context annotations, the WCET of a segment of code that is executed iteratively can vary at each iteration. We
introduce the concept of abstract WCET to represent the set of WCETs associated with a tree node. Abstract WCETs are
defined using multi-sets, a generalization of sets where multiple instances of the same element are allowed. The number
of instances of some element k in the multiset is denote m(k) and called its multiplicity. In our context, we consider that
the smallest element of the multiset has an implicit infinite multiplicity. We recall below some definitions on multi-sets:

Definition 9. Let N# denote the set of multi-sets over N. Let η, η ′ ∈ N# and let n ∈ N. The following operations are
defined on multi-sets:
• η[n], denotes the (n + 1)-th greatest element of η, i.e. |{k |k ∈ η, k > η[n]}| ≤ n < |{k |k ∈ η, k > η[n]}| + m(η[n]).
For instance, if η = {4, 3, 3} then η[0] = 4, η[1] = η[2] = 3, ...;
• η|n denotes the multi-set that contains the n greatest elements of η (i.e. η[0], . . . , η[n − 1]) and an infinite number of
zeros;
• η ⊎ η ′ is a modified version of the traditional multi-set sum, which we will denote ⊎t r ad . Like ⊎t r ad , ⊎ sums
multiplicities. The difference is as follows. Let minη , minη ′ denote respectively the smallest elements of η and η ′ .
Then, we have: η ⊎ η ′ = η ⊎t r ad η ′ \{k |k ≤ max(minη , minη ′ ). So for instance, {8, 8, 4} ⊎ {9, 8, 3, 2} = {9, 8, 8, 8, 4};
• η ⊗ k denotes the multi-set for which each member has k times the multiplicity it has in η;
• η ′′ = η ⊕ η ′ is the multi-set such that: ∀i ∈ N, η ′′ [i] = η[i] + η ′ [i].
Symbolic WCET computation 11

The notion of abstract WCET is now defined as follows:

Definition 10. For any tree t, its abstract WCET is a pair α = (l, η), where l is a loop and η is a multi-set over N. The
presence of an integer n in η means that the code associated with t may have an execution time n, but only once, each time l
is entered.

For instance, in our cache example from Figure 3b, the abstract WCET computed for the Alt node would be
(lb2 , {time(b4m ), time(b4 h ), time(b4 h ), time(b4 h ), ...}), meaning that the WCET of that node is time(b4m ) for the first
iteration of loop lb2 and then it is time(b4 h ) for all subsequent iterations of the loop. Note that, if we exit and re-enter
the loop, the WCET of the Alt node will again be time(b4m ), then time(b4 h ), time(b4 h ), etc.
The abstract WCET for an expression t ∈ T is computed by applying the evaluation function γ : T → LG × N# ,
defined below, using helper function ω(t):
γ (t) = (l, η) where l = l 1 ⊓ l 2 , η = η 1 |n , (l 1, η 1 ) = ω(t) and (t, l 2, n) = ann(t).
ω(t) computes the abstract WCET without considering the annotation on the root node of t, and then γ (t) computes
the abstract WCET resulting from the application of the annotation over t (if any). Notice that, if no annotation is
defined over t, then ann(t) = (t, ⊤, ∞); as a consequence l ⊓ ⊤ = l, η|∞ = η, and γ (t) = ω(t).
We now define ω(t) for the different cases. First, when t = Leaf(b), the WCET of the basic block b is repeated an
infinite number of times. In formula:
ω(t) = (⊤, {time(b)} ⊗ ∞)
The idea behind the processing of Alt nodes is based on the following observation: the worst-case scenario for
multiple executions of the Alt node may involve execution of different children. Therefore, we need to merge the
multi-sets resulting from t 1, . . . , tn . In formula, when t = Alt(t 1, . . . , tn ):
ω(t) = (l 1 ⊓ · · · ⊓ ln , η 1 ⊎ · · · ⊎ ηn ) where (l 1, η 1 ) = γ (t 1 ), . . . , (ln , ηn ) = γ (tn ).

Example 5.1. Let us consider an Alt node with two children t 1 and t 2 , such that γ (t 1 ) = (l, {5, 4, 2, 1}) and γ (t 2 ) =
(l, {6, 2}). The first time the Alt node is executed, the WCET will be 6 (from t 2 ), the second time it will be 5 (from
t 1 ), then 4, and so on. As such, we compute the abstract WCET for the Alt node by taking the union of the multi-set
components of the two children abstract WCET. Therefore, in our example, ω(t) = (l, {6, 5, 4, 2, 2}).

When t = Seq(t 1, . . . , tn ), we make the following observation: for any n, the worst-case time for n executions of the
Seq node is equal to the worst-case time for n executions of t 1 plus the worst-case time for n executions of t 2 and so on.
In formula:
ω(t) = (l 1 ⊓ · · · ⊓ ln , η 1 ⊕ · · · ⊕ ηn ) where (l 1, η 1 ) = γ (t 1 ), . . . , (ln , ηn ) = γ (tn ).

Example 5.2. Let us consider a Seq node with two children t 1 and t 2 , such that γ (t 1 ) = (l, {5, 4}) and γ (t 2 ) = (l, {2, 1}).
The first time the Seq node is executed, its WCET will be 5+2 = 7, the second time it will be 4+1 = 5. As such, we compute
the abstract WCET for the Seq node by adding elements of corresponding ranks. In the example, ω(t) = (l, {7, 5}).

When t = Loop(h, t 1, xh , t 2 ), let (l 1, η 1 ) = γ (t 1 ) and (l 2, η 2 ) = γ (t 2 ). Two different cases must be considered1 . If lh is


the loop component of the abstract WCET of t 1 (case l 1 ≡ lh ), then the execution time of t 1 is a fixed value. In this case,
the worst-case time for one execution of the Loop node is always the worst case execution time for xh executions of the
loop body t 1 .
1 Notice that, by definition of context annotation, it is not possible to have lh ≡ l 2 .
12 Clément Ballabriga, Julien Forget, and Giuseppe Lipari

Otherwise, l 1 represents a loop that contains the currently processed Loop node. As in the previous case, the worst-
case execution time for one execution of the Loop node is the worst-case execution time for xh executions of the loop
body t 1 . However, since l 1 refers to an outer loop, successive executions of the Loop node yield different execution
times, and these are summed together in groups of xh elements.
To summarize, in formula:

Íx h −1
 (l 2, ({ i=0 η 1 [i]} ⊗ +∞) ⊕ η 2 ) if lh ≡ l 1


ω(t) =

 (l 1 ⊓ l 2, η ⊕ η 2 )
 otherwise

Íi ·x +x −1
where (l 1, η 1 ) = γ (t 1 ) and (l 2, η 2 ) = γ (t 2 ) and η[i] = j=ih·x h η 1 [j].
h

Example 5.3. Let γ (t 1 ) = (lh , {5, 4, 3}) (case lh ≡ l 1 ), let the loop bound xh = 2 and let t 2 be empty. Then the execution
time for one execution of the loop is always 5 + 4 = 9 (the sum of the xh first ranks of the multi-set) and we have
ω(t) = (⊤, {9} ⊗ ∞).

Example 5.4. Let γ (t 1 ) = (l 1, {5, 4, 3, 2}) (case lh . l 1 ), let the loop bound xh = 2 and let t 2 be empty. Then the first
execution of the loop will yield execution time 5 + 4 = 9 (the sum of the first xh ranks of the multi-set), while the
second execution will yield execution time 3 + 2 = 5 (the sum of the subsequent xh ranks of the multi-set). Therefore,
ω(t) = (l 1, {9, 5}).

Notice that we make pessimistic simplifications concerning the loop component in the computation of ω and γ .
Consider, the computation for t = Alt(t 1, . . . , tn ) for instance. The WCET of t 1, . . . , tn may depend on different loops,
but keeping track of all these loops in the WCET of t would be very complex. So, as a simplification, we only keep
track of the greatest lower bound of these loops (the loop that most immediately contains t). This is also true in other
cases. However, this approximation is safe (see the proof of Theorem B for details) and has a low impact on WCET
over-approximation (see Section 7).

5.3 From abstract to concrete WCET


We will now detail how to evaluate the WCET of a tree t inside a loop l. Suppose that t is executed n times and that its
abstract WCET is γ (t) = (l, η). The execution time for each individual execution of t depends on the number of times
it was executed after the last time l was entered. Let e be the number of times l was entered, and let us assume that
the n execution of t are distributed uniformly across all e executions of l (this is a realistic assumption because our
computation method ensures that iterating every loop to the maximum results in the longest execution time).

Definition 11. Let t be a control-flow tree and let γ (t) = (l, η). The concrete WCET of t in the scenario where t is
executed n times and the loop l is executed e times, where e and n are strictly positive and n is a multiple of e, is computed
as: ni=1 (η ⊗ e)[i]
Í

This definition applies to any node of the tree. To compute the WCET of a complete program represented by tree t,
we apply the formula with n = e = 1, since we are only interested in one execution of the program. The WCET of the
Í1
program is thus computed as i=1 (η ⊗ 1)[i] = η[1]. The following theorem establishes the soundness of our WCET
evaluation method.
Symbolic WCET computation 13

Theorem 5.5. Let G a CFG. Let D = DAG(G, ⊤) and let t = MakeCFT(D, Ds , De ). Let (l, η) = γ (t). We have:
∀p ∈ gpaths(G, G e ), time(p) ≤ η[1]

Proof. See Appendix B for details. □

6 SYMBOLIC COMPUTATION
In this section we study the problem of computing the abstract WCET of a tree when some parameters of the tree are
unknown (loop bounds for instance, but not only). We show that, using simple syntactic sugaring, our definition of ω(t)
produces formulae akin to arithmetic expressions. Then we rely on existing work on symbolic computation of arithmetic
expressions to simplify abstract WCET formulae. The simplification step is mainly useful in case of on-line formula
evaluation. It reduces memory overhead (since formulae must be part of the embedded code) as well as execution time
overhead (since formulae must be evaluated at each task instanciation).

6.1 Abstract WCET formulae


First, we introduce several operators on abstract WCET, which act as syntactic sugar, to be able to express WCET
computation as arithmetic computation.

Definition 12. Let t 1 and t 2 be two control-flow trees. We define a set of operations on abstract WCET such that:

ω(t 1 ) ⊕ ω(t 2 ) = ω(Seq(t 1, t 2 ))


ω(t 1 ) ⊎ ω(t 2 ) = ω(Alt(t 1, t 2 ))
(ω(t 1 ), ω(t 2 ), h)x h = ω(Loop(h, t 1, xh , t 2 ))
ω(t 1 ) ↓(h,n) = γ (t 1 ) (where ann(t 1 ) = (t 1, lh , n))
n ⊙ (l, η) = (l, η ′ ) (where ∀i, η ′ [i] = η[i] × n)
k ∞ = {k} ⊗ ∞

Furthermore, we let θ ≡ (⊤, 0∞ ). We define the following grammar to represent the set of formulae W corresponding
to the computation of the abstract WCET of a control-flow tree (w ∈ W):

w ::= const | id | w ↓(h,it ) | w ⊕ w | w ⊎ w | (w, w, b)it


h ::= b | id
it ::= i | id
The simplest formula is a constant abstract WCET value (const ∈ (LG × N # )). A formula can also be a variable
corresponding to an unknown WCET value (id). A formula can also be the sum (w ⊕ w), the product (w ⊎ w) or the
repetition of two formulae ((w, w, b)it ). Finally, a formula can also consist of the application of an annotation to a
formula (w ↓(h,it ) ). The factor of a repetition and the factor of an annotation (it) can either be a constant integer value
(i) or a variable (id). The loop header of an annotation (h) can either be a basic block name (b) or a variable (id).

6.2 Symbolic values


As we can see, several elements of these formulae can be symbolic values (denoted by id), i.e. variable parameters:
symbolic WCET value (w), symbolic loop iteration bound (it), symbolic loop header (h). Let us now illustrate how these
14 Clément Ballabriga, Julien Forget, and Giuseppe Lipari

symbolic values can be used to model various WCET variation sources. A simple example is the case where the number
of iterations of a loop depends on an input of the system. The WCET of the loop is statically evaluated to (ω 1, ω 2, h)n ,
where n is a symbolic value. The value of n is computed dynamically and the WCET of the loop is deduced from this
value.
As a second example, we discuss how to perform a modular WCET analysis, in the case where the program contains
a call to a dynamic library. Assume for instance that the library call is in the else branch of an i f − then − else and
that the then branch has a constant WCET of 5. The WCET is statically evaluated to (⊤, {5} ⊗ ∞) ⊎ ω, where ω is a
symbolic value. We perform a separate analysis on the different programs the dynamic library call can correspond to,
so we obtain a different WCET for each possibility. At program execution, we replace ω by the WCET corresponding to
the library that is actually called and deduce the program WCET.
As a last example we discuss how to take into account the results of an instruction cache analysis. Let us consider
the execution of a multi-task system with a non-preemptive scheduler. In such a system, though the hardware provides
no means to consult the exact cache state, it can be approximated to an abstract cache state using the techniques of [2].
In some cases, the category of a block, that is to say whether the execution of the block will result in a miss or in a hit,
depends on the content of the cache at the beginning of the execution of the task containing it. As a consequence, the
block category cannot be determined statically, however it can be determined dynamically based on the abstract cache
state at the beginning of the task execution. In Figure 3b, we have shown how to use context annotations to model a
persistent block. Similarly, to model a block with a non-static category, we split the block into a hit and a miss alternative,
and add annotations on both alternatives. So the WCET formula for this block will be: (ωhit ) ↓h,n1 ⊎ (ωmiss ) ↓h,n2 ,
where n 1 and n 2 are symbolic values. At the beginning of the task execution, we determine the values of n 1 and n 2
based on the abstract cache content and deduce the task WCET. A similar approach can be used to take into account
data-cache analysis and branch prediction.
More generally, we believe that symbolic WCET evaluation is a powerful generic tool with many potential applications.
The focus of this paper however, is to present the general framework. Potential applications will be the subject of future
work.
Concerning the limitations of our approach, currently we cannot specify constraints relating different symbolic
values, which may prevent some simplifications in WCET formula. For instance, a single parameter in the program
external context (e.g. the data-cache size) may introduce several separate symbolic values in the WCET formula (e.g. the
WCET of each basic-block whose WCET is impacted by the data-cache size will become a symbolic value). Handling
such related symbolic values is clearly also an important topic for future work.
A second limitation is that some extra-CFG analyses information may be difficult to represent using context-
annotations, such as for instance the results of CCG analysis [15].

6.3 Formula simplification


When variables appear in a WCET formula, we cannot reduce the formula to a constant abstract WCET value. However,
in many cases the formula can be transformed into a simpler, yet equivalent formula. For instance, we have: (x ⊕ 2 ⊙
x) ⊕ 3 ⊙ x ⊕ y = 6 ⊙ x ⊕ y
Figure 4 lists all the rewriting rules we use in order to simplify WCET formulae. Most of them are direct transpositions
of integer arithmetic simplification rules [9] to the case of WCET formulae. We make the following comments:
Symbolic WCET computation 15

Associativity. Neutral element.


(w 1 ⊕ w 2 ) ⊕ w 3 7→ w 1 ⊕ w 2 ⊕ w 3 (1) w 1 ⊕ θ 7→ w 1 (8)
w 1 ⊕ (w 2 ⊕ w 3 ) 7→ w 1 ⊕ w 2 ⊕ w 3 (2) w 1 ⊎ θ 7→ w 1 (9)
(w 1 ⊎ w 2 )⊎3 7→ w 1 ⊎ w 2 ⊎ w 3 (3) Multiplication.
w 1 ⊎ (w 2 ⊎ w 3 ) 7→ w 1 ⊎ w 2 ⊎ w 3 (4) 0 ⊙ w 1 7→ θ (10)
Commutativity. (ki ⊙ w 1 ) ⊕ w 1 7→ (ki + 1) ⊙ w 1 (11)
(w 1 ⊕ w 2 ) 7→ (w 2 ⊕ w 1 ) if w 2 ◁ w 1 (5) Annotation.
(w 1 ⊎ w 2 ) 7→ (w 2 ⊎ w 1 ) if w 2 ◁ w 1 (6) θ ↓(h,it ) 7→ θ (12)
Distributivity. w 1 ↓(h,it ) ⊕ w 2 ↓(h,it ) 7→ (w 1 ⊕ w 2 ) ↓(h,it ) (13)
(cst 1 ⊕ w 3 ) ⊎ (cst 2 ⊕ w 3 ) 7→ Loop.
(cst 1 ⊎ cst 2 ) ⊕ w 3 (7) (w 1, w 2, b)it 7→ (w 1, θ, b)it ⊕ w 2 (14)

Fig. 4. Abstract WCET formula rewriting rules

• We rely on an order relation ◁ on formulae, so as to ensure that the commutativity rules can only be applied in
one direction for two given formulae. Classically, the order relation is defined based on the syntactic structure of
the formulae (see e.g. [9] for details);
• Distributivity is applied in reverse order and only to factor constant terms;
• Concerning the annotation rewriting rule, the strategy consists in reducing the number of annotation applications;
• Concerning the loop rule, since we have no rule for combining loops, we only extract the loop exit tree from the
loop;
• Combination of constant formulae is not detailed here but is applied as well. For instance, (l, 2∞ ) ⊕ (l, 3∞ ) is
simplified to (l, 5∞ ).
Let R denote the rewriting system consisting of all of these rewriting rules. Let w 1, w 2 two WCET formulae. We
write w 1 7→ R w 2 , or simply w 1 7→ w 2 when w 1 rewrites to w 2 using a single rule of R. We write w 1 7→∗ w 2 when w 1
rewrites to w 2 using a sequence of rules of R. Let ρ denote a variable mapping, that is to say a set of substitutions of the
form id → v where id is an identifier and v is a value. Let ρ(w) denote the result of the substitution of variables of
w by their values in ρ. We assume that ρ maps identifiers to values of the correct type, meaning that it maps WCET
identifiers to WCET values, loop identifiers to loop headers and integer identifier to integer values. We say that ρ is a
complete mapping with respect to formula w when it maps all variables of w to a value.

Lemma 6.1. Let w 1, w 2 ∈ W. Let ρ a complete variable mapping of w 1 . We have:

w 1 7→∗ w 2 ⇒ ρ(w 1 ) = ρ(w 2 )

Proof. We must prove that, for each rewriting rule, the formula on the left of the rule is equivalent to the formula
on the right. Most rules are trivial to prove and rely on arithmetic properties on integer multi-sets. We only detail the
proof for rules on annotations and loops.
Rule 13. Let (l 1, η 1 ) = w 1 and (l 2, η 2 ) = w 2 .

(w 1 ⊕ w 2 ) ↓(h,it ) = (l 1 ⊓ l 2, (η 1 ⊕ η 2 )|it ) = (l 1 ⊓ l 2, η 1 |it ⊕ η 2 |it )


= (l 1, η 1 |it ) ⊕ (l 2, η 2 |it ) = w 1 ↓(h,it ) ⊕ w 2 ↓(h,it )
16 Clément Ballabriga, Julien Forget, and Giuseppe Lipari

Rule 14. Let (l 1, η 1 ) = w 1 and (l 2, η 2 ) = w 2 .


By definition of the ω function on Loop nodes, we see that the computation result for (w 1, w 2, b)it is of the form
(⟨loop⟩, ⟨expression⟩ ⊕ η 2 ). Therefore, let us define (l, η) such that (w 1, w 2, b)it = (l, η ⊕ η 2 ).
If l 1 = b then:

(w 1, w 2, b)it = (l 2, η ⊕ η 2 ) = (⊤, η ⊕ 0∞ ) ⊕ (l 2, η 2 ) = (w 1, θ, b)it ⊕ w 2

If l 1 , b then:

(w 1, w 2, b)it = (l 1 ⊓ l 2, η ⊕ η 2 ) = (l 1 ⊓ l 2, η ⊕ 0∞ ) ⊕ (l 2, η 2 )
= (l 1, η ⊕ 0∞ ) ⊕ (l 2, η 2 ) = (w 1, θ, b)it ⊕ w 2 .

This concludes the proof. □

The following Lemma states that recursive applications R to a given formula w eventually reach a fixed-point and
always produce the same formula w ′ .

Lemma 6.2. R is convergent.

Proof. R is convergent if it terminates and it is confluent. The reader can refer to [1] for more detailed definitions
and proof strategies that we use here.
Termination. We note that for each rule l 7→ r of R, we have either of the following properties:
• Let op(w) denote the sum of the number of operators ⊕, ⊎, ⊙, | in w. Then, we have op(l) < op(r ) (for the following
rules: distributivity, neutral element, multiplication with an integer, annotation);
• The number of parenthesis is less in l than in r (for associativity rules);
• l ◁ r (for commutativity rules);
• Let us extend op by defining op((w 1, w 2, h)k ) = (k + 1) ∗ (op(w 1 ) + op(w 2 )). Then op(l) < op(r ) (for loop rules).
Based on these properties, we can define a strict order relation ≺ on formulae such that, for each rule l 7→ r we have
l ≺ r . As a consequence R terminates.
Confluence. As R terminates, we only need to prove that its overlapping rules are locally confluent. Two rules
l 1 7→ r 1 and l 2 7→ r 2 overlap if there exists a sub-term s 1 of l 1 (resp. s 2 of l 2 ) that is not a variable, and a unifier (a term
substitution) u such that u(s 1 ) = u(l 2 ) (resp. u(s 2 ) = u(l 1 )). Unification is applied after renaming variables such that
V ars(l 1 ) ∩ V ars(l 2 ) = ∅. For instance, rules 1 and 2 overlap: we have two different possible sequences of re-writings for
formula (w 1 ⊕ (w 2 ⊕ w 3 )) ⊕ w 4 :

(w 1 ⊕ (w 2 ⊕ w 3 )) ⊕ w 4 7→ (w 1 + w 2 + w 3 ) + w 4 (rule 2)
7→ w 1 + w 2 + w 3 + w 4 (rule 1)

(w 1 ⊕ (w 2 ⊕ w 3 )) ⊕ w 4 7→ w 1 + (w 2 + w 3 ) + w 4 (rule 1)
7→ w 1 + w 2 + w 3 + w 4 (rule 2)

As both sequences produce the same formula, these overlapping rules are locally confluent.
Symbolic WCET computation 17

We do not detail the proof for the remaining overlapping rules, since it is very similar to the case we just presented.
We only list them below:

(w 1 ⊎ (w 2 ⊎ w 3 )) ⊎ w 4 (3 and 4)
(w 1 ⊕ w 2 ) ⊕ w 3 if w 2 ◁ w 1 (1 and 5)
w 1 ⊕ (w 2 ⊕ w 3 ) if w 3 ◁ w 2 (2 and 5)
(w 1 ⊎ w 2 ) ⊎ w 3 if w 2 ◁ w 1 (3 and 6)
w 1 ⊎ (w 2 ⊎ w 3 ) if w 3 ◁ w 2 (4 and 6)
(cst 1 ⊕ (w 3 ⊕ w 4 )) ⊎ (cst 2 ⊕ (w 3 ⊕ w 4 ) (2 and 7)
(cst 1 ⊕ w 3 ) ⊎ (cst 2 ⊕ w 3 ) if w 3 ◁ cst 1 ∨ w3 ◁ cst 2 (5 and 7)
(w 1 ⊕ θ ) ⊕ w 2 (1 and 8)
w 1 ⊕ (θ ⊕ w 2 ) (2 and 8)
(w 1 ⊎ θ ) ⊎ w 2 (3 and 9)
w 1 ⊎ (θ ⊎ w 2 ) (4 and 9)
w 1 ↓(h,it ) ⊕ w 2 ↓(h,it ) if w 2 ◁ w 1 (13 and 5)

This concludes the proof. □

To summarize, we enumerate below the steps of the computation of the WCET of a program with our approach. Steps
1 to 4 correspond to the computation of the parametric WCET formula. Steps 5 and 6 correspond to the computation of
the actual WCET for some specific parameter values:
(1) Translate the program CFG to a CFT t;
(2) Add extra-CFG analyses results as context annotations;
(3) Compute w = γ (t);
(4) Simplify w into w ′ using rewriting rules;
(5) Replace parameters by their values and obtain w ′′ , with w ′′ = (l, η);
(6) Return η[1].

7 EXPERIMENTS
The benchmarks we selected for our experiments are summarized in Table 1. For each benchmark, we mention its source
(ML for Mälardalen, TB for TACleBench, or PB for PapaBench), provide a short description of the kind of algorithm it
performs and specify the function whose WCET is analyzed. We only introduce one parameter per benchmark because
precision is independent of the number of parameters in our approach. The analyses have been executed on a PC with
an Intel core i5 3470 at 3.2 Ghz, with 8 Gb of RAM. Every benchmark has been compiled with ARM crosstool-NG 1.20.0
(gcc version 4.9.1) with -O1 optimization level.
The results of our experiments are shown in Table 2. First, we detail the size of the WCET formulae computed by our
approach. Column CFG shows the number of basic blocks in the CFG. Column Initial shows the size (the number of
18 Clément Ballabriga, Julien Forget, and Giuseppe Lipari

Bench Source Parameter Algorithm Function


matmult ML Matrix size Matrix multiplication Initialize (twice)
cnt ML Matrix size Matrix sum Sum
fft TB Number of samples FFT main
compress ML Data size Data compression main
lift TB Number of sensors Factory lift control main
adpcm ML Trigo. computation steps ADPCM encoding main
aes_enc TB Data size AES encryption main
powerwindow TB Sensor data input size Car window control main
fbw PB Task activaction count fly-by-wire main
audiobeam TB Audio source count Audio beamforming main
mpeg2 TB Video resolution MPEG2 decoding main
Table 1. Benchmarks summary

Formula size Time (ms) Pessimism (%)


Bench CFG Initial Final Common Us ILP Us Min Max MPA
matmult 111 130 5 1105 1 0 0.01 0.00 3.88 0.31
cnt 153 284 3 2278 2 8 0.15 0.00 3.59 30.4
fft 391 453 8 2968 4 16 0.00 0.00 1.51 -
compress 694 906 3 4760 11 40 0.02 0.01 0.03 -
lift 814 1799 5 5130 19 40 1.51 0.05 2.29 -
adpcm 2032 2211 3 10688 67 272 0.01 0.01 0.33 -
aes_enc 2205 2651 2 4914 30 260 0.04 0.03 0.04 -
powerwindow 3738 4453 24 45702 224 4192 0.01 0.01 1.43 -
fbw 10612 27251 2 36940 1198 8960 2.62 0.03 7.05 -
audiobeam 12299 47248 37 56566 1222 12824 0.12 0.00 0.49 -
mpeg2 38612 1658109 3 267332 12221 > 1 week - - - -
Table 2. Benchmarking results

operands) of the WCET formula before simplification, while Column Final shows the formula size after simplification.
In most cases, the size of the non-simplified formula, which also corresponds to the size of the CFT, is close to the size
of the CFG. Differences are due to the presence of structure-breaking instructions (such as goto, break, continue, return
in the middle of a function), which force basic block aliasing in the CFG to CFT conversion algorithm. This is especially
true for the mpeg2, and to a lesser extent for lift, audiobeam, and fbw benchmarks. For all benchmarks, the size of the
simplified formula is very small and is related to the number of loops whose iteration count depends on the parameter.
Then, we compare our approach with an IPET approach. Comparison is performed according to two criteria: WCET
analysis time, and pessimism of the resulting WCET. The target hardware is an ARM processor with a set-associative
LRU instruction cache (the data cache is not taken into account). The processor pipeline is analyzed with the exegraph
method [19] and the instruction cache is modeled using cache categorization [12]. The target instruction cache used
in the analysis has 64 Kbytes, 16 ways, and blocks of 16 bytes. We chose a small cache to highlight the impact of the
cache on the execution time for such small benchmarks. The instruction cache miss latency was assumed to be 10
Symbolic WCET computation 19

cycles. Each benchmark is analyzed as a standalone task, without any modeling of the operating system. To perform
the preliminary steps of the WCET analysis (program path analysis, CFG building, loop bounds estimation, pipeline
and cache modeling), we rely on OTAWA (version 1.0), an open source WCET computation tool [3]. These steps are
common to the IPET approach and to our approach. For the remaining steps, in the case of the IPET approach, we use
GNU lp_solve ILP solver [5]. Our approach was coded in Python, and executed with PyPy 2.4.0. We took the mean time
for 1000 executions of our algorithm, to compensate for PyPy’s slow start speed. To compare the WCET estimates, we
instantiate our WCET formula by assigning to the parameter the constant value used in the IPET experiment.
The Common column represents the time spent by OTAWA for the preliminary steps (common to IPET and our
approach), while the Us (our approach) and ILP columns correspond to the time spent for the remaining steps. The WCET
evaluation time is essentially linear in the size of the CFT in our approach and noticeably lower than the evaluation
time for the IPET approach. Notice that lp_solve did not find a solution for mpeg2 after one week of execution time.
Furthermore, let us emphasize that computing the WCET for different parameter values with the IPET approach requires
to run the whole analysis (Common+ILP) for each parameter value, while we only need to do the analysis (Common+Us)
once and then instantiate the formula for each parameter value.
WCET pessimism is measured in comparison with the IPET result. The Us column represents the value of the
pessimism with our approach for a fixed value of the parameter (the same value as the one used for the IPET approach).
The Min and Max columns represent respectively the minimum pessimism and maximum pessimism (in percentage)
for varying values of the parameter between 1 and 1000. We observed that, in general, the percentage of pessimism
decreases with the value of the parameter, approximately with an hyperbolic shape. The pessimism of our approach is
much lower than that of the MPA approach (results extracted from [7] are reported in column MPA). It is also extremely
low compared to the IPET approach. Pessimism in our approach can be attributed to the following causes: (1) the
reduced expressiveness of our CFT annotations (as opposed to ILP constraints) and (2) paths existing in the CFT but not
in the CFG. Experiments show that the amount of pessimism does not depend on the size of the CFG.

8 CONCLUSION
In this paper we have presented a novel technique for parametric WCET analysis, which follows a completely new
approach based on symbolic computation of WCET formulas. Experiments show very promising results: execution
time is lower than the traditional non-parametric IPET technique and over-approximation of the WCET (compared to
IPET) is extremely low.
Symbolic WCET computation has many advantages: it greatly reduces the time for analysing the parameters-space
of system; it is modular and easily permits to separately analyse different modules of a system; it allows to efficiently
compute the WCET on-line, thus paving the way to the use of our methodology in adaptive systems.
One of the main limitations of our method is that it is not possible to specify constraints relating different parameters,
which may prevents some simplifications in the formulae. Furthermore, some constraints used in IPET (i.e. some types
of unfeasible paths) cannot be easily represented with context annotations. We plan to extend context annotations in
future works to solve these issues.

REFERENCES
[1] Franz Baader and Tobias Nipkow. Term rewriting and all that. Cambridge University Press, New York, NY, USA, 1998.
[2] Clément Ballabriga, Hugues Cassé, and Marianne De Michiel. A Generic Framework for Blackbox Components in WCET Computation. In 9th
International Workshop on Worst-Case Execution Time Analysis (WCET’09), volume 10, pages 1–12, Dagstuhl, Germany, 2009. Schloss Dagstuhl–
Leibniz-Zentrum fuer Informatik.
20 Clément Ballabriga, Julien Forget, and Giuseppe Lipari

[3] Clément Ballabriga, Hugues Cassé, Christine Rochange, and Pascal Sainrat. Otawa: An open toolbox for adaptive wcet analysis. In Software
Technologies for Embedded and Ubiquitous Systems, volume 6399 of Lecture Notes in Computer Science, pages 35–46. Springer Berlin Heidelberg,
Waidhofen/Ybbs, Austria, 2010.
[4] Bilel Benhamamouch, Bruno Monsuez, and Franck Védrine. Computing wcet using symbolic execution. In Proceedings of the Second International
Conference on Verification and Evaluation of Computer and Communication Systems, VECoS’08, pages 128–139, Swinton, UK, 2008. British Computer
Society.
[5] Michel Berkelaar, Kjell Eikland, and Peter Notebaert. lp_solve 5.5, open source (mixed-integer) linear programming system, May 1 2004.
[6] Armin Biere, Jens Knoop, Laura Kovács, and Jakob Zwirchmayr. The Auspicious Couple: Symbolic Execution and WCET Analysis. In 13th International
Workshop on Worst-Case Execution Time Analysis, volume 30, pages 53–63, Dagstuhl, Germany, 2013. Schloss Dagstuhl–Leibniz-Zentrum fuer
Informatik.
[7] S. Bygde, A. Ermedahl, and B. Lisper. An efficient algorithm for parametric wcet calculation. In 15th IEEE International Conference on Embedded and
Real-Time Computing Systems and Applications, RTCSA’09., pages 13–21, Beijing, China, Aug 2009. IEEE.
[8] Duc-Hiep Chu and Joxan Jaffar. Symbolic simulation on complicated loops for wcet path analysis. In Proceedings of the Ninth ACM International
Conference on Embedded Software, EMSOFT ’11, pages 319–328, New York, NY, USA, 2011. ACM.
[9] J.S. Cohen. Computer Algebra and Symbolic Computation: Mathematical Methods. Number vol. 1 in Ak Peters Series. Peters, Natick, MA, USA, 2002.
[10] Antoine Colin and Guillem Bernat. Scope-tree: A program representation for symbolic worst-case execution time analysis. In 14th Euromicro
Conference on Real-Time Systems (ECRTS), pages 36:1–36:53, Washington, DC, USA, 2002. IEEE.
[11] Paul Feautrier. Parametric integer programming. RAIRO Recherche Opérationnelle, 22:243–268, 1988.
[12] Christian Ferdinand, Florian Martin, Reinhard Wilhelm, and Martin Alt. Cache behavior prediction by abstract interpretation. Sci. Comput. Program.,
35(2):163–189, 1999.
[13] Matthew S Hecht and Jeffrey D Ullman. Flow graph reducibility. In Proceedings of the fourth annual ACM symposium on Theory of computing, pages
238–250, Denver, CO, USA, 1972. ACM.
[14] Johan Janssen and Henk Corporaal. Making graphs reducible with controlled node splitting. ACM Trans. Program. Lang. Syst., 19(6):1031–1052,
November 1997.
[15] Y.-T. S. Li, S. Malik, and A. Wolfe. Cache modeling for real-time software: Beyond direct mapped instruction caches. In Proceedings of the 17th IEEE
Real-Time Systems Symposium, pages 254–263, Washington, DC, USA, 1996. IEEE.
[16] Y-TS Li, Sharad Malik, and Andrew Wolfe. Efficient microarchitecture modeling and path analysis for real-time software. In Proceedings of the 16th
IEEE Real-Time Systems Symposium, pages 298–307, Pisa, Italy, 1995. IEEE.
[17] Sung-Soo Lim, Young Hyun Bae, Gyu Tae Jang, Byung-Do Rhee, Sang Lyul Min, Chang Yun Park, Heonshik Shin, Kunsoo Park, Soo-Mook Moon,
and Chong Sang Kim. An accurate worst case timing analysis for risc processors. IEEE Transactions on Software Engineering, 21(7):593–604, 1995.
[18] S. Mohan, F. Mueller, W. Hawkins, M. Root, C. Healy, and D. Whalley. Parascale: exploiting parametric timing analysis for real-time schedulers and
dynamic voltage scaling. In Proceedings of the 26th IEEE International Real-Time Systems Symposium, pages 232–242, San Antonio, TX, USA, Dec
2005. IEEE.
[19] Christine Rochange and Pascal Sainrat. A context-parameterized model for static analysis of execution times. In Per Stenström, editor, Transactions
on High-Performance Embedded Architectures and Compilers II, volume 5470 of Lecture Notes in Computer Science, pages 222–241. Springer-Verlag,
Berlin, Heidelberg, 2009.
[20] Tao Wei, Jian Mao, Wei Zou, and Yu Chen. A new algorithm for identifying loops in decompilation. In Proceedings of the 14th International Conference
on Static Analysis, SAS’07, pages 170–183, Berlin, Heidelberg, 2007. Springer-Verlag.
[21] Stephan Wilhelm and Björn Wachter. Symbolic state traversal for wcet analysis. In Proceedings of the Seventh ACM International Conference on
Embedded Software, EMSOFT ’09, pages 137–146, New York, NY, USA, 2009. ACM.
[22] Khaled Yakdan, Sebastian Eschweiler, Elmar Gerhards-Padilla, and Matthew Smith. No more gotos: Decompilation using pattern-independent
control-flow structuring and semantics-preserving transformations. In Network and Distributed System Security (NDSS), ISOC, San Diego, CA, USA,
2015. Internet Society.

A CFG TO CFT
In this appendix, we prove the correctness of our translation from a CFG to a CFT. Namely, we prove that any valid
path in the CFG is also a valid path in the CFT.

A.1 Execution paths in a hierarchical DAG


We have already defined the set of feasible execution paths for a CFG (gpaths(G, e)) and for a CFT (tpaths(t)). We
will now define the function dpaths(D) that returns the set of feasible paths of a hierarchical DAG D. Since a DAG
Symbolic WCET computation 21

is a particular case of graph, gpaths() can also be applied to a DAG, however, an important difference between both
functions is that dpaths() explores recursively the sub-paths of hierarchical nodes appearing in the DAG.

Definition 13. Let D be a DAG. The set of execution paths of D is defined as:
Ø
dpaths(D, e) = spaths(p)
p ∈gpaths(D,e)

where
 {q = q 1 @qn |q 1 ∈ spaths(p) ∧ qn ∈ vpaths(n)} (if n is hierarchical)


spaths(p.n) =

 {q = q 1 .n|q 1 ∈ spaths(p)}
 (otherwise)

spaths(ϵ) = {ϵ }

and

vpaths(Lh ) = {p = p1 @...@px h @pe |∀i, 1 ≤ i ≤ xh , pi .hnex t ∈ dpaths(Dh , hnex t )


∧ pe .hex it ∈ dpaths(Dh , hexit )}

where Dh , hnex t and hex it are respectively the DAG, the next node and the exit node corresponding to hierarchical node
Lh .

A.2 Transformation correctness

D 1,1
b D 2,1
a c d i

b1 b2 b3

e f д h
D 1,2 D 2,2

Fig. 5. Decomposing the DAG

We will proceed in two steps: first we will establish a correspondence between DAG execution paths and tree
execution paths, then between CFG execution paths and DAG execution paths.
We will now present a graph decomposition technique on which our proof relies. Let N = {b1, . . . , bn } denote the
set of forced passage nodes of D towards De . Then, D can be decomposed into a set of DAGs Di,j , where j = 1, . . . , i k
is the j-th predecessor of bi+1 . DAG Di,j contains all nodes between bi (excluded) and the j-th predecessor of bi+1
(included), and all related edges. If bi is a hierarchical node, we denote D fi the DAG representing the corresponding
loop (if bi is a basic block, D fi is not defined).
Figure 5 shows such a decomposition. In this example, the forced passage nodes are shown in gray, and their
predecessors are represented by a striped pattern. The DAG is decomposed into sub-DAGs D 1,1 , D 1,2 , D 2,1 and D 2,2
(plus a single node DAG for each forced passage node).
22 Clément Ballabriga, Julien Forget, and Giuseppe Lipari

Lemma A.1. Let D be a DAG. Let t = MakeCFT(D, Ds , De ). We have:

dpaths(D, De ) ⊆ tpaths(t)

Proof. The proof is done by induction on the graph decomposition presented above. The base of the induction
corresponds to the case where D consists only of a chain of forced passage basic blocks. Due to the definition of basic
blocks though, this chain would always consist of a single basic block. Thus proving the induction base is trivial.
Let us now prove the induction step. Let ti,j = MakeCFT(Di,j , Di,j s , Di,j e ), for any appropriate values of i and j.
Let t f bi = MakeCFT(D fi , D fi s , D fi n ), and let t f ei = MakeCFT(D fi , D fi s , D fi e ).
We must now prove the induction step: assuming Inclusions 15, 16, 17, prove Inclusion 18.

∀i, j, dpaths(Di,j , Di,j e ) ⊆ tpaths(ti,j ) (15)


∀i, dpaths(D fi , D fi n ) ⊆ tpaths(t f bi ) (16)
∀i, dpaths(D fi , D fi e ) ⊆ tpaths(t f ei ) (17)
dpaths(D, De ) ⊆ tpaths(t) (18)

To simplify the notation, we will assume that each time a variable named i is introduced in some equation in the
proof, it is constrained to 1, . . . , n. Similarly, when j is introduced, it is constrained to 1, . . . , i k .

For any path p in dpaths(D, De ), p can be expressed as p = p f 1 @p1 @p f 2 @...@pn−1 @p fn , where the p fi terms
are the path segments corresponding to the execution of forced passage nodes, and pi terms are the path segments
corresponding to the execution between these forced passage nodes.
For all i, if bi is a basic block, then let t fi = Leaf(bi ). Otherwise, let t fi = Loop(t f bi , t f ei ).
Let us show that ∀i, p fi ∈ tpaths(t fi ). If bi is a basic block, then we have p fi = {bi } ∈ tpaths(t fi ). If bi is a
hierarchical node, then we have p fi ∈ vpaths(bi ). Thanks to induction hypothesis, dpaths(D fi , D fi n ) ⊆ tpaths(t f bi )
and dpaths(D fi , D fi e ) ⊆ tpaths(t f ei ). Due to the definition of vpaths(bi ), p fi ∈ tpaths(t fi ).
We have ∀i, ∃j, pi ∈ dpaths(Di,j , bi+1 ). Thus, thanks to the induction hypothesis, ∀i, ∃j, pi ∈ tpaths(ti,j ). Thanks to
the definition of the function tpaths() on the Alt node, we have ∀i, pi ∈ tpaths(Alt(ti,1, ..., ti,i k )).
As a consequence, we have p ∈ tpaths(Seq(t f 1, Alt(t 1,1, ..., t 1k ), ..., Alt(tn−1,1, ..., tn−1,k ), t fn )).

Now, we must prove that this corresponds to the structure of the tree built by our algorithm. By examining the
algorithm, we see that t is a Seq node, whose children list alternates between Leaf nodes representing the forced passage
nodes, and Alt nodes (line 16) corresponding to possible paths between forced passage nodes.
The tree representing the forced passage node bi is either Leaf(bi ), if bi is a basic block (line 18), or Loop(t f bi , t f ei ),
otherwise (line 21-24). The definition of this tree is thus that of t fi .
Furthermore, each child tree of one of the Alt nodes represents the paths between a forced passage node, and a
predecessor of the next forced passage node (the test at line 4 prevents the double counting of the forced passage nodes).
Therefore, we have t = Seq(t f 1, Alt(t 1,1, ..., t 1k ), ..., Alt(tn−1,1, ..., tn−1,k ), t fn ). As a consequence, p ∈ tpaths(t) and
finally dpaths(D, De ) ⊆ tpaths(t). □

Now we can proceed to the final correctness theorem.


Symbolic WCET computation 23

Theorem A.2. Let G be a CFG and let G e denote the exit node of G. Let D = DAG(G, ⊤) and let t = MakeCFT(D, Ds , De ).
We have:
gpaths(G, G e ) ⊆ tpaths(t)

Proof. Let D = DAG(G, ⊤). All we need to prove now is that gpaths(G) ⊆ dpaths(D, De ). The problem of reducing
the CFG into a hierarchy of DAGs is a classical problem in compiler theory. Our method is similar to the one described
in [13], so we take its correctness for granted. □

B WCET CORRECTNESS
In this appendix, we show that the WCET obtained with our approach is greater than the execution time of any feasible
path in the CFT. Since we also proved that any paths of the CFG is also a path of the CFT obtained by our translation,
these two properties ensure that the WCET computed by our approach is greater than the execution time of any feasible
path in the CFG, which establishes the correctness of our approach.
Let eval(η, e, n) ≡ ni=1 (η ⊗ e)[i]. We want to prove that the WCET estimation for the program, provided by function
Í

eval, is an upper bound on the execution time of any path in the tree t. The proof strategy is the following:
• We first define a property of the abstract WCET on a control-flow tree. The property is verified only if the
abstract WCET is a valid representation of the tree’s many possible execution times;
• We then show that our function γ provides an abstract WCET which verifies the property mentioned above;
• Finally, we show that this property implies that the WCET estimation for the program is an actual upper bound.
We start by introducing an helper function prep (for path repetition). It is a generalization of tpaths() that computes
all the paths in n repetitions of t, considering that an external loop l of t has been entered e times:

Definition 14. Let prep(t, e, n) be defined as follows:

prep(t, e, n) = {p|∃p1, . . . , pn ∈ tpaths(t), p = p1 @ . . . @pn ,


∀(t ′, l, m) ∈ ann∗ (t), l < t =⇒ occ(tpaths(t ′ ), p) ≤ e · m}

If t is the whole program, then prep(t, 1, 1) = tpaths(t) (in that case, there is no loop containing t, so all annotations
in ann∗ (t) refer to loops inside t).
We are now ready to state our predicate.

Definition 15. V (t, η) is a predicate representing the fact that η is a valid abstract WCETs for control-flow tree t:

V (t, η) ≡ ∀e, n ∈ N, p ∈ prep(t, e, n), time(p) ≤ eval(η, e, n)

This property is actually a generalization of the property we want to prove, i.e. that eval(η, 1, 1) = η[1] is a correct
upper bound for any possible execution of a tree t.
Then, the following theorem states that the function γ computes an abstract WCET that satisfies the property V .

Theorem B.1. ∀t ∈ T , (l, η) = γ (t) =⇒ V (t, γ (t)).

First, we state a property on γ that will be useful later during the proof.

Lemma B.2. Let γ (t) = (l, η). Then:

∀(t ′, l ′, m) ∈ ann∗ (t), l ′ < t =⇒ l ⊑ l ′ .


24 Clément Ballabriga, Julien Forget, and Giuseppe Lipari

Proof. By definition of γ and ω, l is always computed as the intersection between external loops. So, it can never
happen that l refers to a loop that is more external than a loop contained within an annotation in t. □

We prove the theorem by induction on the structure of the control-flow tree. We start by proving that, if the property
is valid for the result of ω, then it is also valid for the result of γ .

Lemma B.3. Let t be a control-flow tree, and let ann(t) = (t, l 1, k) be its annotation. Let t ′ be the same tree on which the
annotation on t has been replaced by the empty annotation (t ′, ⊤, ∞). Let ω(t ′ ) = (l ′, η ′ ) and let γ (t) = (l, η). Then:

V (t ′, η ′ ) =⇒ V (t, η)

Proof. Clearly, γ (t ′ ) = ω(t ′ ) = ω(t) because function ω does not consider the annotation on the root of t.
For all e, n ∈ N, let M = max(e · k, n).
(1) by definition, prep(t, e, n) = prep(t ′, e, M);
(2) by definition, eval(η, e, n) = eval(η ′, e, M).
From item 1, it follows that ∀p ∈ prep(t, e, n) we have also that p ∈ prep(t ′, e, M).
From V (t ′, η ′ ), it follows that time(p) ≤ eval(η ′, e, M). From item 2, eval(η ′, e, M) = eval(η, e, n) which proves the
lemma. □

To prove Theorem B.1, we consider each case of the inductive definition of the CFT separately (Seq, Alt, Loop).

Lemma B.4. Let t = Seq(t 1, t 2 ), and let (l, η) = γ (t), (l 1, η 1 ) = γ (t 1 ), and (l 2, η 2 ) = γ (t 2 ). Then,

∀e, n ∈ N, V (t 1, η 1 ) ∧ V (t 2, η 2 ) =⇒ V (t, η)

Proof. Let t ′ be the same tree as t but without the annotation on the root node and let (l ′, η ′ ) = ω(t ′ ).
By definition of function eval, we have:
n−1
Õ n−1
Õ n−1
Õ n−1
Õ
eval(η ′, e, n) = (η ′ ⊗ e)[i] = ((η 1 ⊕ η 2 ) ⊗ e)[i] = (η 1 ⊗ e)[i] + (η 2 ⊗ e)[i]
i=0 i=0 i=0 i=0
= eval(η 1, e, n) + eval(η 2, e, n).

By definition of predicate V :

∀p1 ∈ prep(t 1, e, n), time(p1 ) ≤ eval(t 1, e, n)


∀p2 ∈ prep(t 2, e, n), time(p2 ) ≤ eval(t 2, e, n)

Any path p ∈ prep(t, e, n) is a permutation of some p1 @p2 , hence

time(p) ≤ eval(η 1, e, n) + eval(η 2, e, n) = eval(η ′, e, n)

and this proves that V (t ′, η ′ ) holds. From Lemma B.3, it follows that V (t, η) also holds. □

Lemma B.5. Let t = Alt(t 1, t 2 ), (l, η) = γ (t), and (l 1, η 1 ) = γ (t 1 ), and (l 2, η 2 ) = γ (t 2 ). Then,

∀e, n ∈ N, V (t 1, η 1 ) ∧ V (t 2, η 2 ) =⇒ V (t, η)

Proof. Let t ′ be the same tree as t but without the annotation on the root node and let (l ′, η ′ ) = ω(t ′ ). By definition
of functions ω and γ , η ′ = η 1 ⊎ η 2 . It follows that

η ′ ⊗ e = (η 1 ⊎ η 2 ) ⊗ e = (η 1 ⊗ e) ⊎ (η 2 ⊗ e).
Symbolic WCET computation 25

From the n greatest elements of η ′ ⊗ e, we have x elements coming from η 1 ⊗ e, and y elements coming from η 2 ⊗ e.
We note that we can have several valid values of x and y if there are shared time values between η 1 and η 2 .
By definition, we have:
n−1
Õ n−1
Õ xÕ
−1 y−1
Õ
eval(η ′, e, n) = (η ′ ⊗ e)[i] = ((η 1 ⊗ e) ⊎ (η 2 ⊗ e))[i] ≥ (η 1 ⊗ e)[i] + (η 2 ⊗ e)[i]
i=0 i=0 i=0 i=0
The last inequality is true for any choice of x and y such that x + y = n, because we pick the x greatest elements from
η 1 ⊗ e and the y greatest elements from η 2 ⊗ e. Moreover, the sum of the n greatest elements of η ′ ⊗ e is never inferior
to the sum of the x greatest elements of η 1 ⊗ e and the y greatest elements of η 2 ⊗ e.
Now, let pmax be the worst-case path of prep(t ′, e, n). Because of the definition of prep on Alt nodes, we can find
x and y such that p1 ∈ prep(t 1, e, x) and p2 ∈ prep(t 2, e, y), and such that pmax is a permutation of nodes from
p1 and p2 . Obviously, we have time(pmax ) = time(p1 ) + time(p2 ). Because of the induction hypothesis, we have
time(p1 ) ≤ eval(η 1, e, x) and time(p2 ) ≤ eval(η 2, e, y). Therefore time(pmax ) ≤ eval(η ′, e, n), and this proves that
V (t ′, η ′ ) holds. From Lemma B.3, it follows that V (t, η) also holds. □

Lemma B.6. Let t = Loop(h, tb , xh , te ), (l, η) = γ (t), (lb , ηb ) = γ (tb ), (le , ηe ) = γ (te ). Then:

∀e, n ∈ N, V (tb , ηb ) ∧ V (te , ηe ) =⇒ V (t, η)

Proof. Let t ′ be the same tree as t but without the annotation on the root node and let (l ′, η ′ ) = ω(t ′ ).
If lb = lh , from the definition of ω, it follows that the estimated time for one full execution of loop l is constant. Let
Íx h −1
us name this constant c = eval(ηb , 1, xh ) = i=0 ηb [i]. By definition of γ and ω:

eval(η ′, e, n) = cn + eval(ηe , e, n).

For all p ∈ prep(t ′, e, n), ∃p1, . . . , pn ∈ prep(tb , 1, xh ) and ∃pe ∈ prep(te , e, n), such that p is a permutation of
p1 @ . . . @pn @pe . We have time(p) = time(p1 )+. . .+time(pn )+time(pe ). Also, ∀k, time(pk ) ≤ eval(ηb , 1, xh ). Therefore,

time(p) ≤ n · eval(ηb , 1, xh ) + eval(ηe , e, n) = eval(η ′, e, n).

Notice that we can rule out case le = lh , by definition of context annotations.


If lb , lh , then by definition of γ and ω functions, we have
i ·x hÕ
+x h −1
η ′ [i] = η 1 [j].
j=i ·x h

We know that n is a multiple of e. Let n = k · e. We have:

eval(η ′, e, n) = eval(η ′, 1, k) · e
Õ i ·x hÕ
k−1 +x h −1 k ·x
Õ h −1
=e· (ηb )[j] + eval(ηe , e, n) = e · ηb [i] + eval(ηe , e, n)
i=0 j=i ·x h i=0

= e · eval(ηb , 1, k · xh ) + eval(ηe , e, n) = eval(ηb , e, n · xh ) + eval(ηe , e, n).

Since lb , lh , and from Lemma B.2, we know that no annotation in t refers to the current loop. Therefore, for
all p ∈ prep(t ′, e, n), p can be expressed as the permutation of pb @pe , where paths pb ∈ prep(tb , e, n · xh ) and
26 Clément Ballabriga, Julien Forget, and Giuseppe Lipari

pe ∈ prep(te , e, n). Then:

time(p) = time(pb ) + time(pe ) ≤ eval(ηb , e, n · xh ) + eval(ηe , e, n) = eval(η ′, e, n).

This proves that V (t ′, η ′ ) holds. From Lemma B.3, it follows that V (t, η) also holds. □

We can now conclude on the validity of our complete WCET evaluation method.

Theorem B.7. Let G a CFG. Let D = DAG(G, ⊤) and let t = MakeCFT(D, Ds , De ). Let (l, η) = γ (t). We have:
∀p ∈ gpaths(G, G e ), time(p) ≤ eval(η, 1, 1)

Proof. Consequence of Theorem A.2 and Theorem B.1. □

You might also like