Dynamic
Dynamic
dynamic analysis
1 Introduction
A A A
OPT OP?
B T F T F
B D B B1
C C C
Fig. 1 we have the control flow graph of the program obtained inserting an unknown
opaque predicate. An unknown opaque predicate OP? is a predicate that sometimes
evaluates to true and sometimes evaluates to false. These predicates are used to diversify
program execution by inserting in the true and false branches sequences of instructions
that are different but functionally equivalent (e.g. blocks B and B1 ) [6]. Observe that this
transformation adds confusion to dynamic analysis: a dynamic analyser has to consider
more execution traces in order to observe all possible program behaviours. Indeed, if
the dynamic analysis observes only traces that follow the original path A → OP? →
B → C it may not be sound as it misses the traces that follow A → OP? → B1 → C
(false negative).
The abstract interpretation framework has been used to formally prove the efficiency
of code obfuscation in making static analysers imprecise [13]. Indeed, code obfuscation
hampers static analysis by exploiting its conservative nature, namely by increasing its
imprecision (false positives) while preserving the program intended behaviour. It has
been observed that adding false positives to the analysis can be formalised in terms
of incompleteness in the analysis of the transformed program [13, 14, 18, 21]. Observe
that, in general, the imprecision added by these obfuscating transformations in order
to confuse a static analyzer is not able to confuse a dynamic attacker that looks at the
real program execution and thus cannot be deceived by false positives. Indeed, dynamic
analysis observes only paths that are actually executed. For this reason common deob-
fuscation approaches often recur to dynamic analysis to understand obfuscated code [3,
7, 31, 38].
It is clear that to hamper dynamic analysis we need to develop obfuscation tech-
niques that exploit the Achilles heel of dynamic analysis and that increases the number
of false negatives. In the literature, there are defense techniques that focus on hamper-
ing dynamic analysis [2, 25–27]. We would like to provide a formal framework where
it is possible to prove and discuss the efficiency of these techniques in complicating
dynamic analysis in terms of the imprecision (false negatives) that they introduce in
the analysis. This will allow us to better understand the potential and limits of code
obfuscation against dynamic program analysis. We start by providing a formalisation
of dynamic analysis and software protection techniques in terms of program semantics
and equivalence reactions over semantic domains, and we characterise when a program
transformation hampers a dynamic analysis in terms of topological features.
The contribution of this work are: (1) formal specification for dynamic analysis/at-
tacks based on program semantics and equivalence relations; (2) formal definition of
software-based protection transformations against dynamic attacks that induce impre-
cision in dynamic analysis (false negatives); (3) validation of the model on some known
software-based defense strategies.
2 Preliminaries
Basic lattice and fix-point theory: Given two sets S and T , we denote with ℘(S) the
powerset of S, with S × T the Cartesian product of S and T , with S ⊂ T strict inclusion,
with S ⊆ T inclusion, with S ⊆F T the fact that S is a finite set. hC, 6, ∨, ∧, >, ⊥i
denotes a complete lattice on the set C, with ordering 6, least upper bound (lub) ∨,
greatest lower bound (glb) ∧, greatest element (top) >, and least element (bottom) ⊥.
m c
Let C and D be complete lattices. Then, C −→D and C −→D denote, respectively, the
set and the type of all monotone and (Scott-)continuous functions from C to D. Recall
c
that f ∈ C −→D if and only if f preserves lub’s of (nonempty) chains if and only if f
preserves lub’s of directed subsets. Let f : C → C be a function on a complete lattice
C, we denote with lfp(f) the least fix-point, when it exists, of function f on C. The
m
well-known Knaster-Tarski’s theorem states that any monotone operator f : C −→C on
c
a complete lattice C admits a least fix point. It is known that if f : C −→C is continuous
then lfp(f) = ∨i∈IN fi (⊥C ), where, for any i ∈ IN and x ∈ C, the i-th power of f in x is
inductively defined as follows: f0 (x) = x; fi+1 (x) = f(fi (x)).
Given a relation R ⊆ C × D between two sets C and D, and two elements x ∈ C
and y ∈ D, then (x, y) ∈ R denotes that the pair (x, y) belongs to the relation R. A
binary relation R on a set C, namely R ⊆ C × C, is an equivalence relation if R is
reflexive ∀x ∈ C : (x, x) ∈ R, symmetric ∀x, y ∈ C : (x, y) ∈ R ⇒ (y, x) ∈ R
and transitive ∀x, y, z ∈ C : (x, y) ∈ R ∧ (y, z) ∈ R ⇒ (x, z) ∈ R. Given a set
C equipped with an equivalence relation R, we consider for each element x ∈ C the
subset [x]R of C containing all the elements of C in equivalence relation with x, i.e.,
[x]R = {y ∈ C | (x, y) ∈ R}. The sets [x]R are called equivalence classes of C wrt
relation R. Let eq(C) be the set of equivalence relations over the set C. The equivalence
classes of an equivalence relation R ∈ eq(C) form a partition of the set C, namely
∀x, y ∈ C : [x]R = [y]R ∨ [x]R ∩ [y]R = ∅ and ∪{[x]R | x ∈ C} = C. The partition
of C induced by the set of equivalence classes of relation R is called the quotient set
of C and it is denoted by C/R . A partition C/R1 is a refinement of a partition C/R2 ,
namely R1 if finer than R2 or R2 is coarser than R1 , if every equivalence class in C/R1
is a subset of some equivalence class in C/R2 . We denote with R1 v R2 the fact that
the equivalence relation R1 is finer than the equivalence relation R2 . Given a subset
S ⊆ C we denote with R(S) the set of equivalence classes of the elements of S, namely
R(S) = {[x]R | x ∈ S}, and with S/R the partition of the subset S induced by the
equivalence relation R, namely S/R = {[x]R ∩ S | x ∈ S}.
Program semantics: Let Prog be a set of programs ranged over by P. Let v ∈ I denote
a possible input and let I∗ denote the set of input sequences ranged over by I, let PP
denote the set of program points ranged over by pp, let Com denote the set of program
statements ranged over by C and let Mem denote the set of memory maps that associates
values to variables ranged over by m : Var → Values. Σ = I∗ × PP × Com × Mem
is the set of program states. Thus, a program state s ∈ Σ is a tuple s = hI, pp, C, mi
where I denotes the sequence of inputs that still needs to be consumed to terminate
the execution, pp denotes the program point of the next instruction C that has to be
executed, and m is the current memory. We denote with C1 ; C2 the sequential com-
position of statements and we refer to skip as the identity statement whose execution
has no effects on memory. Given a program P we denote with IP ⊆ I∗ the set of the
initial input sequences for the execution of program P, and with InitP = {s ∈ Σ | s =
hI, pp, C, mi, I ∈ IP } the set of its initial states. We use Σ∗ to denote the set of all finite
and infinite sequences or traces of states, where ∈ Σ∗ is the empty sequence, |σ| the
length of sequence σ ∈ Σ∗ . Σ+ ⊂ Σ∗ denotes the set of finite sequences of elements of
Σ. We denote the concatenation of sequences σ, ν ∈ Σ∗ as σν. Given σ, ν ∈ Σ∗ , ν σ
means that ν is a subsequence of σ, namely that there exists σ1 , σ2 ∈ Σ∗ such that
σ = σ1 νσ2 . Given s ∈ Σ we write s ∈ σ when s is an element occurring in sequence
σ, and we denote with σ0 ∈ Σ the first element of sequence σ and with σf the final
element of the finite sequence σ ∈ Σ+ . Let R ⊆ Σ × Σ denote the transition relation
between program states, thus (s, s 0 ) ∈ R means that state s 0 can be obtained from state
s in one computational step. The (finite) trace semantics of a program P is defined, as
usual, as the least fix-point computation of function FP : ℘(Σ∗ ) → ℘(Σ∗ ) [9]:
FP (X) = InitP ∪ σsi si+1 (si , si+1 ) ∈ R, σsi ∈ X
def
The trace semantics of P is [[P]] = lfp(FP ) = i∈IN FPi (⊥C ). Den[[P]] denotes the de-
S
notational (finite) semantics of program P which abstracts away the history of the com-
putation by observing only the input-output relation of finite traces. Therefore we have
Den[[P]] = {σ ∈ Σ+ | ∃η ∈ [[P]] : η0 = σ0 , ηf = σf }.
In the literature there exists a formal investigation of the effects of code obfuscation
to the precision of static analysis [13, 14, 18, 21]. This has lead to a better understand-
ing of the potential and limits of obfuscation, and it has been useful in the design of
obfuscation techniques that target specific program properties [14, 18, 19].
In the following we apply a similar approach to dynamic analysis. To this end we
formalise the absence of false negatives, namely the precision of dynamic analysis,
in terms of topological properties of program trace semantics and of the equivalence
relation R modelling the property to be observed. False negatives happen when the set
of traces considered by dynamic analysis misses some traces that would modify the
equivalence classes observed by property R. We show how to transform a program in
order to hinder the dynamic analysis of a property R, namely in order to make the
dynamic analysis of the transformed program not sound.
Dynamic analysis observes a finite subset of finite execution traces of a program and
from this partial observation tries to drive conclusions on the whole program behaviour.
Definition 1 (Dynamic Execution). The execution traces of program P with initial
states in TP ⊆F InitP and with time limits t ∈ N, are defined as:
def
Exe(P, TP , t) = σ ∈ [[P]] |σ| 6 t, σ = s0 σ 0 , s0 ∈ TP
Note that Exe(P, TP , t) is a finite set and that each trace in Exe(P, TP , t) is finite (it has
at most t states). This correctly implies that: Exe(P, TP , t) ⊆F [[P]]. The goal of dynamic
analysis is to derive knowledge of a semantic property of a program by observing a finite
subset Exe(P, TP , t) of its execution traces. Dynamic analysis is therefore specified as
the set of observed execution traces Exe(P, TP , t) and of an equivalence relation on
traces R ∈ eq(Σ∗ ).
Definition 2 (Dynamic Analysis). A dynamic analysis of property R ∈ eq(Σ∗ ) of pro-
gram P ∈ Prog, is defined as a pair hR, Exe(P, TP , t)i.
Let us consider program P on the left of Fig. 2 where the block of code to execute
depends on the input value of x. Consider a property of traces R̄ ∈ eq(Σ∗ ) that observes
which block B1 , B2 or B3 of program P is executed. On the right of Fig. 2 we represent
the partition of the traces of program P induced by property R̄ where xInit denotes the
input value of variable x.
Dynamic analysis hR, Exe(P, TP , t)i can precisely observe property R of the seman-
tics of P (no false negatives) when Exe(P, TP , t) contains at least one trace for each one
of the equivalence classes of the traces of [[P]].
Definition 3 (Soundness). Given P ∈ Prog and R ∈ eq(Σ∗ ) a dynamic analysis
hR, Exe(P, TP , t)i is sound if ∀x ∈ [[P]] : [x]R ∈ R(Exe(P, TP , t)).
When a dynamic analysis hR, Exe(P, TP , t)i is sound we have no false negatives, namely
∀y ∈ [[P]] : [y]R ∈ R(Exe(P, TP , t)). When this happens, all the behaviours of program
P that relation R is able to distinguish are taken into account by the partial observation
of program behaviour Exe(P, TP , t). In the example in Fig. 2 we have that a dynamic
analysis hR̄, Exe(P, TP , t)i is sound if Exe(P, TP , t) contains at least one execution trace
for each one of the three equivalence classes depicted on the right of Fig. 2.
Definition 4 (Covers). Given P ∈ Prog, and R ∈ eq(Σ∗ ), we say that S ⊆ [[P]] covers
P wrt R when: R(S) = R([[P]]).
It is clear that when S covers P wrt R we have that the partial observation S of the
behaviours of P is sound wrt R, since it allows us to observe all the equivalence classes
of R that we would observe by having access to all the traces in [[P]] (no false negatives).
Thus, in the example in Fig. 2 we have that the set of traces {σ1 , η1 } does not cover P wrt
R̄, while the set of traces {σ1 , η1 , η2 , µ2 } does. The following theorem comes straight
from the definitions.
Theorem 1. Given P ∈ Prog and R ∈ eq(Σ∗ ), if Exe(P, TP , t) covers P wrt R then the
dynamic analysis hR, Exe(P, TP , t)i is sound (no false negatives).
The goal of dynamic analysis of a property R on a program P, is to identify the set TP
of inputs, and the length t that induce a partial observation of program semantics that
makes the analysis sound (no false negatives) wrt R. Thus, a possible way to hamper
dynamic analysis is to transform programs in order to increase the number of traces that
it is necessary to observe to ensure soundness. Indeed, by tying the precision of dynamic
analysis to the observation of a wider set of traces (worst case being the observation of
all possible traces) we are limiting the advantages of using dynamic analysis.
In order to formalise this idea, in the following we provide a characterisation of the
set of traces that are needed to guarantee the soundness of the dynamic analysis of a
program P wrt a semantic property R. We use this characterisation to formalise what
it means for a software-based defense transformation to harm dynamic analysis. We
validate our model by showing how it naturally relates to the notion of code coverage
of dynamic analysis, and by showing how existing techniques for hindering dynamic
analysis fit in our framework.
This means that Core(R([[P]]), R) characterises the minimal sets of execution traces that
provide a sound dynamic analysis of property R for program P. In the example in Fig. 2
we have that Core([[P]], R̄) identifies those sets of trace that have exactly three traces:
one trace with xinit < 50, one trace with 50 6 xinit 6 100 and one trace with xinit > 100.
Corollary 1. Given P ∈ Prog and R ∈ eq(Σ∗ ) we have that:
Thus, a dynamic analysis hR, Exe(P, TP , t)i is sound if Exe(P, TP , t) observes at least
one execution trace for each one of the equivalence classes of the traces in [[P]] for the
relation R. In the worst case we have a different equivalence class for every execution
trace of P. When this happens, a sound dynamic analysis of property R on program
P has to observe all possible execution traces, which is unfeasible in the general case.
Thus, if we want to protect a program from a dynamic analysis that is interested in the
property R, we have to diversify property R as much as possible among the execution
traces of the program.
This allows us to define when a program transformation is potent wrt a dynamic
analysis, namely when a program transformation forces a dynamic analysis to observe
a wider set of traces in order to be sound. See [5] for the general notion of potency of a
program transformation, i.e., a program transformation that foils a given attack (in our
case a dynamic analysis).
Definition 6 (Potency). A program transformation T : Prog → Prog that preserves the
denotational semantics of programs is potent for a program P ∈ Prog wrt an observa-
tion R ∈ eq(Σ∗ ) if the following two conditions hold:
Fig. 3 provides a graphical representation of the notion of potency. On the left we have
the traces of the original program P partitioned according to the equivalence relation
P T(P)
R, while on the right we have the traces of the transformed program T(P) partitioned
according to R. Traces that are denotationally equivalent have the same shape (trian-
gle, square, circle, oval), but are filled differently since they are in general different
traces. The first condition means that the traces of T(P) that property R maps to the
same equivalence class (triangle and square), are denotationally equivalent to traces of
P that property R maps to the same equivalence class. This means that what is grouped
together by R on [[T(P)]] was grouped together by R on [[P]], modulo the denotational
equivalence of traces. The second condition requires that there are traces of P (circle
and oval) that property R maps to the same equivalence class and whose denotationally
equivalent traces in T(P) are mapped by R to different equivalence classes. This means
that a defense technique against dynamic analysis wrt a property R is successful when
it transforms a program into a functionally equivalent one for which property R is more
diversified among execution traces. This implies that it is necessary to collect more ex-
ecution traces in order for the analysis to be precise. At the limit we have an optimal
defense technique when R varies at every execution trace.
Example 2. Consider the following programs P and Q that compute the sum of natural
numbers from x > 0 to 49 (we assume that the inputs values for x are natural numbers).
Q
input x;
P
n : = select(N,x)
input x;
x := x * n;
sum := 0;
sum := 0;
while x < 50
while x < 50 * n
• o X = [0, 49] o
• o X = [0, n ∗ 50 − 1] o
sum := sum + x;
sum := sum + x/n;
x := x + 1;
x := x + n;
x := x/n;
Consider a dynamic analysis that observes the maximal value assumed by x at pro-
gram point •. For every possible execution of program P we have that the maximal
value assumed by x at program point • is 49. Consider a state s ∈ Σ as a tuple
hI, pp, C, [valx , valsum ]i, where valx and valsum denote the current values of variables
x and sum respectively. We define a function τ : Σ → N that observes the value as-
sumed by x at state s when s refers to program point •, and function Max : Σ∗ → N
that observes the maximal value assumed by x at • along an execution trace:
valx if pp = •
Max(σ) = max({τ(s) | s ∈ σ})
def def
τ(s) =
∅ otherwise
This allows us to define the equivalence relation RMax ∈ eq(Σ∗ ) that observes traces
wrt the maximal value assumed by x at •, as (σ, σ 0 ) ∈ RMax iff Max(σ) = Max(σ 0 ).
The equivalence classes of RMax are the sets of traces with the same maximal value
assumed by x at •. We can observe that all the execution traces of P belong to the
same equivalence class of RMax . In this case, a dynamic analysis hRMax , Exe(P, TP , t)i
is sound if Exe(P, TP , t) contains at least one execution trace of P. This happens because
the property that we are looking for is an invariant property of program executions and
it can be observed on any execution trace.
Let us now consider program Q. Q is equivalent to P, i.e., Den[[P]] = Den[[Q]], but
the value of x is diversified by multiplying it by the parameter n. The guard and the
body of the while are adjusted in order to preserve the functionality of the program.
When observing property RMax on Q, we have that the maximal value assumed by x at
program point • is determined by the parameter n generated in the considered trace. The
statement n:=select(N,x) assigns to n a value in the range [0, N] depending on the
input value x. We have that the traces of program Q are grouped by RMax depending on
the value assumed by n. Thus, R([[Q]]) contains an equivalence class for every possible
value assumed by n during execution. This means that the transformation that rewrites
P into Q is potent according to Definition 6. Dynamic analysis hRMax , Exe(Q, TQ , t)i
is sound if Exe(Q, TQ , t) contains at least one execution trace for each of the possible
values of n generated during execution.
4 Model Validation
In this section we show how the proposed framework can be used to model existing code
obfuscation techniques. In particular we model the way these transformations deceive
dynamic analysis of control flow and data flow properties of programs. We also show
how the measures of code coverage used by dynamic analysis tools can be naturally
interpreted in the proposed framework.
Dynamic Extraction of the Control Flow Graph. The control flow graph CFG of a
program P is a graph CFGP = (V, E) where each node v ∈ V is a pair (pp, C) denot-
ing a statement C at program point pp in P, and E ⊆ V × V is the set of edges such
that (v1 , v2 ) ∈ E means that the statement in v2 could be executed after the statement
def
in v1 when running P. Thus, we define the domain of nodes as Nodes = PP × Com,
def
and the domain of edges as Edges = Nodes × Nodes. It is possible to dynamically
construct the CFG of a program by observing the commands that are executed and
the edges that are traversed when the program runs. Let us define η : Σ → Nodes
that observes the command to be executed together with its program point, namely
def
η(s) = η(hI, pp, C, mi) = (pp, C). By extending this function on traces we obtain
function path : Σ∗ → Nodes × Edges that extracts the path of the CFG correspond-
ing to the considered execution trace, abstracting from the number of times that an edge
is traversed or a node is computed:
where s ∈ σ means that s is a state that appears in trace σ and ss 0 σ means that
s and s 0 are successive states in σ. This allows us to define the equivalence relation
RCFG ∈ eq(Σ∗ ) that observes traces up to the path that they define, as (σ, σ 0 ) ∈ RCFG
iff path(σ) = path(σ 0 ). Indeed, RCFG groups together those traces that execute the
same set of nodes and traverse the same set of edges, abstracting from the number of
times that nodes are executed and edges are traversed.
The CFG of a program F P can be defined as the union of the paths of its execution
traces, namely CFGP = {path(σ) | σ ∈ [[P]]}, where the union of graphs is defined
as (V1 , E1 ) t (V2 , E2 ) = (V1 ∪ V2 , E1 ∪ E2 ). The dynamic extraction of the CFG of
a program P from the observation of a set X ⊆FF[[P]] of execution traces, is given by
{path(σ) | σ ∈ X}. In the general case we have {path(σ) | σ ∈ X} ⊆ CFGP .
F
Preventing Dynamic CFG Extraction. Control code obfuscations are program trans-
formations that modify the program’s control flow in order to make it difficult for an
adversary to analyse the flow of control of programs [5]. According to Section 3.2,
a program transformation T : Prog → Prog is a potent defence against the dynamic
extraction of the CFG of a program P when T diversifies the paths taken by the execu-
tion traces of T(P) wrt the paths taken by the traces of P. In the following, we show
how two known defence techniques for preventing dynamic analysis actually work by
diversifying program traces with respect to property RCFG .
P RD(P) GD(P)
A simple example is provided in Fig. 4 where on the left we have the CFG of the
original program P. P verifies the parity of the input value and then computes the integer
division. The second graph in Fig. 4 represents the CFG of program P transformed by
RD. The CFG of program RD(P) has four different paths depending on the value of the
input variable x. Each one of these paths is functionally equivalent to the corresponding
path in P (case 0 and case 2 are equivalent to the path taken when x is even, while
case 1 and case 3 are equivalent to the path taken when x is odd). We can easily
observe that in this case the paths of RD(P) have been diversified wrt the paths of P.
Indeed, a dynamic analysis has to observe two execution traces to precisely build the
CFG for P, while four traces are need to precisely build the CFG of RD(P).
1→2→5 1→3→5
odd(x) → even(x) →
1→4→5 1→6→5
We can easily observe that the paths of GD(P) have been diversified wrt the paths of P
and while the dynamic construction of the CFG for P requires to observe two execution
traces, we need to observe 4 execution traces to precisely build the CFG of GD(P).
Most dynamic algorithms use code coverage to measure the potential soundness of the
analysis [1]. Intuitively, given a program P and a partial observation Exe(P, TP , t) of
its execution traces, code coverage wants to measure the amount of program behaviour
considered by Exe(P, TP , t) wrt the set of all possible behaviours [[P]]. In the following
we describe some known code coverage measures.
Statement coverage considers the statements of the program that have been executed
by the traces in Exe(P, TP , t). This is a function st : Σ∗ → Nodes that collects com-
mands annotated with their program point, that are executed along a considered trace:
st(σ) = {η(s) | s ∈ σ}. This allows us to define the equivalence relation Rst ∈ eq(Σ∗ )
def
that groups together traces that execute the same set of statements.
Count-Statement coverage considers how many times each statement of the program
has been executed by the traces in Exe(P, TP , t). Thus, it can be formalised in terms of
an equivalence relation R+ ∗
st ∈ eq(Σ ) that groups together traces that execute the same
set of statements the same amount of times. It is clear that relation R+ st is finer than
relation Rst , namely R+ st v Rst .
Path coverage observes the nodes executed and edges traversed by the traces in
Exe(P, TP , t). This precisely corresponds to the observation of property RCFG ∈ eq(Σ∗ )
defined above, where the paths of the CFG are observed by abstracting form the number
of times that edges are traversed. It is clear that relation RCFG is finer than relation Rst ,
namely RCFG v Rst .
Count-Path coverage considers the different paths in Exe(P, TP , t), where the num-
ber of times that edges are traversed in a trace is taken into account. This can be for-
malised in terms of an equivalence relation R+ ∗
CFG ∈ eq(Σ ) that groups together traces
that execute and traverse the same nodes and edges the same number of times. It is clear
that relation R+CFG is finer than relation RCFG , namely RCFG v RCFG .
+
Trace coverage considers the traces of commands that have been executed abstract-
ing from the memory map. In this case we can define the code coverage in terms
def def
of function trace : Σ∗ → Com × PP defined as trace() = and trace(sσ) =
η(s)trace(σ). The equivalence relation Rtrace ∈ eq(Σ∗ ) is such that (σ, σ 0 ) ∈ Rtrace
if trace(σ) = trace(σ 0 ). This equivalence relation is finer than R+ CFG since it keeps
track of the order of execution of the edges.
In order to avoid false negatives, dynamic algorithms automatically look for in-
puts whose execution traces have to exhibit new behaviours with respect to the code
coverage metric used (e.g., they have to execute new statements or execute them a dif-
ferent number of times, traverse new edges or change the number of times edges are
traversed, or execute nodes in a different order). This can be naturally formalised in
our framework. Given a set Exe(P, TP , t) of observed traces, an automatically gener-
ated input increases the code coverage measured as Rst (or R+ st , RCFG , RCFG , Rtrace ) if
+
the execution trace σ generated by the input is mapped in a new equivalence class of
Rst (or R+st , RCFG , RCFG , Rtrace ), namely in an equivalence class that was not observed
+
by traces in Exe(P, TP , t), namely if [σ]Rst 6∈ Rst (Exe(P, TP , t)) (analogously for R+ st ,
RCFG , R+CFG , R trace ). We have seen above that some of the common measures for code
coverage can be expressed in terms of semantic program properties with different de-
grees of precision id v Rtraces v R+ CFG v RCFG v Rst . This means, for example, that
automatically generated inputs could add coverage for R+ CFG but not for Rst . Indeed, a
new input generates a new behaviour depending on the metric used for code coverage.
Fuzzing and dynamic symbolic execution are typical techniques used by dynamic
analysis to automatically generate inputs in order to extend code coverage. The metrics
that fuzzing and symbolic execution use to measure code coverage are sometimes a
slight variations of the ones mentioned earlier.
Fuzzing: The term fuzzing refers to a family of automated input generating techniques
that are widely used in the industry to find vulnerabilities and bugs in all kinds of soft-
ware [35]. In general, a fuzzer aims at discovering inputs that generate new behaviors,
thus one measure of success for fuzzer is code coverage. Simple statement coverage is
rarely a good choice, since crashes do not usually depend on a single program state-
ment, but on a specific sequence of statements [39]. Most fuzzing algorithms choose to
define their own code coverage metric. American Fuzzy Lop (AFL) is a state of the art
fuzzer that has seen extensive use in the industry in its base form, while new fuzzers
are continuously built on top of it [32]. The measure used by AFL for code coverage
lays between path and count-path coverage as it approximates the number of times
that edges are traversed by specified intervals of natural numbers ([1], [2], [3], [4 − 7],
[8 − 15], [16 − 31], [32 − 127], [128, ∞]). Libfuzzer [30] and honggfuzz [36] employ
count-statement coverage. To the best our our knowledge trace coverage is never used
as it is infeasible in practice [16].
Dynamic Symbolic Execution: DSE is a well known dynamic analysis technique that
combines concrete and symbolic execution [22]. DSE typically starts by executing a
program on a random input and then generates branch conditions that take into account
the executed branches. When execution ends, DSE looks at the last branch condition
generated and uses a theorem prover to solve the negated predicate in order to explore
the branch that was not executed. This is akin to symbolic execution, but DSE can use
the concrete values obtained in the execution to simplify the job of the theorem prover.
The ideal goal of DSE is to reach path coverage, which is always guaranteed if the
conditions in the target program only contain linear arithmetics [22]. Thus, the efficacy
of DSE in generating new inputs is measured in terms of path coverage formalised as
RCFG in our framework.
Let us denote with R ∈ eq(Σ∗ ) the equivalence relation modelling the code cov-
erage metric used either by fuzzing or symbolic execution or any other algorithm for
input generation. When Exe(P, TP , t) covers P wrt R, we have that the fuzzer or sym-
bolic execution algorithm has found all the inputs that allow us to observe the different
behaviours of P wrt R. In general, a dynamic analysis may be interested in a property
RA ∈ eq(Σ∗ ) that is different from the property R used to measure code coverage.
When R v RA we have that if Exe(P, TP , t) covers P wrt R, then Exe(P, TP , t) covers
P also wrt RA and this means that the code coverage metric R can help in limiting the
number of false negative of the dynamic analysis hRA , Exe(P, TP , t)i. When R 6v RA
then a different metric for code coverage should be used (for example RA itself).
Data obfuscation transformations change the representation of data with the aim of hid-
ing both variable content and usage. Usually, data obfuscation requires the program
code to be modified, so that the original data representation can be reconstructed at run-
time. Data obfuscation is often achieved through data encoding [5, 28]. More specifi-
cally, in [15, 23] data encoding for a variable x is formalised as a pair of statements:
encoding statement Cenc = x := f(x) and decoding statement Cdec = x := g(x)
for some function f and g, such that Cdec ; Cenc = skip. According to [15, 23] a pro-
gram transformation T(P) = Cdec ; tx (P); Cenc is a data obfuscation for x where tx ad-
def
TH (P)
T(P) Tn (P)
input x;
P input x; input x;
n := select(N,x);
input x; x := 2*x; x := n*x;
x := He (n,x);
sum := 0; sum := 0; sum := 0;
sum := He (n,0);
while x < 50 while x < 2*50 while x < n*50
while x <H He (n,50)
• X = [x, 49] • X = [x, 2 ∗ 50 − 1] • X = [x, n ∗ 50 − 1]
• X = [x, He (n, 50) − 1]
sum := sum + x; sum := sum + x/2; sum := sum + x/n;
sum := sum +H x;
x := x + 1; x := x + 2; x := x + n;
x := x +H He (n,1);
x:= x/2; x:= x/n;
x:= Hd (x);
ues assumed by x at program point •. Indeed, the static analysis of the interval of values
of x at program point • in T(P) is different and wider (it contains spurious values)
than the interval of possible values of x at • in P. However, the dynamic analysis of
properties on the values assumed by x during execution at the different program points
(e.g., maximal/minimal value, number of possible values, interval of possible values)
has not been hardened in T(P). The values assumed by x at • in T(P) are different from
the values assumed by x at • in P but these properties on the values assumed by x are
precisely observable by dynamic analysis on T(P). Transformation T(P) changes the
properties of data values wrt P, but it does it in an invariant way: during every execution
of T(P) we have that x is iteratively incremented by 2 and the guard of the loop becomes
x < 2 ∗ 50 − 1, and this is observable on any execution of T(P). This means that by dy-
namic analysis we could learn that the maximal value assumed by x is 99(= 2 ∗ 50 − 1).
Thus, transformation T is not potent wrt properties of data values according to Defini-
tion 6 since it does not diversify the properties of values assumed by variables among
traces. In order to hamper dynamic analysis we need to diversify data among traces,
thus forcing dynamic analysis to observe more execution traces to be sound. We could
do this by making the encoding and decoding statements parametric on some natural
number n as described by the third program Tn (P) = x := x/n; tx,n (P); x := n ∗ x in
Fig. 5 (which is the same as Q in Example. 2). Indeed, the parametric transformation
Tn (P) is potent wrt properties that observe data values since it diversifies the values
assumed by x among different executions thanks to the parameter n. For example, to
observe the maximal value assumed by x in Tn (P) we should observe an execution for
every possible value of n.
This confirms what observed [28]: existing data obfuscation makes static analysis
imprecise but it is less effective against dynamic analysis. Interpreting data obfuscation
in our framework allows us to see that, in order to hamper dynamic analysis, data en-
coding needs to diversify among traces. This can be done by making the existing data
encoding techniques parametric.
5 Related Works
To the best of our knowledge we are the first to propose a formal framework for dy-
namic analysis efficacy based on semantic properties. Other works have proposed more
empirical ways to assess the impact of dynamic analysis.
Evaluating Reverse Engineering. Program comprehension guided by dynamic anal-
ysis has been evaluated with specific test cases, quantitative measures and the involve-
ment of human subjects [8]. For example, comparing the effectiveness of static analysis
and dynamic analysis towards the feature location task has been carried out through
experiments involving 2 teams of analysts solving the same problem with a static anal-
ysis and a dynamic analysis approach respectively [37]. In order to compare the ef-
fectiveness of different reverse engineering techniques (which often employ dynamic
analysis), Sim et al. propose the use of common benchmarks [34]. The efficacy of pro-
tections against human subjects has been evaluated in a set of experiments by Ceccato et
al., finding that program executions are important to understand the behavior of obfus-
cated code [4]. Our approach characterizes dynamic attacks and protections according
to their semantic properties which is an orthogonal work that can be complemented by
more empirical approaches.
Obfuscations Against Dynamic Analysis. One of the first works tackling obfus-
cations specifically geared towards dynamic analysis is by Schrittwieser and Katzen-
beisser [27]. Their approach adopts some principles of software diversification in order
to generate additional paths in the CFG that are dependent on program input (i.e. they
do not work for other inputs). Similar to this approach, Pawlowski et al. [26] generate
additional branches in the CFG but add non-determinism in order to decide the exe-
cuted path at runtime. Both of these works empirically evaluate their methodology and
classify it with potency and resilience, two metrics introduced by Collberg et al. [6].
Banescu et al. empirically evaluated some obfuscations against dynamic symbolic exe-
cution (DSE) [2], finding that DSE does not suffer from the addition of opaque branches
since they do not depend on program input. To overcome this limitation they propose
the Range Dividers obfuscation that we illustrated in Section 4. A recent work by Ol-
livier et al. refines the evaluation of protections against dynamic symbolic execution
with a framework that enables the optimal design of such protections [25]. All these
works share with us the intuition that dynamic analysis suffers from insufficient path
exploration and they prove this intuition with extensive experimentation. Our work aims
at enabling the formal study of these approaches.
Formal Systems. Dynamic taint analysis has been formalized by making explicit the
taint information in the program semantics [29]. Their work focuses on writing correct
algorithms and shows some possible pitfalls of the various approaches. Ochoa et al.
[24] use game theory to quantify and compare the effectiveness of different probabilis-
tic countermeasures with respect to remote attacks that exploit memory-safety vulner-
abilities. In our work we model MATE attacks. Shu et al. introduce a framework that
formalizes the detection capability in existing anomaly detection methods [33]. Their
approach equates the detection capability to the expressiveness of the language used to
characterize normal program traces.
This work represents the first step towards a formal investigation of the precision of
dynamic analysis in relation with dynamic code attack and defences. The results that
we have obtained so far confirm the initial intuition: diversification is the key for harm-
ing dynamic analysis. Dynamic analysis generalises what it learns from a partial ob-
servation of program behaviour, diversification makes this generalisation less precise
(dynamic analysis cannot consider what it has not observed). We think that this work
would be the basis for further interesting investigations. Indeed, there are many aspects
that still need to be understood for the development of a complete framework for the
formal specification of the precision of dynamic analysis (no false negatives), and for
the systematic development of program transformations that induce imprecision.
We plan to consider more sophisticated properties than the ones that can be ex-
pressed as equivalence relations. It would be interesting to generalise the proposed
framework wrt to any semantic property that can be formalised as a closure operator
on trace semantics. The properties that we have considered so far correspond to the set
of atomistic closures where the abstract domain is additive. We would like to generalise
our framework to properties modelled as abstract domains and where the precision of
dynamic analysis is probably characterised in terms of the join-irreducible elements
of such domains. A further investigation would probably lead to a classification of the
properties usually considered by dynamic analysis: properties of traces, properties of
sets of traces, relational properties, hyper-properties, together with a specific charac-
terisation of the precision of the analysis and of the program transformations that can
reduce it. This unifying framework would provide a common ground where to inter-
pret and compare the potency of different software protection techniques in harming
dynamic analysis.
We can view dynamic analysis as a learner that observes properties of some execu-
tion traces (training set) and then generalises what it has observed, where the general-
isation process is the identity function. We wonder what would happen if we consider
more sophisticated generalisation processes such as the ones used by machine learning.
Would it be possible to define what is learnable? Would it be possible to formally de-
fine robustness in the adversarial setting? We think that this is an intriguing research
direction and we plan to pursue it.
Acknowledgments The research has been partially supported by the project “Diparti-
menti di Eccellenza 2018-2022” funded by the Italian Ministry of Education, Universi-
ties and Research (MIUR).
References
1. Paul Ammann and Jeff Offutt. Introduction to software testing. Cambridge University Press,
2016.
2. Sebastian Banescu, Christian Collberg, Vijay Ganesh, Zack Newsham, and Alexander
Pretschner. Code obfuscation against symbolic execution attacks. In Proceedings of the
32nd Annual Conference on Computer Security Applications, pages 189–200, 2016.
3. Tim Blazytko, Moritz Contag, Cornelius Aschermann, and Thorsten Holz. Syntia: Synthe-
sizing the semantics of obfuscated code. In 26th USENIX Security Symposium, USENIX
Security 2017, Vancouver, BC, Canada, August 16-18, 2017, pages 643–659. USENIX As-
sociation, 2017.
4. Mariano Ceccato, Massimiliano Di Penta, Paolo Falcarin, Filippo Ricca, Marco Torchiano,
and Paolo Tonella. A family of experiments to assess the effectiveness and efficiency of
source code obfuscation techniques. Empirical Software Engineering, 19(4):1040–1074,
2014.
5. C. Collberg and J. Nagra. Surreptitious Software: Obfuscation, Watermarking, and Tamper-
proofing for Software Protection. Addison-Wesley Professional, 2009.
6. C. Collberg, C. Thomborson, and D. Low. Manufacturing cheap, resilient, and stealthy
opaque constructs. In Proceedings of the 25th ACM SIGPLAN-SIGACT Symposium on Prin-
ciples of programming languages (POPL ’98), pages 184–196. ACM Press, 1998.
7. Kevin Coogan, Gen Lu, and Saumya K. Debray. Deobfuscation of virtualization-obfuscated
software: a semantics-based approach. In Proceedings of the 18th ACM Conference on
Computer and Communications Security, CCS 2011, Chicago, Illinois, USA, October 17-
21, 2011, pages 275–284. ACM, 2011.
8. Bas Cornelissen, Andy Zaidman, Arie Van Deursen, Leon Moonen, and Rainer Koschke. A
systematic survey of program comprehension through dynamic analysis. IEEE Transactions
on Software Engineering, 35(5):684–702, 2009.
9. P. Cousot. Constructive design of a hierarchy of semantics of a transition system by abstract
interpretation. Theor. Comput. Sci., 277(1-2):47–103, 2002.
10. P. Cousot and R. Cousot. Abstract interpretation: A unified lattice model for static analysis
of programs by construction or approximation of fixpoints. In Conference Record of the 4th
ACM Symposium on Principles of Programming Languages (POPL ’77), pages 238–252.
ACM Press, 1977.
11. P. Cousot and R. Cousot. Systematic design of program analysis frameworks. In Conference
Record of the 6th ACM Symposium on Principles of Programming Languages (POPL ’79),
pages 269–282. ACM Press, 1979.
12. P. Cousot and R. Cousot. An abstract interpretation-based framework for software water-
marking. In Conference Record of the Thirtyfirst Annual ACM SIGPLAN-SIGACT Sympo-
sium on Principles of Programming Languages, pages 173–185. ACM Press, New York, NY,
2004.
13. M. Dalla Preda and R. Giacobazzi. Semantic-based code obfuscation by abstract interpreta-
tion. Journal of Computer Security, 17(6):855–908, 2009.
14. Mila Dalla Preda and Isabella Mastroeni. Characterizing a property-driven obfuscation strat-
egy. Journal of Computer Security, 26(1):31–69, 2018.
15. S. Drape, C. Thomborson, and A. Majumdar. Specifying imperative data obfuscations. In
ISC - Information Security, volume 4779 of Lecture Notes in Computer Science, pages 299
– 314. Springer Verlag, 2007.
16. Shuitao Gan, Chao Zhang, Xiaojun Qin, Xuwen Tu, Kang Li, Zhongyu Pei, and Zuoning
Chen. Collafl: Path sensitive fuzzing. In 2018 IEEE Symposium on Security and Privacy
(SP), pages 679–696. IEEE, 2018.
17. Craig Gentry and Dan Boneh. A fully homomorphic encryption scheme, volume 20. Stanford
university Stanford, 2009.
18. R. Giacobazzi. Hiding information in completeness holes - new perspectives in code obfus-
cation and watermarking. In Proc. of The 6th IEEE International Conferences on Software
Engineering and Formal Methods (SEFM’08), pages 7–20. IEEE Press., 2008.
19. R. Giacobazzi, N. D. Jones, and I. Mastroeni. Obfuscation by partial evaluation of distorted
interpreters. In O. Kiselyov and S. Thompson, editors, Proc. of the ACM SIGPLAN Symp.
on Partial Evaluation and Semantics-Based Program Manipulation (PEPM’12), pages 63 –
72. ACM Press, 2012.
20. R. Giacobazzi, F. Ranzato, and F. Scozzari. Making abstract interpretation complete. Journal
of the ACM, 47(2):361–416, March 2000.
21. Roberto Giacobazzi, Isabella Mastroeni, and Mila Dalla Preda. Maximal incompleteness as
obfuscation potency. Formal Asp. Comput., 29(1):3–31, 2017.
22. Patrice Godefroid, Nils Klarlund, and Koushik Sen. Dart: directed automated random testing.
In Proceedings of the 2005 ACM SIGPLAN conference on Programming language design
and implementation, pages 213–223, 2005.
23. A. Majumdar, S. J. Drape, and C. D. Thomborson. Slicing obfuscations: design, correctness,
and evaluation. In DRM ’07: Proceedings of the 2007 ACM workshop on Digital Rights
Management, pages 70–81. ACM, 2007.
24. Martı́n Ochoa, Sebastian Banescu, Cynthia Disenfeld, Gilles Barthe, and Vijay Ganesh. Rea-
soning about probabilistic defense mechanisms against remote attacks. In 2017 IEEE Euro-
pean Symposium on Security and Privacy, EuroS&P 2017, Paris, France, April 26-28, 2017,
pages 499–513. IEEE, 2017.
25. Mathilde Ollivier, Sébastien Bardin, Richard Bonichon, and Jean-Yves Marion. How to kill
symbolic deobfuscation for free (or: unleashing the potential of path-oriented protections).
In Proceedings of the 35th Annual Computer Security Applications Conference, pages 177–
189, 2019.
26. Andre Pawlowski, Moritz Contag, and Thorsten Holz. Probfuscation: an obfuscation ap-
proach using probabilistic control flows. In International Conference on Detection of Intru-
sions and Malware, and Vulnerability Assessment, pages 165–185. Springer, 2016.
27. Sebastian Schrittwieser and Stefan Katzenbeisser. Code obfuscation against static and dy-
namic reverse engineering. In International workshop on information hiding, pages 270–284.
Springer, 2011.
28. Sebastian Schrittwieser, Stefan Katzenbeisser, Johannes Kinder, Georg Merzdovnik, and
Edgar R. Weippl. Protecting software through obfuscation: Can it keep pace with progress
in code analysis? ACM Comput. Surv., 49(1):4:1–4:37, 2016.
29. Edward J Schwartz, Thanassis Avgerinos, and David Brumley. All you ever wanted to know
about dynamic taint analysis and forward symbolic execution (but might have been afraid to
ask). In 2010 IEEE symposium on Security and privacy, pages 317–331. IEEE, 2010.
30. Kosta Serebryany. Continuous fuzzing with libfuzzer and addresssanitizer. In 2016 IEEE
Cybersecurity Development (SecDev), pages 157–157. IEEE, 2016.
31. Monirul I. Sharif, Andrea Lanzi, Jonathon T. Giffin, and Wenke Lee. Automatic reverse
engineering of malware emulators. In 30th IEEE Symposium on Security and Privacy (S&P
2009), 17-20 May 2009, Oakland, California, USA, pages 94–109. IEEE Computer Society,
2009.
32. Dongdong She, Kexin Pei, Dave Epstein, Junfeng Yang, Baishakhi Ray, and Suman Jana.
Neuzz: Efficient fuzzing with neural program smoothing. In 2019 IEEE Symposium on
Security and Privacy (SP), pages 803–817. IEEE, 2019.
33. Xiaokui Shu, Danfeng Daphne Yao, and Barbara G Ryder. A formal framework for program
anomaly detection. In International Symposium on Recent Advances in Intrusion Detection,
pages 270–292. Springer, 2015.
34. Susan Elliott Sim, Steve Easterbrook, and Richard C Holt. Using benchmarking to advance
research: A challenge to software engineering. In 25th International Conference on Software
Engineering, 2003. Proceedings., pages 74–83. IEEE, 2003.
35. Michael Sutton, Adam Greene, and Pedram Amini. Fuzzing: brute force vulnerability dis-
covery. Pearson Education, 2007.
36. Robert Swiecki. Honggfuzz. Available online a t: http://code. google. com/p/honggfuzz,
2016.
37. Norman Wilde, Michelle Buckellew, Henry Page, Vaclav Rajlich, and LaTreva Pounds. A
comparison of methods for locating features in legacy software. Journal of Systems and
Software, 65(2):105–114, 2003.
38. Babak Yadegari, Brian Johannesmeyer, Ben Whitely, and Saumya Debray. A generic ap-
proach to automatic deobfuscation of executable code. In 2015 IEEE Symposium on Se-
curity and Privacy, SP 2015, San Jose, CA, USA, May 17-21, 2015, pages 674–691. IEEE
Computer Society, 2015.
39. Michal Zalewski. Technical” whitepaper” for afl-fuzz. URl: http://lcamtuf. coredump.
cx/afl/technical details. txt, 2014.