SMT Solvers For Software Security
SMT Solvers For Software Security
1
proving to software is more recent thanks to global anal- formation it is then possible to generate SMT formu-
ysis techniques such as Predicate abstraction [4]. Predi- lae that ask questions about the potential values of such
cate Abstraction is a potentially diverging but automated bytes. Exploit generation systems have utilized this abil-
technique to infer constraints at procedure and loop ity to check if data that correspond to potentially sen-
boundaries. Software model-checking tools based on sitive pointers can be controlled by an attacker. So far
predicate abstraction (such as SLAM [4]) have brought the problems tackled in this area have been simplified
considerable value for automated analysis of software scenarios where commonplace operating system security
by uncovering hundreds of software bugs in medium- measures and binary hardening techniques are disabled.
sized drivers. To address the scalability issues aris- The reasons these limitations have been necessary will
ing from the state space explosion problem induced by be discussed in section 3.1.
model checking, the Houdini algorithm [17] has been de- The research performed on generation of malicious
vised to answer the problem of monomial predicate ab- payloads has had more success at solving real world
straction. Houdini is a simple yet powerful technique problem instances. For the purposes of this article we de-
based on candidate contracts (or may be constraints) al- fine the payload of an exploit to be code executed once
lowing the user to provide simple constraints templates the control flow of an application has been hijacked. The
and using the constraint solver in a fixed-point algo- payloads generated as part of exploitation research have
rithm to determine whether or not those constraints al- so far been sequences of return-oriented programming
ways hold at function or loop boundaries. Houdini is (ROP) gadgets. A gadget is a sequence of instructions,
implemented in the Boogie verification framework [24] within a shared library or executable in the target pro-
for which HAVOC is a front-end. While Houdini is a gram, that performs some useful computation and ends
terminating and deterministic algorithm, it is unable to by transferring control flow to the next gadget in the se-
answer existential queries, as candidate constraints are quence. A collection of such gadgets is usually chained
only persisted when they hold in every function contexts. together to accomplish a specific task, such as changing
Thus, Houdini cannot be used to answer the following permissions of a memory segment, copying in a second
question: is it feasible for parameter p to hold value v in stage payload and then executing that second stage. This
some context? . On the other hand, it can answer ques- approach to executing malicious code is necessary to
tions such as: is it provable that parameter p always has deal with protection mechanisms that prevent one from
value v?. executing code that has been placed on the stack or heap.
Section 2 illustrates the inference problem on C pro- By instead executing sequences of instructions within
grams using a simple yet non-trivial loop program based the program’s code this mechanism can be avoided and
on a Sendmail vulnerability [33] for which the SMT sometimes disabled entirely.
solver does an excellent job at deciding satisfiability of SMT solvers have been used as the reasoning com-
a set of constraints at given program points, but does ponent of systems that prove functional equivalence
not provide a mechanism to synthesize the required between a desired computation and a sequence of
constraints automatically. While Houdini is able to instructions [34]. They have also formed part of
reason about candidate loop invariants, automatically end-to-end ROP compilers [16, 32] that attempt to
inferring such complex invariants is out of reach. As automatically chain together sequences of such gadgets
such, the constraints fed to the solver are often provided so that the sequence is semantically equivalent to a
by an expert analyst or generated using limited strategies. model payload. These systems usually incorporate a
SMT solver as a small part of an larger set of algo-
rithms. Section 3.2 discusses both approaches and
their integration of SMT solvers. As with many other
1.2 Exploit generation successful applications of SMT solvers, there is a focus
Exploit generation is a more recent field of study than on reducing the number of queries that most be made
vulnerability checking. Work so far has primarily fallen and pre-processing the input to a solver through other al-
into two categories — attempts at automated generation gorithms in order to reduce the complexity of each query.
of inputs aimed at hijacking the control flow of a sys-
tem [6, 20, 1] and attempts at automating the generation
of malicious payloads [29, 34, 16, 32]. Work in the for- 1.3 Copy protection analysis
mer category has relied on symbolic/concolic execution
systems [35, 7, 10, 5] to perform constraint generation. In the domain of copy protections, we consider the two
Such systems track the semantic relationship and con- problems of equivalence checking of obfuscated pro-
straints between symbolic input data and all other bytes grams and automated cryptanalysis. We find, similarly
in program memory and CPU registers. Using this in- to the aforementioned domains of vulnerability check-
2
1: #define BUFFERSIZE 25
ing and exploit generation, that the main issue that is 2:
faced is not so much feeding a constraint system into an 3: int copy_it(char *input)
SMT solver, but rather, how to generate the constraint 4: {
system from the program (and which constraint system 5: char lbuf[BUFFERSIZE];
6: char c, *p = input, *d = &lbuf[0];
to generate) in the first place. For our application of 7: char *ulimit = &lbuf[BUFFERSIZE-10];
equivalence checking to verify the proper working of a 8: int quotation = FALSE;
deobfuscator for a virtualization obfuscator, the use of 9: int rquote = FALSE;
an SMT solver is easily feasible. If we wish to apply 10:
11: memset(lbuf, 0, BUFFERSIZE);
the same methodology to simply verifying that a virtu- 12:
alized program is equivalent to its original version, we 13: while((c = *p++) != ’\0’)
run into issues surrounding input-dependent branches in 14: {
the obfuscated version. For our application of automated 15: if ((c == ’<’) && (!quotation))
16: {
cryptanalysis, problems exist surrounding how properly 17: quotation = TRUE;
to model code with input-dependent branches, particu- 18: ulimit--;
larly input-dependent loops over unbounded quantities. 19: }
20: if ((c == ’>’) && (quotation))
Other program-analytic techniques such as invariant in-
21: {
ference can furnish solutions to these problems, but they 22: quotation = FALSE;
fall under the inference phase and are orthogonal to the 23: ulimit++;
actual solving of the system. 24: }
25: if (c == ’(’ && !quotation && !rquote)
26: {
2 SMT in vulnerability checking 27: rquote = TRUE;
28: // FIX: insert ulimit--; here
29: }
In this section, we use a verification tool HAVOC [21] 30: if (c == ’)’ && !quotation && rquote)
(a heap-aware verifier for C and C++ programs) to trans- 31: {
late to an intermediate form Boogie [24] which then calls 32: rquote = FALSE;
33: ulimit++;
SMT solver Z3 [26] to decide a vulnerable program from 34: }
a non-vulnerable one. For conciseness, we do not make 35: if (d < ulimit)
explicit the step of transforming the source code into an 36: *d++ = c;
intermediate representation (IR) and go straight to the 37: }
38: if (rquote)
construction of the formula. The Boogie IR is based 39: *d++ = ’)’;
on the Static Single Assignment form (SSA [13]) which 40: if (quotation)
makes it easier to construct the final formula passed to 41: *d++ = ’>’;
the solver. We use the code snippet [15] in Figure 1, a 42: }
simplified version of the Sendmail crackaddr vulnerabil-
ity [33] published by Mark Dowd in 2003. This example
contains a non-trivial loop program parsing an untrusted Figure 1: Essence of the Sendmail crackaddr vuln.
string parameter.
A buffer overflow vulnerability exists at line 36 due to
a missing decrement of the ulimit variable. The correct local buffer (since BUFFERSIZE equals 25). Depend-
fix for this loop is to enable such decrement at line 28. ing on the processed input string, this value can either
At first sight, this loop seems rather complex to verify. A be incremented or decremented. It is possible to model
few key remarks about the structure of the loop are fun- the expected behavior of the loop using a finite state au-
damental in understanding the behavior of this function. tomata corresponding to the expected and valid results of
First, the function contains two state variables quotation its computations (with line 28 enabled). Here, we model
and rquote corresponding to the value of the currently the state of the loop as a triple of type (bool,bool,int)
processed character as pointed by variable c . The two corresponding to the value of variables quotation, rquote
state variables hold value False at the initial state of the and the value of upperlimit - lbuf as shown in Figure 1.
loop. Only a small number of combinations of state val-
ues are possible due to the fact that the inner condition-
als in the loop are mutually exclusive, since the value of
variable c does not change within the same iteration of Note that upperlimit is a pointer variable and not
the loop. The second fundamental remark on this loop is an integer offset, thus the numerical value in the third
about the ulimit variable. The initial value of this vari- component of the triple should in fact be read lbuf +
able as assigned on line 7 is pointing on offset 15 of the num where num is the relative offset from the beginning
3
′
<′
(F, T, 14) (T, T, 13) The next step consists of the following: assuming that
′ the loop is in one of the states described by the invariant
′ ′ >′ at the beginning of the iteration, does the loop remain in
′ ′ ) a state described by the invariant? There are four cases
( ′ to consider (assuming the loop start in one of the four
<′ states) and four sub-cases for each case (assuming we
take one of the four available transition, corresponding to
(F, F, 15) (T, F, 14) one of the four conditionals of the loop). In practice, not
all transitions are available from all states. As such, the
′
>′ proof will be smaller than unrolling 16 different cases.
1. Assume that the loop iteration starts in state
(ulimit = lbuf + 15 ∧ quotation = F ∧ rquote = F).
Figure 2: Automaton corresponding to loop in Fig. 1
(a) The loop enters state (ulimit = lbuf + 14 ∧
quotation = T ∧ rquote = F) if it executes the
of the buffer. We represent this value by only displaying first conditional (lines 15-19)
num in the automaton for conciseness. The absence (b) The loop enters state (ulimit = lbuf + 14 ∧
of line 28 in the studied loop introduces a problematic quotation = F ∧ rquote = T) if it executes the
transition in this automata, since the ulimit variable is not third conditional (lines 25-29)
bounded anymore. It is then possible to assign a value to (c) No other conditional can be entered from such
ulimit that is big enough to trigger an out of bound write entry state.
access to the local buffer on line 35. We can formalize a
logical representation corresponding to this automata by 2. Assume that the loop iteration starts in state
ignoring transitions and only retaining state values. The (ulimit = lbuf + 14 ∧ quotation = T ∧ rquote = F).
automaton then corresponds to the following formula P : (a) The loop enters state (ulimit = lbuf + 15 ∧
quotation = F ∧ rquote = F) if it executes the
(ulim = lbuf + 15 ∧ quotation = F ∧ rquote = F ) second conditional (lines 20-24)
∨ (ulim = lbuf + 14 ∧ quotation = T ∧ rquote = F )
(b) No other conditional can be entered from such
∨ (ulim = lbuf + 14 ∧ quotation = F ∧ rquote = T )
entry state.
∨ (ulim = lbuf + 13 ∧ quotation = T ∧ rquote = T )
3. Assume that the loop iteration starts in state
The technique used by SMT solvers to verify such in- (ulimit = lbuf + 14 ∧ quotation = F ∧ rquote = T).
variant is known as proof by induction. In a nutshell, a
proof by induction involves two steps: (a) The loop enters state (ulimit = lbuf + 15 ∧
quotation = F ∧ rquote = F) if it executes the
1. Prove that the formula holds for the base case (at the fourth conditional (lines 30-34)
entry point of the loop) : P (0) (b) The loop enters state (ulimit = lbuf + 13 ∧
quotation = T ∧ rquote = T) if it executes the
2. Prove that if the formula holds at iteration n of the first conditional (lines 15-19)
loop, then it also holds at the next iteration : ∀n : (c) No other conditional can be entered from such
P (n) ⇒ P (n + 1) entry state.
P (0) means invariant P holds at the entry state while 4. Assume that the loop iteration starts in state
P(n) means that P holds at the nth iteration. Such for- (ulimit = lbuf + 13 ∧ quotation = T ∧ rquote = T).
mula is indeed inductive and easily solved by a SMT (a) The loop enters state (ulimit = lbuf + 14 ∧
solver. We now give the full version of this proof. quotation = F ∧ rquote = F) if it executes the
second conditional (lines 25-29)
Proof. The proof involves analysis of all possible loop
(b) No other conditional can be entered from such
states and valid transitions between states. It is trivial to
entry state.
notice that the formula indeed holds at the entry point
of the loop since the entry state (ulimit = lbuf + 15 ∧ Thus, all possible states of the loop are correctly cap-
quotation = F ∧ rquote = F) exactly corresponds to the tured by the invariant. In other words, ∀n : P (n) ⇒
value of the variables at the loop entry. Thus P (0) holds. P (n + 1)
4
On the other hand, the formula is violated when line
28 is absent from the loop. This shows that, when
correctly queried, the solver is able to differentiate
between a correct program and an invalid program
even when subtle conditions are modelled. However, (F, F, 15)
the loop invariant has to be provided manually. To
our knowledge, there is no tool available to the public
that would be able to infer such conditions automat-
ically. Abstract interpretation techniques [12] based
on control-flow partitioning [25] have shown to be (T, F, 14) ∨ (F, T, 14) ∨ (T, T, 14)
useful for synthesizing loop invariants but synthesis of
such complex invariants seems out of reach due to the
over-approximation employed by abstract interpretation
to keep full automation. Another strategy to infer simple
loop invariants is to generate the candidates using invari- (T, T, 13) (F, T, 15)
ant synthesis strategies based on simple grammars. Such
approach has been used to perform runtime invariant
synthesis and discover likely invariants [27] in software Figure 3: Simpler automaton that fails to model loop
programs based on witnessed executions. We believe
that this latter approach would not bring the desired
result on this example since the invariant is not respected The case study of this section illustrate well the limits
in the presence of a security vulnerability and thus of SMT solvers in absence of a proper constraint gener-
would not be discovered by executing the program trace ation engine to feed them. It is possible to verify code
on which the vulnerability is to be found. invariants as long as those invariants are provided by the
developer. Yet, automated analysis of such loop con-
In order to illustrate the inference problem better, structs is extremely challenging when no input from the
let us try to verify a more abstract formula that is a developer is available, since generating the expected con-
relaxation of the real invariant. Such simpler formula tract from a piece of code is not the role of the solver, and
is an interesting candidate invariant for the loop as it existing inference techniques are usually unable to cope
contains less sub-formulae and thus is more likely to be with such complex loop invariants. Fortunately, there
generated automatically. also exists many other properties [2] [23] [38] for which
the contracts can be more easily guessed. We will see in
the next section that such limitations are not specific to
(ulim = lbuf + 15 ∧ quotation = F ∧ rquote = F ) the scenario of vulnerability discovery.
∨(ulim = lbuf + 14 ∧(quotation = T ∨rquote = T ))
∨ (ulim = lbuf + 13 ∧ quotation = T ∧ rquote = T )
3 SMT in Exploit Generation
This second formula corresponds to the automaton Since 2008 there have been a number of papers [6, 20, 1]
in Figure 3. We do not indicate the input vocabulary in which attempts have been made to develop systems
of this second automaton as it does not correspond to for Automatic Exploit Generation (AEG) that rely on an
a concrete representation of the loop, but a candidate SMT solver for constraint solving. This early work gen-
abstraction of it. Unfortunately, this second invariant erally took the definition of an exploit to be an input to
is not provable due to the introduced spurious state a program that, through leveraging memory corruption
(T,T,14) which is not a real behavior of the loop. When of some kind, results in the hijacking of the program
starting the loop in such spurious state and executing counter and the execution of attacker controlled code. At
the second conditional code of the loop (from line 20 their core, these systems are extensions of the input gen-
to 24 on Figure 1), another spurious state (F,T,15) can eration techniques that have successfully been applied to
be reached. Such state is violating invariant 2. Thus, vulnerability detection [19, 18, 8, 7].
invariant 2 does not hold at every iteration of the loop. While they have had limited success in synthesizing
This failed example shows the difficulty of abstracting exploits for simple vulnerabilities, with relaxed operat-
information from an invariant without losing soundness. ing system security measures, there is a large theoretical
and practical gap still to be bridged before they are appli-
cable to real world problems. In this section we will ex-
plain how this gap results from primitive modeling of the
5
problem domain, rather than a limitation of SMT-based K : I → F . I is the union of the set of register identi-
technologies. We will also discuss some of the more suc- fiers, one for each register and subregister, with the set of
cessful applications of SMT-based technologies to prob- identifiers for all valid memory addresses, one for each
lems found in exploit development. address. F is the set of closed quantifier free formulae
over the theory of fixed sized bitvectors.
Exploit generation systems up to now have relied en-
3.1 Restricted-Model Exploit Generation
tirely on K in conjunction with a set of ad-hoc exploit
Assume that we have a standard symbolic/concolic templates to extend the work performed for input gener-
execution environment, such as S2E [10], BAP [5], ation to produce exploits. We will refer to this approach
TEMU [35], KLEE [7] or those described elsewhere as restricted-model exploit generation. The lack of suc-
[20, 19]. With any of these systems we can pause cess of such systems in tackling non-trivial exploits can
symbolic execution and for any memory address or be directly attributed to the restricted model of the exe-
register identifier retrieve the path condition for the cution environment used. Before discussing why this is,
data at that location. The path condition pc is a logical let us first define what we mean by an exploit template
formula describing the constraints and manipulations and then look at the two typical approaches to AEG.
performed on that data between its introduction from
an attacker controlled source and the current point of
3.1.1 Exploit Templates
execution. Consider the following sample x86 assembly
code, with the assumption that byte in the AL register is A set of exploit templates T is a collection of algorith-
under attacker control: mic descriptions of basic strategies for taking advantage
of vulnerabilities, that meet a set of criteria, in order to
0: add al, al execute malicious code. A template t ∈ T will take as
1: sub al, 0x0f input K and produce a SMT formula f . A satisfying
2: test al, al assignment for f will be an exploit for P if the model
3: jz 5 encoded in f accurately models the constraints imposed
4: ... on program inputs and any other relevant program state
5: jmp 7 and environment properties. An AEG system will typi-
6: ... cally include several of these templates and select among
them based on information derived from K and other in-
If we represent the input byte as b0 , and create a new formation available about the type of vulnerability. As an
variable bn on each write to a variable, then at address example, if the AEG system detects, based on K, that on
6 the path condition for the byte in AL is the following the execution of a ret instruction the memory pointed to
conjunction of clauses by the ESP register is attacker controlled it would likely
select a template that expresses the following constraint:
b1 = b0 + b0 ∧ b2 = b1 − 15 ∧ b2 = 0
K(m) = v0 ∧ K(m + 1) = v1 ∧ K(m + 2) =
whereas at address 4 the path condition is v2 ∧ K(m + 3) = v3
b1 = b0 + b0 ∧ b2 = b1 − 15 ∧ b2 6= 0
where m is the value in the ESP register and
One can then use a SMT solver to ask queries about the (v0 , v1 , v2 , v3 ) are each values in the range 0-255 spec-
states represented by these formulae by appending con- ifying the address we wish to redirect control flow to.
straints and looking for satisfying assignments. For ex- Overall the strategies encoded in exploit templates so far
ample, if we wanted to check at address 4 whether the have been rudimentary. Due to the fact that the only
value 11 can be in the AL register we would create the information source for accurate constraints on the pro-
formula: gram’s state is K, the templates can only generate for-
mulae that ask questions about the possible value ranges
b1 = b0 + b0 ∧ b2 = b1 − 15 ∧ b2 6= 0 ∧ b2 = 11 of bytes for which we have path conditions. Complex
exploit strategies, such as those usually required to deal
An SMT solver will then return a satisfying assignment, with modern binary hardening and OS protection mech-
if one exists, such as b0 = 13 in this case. anisms, require one to reason about a far richer domain
Effectively, the sum total of knowledge possessed by than that described by K. For example, the state of
the system can be expressed as a map K from a col- the heap and its relationship with user input. Similarly,
lection of register/memory identifiers (i0 , i1 , ..., in ) to a the exploitation of other vulnerability categories requires
path condition for each (pc0 , pc1 , ..., pcn ). K has type further abstractions and models to be introduced into the
6
symbolic/concolic execution phases of AEG. Use-after- to the technology required for vulnerability detection
free issues, for example, require both a description of the systems based on symbolic/concolic execution. As such
heap state and also higher level abstractions that entail it is instructive to ask, Why does this approach succeed
objects and their allocation status. for vulnerability detection in real world scenarios but fail
for exploit generation in real world scenarios?. The an-
swer can be found by considering the accuracy of the
3.1.2 AEG with Restricted Models
model used in reasoning with respect to the problem
Two similar mechanisms have been used for AEG, both being modeled. Successful vulnerability detection sys-
of which can be classed as restricted-model exploit gen- tems have found a wealth of vulnerabilities as a result
eration. In the first [20], we start with a program P and of unsafe arithmetic within programs. The information
an input I that we know to be bad. For example, I might required to accurately decide whether a sequence of in-
have been produced by a fuzzer and causes P to crash. structions is unsafe or not, in this context, is contained
Under this approach we execute P (I) within a concolic within the path condition for the output bytes. In this
execution environment up until the point where a crash case the model K is quite close to an ideal model for
would occur during normal execution. At this point, K deciding questions of the problem domain.
is available and we also have information on the cause If we ignore binary hardening, such as stack canaries,
of the crash e.g. we know if it was due to an attempt to and OS security measures, such as address space layout
execute, read or write invalid memory. randomization (ASLR), no-execute (NX) permissions on
Such a system will also include a library of exploit memory regions and more secure memory allocators,
templates T as described above. Using K and the knowl- then AEG is also a tractable problem using the model
edge of the cause of the crash, one or more templates can K for certain vulnerability classes. In this environment
be selected and a set of formulae are generated. the primary factor impacting the success or failure of an
The second approach integrates the vulnerability exploit is the manipulations and constraints imposed by
checking process with AEG [1]. Under this approach a the executed instructions. As these factors are modeled
symbolic/concolic execution environment is used to exe- by K then it is possible for a template to create a SMT
cute the program on symbolic input data. A safety pred- formula that accurately describes the requirements for a
icate φ is invoked at particular program points to check working exploit.
whether a potentially unsafe operation is occurring. For Some AEG systems account for certain protection
example, on a memcpy call the safety predicate might mechanisms, such as limited ASLR without NX [20] and
check that the size argument is sufficiently restricted to NX with limited ASLR [32], in conjunction with sup-
prevent an overflow of the destination buffer. Once a port for a limited set of vulnerability types, such as stack
safety predicate returns false this approach again has ac- based overflows without functional stack hardening. The
cess to K but with the possibility of checking multiple restrictions imposed by these systems on the problems
possible exploit scenarios that depend on the value of el- they can handle effectively reduce the state of the ex-
ements of K. ploited programs environment to one that is sufficiently
For example, on a vulnerable memcpy assume that the accurately modeled by K.
destination buffer is n bytes in size but the size parame- The accuracy of K as a model rapidly deteriorates
ter can be m bytes, with m > n. Then there are m − n once one begins to consider correctly implemented pro-
possible lengths that would violate the safety property. tection mechanisms as found in modern Linux and Win-
Under the first approach a bad input provides a single vi- dows operating systems. It also deteriorates once one
olation of the safety property. In this case, each different begins to consider exploitation of vulnerabilities that re-
input length n < l ≤ m results in the corruption of a dif- quire the manipulation of environmental factors such as
ferent amount of data. Thus, for each value of l a set of the heap layout. In these scenarios a useful model must
formulae may be generated using K and T . Depending account for the effects of user input on the memory lay-
on the value of l the data corrupted could lead to differ- out.
ent possibilities for exploitation, e.g. corrupting 4 bytes
might lead to control of a pointer used as the destination 3.1.3 The Future of AEG
of a write, while corrupting 12 might lead to control of
a function pointer. The second approach can therefore In order to become a practical solution much work re-
generate more exploit candidates using a wider variety mains to be done in AEG. SMT-backed approaches have
of templates. It is important to note though that each of shown promise but the constraint generation phase still
these candidates is generated using K and T and is thus restricts the applicability of these systems to modern
limited as previously described. software and operating systems. In particular the model
Conceptually, both of these approaches are quite close of the program and its environment must be improved.
7
Alongside this, it is worth considering the changes that 3.2 Automated Payload Creation
will have to be made to the template driven approach to
AEG in order to fit in with the trend towards more appli- While AEG systems have attempted to automate the con-
cation and vulnerability-specific exploitation techniques. trol flow hijacking part of exploitation there has also been
Closing the gap between the model of a system and research on the application of SMT-based systems to the
the system can be done in two ways. The first is the ap- generation of ROP payloads [29, 34, 16, 32]. These sys-
proach that has been taken so far with AEG, to reduce the tems have been developed to free an exploit developer
complexity of the system until the model is sufficiently from the tedious process of pouring over the potentially
accurate. The second is to increase the sophistication of hundreds of thousands of candidate gadgets that may be
the model so that it entails more information on the sys- found within a large binary.
tem. As mentioned in the introduction, the approaches
The reason the former approach was taken in the early taken have fallen into two categories. Those that at-
work on AEG is straightforward; it is relatively cheap to tempt to prove equivalence between a single gadget at
extend previous work on symbolic/concolic execution to a time and a model of some computation we wish to per-
produce the model K. It is apparent that this model is an form [34], and later systems that have attempted to pro-
unrealistic abstraction of the state that must be taken into vide a full compiler that can assemble multiple gadgets
account for modern exploit generation. This can be seen in sequence to achieve this computation [16, 32].
if we consider the lack of any information on the relation- In the former approach, we first collect every valid se-
ship between the program’s input and the layout of heap quence of instructions in the binary that ends in an in-
memory. Without such information we cannot generate struction that can successfully transfer execution to the
constraints for which a satisfying assignment can manip- next gadget in sequence e.g. a ret instruction if the gad-
ulate the heap accurately. As a result, we cannot per- get addresses are provided at the location pointed to by
form reliable AEG for any vulnerabilities that may be im- ESP. The system takes as input these candidate gadgets
pacted by heap randomisation. This includes heap over- and the specification of a computation s we require to be
flows but also use-after-free vulnerabilities which are the performed e.g. ESP <- EAX + 8, which specifies we
most prolific form of security flaw in web browsers. are looking for a gadget that puts the value stored in the
When considering future directions for AEG it is im- EAX register plus 8 in the ESP register.
portant to look at the latest developments in manual ex- First, the system will create an SMT representation
ploit creation. For quite some time, the era of generic ssmt of the computation specification. In other words, it
exploitation techniques that take advantage of obvious will convert the specification to an SMT formula. The
flaws in allocators and protection mechanisms has been system will then perform a number of heuristic, but
drawing to a close. While there will always be excep- sound, reductions of the candidate gadget set e.g. elim-
tions, it is more common than not for exploits to leverage inating any gadgets that neither read EAX or write ESP.
application and vulnerability-specific methods to avoid At this point the system will iterate over the remaining
protection mechanisms than attempt to defeat them. For candidate gadgets C and for each gadget g ∈ C create
example, on a heap overflow it is far more likely to be the conjunction of a set of formulae that express the se-
successful if a controlled overwrite can be made to a mantics g. For each g ∈ C the formula g ⇔ ssmt is then
pointer value within the same chunk that will later be created and checked for validity. If the formula is valid
called than attempting to corrupt heap metadata. it means that under all interpretations of the variables in
Modern exploits are also far more likely to leverage in- g and ssmt their semantics are equivalent. This tells us
formation leakage attacks, a topic that has so far received that the gadget can be used to express the computation
no attention in terms of AEG. While many may consider we require. If the formula is satisfiable but not valid it
information leakage to result from different vulnerabil- means that the gadget may work under some interpre-
ity types it is common for certain vulnerabilities, such tations but may not under others. This would not be a
as use-after-free, double-free etc, to be leverage for both desirable property of a component in a reliable exploit.
information leakage and code execution. This approach can be quite useful in quickly discover-
Both of these issues combined call into question the ing gadgets for simple computation specifications, such
potential of template driven AEG to be sufficiently gen- as the example given. However, such a system can only
eral to be useful. We consider it likely that once suffi- check if there is an exact correspondence between one
ciently accurate models are available it will be more use- gadget and the specification. Ideally, we would like to
ful to allow user-driven constraint generation based on check whether the computation can be performed by
their knowledge of exploitation, the application and the chaining together n gadgets if necessary. This is often
vulnerability in combination with more limited general a requirement once our specification requires more com-
and application specific templates. plex data movement or arithmetic. A collection of m
8
gadgets can potentially be arranged in m! different ways. ware analyst must either possess a tool to invert the trans-
Discovering the most useful potential combinations and lation, or reverse engineer the code as it is running inside
then reasoning about their semantics requires a combina- of the interpreter (rather than in the form to which he or
tion of heuristics and formula solving. she is more accustomed, viz., x86 code).
Systems have been developed and successfully applied Such tools make life difficult for the reverse engineer,
to this problem, which have used SMT technology for and they are also quite complex for the protection author
different purposes. In [16], an SMT solver was used to to construct. The translation must capture precisely the
reason about gadgets that contained branches. For ex- semantics of the instructions under consideration, other-
ample, if the gadget contained a branch an instruction wise the virtualized program has the potential to produce
sequence that might result in a crash a solver would be different behaviors than the original, which – by explicit
employed to check if it is possible that branch is never design goal – would be very difficult for the developer
taken given the gadget’s semantics. In [32], an SMT to diagnose. Given this complexity, it is not a surpris-
solver was used to look for arrangements of gadgets that ing notion that these tools might have bugs in the form
meet the requirements of the computation specification. of improper translations. Similarly, if the analyst were
Both the single-gadget and gadget compiler ap- to construct an inversion tool to deobfuscate a virtual-
proaches have been successful at alleviating a certain ization obfuscator, the complexity of such a tool could
amount of manual effort in the process of ROP payload easily lead to bugs in the form of improper deobfusca-
creation. In both approaches we can see the pattern that tion.
exists throughout must successful integrations of SMT Equivalence checking is a well-known technique for
technology — minimizing the number of queries that verifying the equivalence of two pieces of code. The sim-
must be made to a solver, reducing the problem space plest case is when the code snippets are straight-line (i.e.
through the use of less computationally expensive algo- branchless). As an example, consider the C programs
rithms and ensuring the constraints generated are a suf- x0 = y + y; and x1 = y << 1;. Both of these pro-
ficiently accurate model of the problem being reasoned grams logically encode the notion of doubling the vari-
about. able y and storing the result in some other variable (since
shifting left by one corresponds to multiplying by two).
4 SMT in protection analysis To determine whether these sequences produce equiva-
lent results, we encode them as SMT formulae and then
Software protection analysis is critically important in query the decision procedure for the condition x0 6= x1 .
dealing with malware, since most samples employ some If this formula is satisfiable, the SMT solver will return
sort of packing or obfuscation techniques in order to a counterexample, namely, a value of y for which the
thwart analysis. It is also an area of economic concern sequences differ. If this formula is unsatisfiable, this is
in protecting digital assets from piracy and intellectual a proof (assuming the soundness of the solver) that the
property theft. We present several areas in which SMT two sequences always produce the same output, given
solvers have been practically applied towards these prob- the same input.
lems. To apply this procedure to two branchless sequences
of x86 instructions, we convert both sequences to our
intermediate representation, then put both sequences in
4.1 Equivalence checking for verification
Single Static Assignment (SSA [13]) form, convert the
of deobfuscation results SSA version of the IR to SMT formulae, and query the
Virtualization obfuscators [31] are an especially complex decision procedure as to whether the output variables
category of software protection tools that are commonly (i.e., flags, registers, and memory) can ever differ (i.e.,
abused by malware. These tools work by converting por- eaxseq1 6= eaxseq2 ∨ ebxseq1 6= ebxseq2 . . .). To com-
tions of the program-to-be-protected’s x86 machine lan- pare the contents of memory in this way, the solver must
guage into a randomly-generated language that is then support the theory of extensional arrays (which many
executed at runtime in an interpreter, which itself is also modern solvers fortunately do).
randomly-generated and obfuscated. The original x86 SMT-backed equivalence checking provides a power-
machine code in these regions is then overwritten. Code ful primitive for ensuring the correctness of a deobfus-
is generated within the binary such that, at run-time, cation procedure on branchless sequences. One simply
when the program goes to execute the protected code, generates some branchless program that falls within the
the register state is saved to some location (e.g., onto purview of the virtualization obfuscator, obfuscates it,
the stack), the interpreter executes, the register state is deobfuscates it, and uses equivalence checking to com-
reloaded, and then the unprotected portion of the pro- pare the resulting code against the original code.
gram executes normally. The end result is that the mal- Applying this procedure helped discover potentially
9
incorrect translations in TheMida CISC VM [37] after and mouse game in software protection, techniques such
having constructed a deobfuscation procedure for this as [39] can be used to obfuscate cryptographic keys,
protection. The virtualization of certain instructions such and techniques such as [36] demonstrate that these tech-
as ror and inc did not take some of the subtleties of those niques are not necessarily infallible). Many protection
instructions into account. In the case of ror and simi- authors unfortunately still do not heed this advice, lead-
lar instructions, the Intel manuals dictate that these in- ing to a glut of cracked software available on peer-to-peer
structions do not modify the flags if the shiftand is zero. networks.
Therefore, improper maintenance of the flags prior to the SMT solvers can be utilized as a medium for manu-
execution of these instructions could cause the flags to ally modelling licensing schemes. [14] focuses on one
take different values in the obfuscated and deobfuscated scheme in particular, which is partially depicted in Fig-
versions. Similarly, ”inc” does not modify the carry flag, ure 5.
so any modification to this flag induced by the obfuscator
before the instruction executes would result in incorrect 1: again:
machine state. 2: lodsb
Figure 4 illustrates a more subtle example of poten- 3: sub al, bl
4: xor al, dl
tial incorrectness in translation. This (deobfuscated) in- 5: stosb
struction sequence loads an address (stored in enciphered 6: rol edx, 1
form) from the memory location pointed at by the esi reg- 7: rol ebx, 1
ister, deciphers the address in the next four instructions, 8: loop again
then loads a byte from that address and pushes it onto
Figure 5: The main loop of the serial algorithm
the stack. Since the ciphering process is invertible, this
code snippet enforces no restriction upon the range of ad-
dresses from which the byte could potentially be loaded. First, the authors manually cryptanalyze the protec-
Therefore, the address could well point onto the stack, tion from an algebraic standpoint and construct a highly
below the location of the current stack pointer. Since the efficient key generator. Next, the authors demonstrate
obfuscator introduces many spurious writes to the stack, how to manually model the scheme in terms of an
the value loaded in the deobufscated world could differ instance of the SAT problem. Since modern SMT
from the one in the obfuscated world. This translation solvers subsume SAT solvers, the scheme can obviously
error would be unlikely to result in a runtime error in the be manually modelled in terms of operations within
real world, but it demonstrates the exhaustive capabilities the bitvector theory, in a manner that is more succinct
of SMT solvers towards the equivalence checking prob- and natural than the low-level bitwise manual CNF
lem. encoding. For instance, a multiplication operator can
be modelled natively as one term within many solvers,
1: lodsd dword ptr ds:[esi] whereas to model such a thing in terms of operations
2: sub eax, ebx upon individual bits (as in a SAT instance) gener-
3: xor eax, 7134B21Ah
4: add eax, 2564E385h
ates complex circuits. One iteration of the loop shown
5: xor ebx, eax in Figure 5 can be manually modelled in SMT as follows:
6: movzx ax, byte ptr ds:[eax]
7: push ax First, the authors manually cryptanalyze the protection
from an algebraic standpoint and construct a highly effi-
Figure 4: Deobfuscated sequence cient key generator. Next, the authors demonstrate how
to manually model the scheme in terms of an instance of
the SAT problem. Since modern SMT solvers subsume
4.2 SMT-based input crafting for semi- SAT solvers, the scheme can obviously be manually
automated cryptanalysis modelled in terms of operations within the bitvector
theory, in a manner that is more succinct and natural
When it comes to constructing licensing systems, best than the low-level bitwise manual CNF encoding. For
practices dictate that only properly-vetted implementa- instance, a multiplication operator can be modelled na-
tions of trusted cryptographic algorithms be utilized as tively as one term within many solvers, whereas to model
part of securely-designed cryptosystems. (Even then, such a thing in terms of operations upon individual bits
that this may be insufficient to prevent against certain (as in a SAT instance) generates complex circuits. One
types of attacks such as those where the attacker is able iteration of the loop (particulary, iteration i) as shown
to replace private keys within a binary or otherwise patch in Figure 5 can be manually modelled in SMT as follows:
the program’s logic. As part of the never-ending cat
10
al0,i = activation code[i] that place them in a strictly restricted class of the general
∧ al1,i = al0,i − ebxi [7 : 0] input-crafting problem: any type of code (including ob-
∧ al2,i = al1,i ⊕ edxi [7 : 0] fuscated code) with unrestricted programming language
∧ output[i] = al2,i constructs might be utilized to implement a serial check
∧ edxi+1 = rotate lef t(edxi , 1) – and in fact, the constraints might even be harder than
∧ ebxi+1 = rotate lef t(ebxi , 1) the ordinary case due to the prevalence of hard crypto-
graphic operations. Hence, progress towards this pursuit
In this formula, activation code corresponds to the is tied to progress in binary program analysis and verifi-
memory region pointed at by the esi register and is cation/SMT solvers in general. Nevertheless, these early
an input to the serial algorithm, output corresponds to results are encouraging.
the memory region pointed at by the edi register, and
rotate lef t is a built-in function in many SMT solvers
for performing leftward rotation. 5 Conclusion
The same technology that is used in other problem do-
SMT solvers are becoming an integral part of the secu-
mains for more conventional tasks in programming lan-
rity engineer’s tool kit. We presented three applications
guage theory, such as those discussed hereinbefore i.e.
of SMT solvers in vulnerability discovery by static anal-
vulnerability discovery and test-case/exploit generation,
ysis, exploit generation (a specialization of input craft-
can also be repurposed for the sake of solving problems
ing), and copy protection analysis. In these three appli-
such as this one semi-automatically. The Pandemic bi-
cations, solvers do a remarkable job of assisting the an-
nary program analysis framework was employed [30] to
alysts in deciding whether suggested solutions are valid
automatically (statically) generate an execution trace of
in their respective problem space. Yet, solvers are not
a run of the algorithm, where the user’s input is treated
suited for generating domain-specific problem descrip-
as free variables. We then manually constructed the post-
tions as the preliminary constraint generation step has to
condition that the output must satisfy, and then fed the re-
be performed outside the solver. We expect that special-
sults to an SMT solver. The inputs derived by the solver
ized constraint inference assistants will improve in the
correctly break the scheme. Space considerations force
future and help generate formal problem definitions for
us to refer the interested reader to [30] for more details
non-trivial problems in the area of computer security.
of the problem, the system architecture of Pandemic, and
the solutions.
These problems can be attacked purely statically, if the 6 Acknowledgements
analyst is willing to invest the time in manually mod-
elling the state required to simulate the execution of the The authors would like to thank Shuvendu Lahiri and
serial algorithm, or in a concolic fashion (which allows Matt Miller for their insights on this article.
for greater automation). Static solutions may be pre-
ferred when the analyst wishes to investigate the prop-
erties of some piece of code that he or she might not References
know how to trigger; a static investigation may inform [1] AVGERINOS , T., C HA , S. K., H AO , B. L. T., AND B RUMLEY,
the analyst whether such an undertaking is merited (i.e., D. AEG: Automatic exploit generation. In Network and Dis-
whether the portion of code exhibits some vulnerability; tributed System Security Symposium (Feb. 2011), pp. 283–300.
if this is not the case, then the broader problem of driving [2] BALL , T., H ACKETT, B., L AHIRI , S. K., Q ADEER , S., AND
execution to that location would be fruitless). VANEGUE , J. Towards scalable modular checking of user-
defined properties. In Proceedings of the Third international con-
This particular problem instance has the nice property ference on Verified software: theories, tools, experiments (Berlin,
that the path that the algorithm takes is not dependent Heidelberg, 2010), VSTTE’10, Springer-Verlag, pp. 1–24.
upon the user’s input. Specifically, the algorithm con- [3] BALL , T., L AHIRI , S., AND M USUVATHI , M. Zap: Automated
sists of a loop that executes for a fixed number of it- theorem proving for software analysis. In Logic for Program-
erations before comparing the output to a fixed value. ming, Artificial Intelligence, and Reasoning, G. Sutcliffe and
Hence, the problem is easier to solve than what might A. Voronkov, Eds., vol. 3835 of Lecture Notes in Computer Sci-
ence. Springer Berlin / Heidelberg, 2005, pp. 2–22.
be the case if the path were input-dependent, for exam-
ple, if multiple checks lead to failure cases, or if the in- [4] BALL , T., L EVIN , V., AND R AJAMANI , S. K. A decade of soft-
ware model checking with SLAM. Commun. ACM 54, 7 (July
put length were unbounded and the algorithm iterated 2011), 68–76.
over it in its entirety (such conditions could potentially
[5] B RUMLEY, D., JAGER , I., AVGERINOS , T., AND S CHWARTZ ,
be mitigated through the use of loop invariants, perhaps E. J. BAP: a binary analysis platform. In Proceedings of the 23rd
automatically-synthesized ones). We emphasize, how- international conference on Computer aided verification (Berlin,
ever, that there is nothing special about serial algorithms Heidelberg, 2011), CAV’11, Springer-Verlag, pp. 463–469.
11
[6] B RUMLEY, D., P OOSANKAM , P., S ONG , D., AND Z HENG , [23] L AHIRI , S. K., AND VANEGUE , J. ExplainHoudini: Making
J. Automatic patch-based exploit generation is possible: Tech- Houdini inference transparent. In Proceedings of the 12th inter-
niques and implications. In Proceedings of the 2008 IEEE Sym- national conference on Verification, model checking, and abstract
posium on Security and Privacy (Washington, DC, USA, 2008), interpretation (Berlin, Heidelberg, 2011), VMCAI’11, Springer-
SP ’08, IEEE Computer Society, pp. 143–157. Verlag, pp. 309–323.
[7] C ADAR , C., D UNBAR , D., AND E NGLER , D. KLEE: unassisted [24] L EINO , K. R. M. The Boogie 2 project.
and automatic generation of high-coverage tests for complex sys- http://boogie.codeplex.com/.
tems programs. In Proceedings of the 8th USENIX conference
[25] M AUBORGNE , L., AND R IVAL , X. Trace partitioning in abstract
on Operating systems design and implementation (Berkeley, CA,
interpretation based static analyzers. In Programming Languages
USA, 2008), OSDI’08, USENIX Association, pp. 209–224.
and Systems, M. Sagiv, Ed., vol. 3444 of Lecture Notes in Com-
[8] C ADAR , C., G ANESH , V., PAWLOWSKI , P. M., D ILL , D. L., puter Science. Springer Berlin / Heidelberg, 2005, pp. 139–139.
AND E NGLER , D. R. EXE: Automatically generating inputs of
[26] N IKOLAI B JORNER , L. D . M. The Z3 constraint solver.
death. ACM Trans. Inf. Syst. Secur. 12, 2 (2008).
http://research.microsoft.com/projects/z3/.
[9] C HESS , B. V. Improving computer security using extended static
[27] P ERKINS , J. H., AND E RNST, M. D. Efficient incremental al-
checking, 2002.
gorithms for dynamic detection of likely invariants. SIGSOFT
[10] C HIPOUNOV, V., K UZNETSOV, V., AND C ANDEA , G. The S2E Softw. Eng. Notes 29, 6 (Oct. 2004), 23–32.
platform: Design, implementation, and applications. ACM Trans.
[28] P OTTIER , F., AND R EMY, D. The essence of ml type inference.
Comput. Syst. 30, 1 (2012), 2.
In Advanced Topics in Types and Programming Languages, B. C.
[11] C LARKE , E. Model checking. In Foundations of Soft- Pierce, Ed. MIT Press, 2005, ch. 10, pp. 389–489.
ware Technology and Theoretical Computer Science, S. Ramesh
[29] ROEMER , R., B UCHANAN , E., S HACHAM , H., AND S AVAGE ,
and G. Sivakumar, Eds., vol. 1346 of Lecture Notes in Com-
S. Return-oriented programming: Systems, languages, and appli-
puter Science. Springer Berlin / Heidelberg, 1997, pp. 54–56.
cations. ACM Trans. Info. & System Security 15, 1 (Mar. 2012).
10.1007/BFb0058022.
[30] ROLLES , R. Semi-automated input crafting by symbolic execu-
[12] C OUSOT, P., AND C OUSOT, R. Abstract interpretation: a uni-
tion, with an application to automatic key generator generation.
fied lattice model for static analysis of programs by construc-
tion or approximation of fixpoints. In Proceedings of the 4th [31] ROLLES , R. Unpacking virtualization obfuscators. In Proceed-
ACM SIGACT-SIGPLAN symposium on Principles of program- ings of the 3rd USENIX conference on Offensive technologies
ming languages (New York, NY, USA, 1977), POPL ’77, ACM, (Berkeley, CA, USA, 2009), WOOT’09, USENIX Association,
pp. 238–252. pp. 1–1.
[13] C YTRON , R., F ERRANTE , J., ROSEN , B. K., W EGMAN , M. N., [32] S CHWARTZ , E. J., AVGERINOS , T., AND B RUMLEY, D. Q:
AND Z ADECK , F. K. Efficiently computing static single assign- Exploit hardening made easy. In Proceedings of the USENIX
ment form and the control dependence graph. ACM Trans. Pro- Security Symposium (Aug. 2011).
gram. Lang. Syst. 13, 4 (1991), 451–490. [33] S ENDMAIL , P. Sendmail release notes for the crackaddr vulner-
[14] D CODER , AND ANDREWL. kaos ”toy project” and algebraic ability.
cryptanalysis. [34] S OLE , P. DEPLIB 2.0. Ekoparty 2010.
[15] D ULLIEN , T. The future of exploitation revisited. Infiltrate con-
[35] S ONG , D., B RUMLEY, D., Y IN , H., C ABALLERO , J., JAGER ,
ference (2011).
I., K ANG , M. G., L IANG , Z., N EWSOME , J., P OOSANKAM , P.,
[16] D ULLIEN , T., K ORNAU , T., AND W EINMANN , R.-P. A frame- AND S AXENA , P. BitBlaze: A new approach to computer secu-
work for automated architecture-independent gadget search. In rity via binary analysis. In Proceedings of the 4th International
Proceedings of the 4th USENIX conference on Offensive tech- Conference on Information Systems Security. Keynote invited pa-
nologies (Berkeley, CA, USA, 2010), WOOT’10, USENIX As- per. (Hyderabad, India, Dec. 2008).
sociation, pp. 1–.
[36] S YS K. Practical cracking of white-box implementations.
[17] F LANAGAN , C., AND L EINO , K. R. M. Houdini, an annotation http://www.phrack.com/issues.html?issue=68&id=8.
assistant for ESC/Java. FME 2001: Formal Methods for Increas-
[37] T HEMIDA , P. Themida. http://www.oreans.com/themida.php.
ing Software Productivity (2001).
[38] VANEGUE , J. Zero-sized heap allocations vulnerability analy-
[18] G ODEFROID , P., K LARLUND , N., AND S EN , K. DART: Di-
sis. In Proceedings of the 4th USENIX conference on Offensive
rected automated random testing. In Proceedings of the 2005
technologies (Berkeley, CA, USA, 2010), WOOT’10, USENIX
ACM SIGPLAN conference on Programming language design
Association, pp. 1–8.
and implementation (New York, NY, USA, 2005), PLDI ’05,
ACM, pp. 213–223. [39] W YSEUR , B. White-Box Cryptography. PhD thesis, Katholieke
Universiteit Leuven, 2009.
[19] G ODEFROID , P., L EVIN , M. Y., AND M OLNAR , D. SAGE:
Whitebox fuzzing for security testing. Queue 10, 1 (Jan. 2012),
20:20–20:27.
[20] H EELAN , S. Automatic generation of control flow hijacking ex-
ploits for software vulnerabilities. Msc. dissertation, University
of Oxford, September 2009.
[21] L AHIRI , S.K, Q. S. B. S. HAVOC:
Heap aware verifier for c and c++ programs.
http://research.microsoft.com/en-us/projects/havoc/.
[22] L AHIRI , S. K. Unbounded system verification using decision
procedures and predicate abstraction. Tech. rep., Phd thesis,
Carnegie Melon University, 2004.
12