Cracklab - Team - Codisasm
Cracklab - Team - Codisasm
ABSTRACT 1. INTRODUCTION
Fighting malware involves analyzing large numbers of sus- This paper focuses on malicious binary code, and more
picious binary files. In this context, disassembly is a crucial specifically x86-binaries. Nowadays there are two opposite
task in malware analysis and reverse engineering. It involves core problems that we have to face in order to fight mali-
the recovery of assembly instructions from binary machine cious binary code. On the one hand, each day a high volume
code. Correct disassembly of binaries is necessary to pro- of (executable) files are observed and processed. Google re-
duce a higher level representation of the code and thus al- ceives more than 300 000 files per day and has a collection
low the analysis to develop high-level understanding of its of 400 million malware samples. All these files must be an-
behavior and purpose. Nonetheless, it can be problematic in alyzed and classified in order to build defenses against mal-
the case of malicious code, as malware writers often employ ware threats. It is a necessity to devise tools that are able
techniques to thwart correct disassembly by standard tools. to correctly handle very large collections of machine code.
In this paper, we focus on the disassembly of x86 self- On the other hand, malware is quite often well-crafted soft-
modifying binaries with overlapping instructions. Current ware that is heavily protected against analysis. As a result,
state-of-the-art disassemblers fail to interpret these two com- accurate and automatic malware analysis represents a true
mon forms of obfuscation, causing an incorrect disassembly challenge. Moreover, current tools available are not neces-
of large parts of the input. We introduce a novel disas- sarily well adapted to process large amounts of code, because
sembly method, called concatic disassembly, that combines most of them are designed for reverse engineering and often
CONCrete path execution with stATIC disassembly. We involve complex computations. The main objective of this
have developed a standalone disassembler called CoDisasm paper is to develop methods to disassemble and to construct
that implements this approach. Our approach substantially control flow graphs of binary codes, that are robust and able
improves the success of disassembly when confronted with to process efficiently a quite large amount of binary code.
both self-modification and code overlap in analyzed bina- Disassembly and pitfalls. Disassembly is the first step
ries. To our knowledge, no other disassembler thwarts both in the analysis of malware binaries and it is an essential
of these obfuscations methods together. one as all subsequent steps crucially depend on the accu-
racy of the disassembly. Indeed, it is from the disassembly
of a binary that we can reconstruct the control flow graph
Categories and Subject Descriptors (CFG) in order to perform further reverse engineering anal-
D.4.6 [Security and Protection]: Invasive software ysis tasks. It is also from the disassembly that we develop
decompilers in order to extract relevant high-level semantic
General Terms information. However, there are several inherent difficulties
in devising a disassembly process. It has been reported [21]
Security that up to 65% of the code is typically incorrectly disas-
sembled. One difficulty is that it is almost impossible, to
Keywords separate machine code from data. Both are mixed in a long
Disassembler; Malware; Dynamic Analysis; Overlapping In- sequence of bytes. Instructions such as jmp may skip data,
structions; Self-Modifying Codes jumping from one piece of code to another one. Moreover,
these jumps are not statically predictable. An illustration
of this fact is an indirect jump like the instruction jmp eax.
Permission to make digital or hard copies of all or part of this work for personal or
classroom use is granted without fee provided that copies are not made or distributed
To pursue static analysis, it is then necessary to determine
for profit or commercial advantage and that copies bear this notice and the full cita- the range of values in the register eax, or at least a good
tion on the first page. Copyrights for components of this work owned by others than approximation of the eax values. It is worth noting that
ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or re- determining the destination of an indirect jump is undecid-
publish, to post on servers or to redistribute to lists, requires prior specific permission able, which implies that separating code from data is also an
and/or a fee. Request permissions from [email protected]. uncomputable task. Most previous work [31, 22, 33, 20, 19]
CCS’15, October 12–16, 2015, Denver, Colorado, USA.
c 2015 ACM. ISBN 978-1-4503-3832-5/15/10 ...$15.00. has tried to solve the problem of indirect jumps employing
DOI: http://dx.doi.org/10.1145/2810103.2813627. static analysis methods. That said, there are other signif-
icant issues. In this paper, we focus on two of them: (i) The second issue concerns overlapping instructions, which
self-modifying code, and (ii) overlapping instructions. Both is a typical feature of x86 machine code and a common anti-
obfuscation techniques are designed to protect code against disassembling mechanism. Consider for instance the follow-
human and automated analysis, and are in fact widespread ing execution sequence of bytes extracted from the packer
in malware. tELock0.99
fe 04 0b eb ff c9 7f e6 8b c1
; data size
01006 e62 mov ecx , 0 x1dc2 occurring in the code snippet in Figure 2. The correct con-
; ebx is the pointer on the block to decrypt
; ebx=0x1005090
trol flow graph is given in Figure 3. The instruction at the
01006 e67 inc ebx
01006 e 6 f loop : r o l byte ptr [ ebx+ecx ] , 0 x5 01006 e7a f e 04 0b inc byte [ ebx+ecx ]
01006 e73 add byte ptr [ ebx+ecx ] , c l 01006 e7d eb f f jmp +1
01006 e76 xor byte ptr [ ebx+ecx ] , 0 x67 01006 e 7 e f f c9 dec ecx
01006 e7a inc byte ptr [ ebx+ecx ] 01006 e80 7 f e6 jg 01006 e68
01006 e7d dec ecx 01006 e82 8b c1 mov eax , ecx
01006 e80 j n l e loop Figure 2: Overlapping assembly in tELock0.99
; jump to the decrypted data
01006 e82 jmp 0 x01005090
Figure 1: Decryption loop of tELock of data from address 01006e7d is jmp +1. This instruction is encoded by
address 0x01005090 to 0x01006e52 two bytes and it jumps to the second byte of its opcode
at address 01006e7d+1, which corresponds to an instruction
dec ecx. The opcode of dec ecx is ff c9 which shares the
Nowadays, malware is almost always self-modifying. Gen- byte ff at address 01006e7d+1 with the jmp +1 opcode. As
erally, this kind of code protection consists of a sequence of a result, both instructions jmp +1 and dec ecx overlap each
complex and intertwined unpacking/decryption and protec- other.
tion routines. For example, the packer tELock 0.99 uses 18 The overlap is just there to obfuscate the code. Figure 4
layers to unpack and to protect the original code. In Fig- displays the disassembly result respectively output by IDA
ure 1, we present a simple —but commonly seen in malware— Pro (v6.3) [11] which is incorrect. The reason is that off-the-
example of a decryption loop based dynamic analysis on a shelf disassemblers make the assumption that instructions
one-time pad cipher inside a layer of the packer tELock. The do not overlap and so misinterpret the execution sequence
encrypted code is run after decryption at address 0x01005090. above. There is one important exception: the Jakstab disas-
Packers and malware authors protect in a very effective man- sembler proposed by Kinder [18], which handles overlapping
ner the original code by mostly avoiding potential dynamic instructions but not self-modification.
analysis that attempt to analyse malware behavior. Com- It is worth mentioning that tELock combines self-modi-
monly found protection methods may be quickly classified in fication and overlapping instruction obfuscation techniques.
two categories. The first category combines anti-debbuging, For explanatory reasons, we choose to separate and display
anti-virtualization, and anti-disassembly mechanisms in var- them in two independent snippets in Figure 1 and 3.
ious forms in order to evade system monitoring. For exam- Objectives. The first objective of our work is to devise
ple, the packer tELock contains several anti-debugging rou- a disassembler of x86-malware code. Inputs are stripped
tines. The second category employs obfuscations and “code binary code, with no information of any kind, and that are
slicing” methods in order to reveal the original code. A usually heavily obfuscated. In particular, we have focused
packer has the ability to show just slices of the original code on (i) self-modifying binaries, and (ii) on binaries containing
and to hide the rest of the code. For example, the packer overlapping instructions. An important point is that we
ACProctect interleaves the original code with its code and make the assumption that slices of the original code may
unpacks library calls only when it is required. be executed in any wave (i.e. unpacking layer) when the
Generic unpackers were not designed to deal with all these
protections. Most of them [25, 17, 30, 13] perform dynamic
analysis and have heuristics to find the unpacking layer that Layer 1
contains the original code. One exception is that of the static instruction !
analysis-based unpacker proposed by Coogan et al. [9]. Ac- inc byte [ebx+ecx] overlapping
cording to Ugarte et al. [32], generic unpackers can be de- Layer 2
ceived, and therefore fail, because (i) they rely on specific
packer families, (ii) malware authors use handmade packers, jmp+1 dec exc
and (iii) they are based on assumptions that are no longer
valid. Indeed, nowadays packers combine the original code
with their own code. As a result, the original code is often inc jmp
jg 01006e68
not totally available in memory and we cannot take a sin- fe 04 0b eb ff c9 7f e6 8b c1
gle memory snapshot to capture it. Moreover, packers may dec jg mov
use several processes or threads to run the original code.
mov eax,ecx
For these reasons, we develop a dynamic analysis system to
trace processes and threads. Thanks to our model of self-
modifying code, we take a sequence of memory snapshots
containing at least all the instructions of the original code
that are executed. Figure 3: Control flow graph for the tELock sample
In the second step, described in Section 4, we disassem-
01006E7A inc byte ptr [ ebx+ecx ] ble each wave. Each wave provides a memory snapshot and
01006E7D jmp loc 1006E7D+1
01006E7D ; —————– a (sub-)trace. This step consists in identifying and in dis-
01006E7F db 0C9h ; assembling the code in each wave with the executed trace
01006 E80 db 7Fh ; as a hint. For this, we implement a recursive disassembler
01006 E81 db 0E6h ; that follows the trace. The trace indicates a sequence of ex-
01006 E82 db 8Bh ; ecuted addresses but of course the memory snapshot of the
01006 E83 db 0C1h ; wave also contains “dormant” instructions, that have not
Figure 4: Disassembly of tELock example with IDA been executed but that will also be disassembled by CoDis-
Pro (v6.3) asm Nonetheless, this trace will be our guide to perform
a recursive traversal disassembly. In fact, the instruction
addresses gathered in the trace are starting points for dis-
assembly. For example, the packer PE Spin has 58 indirect
jumps which are immediately solved by using the trace.
analyzed code is packed. In other words, we do not assume
We now have to face the second issue: overlapping instruc-
that the original code is entirely visible at some point of the
tions. Our approach is to split the memory analyzed in lay-
unpacking process. The second objective is to develop an
ers. Each layer corresponds to an overlap. To illustrate this
effective complete disassembly architecture that is able to
idea, let us go back to the tELock example. The memory is
automatically process each binary code file in a reasonable
constituted of 10 bytes: fe 04 0b eb ff c9 7f e6 8b c1.
amount of time.
As shown in Figure 6, this defines two layers fe 04 0b eb ff
The concept of code waves. We consider that each in-
and ff c9 7f e6 8b c1. Our approach can also thwart
put binary is self-modifying code. That is, the execution of
obfuscations such as those shown by Jämthagen et al. [16].
a binary will usually deploy different waves of code. Thus,
Other obfuscation techniques may be also resolved from
an execution might be viewed as a sequence of waves, where
the trace. For example, a trace gives the return address of
a wave is produced by previous waves. Most of the time, a
a call even if the return address has been modified. Notice
wave is produced by unpacking or by decrypting some data.
that this information is easily available in dynamic analysis,
For example, tELock generates 18 waves and the misalign-
unlike in static analysis.
ment given in Figure 3 occurs at Wave 3. Each wave is
determined by an execution level. We begin with Wave 1
The results. We propose a simple model of self-modifying
in which the starting code is run. Then, there is Wave 2
program executions, dubbed wave semantics, that allows us
for which the executed code has been written by Wave 1.
to reconstruct the original code. We also generalize the no-
Next, the process repeats itself and switches from Wave k
tion of control flow graph to deal with self-modifying code
to k + 1 each time we run data written during Wave k. No-
with overlapping instructions.
tice that the code run at Wave k + 1 can be generated by
From this model, we have developed a two step disassem-
several previous waves (not only by Wave k).We found such
bler called CoDisasm. In the first step, CoDisasm collects an
an example in the packer UPolyX, which we observed in a
execution trace of a stripped binary. This trace is analyzed
Hupigon sample: hupigon.eyf. The execution of UPolyX
and split into code waves. At the end, CoDisasm outputs a
consists of a first wave that generates a second unpacking
set of layers for each wave, where each layer contains a set of
routine and part of the payload. The second wave starts with
non-overlapping instructions. From this set of layers, we re-
the execution of the second unpacking routine that calls the
construct an enhanced control flow graph (see Section 4.3).
first unpacking routine and generates the remainder of the
Next, we can apply other techniques to discover new pieces
payload. Finally, the third wave is triggered and executes
of code thus obtaining a speculative disassembly of the code
the payload. In this example, we see that the payload is
(see Section 4.4).
written at Waves 1 and 2.
CoDisasm overview The overall architecture of our dis-
Our model is closely related to the one suggested by Guiza-
assembler CoDisasm is shown in Figure 5. The CoDisasm
ni et al. [12], which is why we use the same terminology.
disassembler performs a static disassembly along with a con-
The main difference is that we simplify the wave computa-
crete execution with the aim of maximizing coverage. It
tion and we use a monotonic numbering, which allows us to
includes two main components:
take a memory snapshot at the right time to dump a wave.
Debray and Patel [10] define the notion of phase. The def-
inition of a phase is closely related to the notion of wave. 1. A dynamic analysis component that collects execution
Dalla Preda et al. [28] define a fixed-point semantics of self- traces of the threads and processes of a binary run.
modifying programs, which is also similar. For this, we developed Pin tracer to instrument code,
which is based on Pin [24]. We recover each code wave
by taking a memory snapshot at the beginning of the
The method in a nutshell. The disassembly method
wave, as explained in Section 3. A Portable Executable
proceeds in two steps. In the first step, we perform dynamic
(PE) file is then built for each snapshot.
analysis. We instrument a binary (see Section 3.2), and run
it in a sandbox. The code instrumentation is able to bypass
some anti-analysis evasion mechanisms. We follow threads 2. From the execution traces, each memory snapshot is
and processes created by the binary by instrumenting them disassembled following the algorithm described in Sec-
on the fly. We collect execution traces of all threads and pro- tion 4, taking care of overlapping instructions. At the
cesses. Then, we determine a sequence of waves as explained end, we have a sequence of disassembled code waves,
in Section 3 and we take memory snapshots to disassemble which corresponds to the code discovered thanks to the
the code of each wave in the second step. set of collected traces.
The overall architecture
We combine concrete path execution and static analysis
decompresses some piece of data. In our model, data are • Checksum checks (CRC) on parts of the code, to check
decompressed in a memory area and so each address of this whether or not it has been altered.
area gets write level 1. Then, the unpacker transfers on-
the-fly the control to the “decompressed” data. As a result, • Use of interrupt table manipulating instructions such
data at write level 1 is executed, thus triggering the second as SIDT, SLDT, etc., in order to check whether or not
wave of execution, and we set the execution level to 2. In the code runs in a virtual machine.
turn, Wave 2 may generate a third wave and this process
In each of these cases, the counter-measure implemented is
may repeat.
to return the expected value, which is not always possible
We define Wave k as the whole set of instructions, ex-
to determine.
ecuted or not, which are present when the execution level
Regardless of the method of collection, execution traces
reaches k (See discussion in Section 6). As a result, we can
are important tools in reverse engineering that we use as an
see a run of a program as a sequence of waves. Notice that
enabler to thwart code protections. We therefore need to
in this model, non-self-modifying code will only have one
formalize the notion of execution traces as a basis for rea-
wave.
soning on self-modification behaviors. An execution trace is
The rationale behind the model of a self-modifying code
a sequence of operations performed by a program, where at
run as a sequence of waves is that we can extract a snapshot
each step, we gather a sequence of information such as pro-
of the memory at the beginning of each wave from the execu-
cess IDs, register values and read/write memory addresses
tion of a binary. This snapshot contains all the instructions
that we collectively refer to as as dynamic instruction. A
deployed by the binary to run this wave and possibly some
dynamic instruction D is a tuple composed of:
silent code. Our objective is then to disassemble this mem-
ory snapshot in order to recover the assembly code contained • a memory address A[D],
in a wave.
The wave semantics that we propose is defined at the low- • the machine instruction I[D] run at address A[D],
est possible level of abstraction in the sense that we see all
computations inside the system through the eyes of the sin- • the set W[D] of memory addresses written by the in-
gle core processor. Consequently, this model takes into con- struction I[D].
sideration threads and processes.
An execution trace is a finite sequence D1 , D2 , . . . , Dn of
3.2 Collecting execution traces dynamic instructions. Figure 7 shows the dynamic trace of
In practice, we focus exclusively on Windows/x86 bina- the program in Figure 1 after two iterations.
ries. To this end, we use Pin which is a dynamic instrumen-
tation framework supported by Intel [24]. We developed and Algorithm 1: Computation of execution and write lev-
used a Pin tool, that we refer to as Pin tracer, to collect ex- els
ecution traces of x86-code. Pin tracer is able to trace newly
created threads and processes. It also tries to detect code Update(X,W,D)
injection in a running process. If such an event occurs, it X ← max(X, W(A[D]) + 1) ;
instruments the injected process. For example, the driver foreach m ∈ W[D] do
of Duqu illustrates this mechanisms by injected in memory W(m) ← X ;
within service.exe. In this case, Pin tracer traces ser- end
vice.exe. Code injections are detected by monitoring calls return (X, W)
to the CreateRemoteThread and CreateRemoteThreadEx func- W(A[D]) is a shortcut for max(W(A[D]), . . . , W(A[D] + k))
tions from the Windows API. When Pin tracer detects a where k is the number of bytes encoding the instruction.
new process, then a new pin tool is attached to this process.
Thus, a new trace is generated. Finally, we collect all traces
of the threads and processes detected.
Given the fact that many malware use anti-emulation, 3.3 Execution and write levels
anti-debugging and anti-virtualization techniques, includ- The goal of this section is to delineate waves inside an
ing on Pin, we built some anti-evasion functionality into execution trace. A wave is determined from both (i) the
Pin tracer. In particular, we attempt to cover the following execution level, and (ii) the write level of each memory ad-
evasion techniques: dress. The write level of each memory address is stored into
a finite mapping W that we call the write level table.
• Time check, to verify whether or not the malware code
Given an execution trace D1 , D2 , . . . , Dn , we define a se-
is monitored.
quence of pairs composed of the execution level and the write
• EIP (instruction pointer) check, to verify whether or level table (X0 , W0 ), (X1 , W1 ), . . . , (Xn , Wn ) for each dy-
not the malware code has been instrumented. namic instruction that satisfies the following properties: (i)
Figure 7: Trace execution of the tELock snippet shown in Figure 1
A[D] I[D] W[D] A[D] I[D] W[D]
01006e62 mov ecx, 0x1dc2 01006e7d dec ecx
01006e67 inc ebx 01006e80 jnle loop
01006e6f loop: rol byte ptr [ebx+ecx], 0x5 0x01006e52 01006e6f loop: rol byte ptr [ebx+ecx], 0x5 0x01006e51
01006e73 add byte ptr [ebx+ecx], cl 0x01006e52 01006e73 add byte ptr [ebx+ecx], cl 0x01006e51
01006e76 xor byte ptr [ebx+ecx], 0x67 0x01006e52 01006e76 xor byte ptr [ebx+ecx], 0x67 0x01006e51
01006e7a inc byte ptr [ebx+ecx] 0x01006e52 ... ...
Before executing the dynamic instruction Di+1 , the the ex- 3.4 Reconstructing waves from a trace
ecution level is Xi and the write level table is Wi and (ii) We are now ready to split an execution trace into sub-
after executing Di+1 , the execution level is Xi+1 and the traces depending on their execution levels. From an execu-
write level table is Wi+1 . We shall say that the execution tion trace D1 , D2 , . . . , Dn , we have previously described how
level of the dynamic instruction Di+1 is given by Xi+1 . The to compute the sequence of execution levels X1 , X2 , . . . , Xn .
sequence of execution levels and the write level tables are ob- It is not difficult to see that this sequence is weakly mono-
tained by iteratively applying the function Update shown in tonic, that is Xi ≤ Xi+1 for all i = 1, n. The number of
Algorithm 1. waves observed in this execution trace is K = maxi (Xi ) =
We have defined the list of pairs (execution level, write Xn . In other words, in our model of self-modification, there
level table) for explanatory reasons, but in fact, the exe- are K − 1 successive code self-modification in this execu-
cution level is shared by the entire memory. In fact, the tion. As a result, we can extract K sub-traces of dynamic
execution level is shared by any memory address executed instructions, which are defined as follows:
in a wave. Thus, it is sufficient to keep track of the current
execution level and the write level table. Consequently, we trace(1) = D1 , . . . , D`1 −1
begin by setting all write levels to 0 and the execution level where Xj = 1 for j = 1, `1 −1
to 1. That is, W(m) = 0 for each memory address m and trace(i) = D`i , . . . , D`i+1 −1
X = 1. Then, we apply the function Update on arguments
(X, W) and D in order to determine the next execution level where Xj = i for j = `i , `i+1 − 1
and the next write level table: (X, W) = Update(X, W, D). trace(K) = D`K , . . . , Dn
where Xj = K for j = `K , n
Algorithm 2: Wave recovery for self-modifying codes
At the same time, we can also take a memory snapshot at
input : PE File the beginning of each wave. Thus, we have K memory snap-
output: The number of waves X and for each wave, a shots. Let us call wave(i) the memory snapshot at the be-
snapshot and a trace in the lists traceList and ginning of the i-th wave, that is before executing instruc-
waveList tion D`i . Notice that the memory snapshot wave(1) is the
Wave_ recovery() snapshot of the starting code. As a result, our model of
foreach address m do self-modification ensures that the memory snapshot wave(i)
W(m) ← 0 contains any instruction in the trace trace(i) and probably
end also other dormant instructions that we have to identify.
X←0;
trace ← ∅ ; list of dynamic instructions 3.5 Overview of the wave recovery algorithm
traceList ← ∅ ; list of traces The wave recovery algorithm is presented in Algorithm 2.
waveList ← ∅ ; list of memory snapshots The input is a PE file that is loaded into memory. The first
wave ← Snapshot() ; step of the algorithm initializes the write level table and
Add (waveList,wave) ; takes an initial snapshot of the memory wave(1). The first
Computation of subtraces and memory snapshots memory snapshot contains all the code and data that are
while not at end do inside the PE file sections.
D ← Pin Tracer() ; In a second step, the Pin Tracer executes code one state-
Add(trace, D) ; ment at a time as explained in Section 3.2. Pin Tracer runs
(X, W) ← Update(X, W, D) ; one instruction and, at each step, gathers the corresponding
dynamic instruction. Then, Pin Tracer computes the write
ip ← Pin Next Instruction() ; level table as described in Section 3.3. The index of the
if W(ip) ≥ X then current wave is given by X. We also gather all instructions
New wave executed during the current wave in the list trace(X).
Add (traceList,trace) ; Finally, we determine the address of the next instruction
trace ← ∅ ; to be executed thanks to a Pin tool called Pin Next Instruction.
wave ← Snapshot() ; If the execution level of the next instruction increases, then
Add (waveList,wave) ; we know that Wave X ends there and that Wave X + 1
end will start as soon as the next instruction will be executed.
end Therefore, we take a memory snapshot wave(X + 1) of the
return (X, waveList, traceList) memory before the beginning of Wave X + 1. Otherwise, we
stay in the same wave and binary execution is resumed.
A memory snapshot combines (i) the code and data in of the wave. Recall that inside a wave, there is no code
a PE file, and (ii) all data stored in dynamically allocated self-modification. However, other obfuscations may occur,
memory areas (e.g. malloc). It is necessary to consider dy- in particular x86 overlapping instructions. In this section,
namic memory allocations because it is possible to jump into we address this issue. We present a recursive algorithm that
data that, for example, comes from a decryption loop. statically disassembles and correctly handles overlapping in-
structions.
3.6 Example
The introductory example (Figure 1) presents a decryp- Algorithm 3: Recursive disassembler, recovering over-
tion loop that generates two waves. The first wave mainly lapping instructions
consists of the loop and trace(1) is composed of the nine
input : The memory snapshot of a wave, its execution
dynamic instructions in the interval [01006e62, 01006e82].
trace and an empty set of layers
The second wave is triggered when the condition at address
output: A set of layers resulting from the disassembled
01006e80 is false and the control is transferred to the address
wave
01005090. trace(2) is composed of the dynamic instruction
disassembler(wave,trace)
in the interval [01005090, 01006e52]. Figure 8 illustrates the
L← New();
execution of this example and provides the execution level X
Set layers ← {L};
and the write level table W. For example, take instruction
foreach addr ∈ trace do
xor byte ptr [ebx+ecx], 0x67. The execution level is 1.
if addr 6∈ Set layers then
This instruction performs a memory write at the address
if the addr has not been processed ;
pointed by the value of ebx+ecx. Since the value of ebx+ecx
Set layers ← recursive traversal(wave, addr,
for that execution is 01006e82, we set W(01006e82) = 1.
Set layers,L);
3.7 Disassembly completeness end
A discussion on disassembly completeness may seem quite end
theoretical at first glance. Nevertheless, it is a necessary di- return Set layers
gression in order to be able to discuss disassembler evalua- recursive_traversal(wave, addr, Set layers,L)
tion criteria in Section 5. We now put forth a definition of a opcode ← disasm(wave, addr);
semantics for self-modifying programs. In Section 3.4, an ex- if ( addr,opcode) is aligned with L then
ecution trace D = D1 , . . . , Dn defines trace(i) correspond- Add the instruction to an aligned layer ;
ing to the instructions run in the i-th wave. We call each sub- Add( L,addr,opcode);
trace a code wave. The set of all code waves of a trace D is
else
trace(D) = {trace(i) | i = 1..K where K is the last wave}.
Create a new layer with the instruction;
We define the wave semantics of a given binary as a graph
Lnew = New() ;
G = (V, E) defined as follows. The set of vertices V is the
Add(Lnew,addr,opcode);
set of all code waves for any execution trace, that is
[ SetUnion(Set layers,Lnew);
V = trace(D) end
for all traces D foreach successor of ( addr,opcode) do
Set layers ← recursive traversal(wave,successor,
Two vertices W and W 0 are connected, that is (W, W 0 ) ∈ E
Set layers,L)
if W and W 0 are two consecutive subtraces of a trace. In
end
other words, there is an execution where the successor of the
return Set layers
last instruction of W is the first instruction of W 0 , or yet if
the wave denoted by W jumps to the wave denoted by W 0
in some execution.
As a result, the wave semantics is a graph G that repre- 4.1 Layers
sents all possible self-modifications of a binary and encodes Two dynamic instructions overlap when they share at
all possible execution paths. The wave semantics of a bi- least one byte in memory. We will say that a set of dy-
nary provides the partially ordered list of all instructions namic instructions is mis-aligned if at least two instructions
that can be run. For that reason, the wave semantics G overlap. Otherwise, we will say that the instructions of this
could be used to mesure the correctness of a disassembler set are aligned. Take again the teLock snippet in Figure 2
(of self-modifying programs), because a perfect disassembly and look at Figure 6. The instructions jmp +1 and dec ecx
of a binary should be able to reconstruct the graph G. Of have the byte at address 0x01006e7e in common. So, they
course, a perfect disassembler does not exist because the are overlapping instructions. Both overlapping instructions
problem of disassembling is undecidable; and from this fact, create two sequences of aligned dynamic instructions. Each
the wave semantics is uncomputable. Any disassembler pro- sequence forms what we will call a layer.
vides an approximation of the wave semantics. So at least Before we define layers, we have to introduce the notion
from a theoretical point of view, the distance with the wave of connected instruction set. A set L of instructions is con-
semantics may provide a metric to evaluate disassemblers. nected if given two instructions D and D0 , there is a path
between D and D0 composed of instructions in L. That is,
4. OVERLAPPING INSTRUCTIONS there is a sequence D = D1 , . . . , Dn = D0 of instructions in
We now have all the necessary information to start the L such that the instruction Di+1 is a successor of Di . The
second phase, which consists in disassembling the code of a successors of the instruction D are all the reachable instruc-
wave from a snapshot of the memory together with the trace tions from D that we can predict. For example, a sequential
After several iterations of the loop After transferring the control to 01005090
A[D] I[D] W X A[D] I[D] W X
01006e62 mov ecx, 0x1dc2 0 1 01006e62 mov ecx, 0x1dc2 0
01006e67 inc ebx 0 1 01006e67 inc ebx 0
01006e6f loop: rol byte ptr [ebx+ecx], 0x5 0 1 01006e6f loop: rol byte ptr [ebx+ecx], 0x5 0
01006e73 add byte ptr [ebx+ecx], cl 0 1 01006e73 add byte ptr [ebx+ecx], cl 0
01006e76 xor byte ptr [ebx+ecx], 0x67 0 1 01006e76 xor byte ptr [ebx+ecx], 0x67 0
01006e7a inc byte ptr [ebx+ecx] 0 1 01006e7a inc byte ptr [ebx+ecx] 0
01006e7d dec ecx 0 1 01006e7d dec ecx 0
01006e80 jnle loop 0 1 01006e80 jnle loop 0
01006e82 jmp 0x01005090 0 1 01006e82 jmp 0x01005090 0
0x01005090 decrypted byte 1 0x01005090 decrypted byte 1 2
0x01005091 decrypted byte 1 0x01005091 decrypted byte 1 2
Wave 1: trace(1) instruction in [01006e62, 01006e82] Wave 2: trace(2) instructions in [01005090, 01006e52]
instruction like mov eax,ebx has one successor which is the instructions, each pre-CFG is a connected graph. All pre-
next instruction, while jnz 100 has two successors: the in- CFG are connected together. Indeed, there is at least one
struction at address 100 and the next one. On the other edge between a node of a pre-CFG and the root of another
hand, we may not be able to determine the successor of an pre-CFG, which comes from the instruction that creates the
instruction like jmp eax if we have no certain value for the overlap. Finally, a node can have multiple incoming edges
register eax. which corresponds to a resynchronization of the code.
We now come to the second key notion. A layer L is a We illustrate and sum-up the construction by an exam-
set of dynamic instructions that satisfies the following two ple coming from the packer UPX. In Figure 9, we show the
properties: (i) two instructions in L never overlap, and (ii) two layers created by the conditional jump jnz +9. There-
the set L is connected. Our objective is to construct a set fore, there are two pre-CFG that correspond to both layers
of layers that approximates the code inside a wave. generated by UPX. The dashed edge corresponds to the in-
struction overlap due to the jnz +9 instruction. The code
4.2 Disassembling algorithms resynchronizes at the push ebp instruction.
Algorithm 3 defines the disassembly procedure. Its inputs
are a memory snapshot wave of a given wave and its corre- 4.4 Speculative disassembly
sponding sub-trace trace. Both inputs come from the first At the end, we perform a speculative disassembly by run-
phase that we have presented in the previous section. The ning a linear sweep on unexplored pieces of memory, byte
algorithm inspects recursively the memory snapshot wave by byte as Vigna [33] proposes. In order to identify valid
from each address in the trace. For this, we begin with a layers, that is to separate code from data, we apply well-
new empty layer. We disassemble recursively and we add known heuristics employing pre-determined scoring [22] and
instructions to a layer in a consistent way. That is, we guar- statistical methods [21]. Finally, the trace is taken into
antee that layers are always a sequence of aligned instruc- account to evaluate the probability of correctness of each
tions. When an instruction cannot be added to a layer in reconstructed disassembly.
a consistent way, that is, if the instruction overlaps at least
one other instruction in one of the already computed layers,
we create a new layer. We add the misaligned instruction to 5. EVALUATION
the new layer. The new layer is added to the current set of
layers. As a result, we maintain during the disassembly a set 5.1 Methodology
of coherent layers, such that: (i) no instruction inside the In order to evaluate a disassembler, it is necessary to de-
layer overlaps another instruction in the same layer, and (ii) fine what we expect to be a correct output of a disassembler.
if we take two layers in this set, then there are at least two In the case of a regular binary code produced by a compiler,
instructions from each layer which are mis-aligned. The out- it is sufficient to compare the disassembler output with the
put is a set of coherent layers that together form an under- compiler assembly output. But in the case of a heavily ob-
approximation of the complete disassembled code inside a fuscated binary code like a malware, the evaluation of a
wave. disassembler is a non-trivial problem that presents complex
Notice that this algorithm follows all found execution paths. challenges.
For example, when a conditional instruction like jcc is en- Before going further, we need to discuss what we mean by
countered, we follow both successors. Moreover, the trace a “correct disassembler”. A correct disassembler should only
gives us some valuable additional information. For example, output instructions which are in a possible execution path,
Linn and Debray [23] propose to modify the return value on that is an approximation of the wave semantics as already
the stack of a call as an obfuscation technique. In this case, defined and discussed in Section 3.7. Recall that the wave
the trace immediately gives the correct return address and semantics of a binary code provides the set of all instruc-
thus provides a correct answer to this common technique. tions that can be run. Thus, it is important to measure the
approximation obtained with respect to the wave semantics
4.3 Recovering an enhanced CFG in order to determine the code coverage.
From each layer, we reconstruct a control flow graph (CFG) This response is not completely satisfactory. For example,
that we call pre-CFG. Since each layer is a connected set of a malware may be packed and in this case one might be in-
Addresses 0xf2 0xf3 ... 0xf9 0xfa 0xfb 0xfc 0xfd 0xfe 0xff
Bytes 79 07 ... 47 b9 57 48 f2 ae 55
Layer 1 @0xf2 jns +9 (0xfb) ... inc edi mov ecx, aef24857 push ebp
Layer 2 @0xfb push edi dec eax repne scasb
Layer 1 Layer 2
mov ecx,edi
010059f0! ! 89! f9! ! ! ! mov ecx,edi! jnz +9
terested in reconstructing the assembly code of the malware. Program #Inst. #Inst.CoDisasm Time (ms)
The assembly code reconstruction of a packed malware is a adpcm.exe 1191 1191 120
different issue than the one studied in this work. Indeed compress.exe 506 506 34
we may for example develop a packer in which the malware ns.exe 99 99 6
functionalities and the protection functionalities are fully in- nsichneu.exe 5550 5550 1700
tertwined. In this case, malware functionality identification statemate.exe 1375 1375 155
and reconstruction is a research subject per se. For example,
the work of Yadegari et al. [34] developed a de-obfuscation #Inst. = number of instructions
method to extract a simplified code. For this, they com- #Inst.CoDisasm= number of instructions disassembled
bined a dynamic analysis with concolic executions in order
to collect several traces, which are simplified in order to re- Table 1: Precision of disassembly
construct a control flow graph.
Another approach is to compare disassemblers. The com-
parison between disassemblers is currently difficult because 5.2 Experimental validation of correctness
there is no benchmark based on obfuscated binary codes. In We consider regular binaries coming from a compiler. We
particular, it makes no sense to compare CoDisasm with off- show the correctness of CoDisasm on regular programs in
the-shelf disassemblers because none deal with self-modifying Table 1. These samples are taken from the Mälardalen
code and overlapping instructions. WCET benchmark programs [14]. This correctness is simply
As such, it is difficult to assess a disassembler and we rec- established by comparing assembly outputs.
ognize that we have not been able to define a metric that
allows us to adequately determine code coverage for CoDis- 5.3 Relevance of our approach on malware
asm, or for any other disassembly tool, for that matter. That We demonstrate that our approach is relevant by taking
is why, we propose a fourfold evaluation of CoDisasm focus- 500 malicious software from the public repository malware.
ing on testing functionality and usefulness of the tool and lu. All these malware are detected by at least three well-
showing that there is no major operational problems with known anti-virus software. We verify our assumptions that
the tool or its approach. First in Section 5.2, we check that (i) malware are self-modifying code by computing the num-
CoDisasm correctly retrieves the code of a regular binary ber of waves, and that (ii) malware use overlapping instruc-
produced by a compiler. Second in Section 5.3, while we tions by computing the number of layers per sample.
cannot verify its correctness on malware for which source Table 2 shows the number of waves generated by the sam-
code is normally not available, we verified the relevance of ples. It can be seen that 93% are self-modifying code. Half
our approach by running the tool on 500 malware families, of them have only 2 waves. In this case, most of them could
and observing the number of waves and layers; more pre- be disassembled by using first a generic unpacker and then
cisely, we were able to deduct that tools not handling self- by running a disassembler on the unpacked code. However,
modification and code overlap simultaneously would have the remaining 40% of samples are more difficult to analyze.
failed to correctly disassemble the majority of those sam- Generic unpackers fail, while our approach works, thus con-
ples. Third, in Section 5.4, we successfully benchmarked firming its usefulness with respect to discovery and analysis
CoDisasm by packing known applications with 28 different of waves.
readily available packers and retrieving these known appli- Table 3 shows the number of layers obtained on the same
cations. Finally in Section 5.5, we illustrated CoDisasm’s samples. As can be seen, 70% of the samples use at least
capacity with malware analysis by packing a known mal- one instruction overlapping technique.
ware and showing that our approach may considerably help
malware analysis. 5.4 Relevance of our approach on packers
In this section, the goal is to show that we are able to
retrieve the original code of a packed binary. For this, we
take hostname.exe, which plays the role of a probe that we
can easily detect. Notice that the same experiment with
# Waves 1 2 3 4 5–10 > 10
Packer name #proc. #thr. #Wave DM
8% 53% 12% 6% 13% 9% ACProtect v2.0 1 1 635 N
Armadillo v9.64 2 11 165 Y
Table 2: Number of waves from 500 malware Aspack v2.12 1 1 3 N
BoxedApp v3.2 1 15 6 Y
# Layers 1 2 3 4 ≥5 EP Protector v0.3 1 1 2 N
Expressor 1 1 2 N
32% 35% 17% 11% 5% FSG v2.0 1 1 2 N
JD Pack v2.0 1 1 3 N
Table 3: Number of layers from 500 malware MoleBox 1 1 3 N
Mystic 1 1 4 Y
Neolite v2.0 1 1 2 N
an unknown binary like a malware will not be conclusive nPack v1.1.300 1 1 2 N
because we do not know a priori its assembly code (probably Packman v1.0 1 1 2 N
PE Compact v2.20 1 1 4 Y
generated at runtime). Therefore, we packed hostname.exe PECrypt V1.02 1 4 99 Y
with 28 different packers. The results are shown in Table 4. PE Lock 1 1 15 Y
We display the number of processes, threads and waves of PE Spin v1.1 1 1 80 Y
the 28 packers. The last column indicates whether instruc- Petite v2.2 1 1 3 N
tions are run in dynamically allocated memory or not. What RLPack 1 1 2 N
we immediately see is that packers massively use waves, Setisoft v2.7.1 1 5 32 Y
TELock v0.99 1 1 18 Y
some of them being dynamically allocated. The cascade Themida v2.0.3.0 1 28 106 Y
of waves may be as deep as 635. We were dumbfounded to Upack v0.39 1 1 3 N
see that up to 20% of some of these waves were composed Upx v2.90 1 1 2 N
of overlapped instructions. The case of armadillo is as- VM Protect v1.50 1 1 1 N
tounding. We observed 132 overlapping instructions and the WinUPack 1 1 3 N
packer creates 11 threads and has 2 processes. (The father Yoda’s Crypter v1.3 1 1 4 Y
Yoda’s Protector v1.02 1 1 6 N
process creates a new process which contains the original
code. Then the father process attaches to the son process #proc. = number of processes ; #thr. = number of threads
like a debugger.) #Wave = number of waves ; DM = Code run in allocated
For all packers but Setisoft, we observed that the packed memory
code behaves like hostname.exe. In fact, Setisoft detects
the presence of Pin tracer and does not run hostname.exe. Table 4: Packer analysis
Thus, in all but one case, we can state that we escape anti-
debugging techniques and that we correctly reach the “pay-
load”. In all but three cases (PE Lock, PE Spin and VM
Protect), we have been able to manually find the original Wave Time (ms) #Instructions #Layers
code of hostname.exe within the waves disassembly, some- 1 62 1189 3
times sliced into small pieces. The packer VM Protect is a 2 47 1115 3
code virtualizer, and thus it is expected that we cannot see 3 20 357 1
that original code, only its intermediate representation. PE
Spin and PE Lock are based on code transformations, and
again it is not surprising that the original code cannot be Table 5: Aspack v2.12
recovered. This sequence of test shows that Pin tracer is
able to correctly instrument many significant packers.
For completeness, we also repeated the experiment replac-
ing hostname.exe with other software. No significant differ- Wave Time (ms) #Instructions #Layers
ences in results were observed. 1 1 85 4
2 1 67 1
Finally, we determine for each packer the number of in-
3 1 20 2
structions by wave and also the number of layers. To con- 4 1 43 2
duct these experiments, we packed again hostname.exe ,which 5 13 693 4
has 335 instructions. Due to a lack of space, we just present 6 1 18 1
the results for Aspack in Table 5.4 and TELock in Table 6. 7 1 28 1
8 1 16 1
5.5 A malware writer scenario 9 1 51 1
10 1 36 1
In this last experiment, we sent the backdoor hupigon.eyf 11 1 23 1
to the Virus Total Web service. From a total of 57 antivirus 12 1 49 1
products, 45 detected hupigon.eyf and correctly identified 13 2 134 3
it. We then packed hupigon.eyf with the Mystic packer 14 9 496 3
and sent it back to Virus Total. This time, only 22 antivirus 15 5 333 3
16 17 799 2
products detected that the file was malicious, but none were
17 3 172 1
able to identify it. 18 8 431 1
We analyzed the same Mystic-packed file with CoDisasm.
The Mystic packer generates 4 waves. The last wave cre- Table 6: TELock v0.99
ates a new process, which in turn creates two new processes.
We traced the 4 waves and the 3 processes of the last wave. this paper, but is available at www.lhs.loria.fr. CoDis-
We verified from the disassembly output of CoDisasm that asm disassembles binaries that are both self-modifying and
Wave 4 contains the payload hupigon.eyf. Then, we sent that employ overlapping instructions as obfuscation tech-
to Virus Total the PE file reconstructed from the last wave. niques, something that is increasingly common in modern
This time, 20 antivirus products correctly detected it as malware.
hupigon.eyf. We think that this relatively low rate of detec- To accomplish this, the disassembler combines dynamic
tion is due to the fact that the antivirus products available analysis of the binary and a static recursive disassembly
on Virus Total are not made to scan the sent PE file, which procedure. We have devised and implemented an array of
is just a memory dump at the last wave. novel techniques, that we have dubbed concatic disassem-
This experiment illustrates a typical scenario where a mal- bly, to address challenges like the discovery of code waves
ware writer builds a new malware by concealing a malicious and code layers. From a technical point a view, the dy-
code with a packer. The key point here is that the dynamic namic analysis of binaries relies on a robust tracer taking
analysis of CoDisasm correctly reconstructs the disassem- into account anti-analysis mechanisms and tracing threads
bly code generated by the packer, and so successfully found and processes. From a theoretical and fundamental point of
hupigon.eyf. This has lead us to think of the potential use- view, we provide an effective model of self-modifying pro-
fulness of a Virus Total extension where each suspect binary grams with overlapping instructions. CoDisasm is probably
is first disassembled by CoDisasm, which then produces a set one of the first tools to achieve these results.
of waves that can then be parsed by each anti-virus. Explor- CoDisasm was mainly designed as an automatic disas-
ing this idea will likely be the object of future research. sembler tool which outputs sequence of disassembled code
waves. In turn, each wave may be analyzed by other tools.
6. DISCUSSION We illustrate this approach with the example of the Hupigon
In this work, we have just considered a single execution malware in Section 5.5. Nonetheless, while it is able to au-
trace. We may wonder whether or not a single trace is tomatically and seamlessly process a moderate amount of
enough. From our experience, the sequence of waves gen- binary code (i.e. 30 binaries per minute in our lab), it would
erated by packers rarely depends on inputs to the program be necessary to speed-up disassembly in order to face the
and is almost blind to its execution environment. Our as- large amount of binaries received and processed each day by
sumption is comforted by some of our previous experimental anti-virus companies.
studies described [4, 7]. To conduct this work, we used 600 Acknowledgments
malware divided in six well-known families. We showed that
less than 2% of malware interact with the system environ- The authors would like to thank Juan Caballero, Saumya
ment in the middle waves. We found that, (i) most of the Debray and Tim Kornau with whom we discussed this work,
time payloads are in the low last waves, and (ii) the wave and who provided invaluable feedback. Work partially funded
structure is relatively simple. That is why, we were able to by French ANR (project BINSEC, grant ANR-12-INSE-0002).
extract payloads in our experiments. But, and as we have
already observed it, we should quickly develop the ability 8. REFERENCES
to automatically generate a set of traces to cope with code
[1] S. Bardin, P. Herrmann, and F. Védrine.
slices of the payloads triggered only when they are used.
Refinement-based cfg reconstruction from
It is even possible to think of an attack, where the malware
unstructured programs. In Proc. Int. Conf.
writer generates a massive number of waves in order to block
Verification, Model Checking, and Abstract
and frustrate analysis with CoDisasm.
Interpretation (VMCAI), pages 54–69, 2011.
The situation is quite different when we deal with binary
[2] J. Caballero, N. M. Johnson, S. Mccamant, and
code in general. Take for example a botnet. A botnet will
D. Song. Binary code extraction and interface
try to connect in order to receive commands, but it may fail
identification for security applications. In Proc. ISOC
if it is run in an isolated testbed [5]. As a result the trace
Network and Distributed Systems Security Symp.
obtained will not be relevant. A solution is to extract mes-
(NDSS), 2010.
sage formats and then to forge messages to generate traces
in order to cover the botnet code [3, 6]. In other cases, an [3] J. Caballero, H. Yin, Z. Liang, and D. Song. Polyglot:
interesting direction would be to determine values to run Automatic Extraction of Protocol Message Format
unexplored paths in a wave to see whether or not it will using Dynamic Binary Analysis. In Proc. ACM Comp.
produce new waves. Take this toy example: Communications Security Conf. (CCS), pages
i f ( d a t e ( ) = ”F r i d a y t h e 13 th ” ) {
523–529, 2007.
unpack ( ) ; e x e c u t e P a y l o a d ( ) ; [4] J. Calvet. Analyse dynamique de logiciels malveillants.
} PhD thesis, École Polytechnique de Montréal and
e l s e p r i n t ” H e l l o world ” ; Université de Lorraine, 2013.
If this code is found in a wave and if it is analyzed in any [5] J. Calvet, C. R. Davis, J. M. Fernandez, W. Guizani,
other day than Friday the 13th, it will not produce the un- M. Kaczmarek, J.-Y. Marion, and P.-L. St-Onge.
packed code. Isolated virtualised clusters: Testbeds for high-risk
security experimentation and training. In Proc. Usenix
7. CONCLUSION Cyber Security Experimentation and Testing (CSET),
We have developed a disassembler, called CoDisasm, that 2010.
targets obfuscated malware x86 binary files running on Win- [6] J. Calvet, C. R. Davis, J. M. Fernandez, J.-Y. Marion,
dows. It comes with an IDA plug-in, called BinViz, to vi- P.-L. St-Onge, W. Guizani, P.-M. Bureau, and
sualize code unpacking waves, which was not described in A. Somayaji. The case for in-the-lab botnet
experimentation: creating and taking down a [22] C. Kruegel, W. Robertson, F. Valeur, and G. Vigna.
3000-node botnet. In Proceedings of the 26th Annual Static disassembly of obfuscated binaries. In Proc.
Computer Security Applications Conference, pages USENIX Security Symposium, pages 255–270,
141–150. ACM, 2010. Berkeley, CA, USA, 2004.
[7] J. Calvet, F. Lalonde Lévesque, J. M. Fernandez, J.-Y. [23] C. Linn and S. Debray. Obfuscation of executable
Marion, E. Traourouder, and F. Menet. WaveAtlas: code to improve resistance to static disassembly. In
surfing through the landscape of current malware Proc. ACM Conf. Comp. Communications Security
packers. In Proc. Virus Bulletin Conf., 2015. (CCS), pages 290–299, 2003.
[8] C. Collberg and J. Nagra. Surreptitious Software - [24] C.-K. Luk, R. Cohn, R. Muth, H. Patil, A. Klauser,
Obfuscation, Watermarking, and Tamperproofing for G. Lowney, S. Wallace, K. Hazelwood, and V. J.
Software Protection. Addison-Wesley Software Reddi. Pin: Building customized program analysis
Security Series, 2009. tools with dynamic instrumentation. In Proc. ACM
[9] K. Coogan, S. Debray, T. Kaochar, and G. Townsend. SIGPLAN Conf. Programming Language Design and
Automatic static unpacking of malware binaries. In Implementation (PLDI), 2005.
Proc. IEEE Working Conf. on Reverse Engineering [25] L. Martignoni, M. Christodorescu, and S. Jha.
(WCRE), pages 167–176, 2009. Omniunpack: Fast, generic, and safe unpacking of
[10] S. Debray and J. Patel. Reverse engineering malware. In Proc. Annual Computer Security
self-modifying code: Unpacker extraction. In Proc. Applications Conference (ACSAC), 2007.
IEEE Working Conf. on Reverse Engineering [26] A. Moser, C. Kruegel, and E. Kirda. Limits of static
(WCRE), pages 131–140, 2010. analysis for malware detection. In Proc. Annual
[11] I. Guilfanov. The ida pro disassembler and debugger. Computer Security Applications Conference (ACSAC),
http://www.hex-rays.com/idapro/. 2007.
[12] W. Guizani, J.-Y. Marion, and D. Reynaud-Plantey. [27] S. Nanda, W. Li, L.-C. Lam, and T. cker Chiueh.
Server-side dynamic code analysis. In Proc. Int. Conf. Bird: binary interpretation using runtime disassembly.
Malicious and Unwanted Software (MALWARE), In Proc. Int. Symp. Code Generation and
pages 55–62, 2009. Optimization (CGO), 2006.
[13] F. Guo, P. Ferrie, and T.-C. Chiueh. A study of the [28] M. D. Preda, R. Giacobazzi, S. Debray, K. Coogan,
packer problem and its solutions. In Proc. Int. Symp. and G. Townsend. Modelling metamorphism by
Recent Advances in Intrusion Detection (RAID), abstract interpretation. In Proc. Int. Static Analysis
pages 98–115, 2008. Symposium (SAS), pages 218–235, 2010.
[14] J. Gustafsson, A. Betts, A. Ermedahl, and B. Lisper. [29] T. W. Reps and G. Balakrishnan. Improved
The Mälardalen WCET benchmarks – past, present memory-access analysis for x86 executables. In
and future. In Proc. Int. Work. on Worst-Case Compiler Construction, volume 4959 of Lecture Notes
Execution Time Analysis (WCET), pages 137–147, in Computer Science, pages 16–35. Springer, 2008.
2010. [30] P. Royal, M. Halpin, D. Dagon, R. Edmonds, and
[15] N. M. Hai, O. Mizuhito, and Q. T. Tho. Pushdown W. Lee. Polyunpack: Automating the hidden-code
model generation of malware. Technical report, Japan extraction of unpack-executing malware. In Proc.
Advanced Institute of Science and Technology, Japan, Annual Computer Security Applications Conference
2014. (ACSAC), pages 289–300, 2006.
[16] C. Jämthagen, P. Lantz, and M. Hell. A new [31] B. Schwarz, S. Debray, and G. Andrews. Disassembly
instruction overlapping technique for anti-disassembly of executable code revisited. In Proc. IEEE Working
and obfuscation of x86 binaries. In Proc. Workshop on Conference on Reverse Engineering (WCRE), pages
Anti-malware Testing Research (WATeR), 2013. 45–, 2002.
[17] M. G. Kang, P. Poosankam, and H. Yin. Renovo: a [32] X. Ugarte-Pedrero, D. Balzarotti, I. Santos, and P. G.
hidden code extractor for packed executables. In Proc. Bringas. SoK: Deep packer inspection: A longitudinal
ACM Workshop on Recurring Malcode (WoRM), study of the complexity of run-time packers. In Proc.
pages 46–53, 2007. IEEE Symp. Security and Privacy (S&P), 2015.
[18] J. Kinder. Static analysis of x86 executables. PhD [33] G. Vigna. Static disassembly and code analysis. In
thesis, Technische Universität Darmstadt, 2010. M. Christodorescu, S. Jha, D. Maughan, D. Song, and
[19] J. Kinder and D. Kravchenko. Alternating control flow C. Wang, editors, Malware Detection, volume 27 of
reconstruction. In Proc. Int. Conf. Verification, Model Advances in Information Security, pages 19–41.
Checking, and Abstract Interpretation (VMCAI), Springer US, 2007.
pages 267–282, 2012. [34] B. Yadegari, B. Johannesmeyer, B. Whitely, and
[20] J. Kinder, F. Zuleger, and H. Veith. An abstract S. Debray. A generic approach to automatic
interpretation-based framework for control flow deobfuscation of executable code. In Proc. IEEE
reconstruction from binaries. In Proc. Int. Conf. Symp. Security and Privacy (S&P), 2015.
Verification, Model Checking, and Abstract
Interpretation (VMCAI), pages 214–228, 2009.
[21] N. Krishnamoorthy, S. Debray, and K. Fligg. Static
detection of disassembly errors. In Proc. IEEE
Working Conf. on Reverse Engineering (WCRE),
pages 259–268, 2009.