Precise System-Wide Concatic Malware Unpacking: David Korczynski
Precise System-Wide Concatic Malware Unpacking: David Korczynski
David Korczynski
[email protected]
Department of Computer Science
University of Oxford
ABSTRACT work is, therefore, to execute a given sample, monitor all memory
Run time packing is a common approach malware use to obfus- writes made by the malware, and whenever dynamically written
cate their payloads, and automatic unpacking is, therefore, highly memory executes, the unpacker identifies this memory as decrypted.
relevant. The problem has received much attention, and so far, The tools then dump this specific memory to enable follow-up
arXiv:1908.09204v1 [cs.CR] 24 Aug 2019
solutions based on dynamic analysis have been the most successful. inspection.
Nevertheless, existing solutions lack in several areas, both concep- There are two main limitations to the approach of existing work.
tually and architecturally, because they focus on a limited part of First, the “write-then-execute” heuristic is not well-suited for pack-
the unpacking problem. These limitations significantly impact their ers that perform system-wide unpacking. This is because the heuris-
applicability, and current unpackers have, therefore, experienced tic only captures code that is dynamically generated explicitly by the
limited adoption. malware and not malicious code that is dynamically generated via
In this paper, we introduce a new tool, called Minerva, for ef- benign code which, unfortunately, is frequently the case in multi-
fective automatic unpacking of malware samples. Minerva intro- process unpacking. Consequently, existing unpackers are mainly
duces a unified approach to precisely uncover execution waves in a suitable for single-process malware and new approaches to capture
packed malware sample and produce PE files that are well-suited for system-wide malware unpacking are needed. Second, the primary
follow-up static analysis. At the core, Minerva deploys a novel in- output of existing work is memory dumps or naively constructed
formation flow model of system-wide dynamically generated code, PE files of the dynamically generated code. The output lacks struc-
precise collection of API calls and a new approach for merging ture, is often an unreasonable over- or under-approximation of the
execution waves and API calls. Together, these novelties amplify actual malware code, and many obfuscation techniques from the
the generality and precision of automatic unpacking and make the packing process, e.g. obfuscation of external dependencies, remain
output of Minerva highly usable. We extensively evaluate Minerva in the output. As such, the analysis that follows must overcome
against synthetic and real-world malware samples and show that these obfuscation techniques to enable meaningful analysis of the
our techniques significantly improve on several aspects compared code. This is a problem because the purpose of unpacking is to
to previous work. facilitate follow-up analysis and not to give any conclusive answer
about the malware itself.
KEYWORDS The limitations described above reoccur in existing work, and
we argue that an essential reason for this is because existing work
Malware analysis, Malware unpacking, Reverse engineering, Pro-
widely uses the same set of benchmark applications to validate
gram analysis
their solutions. These benchmark applications consist of packers
that are out-dated and built more than a decade ago. Consequently,
1 INTRODUCTION the empirical assessment of novel tools occur with old, and often
Conceptually, run time packers encode a binary with obfuscation similar, techniques that do not accurately reflect the challenges
techniques such as compression and encryption to harden analy- posed by modern-day malware packers. To ensure that our novel
sis of their code. This hardening significantly increases the effort tools are relevant, we need new benchmark applications that can
needed to reverse engineer a given sample, whether manually or be used for profiling novel unpackers. These benchmarks must
automatically, because it requires inverting the anti-analysis tech- explore corner-cases of modern packing techniques and be easily
niques used by the packer to understand the full capabilities of the accessible to anti-malware researchers.
malware. Run time packing is a highly effective anti-analysis tech- The goal of this paper is to develop techniques that overcome the
nique, and estimates show more than 80% of malware samples come limitations of existing work highlighted above. We present a unified
packed [15]. The combination of needing to unpack samples before approach to precisely unpack malware samples with system-wide
proper analysis is feasible, and that most malware samples come execution, dynamically generated code, custom IAT loading and
packed makes it desirable to develop approaches that automatically API call obfuscations. The aim is to provide unpacked code that is
unpack malware. well-suited for follow-up analysis via manual reverse engineering
Techniques and tools for automatic unpacking malware have or off-the-shelf static analysis tools. To this end, Minerva deploys a
received a lot of attention in the literature [10, 17, 22, 23, 28, 29, 33, combination of dynamic and static analysis to amplify the effective-
42]. Despite this large amount of research, the vast amounts of work ness of automatic unpacking. The novel techniques presented in
rely on the same core principle, the “write-then-execute” heuristic. Minerva rely on information flow, which makes it highly precise
This heuristic deploys the key observation that in order to execute and capable of unpacking malware samples in a system-wide con-
the encrypted code, it first must be decrypted and, therefore, be text. Minerva models execution waves on a per-process basis and
dynamically generated. The most common approach by previous each process with malware execution operate within the context of
a single execution wave at any given moment. This provides for a written by the instructions in layer Li .
clear wave model and implementation but may result in duplicate
content amongst waves, for example, when execution waves use Limitation 1.1: the write-then-execute heuristic is unable
code from an earlier execution wave. to capture dynamically generated malicious code via benign
Minerva takes as input a 32-bit Windows binary and outputs at code. The strict relationship that instructions of one layer must be
least one Portable Executable (PE) file per execution wave. This dynamically generated explicitly by instructions from a previous
has the benefit of mostly independent PE files but also means the layer severely limits the generality of existing work. Malware that
duplicate content of multiple waves will exist in multiple PE files. uses benign code to dynamically generate its malicious code go
In order to produce output that is useful for follow-up analysis, unnoticed by this model. The implications of this limitation are
Minerva captures how the malware uses external dependencies substantial for capturing dynamically generated code across mul-
throughout the entire execution and maps this to each execution tiple processes by way of code-reuse attacks or OS-provided APIs
wave, resulting in PE files with valid import address tables and since it is not the instructions of the malicious code that does the
patched API calls. Finally, Minerva also performs static analysis writing of memory. Rather, it is benign code that is manipulated by
to identify relevant malware code within each execution wave. In the malware into writing dynamically generated malicious code.
addition to our unpacker, we also propose a new benchmark suite
with applications that combine code-injection techniques, dynami- Limitation 1.2: existing work unreasonably approximate
cally generated code and obfuscation of external dependencies to relevant dynamically generated memory. Whenever an un-
overcome the limitations of empirical evaluation in existing work. packer observes dynamically generated code it outputs the code for
We demonstrate our unpacker empirically against synthetic and follow-up analysis. To do this, the unpacker must have a definition
real-world malware samples. of what parts of dynamically generated memory are relevant to the
Our main contributions of this paper are as follows. unpacked code. This is because not all memory that is dynamically
• We present a novel approach that combines dynamic and generated, e.g. the stack, is relevant for the unpacked output. How-
static analysis techniques to unpack malware that executes ever, this step of identifying relevant memory is highly overlooked
across the entire system automatically. The approach fo- by previous work. For example, neither Renovo [23] nor EtherUn-
cuses on precise analysis and outputs unpacked samples pack [10] clearly describe the specific memory they extract during
that are well-suited for follow-up static analysis. unpacking, and Mutant-X [17] dumps the entire memory image of
• We present a new benchmark suite with samples exploring a process when observing dynamically generated code. These are
modern-day packing behaviours. To the knowledge of the unreasonably imprecise and leave follow-up analysis with the task
author, this is the first benchmark suite that comprises of identifying a needle in a haystack.
synthetic applications aimed at evaluating unpackers.
• We implement the techniques into Minerva and present Limitation 1.3: existing work output raw memory scat-
an extensive empirical evaluation based on synthetic ap- tered across many memory dumps. The majority of existing
plications and real-world malware samples. unpackers [6, 10, 23, 42] make little effort to output the unpacked
code in a coherent data structure but rather output the unpacked
2 BACKGROUND, MOTIVATION AND malware in the shape of raw memory dumps. The problem is that
when malware dynamically generates code, this may be scattered
OVERVIEW across several regions, and some of these may also be data-only
Packing is an umbrella term that refers to a set of various concrete sections. A precise unpacker should not output incoherent raw
obfuscation techniques and there is no clear definition on the spe- memory regions, but rather a suitable data structure that combines
cific obfuscation techniques it encapsulates. This section clarifies these memory regions in an appropriate manner, e.g. re-basing
the obfuscation techniques we treat in this paper and the limitations where needed, that enables meaningful follow-up analysis.
of existing work that motivate us. In total, we have compiled six
core limitations across two general obfuscation techniques.
1 sha256 078a122a9401dd47a61369ac769d9e707d9e86bdf7ad91708510b9a4584e8d4
3
Malware overcome the limitations against malware that obfuscates exter-
sample nal dependencies, the solution must also (limitation 2.1) precisely
capture the use of API calls within the malware code; (limitation
Full system 2.2 and 2.3) do this in the context of custom API resolution and
recording obfuscated API calls; and, finally, map these observations to the
Taint analysis output, so it is readily available for follow-up analysis.
Replay The solution we come up with, and implement into Minerva,
Malware Tracing analysis deploys a two-step approach following the architecture shown in
Wave collector Figure 3. First, we use the dynamic analysis in Minerva to precisely
extract packed code and the API calls of the malware, and then
API hooks
we use static analysis to construct PE files based on the unpacked
code. Specifically, the first step is to capture the malware execution
Disassembler trace using dynamic taint analysis in a similar fashion to Tartarus
Static presented by Korzynski and Yin [29]. Then, we abstract the malware
IAT builder analysis execution trace into execution waves based on information flow
PE builder analysis such that an execution wave is a process-level construct
Binary patcher that represents dynamically generated code in the malware. During
the run time analysis, Minerva also ensures precise identification of
PE files API calls by the instructions in the malware execution trace. From
the first step, we get a set of execution waves consisting of memory
dumps, the malware execution trace of each wave and more. The
Figure 3: Architecture of Minerva’s automatic unpacker. second step Minerva performs is to group related memory within
each execution wave using disassembly techniques. Minerva then
converts each group of related dumps into a new PE file with a
in this case contains the base-offset of a custom IAT by the mal- new import address table, and patches API calls based on static
ware. Furthermore, Figure 2 shows an example from an application analysis and the API calls observed during dynamic analysis. In the
packed with the PEtite2 packer where the code calls a Windows following sections, we detail these steps.
API function by pushing a value on top of the stack, rotating that
value and then transferring execution via a ret instruction to the 3 SYSTEM-WIDE MALWARE TRACING
rotated value on top of the stack. A key component of our system is the ability to trace the mal-
The output of existing unpackers is not capable of resolving ob- ware throughout the entire operating system using dynamic taint
fuscated API calls in the unpacked code. This is a problem because analysis. We implement the techniques in Tartarus [29] to do this.
it is much harder and sometimes impossible, to determine the desti- In order to make this paper self-contained we briefly summarise
nation of the branch instructions in follow-up analysis than it is for the idea in this section, however, for complete description of the
the unpacker. For example, without knowledge about the contents approach we refer to [29].
of EBX, the data at the address being read and the process layout,
it is impossible to determine the destination of the given branch
3.1 Abstract model of execution environment
instructions and if they are API calls.
We define a formal environment in which we can reason about exe-
2.3 Solution overview cutions in a sandbox. The model we present is an extension of work
from Dinaburg et al. [10]. We consider execution at the machine
The goal of this paper is to develop system-wide, precise and general instruction level, and since an instruction can access memory and
unpacking techniques. Specifically, our goal is to input a malware CPU registers directly we consider a system state as the combina-
binary into our Minerva tool and output PE files that precisely tion of memory contents and CPU registers. Let M be the set of all
capture the malware code post-decryption and decompression, and memory states and C be the set of all possible CPU register states.
also capture how the malware uses external dependencies. The aim We denote all possible instructions as I , where each instruction can
is to output PE files that are well-suited for follow-up analysis by be considered a machine recognisable combination of opcode and
off-the-shelf static analysis tools and manual investigation. operands stored at a particular place in memory.
To achieve our goal, we must overcome the limitations high- A program P is modelled as a tuple (M P , ϵ P ) where M P is the
lighted above. First, to overcome the limitations when dealing with memory associated with the program and ϵ P is an instruction in
dynamically generated code, we need a solution that can (limita- M P which defines the entry point of the program. There are often
tion 1.1) identify dynamically generated memory across the system; many programs executing on a system and each of these may com-
(limitation 1.2) extract precisely the memory that is relevant to the municate with each other through the underlying OS. As such, we
malware; and (limitation 1.3) combine the relevant dynamically model the execution environment E as the underlying OS and the
generated code into meaningful and related structures. Second, to other programs running on the system.
We define a transition function δ E : I ×M ×C → I ×M ×C to repre-
2 https://www.un4seen.com/petite/ sent the execution of an instruction in the environment E. It defines
4
how execution of an instruction updates the execution state and 3.3 Tracing the malware execution
determines the next instruction to be executed. The trace of instruc- The goal is to capture malware execution throughout the whole
tions obtained by executing program P in execution environment system in a precise and general manner. The overall idea is to
E is then defined to be the ordered set T (P, E) = (i 0 , . . . , il ) where use dynamic taint analysis to mark the malware under analysis as
i 0 = ϵ P and δ E (i k , Mk , Ck ) = (i k +1 , Mk +1 , Ck +1 ) for 0 ≤ k < l. We tainted and then capture its system-wide execution by following
note here that the execution trace does not explicitly capture which how the taint propagates through the system.
instructions are part of the program, with the exception of i 0 , but Algorithm 1 gives an overview of our approach to capturing the
rather all the instructions executed on the system including instruc- malware execution trace. Assuming the first instruction executed
tions in other processes and the kernel. For any two elements in on the system is the entry point of the malware, the first step (line
the execution trace i j ∈ T (P, E) and i k ∈ T (P, E) we write i j < i k if 1) is to taint the memory making up the malware. In particular,
j < k, i j > i k if j > k and otherwise i j = i k . We use this to define we taint the entire malware module, including data and code sec-
ordering between the instructions of the sequence. tions. Next, execution continues until there is no more taint or a
user-defined timeout occurs, and for each instruction executed we
3.2 Malware execution trace check if the memory making up the instruction is tainted (line 7).
We now introduce the concept of malware execution trace. Suppose We include the instruction in the malware execution trace if the
P is a malware program and PA is some malware tracer that aims instruction is tainted (line 8). For each instruction in the malware
to collect P’s execution trace. Malware program P is interested in execution trace we taint all the output of the instruction, so as to
evading analysis and gain privilege escalation by using code-reuse follow memory generated by the malware that is generated inde-
attacks and code injections. As such, the execution trace of the pendently of the initial state of the malware memory, as shown by
malware may contain instructions that are not members of program the Update algorithm in Algorithm 2 (line 3-5).
P’s memory M P .
To monitor the malware across the environment, the malware 4 INFORMATION FLOW EXECUTION WAVES
monitor PA maintains a shadow memory that allows it to label the Given the malware execution trace, the next step is to partition it
memory and the CPU registers. This shadow memory is updated into execution waves. The goal of execution waves is to capture
for each instruction in the execution trace. Let S ⊆ M ×C be the set dynamically generated malicious code independently of who wrote
of all possible shadow memories. We then define the propagation the code and on the basis that the generated code must originate
function δ A : S × I → S to be the function that updates the shadow from the malware. However, we consider execution waves to be
memory when an instruction executes. The list of shadow memories more than just a sequence of instructions. The set of execution
collected by the malware tracer is now defined as the ordered set: waves gives an explicit representation of an entire application, in-
STA (T (P, E)) = (s 0 , . . . , sl ) where δ A (sk , i k ) = sk +1 for 0 ≤ k < l. cluding dynamically generated malicious code, and each execution
The job of the malware tracer is to determine for each instruc- wave may, therefore, include both executable and non-executable
tion in the execution trace whether the instruction belongs to data.
the malware or not. To do this, the analyser uses the predicate In this section we give a semantics for execution waves (Section
ΛA : S × I → {true, f alse}. The malware execution trace is now 4.1) and describe how we collect the waves in practice (Section 4.2).
given as the sequence of instructions for which ΛA is true and we
call ΛA the inclusion predicate. We define the malware execution
4.1 Execution wave semantics
trace formally as follows:
The goal of our execution wave semantics is to clearly define the
Definition 1. Let T (P, E) be an execution trace and PA a mal- conversion of a malware sample’s execution into waves of dynami-
ware tracer. The malware execution trace is the ordered set ΠA = cally generated malicious code. As such, we describe the waves in
(m 0 , . . . , md ) where: relation to an execution trace T (P, E) described in Section 3.1.
• ΠA is a subsequence of T (P, E); We partition the malware execution into waves on a process-
• ∃v | m j = iv ∧ ΛA (sv , iv ) for 0 ≤ j ≤ d. level basis. We map every instruction in the malware execution
trace i ∈ ΠA to a process Py and a wave within this process Wx .
The above definition says that the malware execution trace is We denote PyWx to mean wave x within process y, and every
a subsequence (ordering is preserved) of the entire whole-system process with malicious code execution contains a sequence of waves
trace and for each instruction in the malware execution trace there Py .Ω = PyW0 , . . . , PyWn with |Py .Ω| ≥ 1. We denote the initial
is a corresponding instruction in the whole-system trace for which wave in which malware execution begins as Pϵ Wϵ and the set ΦΠ
the inclusion predicate is true. contains all execution waves for a given malware execution trace
The malware execution trace gives us a definition we can use to ΠA . For each instruction in the malware execution trace, we first
reason about the properties of malware tracers. In particular, for identify the process in which they execute and then the wave they
a given malware tracer it highlights the propagation function, δ A , belong to within their respective process.
and the inclusion predicate, ΛA , to be the defining parts. Having Formally, we define an execution wave as follows.
constructed our model of malware tracers and identified the key
aspects that determine how they collect the execution trace, we now Definition 2. An execution wave is a tuple composed of:
move on to present how Minerva precisely captures system-wide • A sequence of instructions I = i 0 , . . . , i n executed in the
propagation. given wave. We have i 0 to be the entry point of the wave;
5
• a shadow memory S, which is a set of ordered pairs (maddr , mbyt e ) sufficient to keep track of the current wave in each process. We
that contains the tainted memory making up the wave, in- initially only have one, wave which is the wave inside of the process
cluding both code and data memory; executing the malicious application. The shadow memory S of this
• the tainted writes T which is a set of ordered pairs (taddr , tbyt e ) wave is the malware module when loaded into memory, and the set
that holds the tainted memory written by instructions in P of tainted writes is initially the empty set, T = ∅. We then update
since i 0 , where P is the process of the execution wave. the set of tainted writes whenever an instruction writes tainted
memory following our Update function shown in Algorithm 2.
Next, we present a set that formalises our requirements for par-
titioning a complete execution trace into a set of execution waves.
ALGORITHM 1: Wave collection
The purpose of this definition is to capture every layer of dynami- Data: (input) Malware sample B
cally generated malicious code and not restrict a minimal overlap Result: Logged malware execution waves and malware execution trace Π .
between the content of each execution wave. In the following, 1 P ← init taint(B)
2 T, S ←init waves(B) // initialise the shadow memories and tainted writes.
we write for two instructions i, j, i < j if i comes before j in the
3 // Full system instrumentation
malware execution trace, and vice versa. 4 i ← f ir st inst r ()
while P , ∅ do
Definition 3. Let T(P,E) be an instruction execution trace and ΠA 5
6 // is the instruction tainted?
the corresponding malware execution trace. The set of execution waves 7 if i[A] ∈ P then
is then given ΦΠ = {P 0 , . . . , Pn } where: 8 Π ← Π∧ hii
9 if i[A] < Spid then
• ∀i ∈ ΠA ∃Px ∈ ΦΠ |i ∈ Px .I. 10 if i[A] < Tpid then
• For any PyWx and PyWz in ΦΠ where x < z we have that 11 Spid ← Spid ∪ (i[A], i[mem])
∀i x ∈ PyWx .I, ∀i z ∈ PyWz .I|i x < i z . 12 Wpid ← Wpid ∪ {i }
This says that there is a strict ordering in the malware 13 else
14 Spid , Tpid , Wpid ← dump wave()
execution trace between the instructions of any two waves in 15 else
a given process Py .Ω. 16 if i[A] ∈ Tpid ∧ Spid [i[A]] , i[mem] then
• ∀(maddr , mbyt e ) ∈ Pw Ww 0 .S∃(taddr , tbyt e ) ∈ Pt Wt 0 .T 17 Spid , Tpid , Wpid ← dump wave()
else
|(maddr , mbyt e ) ∈ Pϵ Wϵ .S∨(maddr , mbyt e ) = (taddr , tbyt e )
18
19 Wpid ← Wpid ∪ {i }
where Pw Ww 0 , Pt Wt 0 and ∀iw ∈ Pw Ww 0 .I∃i t ∈ Pt Wt 0 |i t < 20 i, P, T = updat e(i, P, T)
iw . 21 return (Π)
This says that the shadow memory for all execution waves
must either exist in the shadow memory of the initial wave To capture execution waves, we monitor for each process the re-
or be composed of tainted memory written by a wave that lationship between the currently executing instruction, the shadow
started earlier. memory and the set of tainted writes following Algorithm 1. Specif-
• For any wave PyWx ∈ ΦΠ we have that ∀i ∈ PyWx .I ically, for every tainted instruction in the malware execution trace,
|∃(maddr , mbyt e ) ∈ PyWx .S|i[A] = maddr . there are four possible cases:
This says the memory of any instruction in each execution
(1) The address of the instruction is not in the shadow memory
wave must be present in the shadow memory of the given
and not in the tainted writes (line 10 Algorithm 1);
wave.
(2) The address of the instruction is not in the shadow memory
An important aspect of Definition 3 is that the second bullet but in the tainted writes (line 13 Algorithm 1);
enforces a strict ordering between instructions in the set of execu- (3) The address of the instruction is in the shadow memory and
tion waves for each process. The effect of this is that we preclude in the tainted writes but the content of the shadow memory
instructions from any given execution wave to be used in any other is not similar to current instruction (line 16 Algorithm 1);
execution wave. The reason we do this is that it creates a clear (4) The address of the instruction is in the shadow memory and
history of execution wave progress within each process, and it in the tainted writes and the content of the shadow memory
becomes easier to implement since it is only necessary to maintain is equivalent to the memory of the current instruction (line
one execution wave per process. The drawback is that when mal- 18 Algorithm 1).
ware transfers execution to code from an earlier execution wave, Case (1) happens in two scenarios. The first case is when tainted
we include some content of the earlier execution wave into the memory is transferred across processes via shared memory. For
current execution wave. In this way, we may end up with waves example, if tainted memory is written to memory shared by pro-
that overlap in their shadow memory, but, naturally, this can be cesses P1 and P2 and the instructions performing the writing is
stripped during post-processing. However, we have found this to in P1 , then the tainted writes will not be in P2 .T or P2 .S because
be no major issue and that the trade-off works well in practice. we only populate P2 .T if instructions from P 2 .T are writing to
However, we leave the door open and encourage future work in the address space of P2 . The second case is when code from the
other models, e.g. more refined models. current wave transfers execution to code that is part of an earlier
wave. This is because the shadow memory of each wave does not
4.2 Collecting the execution waves propagate to the proceeding wave, but the memory remains tainted
In practice, we only associate one wave with a given process at nonetheless. Whenever we observe case (1) we add the memory of
any given moment. Therefore, to collect the execution waves, it is the instruction to the shadow memory of the current process and
6
also append the instruction to the sequence of instructions in the All of this information will then be used to reconstruct PE files that
current wave. In this case, we update the shadow memory of the are effective for follow-up static analysis.
current wave with the executing instruction.
In case (4) the current instruction is simply part of the current
execution wave, and this is by far the most common case. In this 5 PRECISE DEPENDENCY CAPTURE
case, we append the instruction to the instruction sequence of the In order for the PE files to be useful for follow-up static analy-
current wave. In cases (2) and (3) we consider the current instruction sis, they must show how the malware uses external dependencies.
to be the entry point of a new execution wave. Specifically, in case As described in Section 2.2, we must consider custom API call
(2) the instruction is dynamically generated in a new memory region resolution and obfuscated API calls. To this end, we capture the
and in case (3) the instruction is dynamically generated on top of destination of every branch instruction in the malware execution
already existing malware code. trace and check if it corresponds to the beginning of a function in
In the event of a new wave, we log information about the current an external module.
wave following Algorithm 3. First, we log the instructions executed To collect the addresses of functions in each process with mal-
in the current wave, tainted writes and the shadow memory, which ware execution, we iterate the export table of every module in the
includes dumping every page in which there is a tainted write given process and capture the address of every function it exports.
and also dumping the shadow memory. Then we set the shadow We put these functions in a per-process map that pairs function ad-
memory of the next wave to be the tainted writes of the current dresses with their respective function names. Minerva also comes
wave and set the tainted writes to be the empty set. with the possibility to speed up this process using pre-calculated
function offsets for a given DLL. As such, with pre-calculated offsets,
ALGORITHM 2: Update we only need to know the base address of a given imported module
Data: (input)Instruction i , memory propagation set P , Tainted writes T .
inside the malware process to compute the absolute addresses of
Result: Next instruction i ne x t , memory propagation set P , Tainted writes T its exported functions.
1 P ← pr opaдat e t aint (i, P) To capture the API functions that the malware calls, we obtain the
2 i ne x t ← ex ec inst r (i)
destination of every branching instruction in the malware execution
3 if i[A] ∈ P then
4 for o ∈ i[O ] do trace. If the branch destination is in the set of functions exported by
5 P ← P ∪ {o } any of the dynamically loaded modules within the execution trace,
6 for w ∈ i[W ] do it means the malware performs an API call. We log every API call
7 if w ∈ P then T[i .pid] ← T[i .pid] ∪ {w }
8 return i ne x t , P, T and for some functions the parameters as well. For many functions
in the Windows API, the return value is also essential to understand
the semantics of the call. To capture the return value and output
parameters, we note the return address of the API call on the stack
ALGORITHM 3: dump wave and read the output of the function whenever the return address
Data: (input)Current wave W , shadow memories S , Tainted writes T . executes. We also monitor functions like LoadLibrary to update
Result: Updated S , T , W our export table when processes load new modules.
1 LogInstrs(Wpid )
Our approach to monitoring API calls precisely captures the API
2 LogTaint( T)
3 LogShadowMem( S ) calls performed by instructions in the malware execution trace and
4 Spid ← Tpid do not capture API calls performed by benign code inside a process
5 Tpid ← ∅ in which the malware executes. Furthermore, because we know the
6 Wpid = ∅ ∪ {i } specific malicious instruction for each execution wave, it is trivial
7 return Spid , Tpid , Wpid
to map API calls to execution waves. This precise mapping highly
improves the precision of the analysis in comparison to sandboxes
The execution waves capture dynamically generated code inde- that capture API calls globally within a process since many of these
pendent of who wrote the code including dynamically generated calls are irrelevant to the malware (this is particularly true in code
malicious code via benign code. We achieve this generality because injected processes).
the shadow table is composed of tainted memory and tainted mem- Minerva currently does not take any efforts when malware hides
ory propagates through both benign and malicious instructions. API usage by way of stolen bytes or copying of the Windows code.
Since the tainted code originates from the malware itself, it is dy- Furthermore, if malware deploys inlined library code or statically
namically generated malicious code. This property distinguishes linked libraries, then Minerva will not consider these as external
our technique from previous work and allows it to be more general dependencies. This is a limitation we discuss further in Section 8.
without losing precision.
The output from collecting the execution waves is the sequence of
waves executed during the malware execution. For each execution
6 STATIC RECONSTRUCTION OF
wave, we have memory dumps of the tainted memory during its EXECUTION WAVES
execution and the list of instructions that belong to each wave. As After collecting the execution waves and external dependencies, we
such, we have an explicit representation of each instruction in the need to combine these into PE files. For each execution wave we
malware execution in the form of its raw bytes, and we also have construct a set of PE files based on the content of their respective
memory dumps of any non-executed malicious (tainted) memory. shadow memory and for each PE file we need three ingredients: (1)
7
The specific memory pages of an execution wave that makes up the 6.2 Dependency reconstruction
PE file; (2) the PE’s IAT; (3) the entry point of the file. To reconstruct external dependencies in our PE files, we need to
The static analysis component of Minerva performs three main rebuild the IAT of the binary and patch instructions to rely on this
steps. First, it groups related memory dumps of each execution new IAT.
wave, then identifies external dependencies in each of these memory To construct the IAT, we first identify API calls made by instruc-
groups and, finally, builds new PE files based on the results of the tions belonging to the pages of each memory group. We identify
two previous steps. these by matching the API hooks collected during dynamic analysis
to instructions of the respective code wave and the pages of the
given memory group. We include each unique API function in the
IAT of the reconstructed PE file.
Although we know which instructions branch to external APIs
from the malware execution trace, the branch destinations may
6.1 Merging over-approximated shadow not be visible from the memory dumps themselves. The final step
in constructing PE files is, therefore, to map the instructions that
memories perform API calls to our newly generated IAT by patching them
The output from collecting the execution waves includes, for each on the binary level. Unfortunately, binary patching is not an easy
execution wave, page-level memory dumps of the shadow memory task since some instructions may require for us to rearrange the
and tainted writes. Intuitively, it can seem appropriate to convert all instructions in the binary, and this may subsequently break it. In
of these memory dumps into one large PE file and use this for static practice, we patch branch instructions that are 6 bytes long, e.g.
analysis. However, we have found this to be imprecise in practice call [0xdeadbeef], because we can do this without rearranging
because it is rarely the case that all of the memory dumps are instructions. We do not patch instructions that are less than 6
relevant to the malware. On the one hand, the shadow memory is a bytes, e.g. call eax. However, we still keep the cross-references
conservative approximation as we capture some memory that is not so they can be used in more abstract representations in a follow-up
executable code, and some memory is a result of over-propagated analysis.
taint. On the other hand, we do not want to reconstruct PE files
purely based on executed memory since this will miss non-executed,
yet still, malicious executable code, and also relevant data sections. 6.3 Final PE construction
To avoid this imprecision, we divide the page-level memory In order to construct the final PE file, we need to know the entry
dumps from the dynamic analysis into smaller groups, such that point and the PE sections to put in the file. To identify the entry
the pages of each group are related and no page in a given partition point, we go through the instruction sequence of the given wave
relates to any other partition. The goal with this is to capture and identify the first instruction in the range of each memory dump
the parts of the malware that are self-contained and represent the group. In order to construct the PE sections, we rely on the memory
application timelessly. To this end, we create a PE file with multiple intervals that we end up with in each memory group. For example,
sections for each partition. Figure 4 shows an example of how we in our example from Figure 4 we end up with one group and two
select the specific tainted pages that are relevant for the unpacked intervals ([0x5300000-0x5303000], [0x6200000-0x6201000]). We
malware from a set of tainted pages output by Minerva’s dynamic make each of these intervals into an individual section of the PE file
analysis component. and place the newly generated IAT in-between the PE header and
The first step is to identify the tainted pages with malicious code these sections. The reason we make each of them into individual
execution. To do this, we first iterate the sequence of instructions sections is to avoid rebasing each interval. The pages we dump
executed in a given execution wave and collect all pages that hold from virtual memory are placed at various locations, and each of
instructions from this sequence. Following this, we iteratively them must keep this virtual address in the PE file. As such, the
collect neighbouring pages until there are no more neighbouring PointerToRawData and VirtualAddress values in each section
pages and the result is a set of page-level intervals where some header will be significantly different, and the VirtualAddress
pages hold executed code, and other pages neighbour up to these. points to the base address of each interval as it was when dumped
This corresponds to the first two steps in Figure 4. from virtual memory (0x5300000 and 0x6200000 in the example
Following this, we identify pages in the shadow memory that in Figure 4).
relate to each interval. To construct self-contained PE files, we cap-
ture data-dependencies and control-dependencies to other pages
in the shadow memory for each interval. We do this by perform- 7 EVALUATION
ing speculative disassembly on each memory dump to capture Having presented the core techniques of Minerva, we now move
cross-references to other memory dumps. This step gives us cross- on to evaluate Minerva using multiple benchmarks with respect to
references for each interval, and we then iteratively merge related the following research questions:
intervals such that no interval will have cross-references to other
intervals. Following this approach, we end up with a set of groups (1) Does Minerva precisely capture dynamically generated
of memory dumps, and we create a PE file for each of these groups. code and the malware’s API calls?
In the example in Figure 4 we end up with one group consisting of (2) Does Minerva improve results over previous work?
two intervals and will, therefore, create one PE file. (3) Is Minerva relevant for common malware analysis tasks?
8
Tainted pages from
dynamic analysis
0x7768000
0x7751000
0x7731000
0x7557000 0x6201000
0x7551000 0x5303000 0x6200000
0x6201000 0x5301000 0x5302000 0x5303000
0x6200000 Collect 0x5300000 Gather neigh- 0x5301000 Speculative 0x5302000
pages with bouring disassem-
0x5901000 0x5300000 0x5301000
malicious pages. bly to collect
0x5303000 execution. pages by cross- 0x5300000
0x5302000 referencing.
0x5301000
0x5300000
Figure 4: The process of identifying which tainted pages from dynamic analysis that are relevant when reconstructing un-
packed PE files. In the example, the reconstructed PE file has two sections (0x5300000-0x5303000 and 0x6200000-0x6201000).
To facilitate the research questions above, we gather four sets of Artemis CTBLocker Cerber CoinMiner
benchmark applications comprising synthetic applications as well CosmicDuke Emotet Kovter Madangel
as real-world malware applications: Mira Natas Nymaim Pony
(1) Benchmark #1 : Ground truth data set. We develop a Shifu Simda TinyBanker Urausy
new benchmark suite that combines the use of dynamically Zbot
generated code, code injection and obfuscation of external Table 1: The malware families in Benchmark set #4. We col-
dependencies. In total, we have developed nine different lected a total of seven samples from each family.
applications, and they are all described in Table 2.
The applications in the benchmark suite represent many
of the challenges posed by real-world packers. To the
knowledge of the authors, this is the first dedicated bench- to ensure the samples are indeed benign, we required each
mark suite for challenging the attributes of unpackers sample to be detected by at least 15 anti-malware vendors.
where none of the samples relies on packers developed Furthermore, in order to ensure certainty that the samples
by third-party teams. The benefit of this benchmark suite belong to their respective families, we required at least two
is that each sample poses specific challenges that are clearly vendors to label them in the same family. On average each
defined, the applications are easy to understand, and we sample had 52 anti-malware vendors report it as malicious
have the complete source code of each example. As such, it and a median of 54. We recorded each of these samples for
becomes much more accessible to determine if an unpacker 25 seconds and set a max replay time of 120 minutes.
is successful because there is no need to reverse engineer In order to assess the techniques of Minerva, we must make a
large amounts of binary code. fair and meaningful comparison to existing work. One approach is
(2) Benchmark #2 : Selected malware samples. The sec- to compare Minerva to recently proposed unpackers like Codisasm
ond data set corresponds to several malware samples from [6] or Aranchino [34]. However, we already showed in [29] that
the families CryptoWall, Tinba, Gapz and Ramnit. These Codisasm is very limited due to its implementation in PIN, and
samples perform many of the obfuscation techniques that Aranchino is also developed on top of PIN with no additional effort
Minerva aims to overcome, such as code injection com- for analysing system-wide malware. Instead, we compare Minerva
bined with dynamically generated code and custom API to the unpacker by Ugarte et al. [42].
resolution. Ugarte et al. [42] propose a malware unpacker that is capable of
(3) Benchmark #3 : Packed synthetic samples. We have analysing multi-process malware is implemented on top of QEMU.
taken a set of synthetic samples and packed them with The unpacker supports multi-process unpacking by monitor various
well-known packers. In these applications, we know the system calls and also develop techniques for capturing memory
applications’ behaviours before packing because we de- mappings. The tool they present is only available as a web service3 ,
sign the applications; however, we do not know the exact which forces us to treat their system as a black box. Furthermore,
changes the packers make on the code and, therefore, do they do not mention which OS they support in their work; however,
not have ground truth about the packed applications. from experimenting with the service, we conclude the analysis
(4) Benchmark #4 : Real-world malware samples. This environment is Windows XP. We determined this because the web
set comprises 119 malware samples from the real-world service responds with “Error - The sample did not start executing.”
malware families listed in Table 1. We collected seven when faced with applications compiled for Windows 7 and later,
samples from each family to maintain a balanced data set,
and the samples were collected from VirusTotal. In order 3 www.packerinspector.com
9
ID Description.
applications apart from the generic Windows processes running in
Dynamically generates code and uses custom IAT resolution to resolve
D1 the guest machine itself.
GetModuleHandle, GetProcAddress and ExitProcess and exits.
Dynamically generates code and uses custom IAT resolution to resolve
D2 GetModuleHandle, GetProcAddress and MessageBoxA and then displays a
message box. 7.3 Empirical evaluation of correctness
Dynamically generates code that further dynamically generates code and
then uses custom IAT resolution to resolve GetModuleHandle,
In our first experiment, we match Minerva and PackerInspector
D3
GetProcAddress and MessageBoxA and then displays a message box. with the ground-truth samples in benchmark set #1. For each of the
Dynamically generates code that further dynamically generates code and samples, we capture the number of execution waves, the number
D4 then uses custom IAT resolution to resolve GetModuleHandle,
GetProcAddress and ExitProcess and then exits.
of processes involved in the execution, the number of API calls
Opens the Windows process explorer.exe using OpenProcess, observed from the last wave of each sample and the number of
WriteProcessMemory and CreateRemoteThread, then inside the target functions in the IAT of the unpacker’s output. We match the results
C1
process dynamically resolves the address of GetModuleHandle,
GetProcAddress and ExitProcess, and calls each of them to exit. from the output of Minerva and PackerInspector with our ground
Opens the Windows process explorer.exe using OpenProcess, truth data and Table 3 shows our results.
WriteProcessMemory and QueueUserAPC, then inside the target process For the samples that execute in a single process, both Minerva
C2
dynamically resolves the address of GetModuleHandle, GetProcAddress
and ExitProcess, and calls each of them to exit. and PackerInspector capture the number of processes and waves
Uses the PowerLoaderEx injection that relies on a global memory buffer and accurately. In two of these four samples, Minerva captures the
code-reuse attacks to hijack execution of explorer.exe. Inside five expected API calls accurately, and in the other two Minerva
C4
explorer.exe code-reuse attacks transfers execution to shellcode that calls
LoadLibraryA. captures slightly more than the expected number. PackerInspector,
Uses the Atombombing injection techniques that relies on the global atom however, attributes about 200x more API calls than the expected
C5 tables to execute within explorer.exe. Inside explorer.exe it uses
code-reuses attack to execute a piece of shellcode that launches calc.exe.
number to the final wave of the execution. Furthermore, Minerva
Injects code into explorer.exe similarly to A1, then inside the target builds the IAT for the two samples accurately and a slightly larger
M1
process dynamically generates code that then dynamically resolves the IAT for the other two. PackerInspector is unable to produce any
address of GetModuleHandle, GetProcAddress and ExitProcess, and
calls each of them to exit.
output with an IAT, and there is no sign of API usage in the output
Table 2: Description of the samples in data set #1 and how of PackerInspector. The reason Minerva captures slightly more API
they perform code injection. calls than expected is that the compiler, naturally, adds various
function calls around the source code. PackerInspector success-
fully identifies the correct number of execution waves but fails to
attribute API calls accurately to unpacked code and also fails to
produce any output with an IAT. Minerva, however, succeeds at
but runs normally with Windows XP applications. As such, we both.
wrote the samples in our data set to make sure they all execute For the samples that perform multi-process execution, we ob-
correctly on both Windows XP and Windows 7. We will refer to serve that Minerva captures all processes, execution waves, API
the unpacker by Ugarte et al. [42] as PackerInspector. calls, and rebuilds PE files with the expected IAT. The reason Min-
erva does not capture slightly more API calls than the expected
7.1 Implementation amount in these samples is that the final wave occurs within an
Minerva is built on top of PANDA [11], which is a dynamic analysis injected process and does not contain the added functions from
framework based on full system emulation and utilises a record- the compiler. Surprisingly, PackerInspector fails to detect multi-
and-replay infrastructure. All of the code on top of PANDA is built process execution in any of the samples, and we suspect this is
in C/C++ and the majority of our tools that process the output of because PackerInspector only monitors for multi-process execution
the sandbox are in Python. Most of the code in Minerva’s dynamic via memory mapped files which none of the samples uses. From our
analysis is on top of PANDA; however, we have had to modify the multi-process samples, we observe the limitations of the original
main taint analysis plugin that comes with PANDA to be less re- write-then-execute heuristic, in that it is unable to handle system-
source intensive. Specifically, PANDA’s taint2 plugin can quickly wide unpacking in a general and precise manner. However, the
use 40+ GB of memory, and to limit this, we removed support for novel techniques introduced by Minerva are successful at this.
taint-labels and made some data structures more simplistic. When matched with our ground-truth samples it is clear that
PackerInspector over-approximates the API usage of the applica-
7.2 Experimental set up tions, is unable to output unpacked code that shows API usage
when faced with obfuscations of external dependencies and under-
We conduct all of our Minerva experiments on a 4-core Intel-7 CPU approximates the system-wide malware execution. These obser-
with 4.2 GHz and a Windows 7, 32-bit guest architecture. The guest vations verify our hypothesis that state-of-the-art unpackers are
is in a closed network and connected to another virtual machine unable to deal with many challenges faced by system-wide packing
that performs network simulation using INetSim[18]. As such, and that the techniques in Minerva overcome these limitations.
malware samples that connect back to some CC server will be able
to resolve DNS names, connect to every IP and also receive content.
However, the content itself is the default data provided by INetsim.
7.4 Empirical evaluation against selected
We executed the applications on the guest machine with a local malware
admin account, and User Account Control (UAC) enabled. We In our second experiment, we match Minerva and PackerInspector
perform no user stimulation during the analysis, and there were no with the malware samples in data set #2. The goal of this experiment
10
Precision Precision
(Ground Truth, Minerva, PackerInspector) (Minerva, PackerInspector)
Sample #Procs #Waves #API calls in final wave #IAT size Sample Procs Waves API calls Unique APIs
(1) D1 1,1,1 2,2,2 5,8,1007 3,5,0 CryptoWall4 5(†2),4(†1) 8,4 7371,21050 148, 354
(1) D2 1,1,1 2,2,2 5,5,1007 3,3,0 CryptoWall 5 3, 4(†1) 6,4 9945,23580 135, 388
(1) D3 1,1,1 3,3,3 5,5,1006 3,3,0 Tinba 6 3,2 4,3 557,34076 54, 477
(1) D4 1,1,1 3,3,3 5,10,1007 3,6,0 Tinba 7 3,2 4,3 667,49260 55,549
(1) C1 2,2,1 2,2,1 6,6,† 3,3,† Tinba 8 3,2 4,3 704,49262 55, 550
(1) C2 2,2,1 2,2,1 6,6,† 3,3,† Gapz 9 2,1 7,5 36509156, 15850000 140, 336
(1) C4 2,2,1 2,2,1 1,1,† 1,1,† Gapz 10 2,1 4,3 36504908, 15845670 125, 226
(1) C5 2,2,1 2,2,1 5,5,† 3,3,†
Gapz 11 3,1 5,2 36506063, 15844113 186, 251
(1) M1 2,2,1 3,3,1 6,6,† 3,3,†
Ramnit 12 3, 4(†1) 8,5 6908,56720 116, 479
Table 3: The evaluation results from matching Minerva and Ramnit 13 12(†1),5 30,6 16185,209828 153, 489
PackerInspector with the ground-truth samples of data set Ramnit 14 3, 5(†1) 8,8 3189, 115943 115, 621
#1. † means not available because PackerInspector failed to Table 4: The evaluation results from matching Minerva and
reach the last wave. PackerInspector with the malware samples of data set #2. †
indicates the number processes we determined to be false
positives.
11
1 2 3 4-5 6-11 7.5 Relevance on malware
66% 15% 9% 5% 5% In this experiment, we match Minerva with benchmark set #4.
Table 5: Number of process executions per malware sample. In total we run 119 samples through Minerva and collect (1) the
number of process executions; (2) the number of waves; (3) the
1 2 3 4 5 5¡ number of generated PE files; (4) the number of imports in the IAT
52% 18% 7% 9% 2% 12% of each PE file and (5) the number of sections in each PE file.
Table 6: Number of waves per malware sample. Table 5 shows the number of processes and Table 6 the number
of waves in our data set. We find that a third of the samples per-
1 2 3 4 5 5¡ form multi-process execution and that roughly half have multiple
51% 17% 8% 3% 8% 13% execution waves, which means that a large part of all the samples
Table 7: Number of PE files constructed per malware sample. with single-process execution have multi-wave execution.
Table 7 shows the distribution of reconstructed PE files. We
construct more PE files than the number of captured waves, which
1 2 3-5 5¡
shows that some waves contain several regions that are non-related.
56% 15% 18% 11%
Finally, Table 8 shows the number of sections reconstructed in each
Table 8: Number of Sections reconstructed per PE file. PE file and Table 9 shows the number of reconstructed imports.
For roughly 20% of the PE files, we do not monitor any API calls
0 1-10 11-15 16-20 21-25 26 ¡ in the code, and this is due to some PE files being a result of small
18% 28% 7% 5% 2% 40% amounts of taint in minor code regions.
Table 9: Number of imports per PE file.
7.6 Relevance on packers
In this experiment, we show that Minerva is relevant against pub-
licly available packers from benchmark set #3. This experiment is
common practice for unpacking engines [6, 33, 38, 42] and, there-
In terms of precision for tracking API calls, there is a similar re-
fore, natural for us to perform. We construct a simple application
lationship between Minerva and PackerInspector as when matched
that will get the name of the current user and report back to us so
with our ground-truth samples. In the samples from CryptoWall,
we can verify the behaviour occurred correctly. We pack this appli-
Tinba and Ramnit, Minerva reports roughly twelve times fewer API
cation with 13 publicly known packers and analyse the samples in
calls within the malware execution than PackerInspector. We man-
Minerva.
ually investigated several of the Tinba samples to confirm these
We show the results of our experiment in Table 10. The table
numbers and found that Minerva captures API calls accurately
shows the number of processes, waves, PE files, and whether we
within the malware execution. As such, we consider the API calls
found the original code or a derivative thereof, and also whether we
reported by PackerInspector to be a significant over-approximation.
observed the original behaviour. Minerva produced PE files for most
In addition to the total number of API calls, PackerInspector cap-
of the packers that are very similar to the original code, including
tures about three times more unique API calls in each malware
correct API calls. In general, these packers are rather simple in
execution, even in cases where Minerva finds more total API calls.
comparison to some of the techniques we observe in malware from
The only instances where Minerva finds more total API calls are in
the wild. For example, all but one of the packers are single-process
the Gapz samples. The reason that there is a significant amount of
packers. This makes sense since the packers are not necessarily
API calls in these cases is that the Gapz malware scans a remote
meant to be used by malicious software, but may be used by benign
process for gadgets and this results in an enormous amount of
applications, which are not meant to inject into other applications.
calls to ReadProcessMemory (about 99.985% of calls in the Minerva
Furthermore, many of these packers rely on similar approaches for
analyses). We believe that the reason Minerva reports more API
compression, e.g. the Lempel–Ziv–Markov chain algorithm, and
calls than PackerInspector only in the Gapz samples, is because
the majority the packers used in these experiments are rather old.
PackerInspector exits prematurely.
When matched with these malware samples, it is clear that Pack-
erInspector over-approximates the API usage of the applications,
7.7 Tinba case study
both in terms of total API calls and unique APIs used. We also find We now investigate in depth a case study of a real-world malware
that in the majority of times, PackerInspector under-approximates sample from the Tinba malware family15 . Minerva outputs four
the system-wide malware propagation and Minerva finds more PE files with sizes 12KB, 12KB, 16KB and 24KB, respectively. We
system-wide unpacking. manually reverse engineered the sample to fully understand the
system-wide propagation and where the sample exposes its un-
packed code. The malware first decrypts memory from its data
10 md5sum 0ed4a5e1b9b3e374f1f343250f527167
section and then transfers execution to this code. The decrypted
11 md5sum e5b9295e0b147501f47e2fcba93deb6c code injects code into the Windows process Winver.exe and from
12 md5sum 448ce1c565c4378b310fa25b4ae3b17f Winver.exe it further injects into explorer.exe.
13 md5sum 33cd65ebd943a41a3b65fa1ccfce067c
14 md5sum 3bb86e6920614ed9ac5d8fbf480eb437
15 md5sum 08ab7f68c6b3a4a2a745cc244d41d213
12
Packer #proc #wave #PE U OB systems that rely on hypervisor-based virtualisation for recording,
BoxedApp 1 1 1 Y Y e.g. AfterSight [8]. However, we consider PANDA’s performance
Enigma 2 11 1 Y Y good enough for malware analysis in particular because the plugins
FSG packed 1 2 2 Y Y that we deploy will have far more impact on the total analysis
mew11 1 3 4 Y Y time. Naturally, the performance overhead in the recording stage
MoleBox 1 4 9 Y Y can be used by the malware to evade analysis, and we discuss this
mpress 1 2 2 Y Y further in Section 8. In this performance evaluation, we focus on
PackMan 1 2 2 Y Y the overhead of Minerva’s analysis when replaying the recorded
PECompact 1 4 4 Y Y execution, and the numbers we report in this section are based
PEtite 1 4 4 Y Y on analysis of the 25 malware samples from the Ramnit, Gapz,
tElock 0 0 0 N N CryptoWall and Tinba families.
UPX 1 2 2 Y Y
WinUpack 1 2 w Y Y
6,000 LLVM + taint + Minerva
XComp 1 2 2 Y Y
LLVM + taint
Table 10: The results from matching Minerva with known
Seconds
packers. OB indicates if we observed the original behaviour 4,000
of the packed application. U indicates if we found the origi-
nal code in Minerva’s output. 2,000
0
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6
To inject code into Winver.exe Tinba launches a new instance ·109
of Winver.exe in a suspended state. Then, Tinba allocates mem- Instructions replayed
ory on the heap of the newly started Winver.exe and copies some
malicious code into this specific memory. Tinba then overwrites Figure 5: The average number and standard deviation of in-
six bytes of the Start function in Winver.exe with the instruc- structions replayed relative to time, for instruction counts
tions push ADDR; ret, where ADDR is some address inside the where we have more than 3 samples executing the given
dynamically generated malicious code. Effectively, Tinba ensures number of instructions.
execution of its malicious code in Winver.exe by overwriting an
initial function in Winver.exe to hijack execution. The blue curve of Figure 5 shows the number of instructions
Minerva captures one execution wave and outputs two unpacked replayed relative to the time taken in each of the analyses. On
PE files for the code in Winver.exe, one PE file based on a single average the replay time is 3166 seconds, resulting in a 126x slow-
execution wave in explorer.exe and also one PE file from the down of the recording time, and we analysed on average 432365
unpacked code in the initial process. The PE files produced by instructions per second. In comparison, the developers of PANDA
Minerva has 0, 11, 12 and 26 imports reconstructed. These results report a 24.7x slowdown when tainting data sent over the network
capture the execution perfectly because the PE file with 0 imports is and a 67.7x slowdown for tainting a 1KB file and encrypting it with
purely the push ADDR; ret instructions of the malware execution AES-CBC-128 [11]. Additionally, Figure 6 shows the number of
trace in Winver.exe and the rest of the PE files contain various instructions that it took to replay the samples in our data set, and
other stages with more payload content. we observe that for about 90% of the samples this required less than
Minerva correctly identifies the malicious code, both the patched 2 billion instructions.
code of Winver.exe and also the code on the heap that contains Another interesting metric is the specific overhead incurred
the core of the malware code. More importantly, the PE files pre- by Minerva-only code. Specifically, there is some share of the
cisely capture the malware execution, and from the execution trace overhead that is due to the translation of QEMU TCG instructions
output by Minerva, we can precisely see the exact instructions to LLVM instructions and also overhead that is specific to the taint
push ADDR; ret. Minerva also catches the exact malware code implementation of PANDA. None of these requirements is strict
inside of the explorer.exe process. The PE file captured from the to Minerva, in that we are not reliant on LLVM specifically, and
second execution wave in the original malware process contains 11 PANDA’s taint analysis does not focus on performance. Several
imports in its reconstructed IAT, five of which are ResumeThread, systems focus on fast taint analysis [7, 16, 35] and, conceptually,
CreateProcessA, WriteProcessMemory, VirtualAllocEx and the techniques of Minerva can be implemented on top of these taint
VirtualProtectEx. A novice analysts can quickly determine that libraries as well. To understand the overhead of Minerva’s code,
the execution performs a code-injection based on these API calls. we ran the samples through a replay with LLVM-translation and
taint analysis enabled, and no Minerva-specific analysis code. This
7.8 Performance evaluation gives us a reasonable estimate for how much of the analysis time
In the final part of our evaluation, we monitor the performance was spent in the specific code related to Minerva. The black curve
of Minerva. The authors of PANDA report that recording gives in Figure 5 shows these numbers. On average, each execution took
a 1.85x slowdown in comparison to QEMU alone and replaying 1275 seconds with the overhead of LLVM translation and PANDA’s
incurs a 3.57x slowdown [11]. This is expensive in comparison to taint library. This corresponds to an average of 51x slowdown,
13
meaning that Minerva’s code takes up a bit more than half of the The sample we observed with the longest replay time is from the
total 126x slowdown. Nymaim family that has a stalling loop with 1.4 billion iterations,
and after the stalling loop, it calls the Sleep function from the
LLVM + taint + Minerva Windows API to further stall the execution. In total, our 25-second
recording of this sample took 170,000 seconds to replay.
% of samples
100
50
0
0 1 2 3 4
Instructions to complete replay ·109
100% instruction coverage In a more general setting, malware can copy entire functions
4,000 or modules from the Windows API and then rely on the copied
code rather than calling the original Windows code. In this context,
2,000 Minerva will still capture whenever system calls happen, but new
measurements should be taken for identifying the copying of Win-
0 dows code. One approach is to mark library code with a specific
0 20 40 60 80 100
taint label and then monitor whether library code is propagated.
% samples Naturally, this solution is subject to the limitations of taint analy-
sis. Another approach is to incorporate forensic techniques that
Figure 7: The time taken to explore the unique instructions determine the similarity between the code in a given process and a
in each malware sample. set of external libraries, which we discuss further in the following
paragraph.
The biggest performance bottleneck we found in Minerva is
Inlining and statically linked binaries. A limitation in Min-
when malware makes the code execute longer via stalling loops.
erva in terms of identifying external dependencies is when malware
An example of a stalling loop from a Kovter sample16 is shown in
deploys inlined or statically linked code. The difference between
Figure 8. In total, the loop does 20 million iterations with sixteen
this and copying external libraries as described above is that inlin-
calls to functions from the Windows API in each iteration. The loop
ing and statically linking occurs at compile time where copying
has no real effect and is purely garbage code. In total, our 25-second
occurs at run time. Minerva is not capable of identifying inlined
recording of this sample reaches 17 million iterations before the
or statically linked external dependencies, and we consider this to
recording is over and incurs a replay time of 3300 seconds.
be a slightly different problem, namely similarity analysis of the
16 md5 of sample 147330a7ec2e27e2ed0fe0e921d45087 malware code with library implementations. However, inlining
14
and statically linking can, of course, be used in combination with limited. In some aspects, Ugarte et al. provide a more refined model
obfuscation techniques and similar, and, therefore, the problem be- for dynamically generated code in that they assign various labels
comes determining program equivalence in the general case, which to the memory written by the malware based on whether it is exe-
is a well-known undecidable problem. Nonetheless, efforts can still cuted and alike. These labels can easily be integrated into Minerva.
yield positive and practical results as shown by previous work in In addition to this, they also highlight that several limitations in
areas such as library fingerprinting [13, 20], structural comparison existing unpackers exist due to missing reference data sets, which
of binary code [12, 14, 30] and, most recently, similarity detection indeed also motivated the construction of our synthetic benchmark
via machine learning [32, 41]. set #1.
The work that is closest to ours is Tartarus [29] and the ideas
Performance limitations. There are currently two main per- of this paper are heavily inspired by their work. We deploy a
formance limitations in Minerva. First, malware can detect the similar approach to tracing the malware throughout the whole
presence of the recording component due to the 3.56x slowdown, system, however, we deploy a different model of dynamically gen-
and second, the replaying component limits the throughput of Min- erated malicious code and also propose novel algorithms for making
erva due to its performance cost. Stalling loops seem to pose a the output suitable for follow-up analysis. In particular, the post-
core limitation in this context. There is, however, previous work on processing we describe in this paper is novel and our model of
how to deal with stalling loops in the context of full system emula- dynamically generated code is explicitly connected to previous
tion. Kolbitsch et al. implemented several features into the Anubis waves whereas Tartarus simply dumps the whole of tainted mem-
analysis system [27]. Their approach is to implement heuristics ory whenever a new wave executions. As such, our model is more
that detect when stalling loops occur and then either disable heavy precise and also formally defined.
instrumentation until the stalling loop exits or force execution out
of the loop. The first approach is certainly possible to implement System-wide malware execution. Several works have closely
in Minerva, but it may run into issues if the stalling loop is also considered the concept of malware executing throughout the whole
responsible for propagating executable malicious code since the system. In particular, Panorama [43], DiskDuster[1], Tartarus [29]
taint analysis would likely be disabled. The second approach is and API Chaser [25] use dynamic taint analysis to capture this.
more challenging to implement because replaying is not able to Barabosch et al. has also investigated the problem with code in-
change execution-path in the guest system, as the execution is fixed jection by analysing memory dumps [4] and also at run time [5].
to the replay log. Minerva relies on the same techniques as Tartarus to trace malware
We consider five main avenues to improving performance. First, execution through the system. An interesting approach at the other
we can use hardware assisted virtualisation during recording and end of the spectrum is explored by Ispoglou and Payer in malWASH
only full system emulation during replay, as suggested in After- [19], where they propose to write complex malware using exactly
Sight [8]. Second, we can implement various on-and-off analyses the paradigm of system-wide execution.
during the replay similar to Kolbitsch et al. Third, we can add light
anti-analysis monitoring during the recording, for example, to limit Malware disassembly. The work in this paper is closely related
the effectiveness of calls to functions like Sleep. In this case how- to techniques that focus on disassembling malicious software. An
ever, the implementation must use some form of approximation to accurate description of our work within this domain, rather than
determine the malware execution trace since taint analysis will not unpacking, is a system-wide malware disassembler. We gather a
be available. Fourth, we can improve the speed of various parts in precise instruction-level execution trace of the malware and then
PANDA, such as the taint analysis plugin. Instead of converting gather more content to include in the reconstructed PE file with
instructions to LLVM and performing taint analysis on the LLVM speculative disassembly. Traditionally, disassembly techniques are
code, we can adopt the taint system by DECAF, which occurs di- split between linear sweep, as used in GNU’s Objdump, and recur-
rectly on the QEMU tcg instructions [16]. Finally, an interesting sive traversal [9, 40] algorithms. However, there are several pieces
avenue is implementing a feedback loop between record-and-replay of previous work on disassembly that specifically target malware
that based on the analysis in the replay sends information to the and these move beyond the traditional approaches. Kruegel et al.
recording about where a delay in execution occur and how to han- present an approach that combines a variety of techniques from
dle it. In this way, it is possible to incrementally build up a complete control-flow analysis and statistical methods, in order to statically
execution trace of the malware without anti-analysis tricks. disassemble obfuscated binaries [31]. Kinder and Veith present
an approach based on abstract interpretation that statically dis-
assembles binaries and also resolves indirect branch instructions
9 RELATED WORK [26]. Rosenblum et al. present a classification approach to identify
Automatic unpacking. There are many works in automatic un- function entry points [37], and Bao et al. [3] follow the same path
packing of malware and we have already discussed several of these and use machine learning and static analysis to identify functions
throughout the paper [10, 17, 21, 23, 33, 39, 42]. Some of this work within binaries. They train a weighted-prefix tree that recognises
considers the concept of IAT destruction [21, 28, 39] and IAT recon- function starting points in a binary file and then value-set analysis
struction has also been considered on a more general basis [24]. The [2] with an incremental control-flow recovery algorithm to identify
work by Ugarte et al. [42] highlights several missing gaps in exist- function boundaries.
ing unpackers and proposes a system-wide approach to unpacking.
However, as we observed in this paper, their approach is severely
15
10 CONCLUSIONS Advances in Intrusion Detection (RAID’11). Springer-Verlag, Berlin, Heidelberg,
1–20. https://doi.org/10.1007/978-3-642-23644-0 1
In this paper, we proposed a system called Minerva that focuses [8] Jim Chow, Tal Garfinkel, and Peter M. Chen. 2008. Decoupling Dynamic Pro-
on generic and precise malware unpacking. From a technical point gram Analysis from Execution in Virtual Environments. In USENIX 2008 Annual
Technical Conference (ATC’08). USENIX Association, Berkeley, CA, USA, 1–14.
of view, Minerva deploys a concatic approach with both dynamic http://dl.acm.org/citation.cfm?id=1404014.1404015
and static analysis and partitions the malware execution trace into [9] Cristina Cifuentes and K. John Gough. 1995. Decompilation of Binary Pro-
execution waves based on information flow analysis. Minerva grams. Softw. Pract. Exper. 25, 7 (July 1995), 811–829. https://doi.org/10.1002/
spe.4380250706
precisely monitors the API-calls of the malware code and accurately [10] Artem Dinaburg, Paul Royal, Monirul Sharif, and Wenke Lee. 2008. Ether:
correlates these to the unpacked code. Based on the output of the Malware Analysis via Hardware Virtualization Extensions. In Proceedings of the
dynamic analysis, Minerva performs static analysis on the execution 15th ACM Conference on Computer and Communications Security (CCS ’08). ACM,
New York, NY, USA, 51–62. https://doi.org/10.1145/1455770.1455779
waves to output a set of reconstructed PE files with valid import [11] Brendan Dolan-Gavitt, Josh Hodosh, Patrick Hulin, Tim Leek, and Ryan Whelan.
address tables and patched API calls. 2015. Repeatable Reverse Engineering with PANDA. In Proceedings of the 5th
Program Protection and Reverse Engineering Workshop (PPREW-5). ACM, New
From a theoretical point of view, Minerva deploys a precise York, NY, USA, Article 4, 11 pages. https://doi.org/10.1145/2843859.2843867
model of execution waves based on an information flow model that [12] Thomas Dullien and Rolf Rolles. 2005. Graph-based comparison of executable
captures dynamically generated malicious code independently of objects (english version). SSTIC 5 (01 2005).
[13] Mike Van Emmerik. 1994. Signatures for Library Functions in Executable Files.
who wrote the code. We came up with several novel algorithms [14] Halvar Flake. 2004. Structural Comparison of Executable Objects. In Detection
that combine these execution waves with other artefacts collected of Intrusions and Malware & Vulnerability Assessment, GI SIG SIDAR Workshop,
from the dynamic analysis to carefully produce PE files that are DIMVA 2004, Dortmund, Germany, July 6.7, 2004, Proceedings. 161–173. http:
//subs.emis.de/LNI/Proceedings/Proceedings46/article2970.html
well-suited for follow-up static analysis. [15] Fanglu Guo, Peter Ferrie, and Tzi-cker Chiueh. 2008. A Study of the Packer
Finally, we proposed a new set of benchmark applications that Problem and Its Solutions. In Recent Advances in Intrusion Detection, Richard
Lippmann, Engin Kirda, and Ari Trachtenberg (Eds.). Springer Berlin Heidelberg,
exhibit unpacking behaviours with various forms of dynamically Berlin, Heidelberg, 98–115.
generated code, system-wide execution and import-address table [16] Andrew Henderson, Lok-Kwong Yan, Xunchao Hu, Aravind Prakash, Heng
destruction in order to address a missing gap in terms of ground- Yin, and Stephen McCamant. 2017. DECAF: A Platform-Neutral Whole-System
Dynamic Binary Analysis Platform. IEEE Trans. Softw. Eng. 43, 2 (Feb. 2017),
truth samples for testing unpackers. This benchmark suite is the 164–184. https://doi.org/10.1109/TSE.2016.2589242
first of its kind in that previous benchmark data sets for testing [17] Xin Hu, Sandeep Bhatkar, Kent Griffin, and Kang G. Shin. 2013. MutantX-S:
automatic unpackers rely on third-party applications to perform Scalable Malware Clustering Based on Static Features. In Proceedings of the 2013
USENIX Conference on Annual Technical Conference (USENIX ATC’13). USENIX
the packing. Association, Berkeley, CA, USA, 187–198. http://dl.acm.org/citation.cfm?id=
We evaluated Minerva against our synthetic applications, real- 2535461.2535485
[18] Thomas Hungenberg and Matthias Eckert. 2018. http://www.inetsim.org/.
world malware samples and also performed a comparative evalua- [19] Kyriakos K. Ispoglou and Mathias Payer. 2016. malWASH: Washing Malware
tion. Our results show that Minerva is significantly more precise to Evade Dynamic Analysis. In 10th USENIX Workshop on Offensive Technolo-
than previous work and outputs unpacked code that shows exter- gies (WOOT 16). USENIX Association, Austin, TX. https://www.usenix.org/
conference/woot16/workshop-program/presentation/ispoglou
nal dependencies, which previous work does not. Our results also [20] Emily R. Jacobson, Nathan Rosenblum, and Barton P. Miller. 2011. Labeling
show that Minerva captures system-wide unpacking in many cases Library Functions in Stripped Binaries. In Proceedings of the 10th ACM SIGPLAN-
where previous work fails. SIGSOFT Workshop on Program Analysis for Software Tools (PASTE ’11). ACM,
New York, NY, USA, 1–8. https://doi.org/10.1145/2024569.2024571
[21] Sébastien Josse. 2007. Secure and advanced unpacking using computer emulation.
Journal in Computer Virology 3, 3 (01 Aug 2007), 221–236. https://doi.org/10.
REFERENCES 1007/s11416-007-0046-0
[1] Andrei Bacs, Remco Vermeulen, Asia Slowinska, and Herbert Bos. 2013. System- [22] S. Josse. 2014. Malware Dynamic Recompilation. In 2014 47th Hawaii International
Level Support for Intrusion Recovery. In Detection of Intrusions and Malware, Conference on System Sciences. 5080–5089. https://doi.org/10.1109/HICSS.2014.
and Vulnerability Assessment, Ulrich Flegel, Evangelos Markatos, and William 624
Robertson (Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg, 144–163. [23] Min Gyung Kang, Pongsin Poosankam, and Heng Yin. 2007. Renovo: A Hidden
[2] G. Balakrishnan, T. Reps, D. Melski, and T. Teitelbaum. 2008. WYSINWYX: What Code Extractor for Packed Executables. In Proceedings of the 2007 ACM Workshop
You See Is Not What You eXecute. Springer Berlin Heidelberg, Berlin, Heidelberg, on Recurring Malcode (WORM ’07). ACM, New York, NY, USA, 46–53. https:
202–213. https://doi.org/10.1007/978-3-540-69149-5 22 //doi.org/10.1145/1314389.1314399
[3] Tiffany Bao, Jonathan Burket, Maverick Woo, Rafael Turner, and David Brumley. [24] Yuhei Kawakoya, Makoto Iwamura, and Jun Miyoshi. 2018. Taint-assisted IAT
2014. BYTEWEIGHT: Learning to Recognize Functions in Binary Code. In Reconstruction against Position Obfuscation. JIP 26 (2018), 813–824. https:
Proceedings of the 23rd USENIX Security Symposium, San Diego, CA, USA, August //doi.org/10.2197/ipsjjip.26.813
20-22, 2014. 845–860. https://www.usenix.org/conference/usenixsecurity14/ [25] Yuhei Kawakoya, Eitaro Shioji, Makoto Iwamura, and Jun Miyoshi. 2019. API
technical-sessions/presentation/bao Chaser: Taint-Assisted Sandbox for Evasive Malware Analysis. Journal of Infor-
[4] Thomas Barabosch, Niklas Bergmann, Adrian Dombeck, and Elmar Padilla. 2017. mation Processing 27 (2019), 297–314. https://doi.org/10.2197/ipsjjip.27.297
Quincy: Detecting Host-Based Code Injection Attacks in Memory Dumps. In [26] Johannes Kinder and Helmut Veith. 2008. Jakstab: A Static Analysis Platform
Detection of Intrusions and Malware, and Vulnerability Assessment, Michalis Poly- for Binaries. In Proceedings of the 20th International Conference on Computer
chronakis and Michael Meier (Eds.). Springer International Publishing, Cham, Aided Verification (CAV ’08). Springer-Verlag, Berlin, Heidelberg, 423–427. https:
209–229. //doi.org/10.1007/978-3-540-70545-1 40
[5] Thomas Barabosch, Sebastian Eschweiler, and Elmar Gerhards-Padilla. 2014. [27] Clemens Kolbitsch, Engin Kirda, and Christopher Kruegel. 2011. The Power of
Bee Master: Detecting Host-Based Code Injection Attacks. In Detection of Intru- Procrastination: Detection and Mitigation of Execution-stalling Malicious Code.
sions and Malware, and Vulnerability Assessment, Sven Dietrich (Ed.). Springer In Proceedings of the 18th ACM Conference on Computer and Communications
International Publishing, Cham, 235–254. Security (CCS ’11). ACM, New York, NY, USA, 285–296. https://doi.org/10.1145/
[6] Guillaume Bonfante, Jose Fernandez, Jean-Yves Marion, Benjamin Rouxel, Fab- 2046707.2046740
rice Sabatier, and Aurélien Thierry. 2015. CoDisasm: Medium Scale Con- [28] David Korczynski. 2016. RePEconstruct: reconstructing binaries with self-
catic Disassembly of Self-Modifying Binaries with Overlapping Instructions. modifying code and import address table destruction. In IEEE 11th Interna-
In Proceedings of the 22Nd ACM SIGSAC Conference on Computer and Com- tional Conference on Malicious and Unwanted Software, MALWARE 2016, Fa-
munications Security (CCS ’15). ACM, New York, NY, USA, 745–756. https: jardo, PR, USA, October 18-21, 2016. IEEE Computer Society, 31–38. https:
//doi.org/10.1145/2810103.2813627 //doi.org/10.1109/MALWARE.2016.7888727
[7] Erik Bosman, Asia Slowinska, and Herbert Bos. 2011. Minemu: The World’s [29] David Korczynski and Heng Yin. 2017. Capturing Malware Propagations with
Fastest Taint Tracker. In Proceedings of the 14th International Conference on Recent Code Injections and Code-Reuse Attacks. In Proceedings of the 2017 ACM SIGSAC
16
Conference on Computer and Communications Security, CCS 2017, Dallas, TX,
USA, October 30 - November 03, 2017, Bhavani M. Thuraisingham, David Evans,
Tal Malkin, and Dongyan Xu (Eds.). ACM, 1691–1708. https://doi.org/10.1145/
3133956.3134099
[30] Christopher Kruegel, Engin Kirda, Darren Mutz, William Robertson, and Gio-
vanni Vigna. 2006. Polymorphic Worm Detection Using Structural Information of
Executables. In Proceedings of the 8th International Conference on Recent Advances
in Intrusion Detection (RAID’05). Springer-Verlag, Berlin, Heidelberg, 207–226.
https://doi.org/10.1007/11663812 11
[31] Christopher Kruegel, William Robertson, Fredrik Valeur, and Giovanni Vigna.
2004. Static Disassembly of Obfuscated Binaries. In Proceedings of the 13th Confer-
ence on USENIX Security Symposium - Volume 13 (SSYM’04). USENIX Association,
Berkeley, CA, USA, 18–18. http://dl.acm.org/citation.cfm?id=1251375.1251393
[32] Yujia Li, Chenjie Gu, Thomas Dullien, Oriol Vinyals, and Pushmeet Kohli. 2019.
Graph Matching Networks for Learning the Similarity of Graph Structured
Objects. https://openreview.net/forum?id=S1xiOjC9F7
[33] L. Martignoni, M. Christodorescu, and S. Jha. 2007. OmniUnpack: Fast, Generic,
and Safe Unpacking of Malware. In Twenty-Third Annual Computer Security
Applications Conference (ACSAC 2007). 431–441. https://doi.org/10.1109/ACSAC.
2007.15
[34] Mario Polino, Andrea Continella, Sebastiano Mariani, Stefano D’Alessio, Lorenzo
Fontana, Fabio Gritti, and Stefano Zanero. 2017. Measuring and Defeating Anti-
Instrumentation-Equipped Malware. In Detection of Intrusions and Malware,
and Vulnerability Assessment, Michalis Polychronakis and Michael Meier (Eds.).
Springer International Publishing, Cham, 73–96.
[35] Georgios Portokalidis, Asia Slowinska, and Herbert Bos. 2006. Argos: an Emula-
tor for Fingerprinting Zero-Day Attacks. In Proc. ACM SIGOPS EUROSYS’2006.
Leuven, Belgium.
[36] Symantec Security Response. 2015. W32.Ramnit analysis.
[37] Nathan E. Rosenblum, Xiaojin Zhu, Barton P. Miller, and Karen Hunt. 2008.
Learning to Analyze Binary Computer Code. In Proceedings of the Twenty-Third
AAAI Conference on Artificial Intelligence, AAAI 2008, Chicago, Illinois, USA, July
13-17, 2008. 798–804. http://www.aaai.org/Library/AAAI/2008/aaai08-127.php
[38] Paul Royal, Mitch Halpin, David Dagon, Robert Edmonds, and Wenke Lee. 2006.
PolyUnpack: Automating the Hidden-Code Extraction of Unpack-Executing
Malware. In Proceedings of the 22Nd Annual Computer Security Applications
Conference (ACSAC ’06). IEEE Computer Society, Washington, DC, USA, 289–300.
https://doi.org/10.1109/ACSAC.2006.38
[39] Monirul Sharif, Vinod Yegneswaran, Hassen Saidi, Phillip Porras, and Wenke Lee.
2008. Eureka: A Framework for Enabling Static Malware Analysis. In Computer
Security - ESORICS 2008, Sushil Jajodia and Javier Lopez (Eds.). Springer Berlin
Heidelberg, Berlin, Heidelberg, 481–500.
[40] Richard L. Sites, Anton Chernoff, Matthew B. Kirk, Maurice P. Marks, and Scott G.
Robinson. 1993. Binary Translation. Commun. ACM 36, 2 (Feb. 1993), 69–81.
https://doi.org/10.1145/151220.151227
[41] Wei Song, Heng Yin, Chang Liu, and Dawn Song. 2018. DeepMem: Learning
Graph Neural Network Models for Fast and Robust Memory Forensic Analysis. In
Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications
Security, CCS 2018, Toronto, ON, Canada, October 15-19, 2018. 606–618. https:
//doi.org/10.1145/3243734.3243813
[42] Xabier Ugarte-pedrero, Davide Balzarotti, Igor Santos, and Pablo G. Bringas.
[n.d.]. SoK: Deep Packer Inspection: A Longitudinal Study of the Complexity of
Run-Time Packers.
[43] Heng Yin, Dawn Song, Manuel Egele, Christopher Kruegel, and Engin Kirda.
2007. Panorama: Capturing System-wide Information Flow for Malware De-
tection and Analysis. In Proceedings of the 14th ACM Conference on Computer
and Communications Security (CCS ’07). ACM, New York, NY, USA, 116–127.
https://doi.org/10.1145/1315245.1315261
17