Metamorphic Malware
Metamorphic Malware
Yu Fu Dinghao Wu
[email protected] [email protected]
The Pennsylvania State University The Pennsylvania State University
University Park, PA 16802, USA University Park, PA 16802, USA
ABSTRACT KEYWORDS
As the underground industry of malware prospers, malware de- Malware detection, metamorphic virus, binary diffing, binary code
velopers consistently attempt to camouflage malicious code and semantics analysis
undermine malware detection with various obfuscation schemes. ACM Reference Format:
Among them, metamorphism is known to have the potential to Li Wang, Dongpeng Xu, Jiang Ming, Yu Fu, and Dinghao Wu. 2019. Meta-
defeat the popular signature-based malware detection. A meta- Hunt: Towards Taming Malware Mutation via Studying the Evolution of
morphic malware sample mutates its code during propagations so Metamorphic Virus. In 3rd Software Protection Workshop (SPRO’19), Novem-
that each instance of the same family exhibits little resemblance to ber 15, 2019, London, United Kingdom. ACM, New York, NY, USA, 12 pages.
another variant. Especially with the development of compiler and https://doi.org/10.1145/3338503.3357720
binary rewriting techniques, metamorphic malware will become
much easier to develop and outbreak eventually. To fully under-
stand the metamorphic engine, the core part of the metamorphic 1 INTRODUCTION
malware, we attempt to systematically study the evolution of me-
The malicious software (malware) underground market has evolved
tamorphic malware over time. Unlike the previous work, we do
into a multi-billion dollar industry [6]. Driven by the rich profit,
not require any prior knowledge about the metamorphic engine
there has been consistent growth in the number and diversity of
in use. Instead, we perform trace-based semantic binary diffing
malware. According to a Panda Security Lab annual report [40], in
to compare mutation code iteratively and memoize semantically
2017 alone, the total number of malware samples in circulation is as
equivalent basic blocks. We have developed a prototype, called
high as 75 million, 1.4 times the number of malware found in 2016.
MetaHunt, and evaluated it with 1, 400 metamorphic malware vari-
Relentless malware developers typically apply various obfuscation
ants. Our experimental results show that MetaHunt can accurately
schemes (e.g., packer, polymorphism, and metamorphism) [37, 45]
capture the semantics of unknown metamorphic engines, and all
to camouflage arresting features, circumvent malware detection,
of the comparisons converge in a reasonable time. Besides, Meta-
and impede reverse engineering attempts. Among these obfuscation
Hunt identifies several metamorphic engine bugs, which lead to a
techniques, metamorphism is widely believed to be a panacea to
semantics-breaking transformation. We summarize our experience
thwart the signature-based malware scanning approaches [1, 47, 56],
learned from our empirical study, hoping to stimulate designing
which are still by far the most widely used anti-malware solution
mutation-aware solutions to defend this threat proactively.
in practice [8]. The core of metamorphic malware is a metamorphic
engine (i.e., morphing engine). Each time a metamorphic malware
CCS CONCEPTS sample executes or propagates, the metamorphic engine mutates
• Security and privacy → Intrusion/anomaly detection and the instructions that are loaded into memory by various methods
malware mitigation. such as register swapping, instruction substitution, instruction re-
ordering, and junk code insertion. As a result, the old version is
transformed into a syntactically different but semantically equiva-
Permission to make digital or hard copies of all or part of this work for personal or lent variant. In this way, a metamorphic malware sample becomes
classroom use is granted without fee provided that copies are not made or distributed a moving target for analysis as the archetype1 evolves from genera-
for profit or commercial advantage and that copies bear this notice and the full citation
on the first page. Copyrights for components of this work owned by others than ACM
tion to generation. Consequently, the signature-based anti-malware
must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, approaches become insufficient to capture the numerous ostensibly
to post on servers or to redistribute to lists, requires prior specific permission and/or a different variants for a particular instance of malware, as illustrated
fee. Request permissions from [email protected].
SPRO’19, November 15, 2019, London, United Kingdom in Figure 1. A striking example is from Leder et al.’s study [27] in
© 2019 Association for Computing Machinery.
1 The term “archetype” means the initial un-mutated version, from which the mutation
ACM ISBN 978-1-4503-6835-3/19/11. . . $15.00
https://doi.org/10.1145/3338503.3357720 starts.
1
2009. They report that only 12.6% of the files infected by the meta- are well known. Since several prototypes of metamorphic malware
morphic malware Lexotan32 are detected by a total of 40 malware have been well studied or open sourced [19, 38], it seems that the
scanners in VirusTotal2 and no single scanner can identify all the prior knowledge about morphing rules can be collected easily. Ho-
infected samples. wever, such optimistic assumption does not always hold in practice.
The prototype of metamorphic malware first emerges in the DOS It is always possible for an expert malware developer to design
days [48] and is constantly evolving on Windows platforms [3, 17]. an alternative mutation way [39]. Unfortunately, manually tracing
However, compared with packer and polymorphism, metamor- metamorphic mutations often cost several days or even weeks of
phism obfuscation was not widely adopted in the past. The major tedious work, and the results are incomplete and error-prone as
reason is that developing a full-fledged metamorphic engine is well.
highly complicated, especially for self-propagating malware, which In this work, we present MetaHunt, to study the evolution of
typically attaches the metamorphic engine in its code. For example, metamorphic malware mutation over time. Our purpose is to under-
the relatively sophisticated sample, MetaPHOR (a.k.a.W32.Simile stand the diversity of the metamorphic transformation comprehen-
and W32.Etap) has about 14, 000 lines of assembly code and more sively, and provide the insight of the mutation mechanism behind
than 90% of its code is occupied by the metaphoric engine [19]. In the metamorphic malware, which further helps stimulate the de-
recent years, with the advent of automated development toolkits, velopment of mutation insensitive malware protection solution.
such as LLVM [26], SecondWrite [2], and Uroboros [51, 52], de- Different from the previous work, we do not assume the knowledge
veloping a powerful metamorphic engine will become relatively about the specific metamorphic engine in use. Instead, we study
easy. For example, LLVM has been actively employed to facilitate how a metamorphic engine mutates the code via iteratively com-
malware mutation and diversification [24, 42, 49]. Therefore, we paring input-related mutation code and memoizing equivalent basic
estimate that new malware variants with an advanced metamorphic blocks. There are two key observations behind our approach. The
engine will outbreak in the foreseeable future. To keep ahead in first one is the metamorphic mutation is a semantics-preserving
the malware defense arms race, we have to measure the risks of transformation. Therefore, ostensibly different code pairs but with
metamorphic malware and develop effective countermeasures. the same function can be matched by state-of-the-art semantics-
A major challenge in metamorphic malware analysis is to design based binary diffing techniques [21, 30, 35]. The second one is,
a general and automatic technique to capture all possible mutati- compared to other metamorphic transformation methods, the effect
ons [44]. Previous research work relies on studying the similarities of equivalent instruction substitution is harder to reverse (e.g., via
before/after metamorphism and can be classified into three cate- code normalization [5]) because of the cumbersome x86 instruction
gories. The first category measures the similarity of static features set architecture. Meanwhile, the sets of pure equivalent instruction
such as control flow graph [4], opcode statistical signatures [10], substitution patterns are also limited [33, 50]. For example, the
instruction hidden Markov model [55], and characteristic value code substitution table of MetaPHOR consists of 94 alternative in-
set [27]. Chouchane et al. [9] introduce an engine-specific scoring struction sequences [19]. Consequently, after our preprocessing to
signature to match metamorphic engines. These approaches gear remove some mutation methods such as junk code and opaque pre-
toward fast filtering out simple metamorphic malware, but they dicates, the iteration of comparing metamorphic mutations is not
are brittle to defeat the sophisticated ones whose code are even endless, which will converge when no new semantically equivalent
encrypted [43]. Besides, the metamorphic engines can be decou- code is discovered.
pled from the malicious code to mutate non-propagating malware More specifically, given two metamorphic mutations, we first
offline. For example, for the highly metamorphic malware created identify the basic blocks that can be affected by inputs via multi-
by NGVCK (Next Generation Virus Creation Kit) [55], their engines tag taint analysis. Next, we perform normalization to reverse the
are separated from the malicious body. In that case, the “engine sig- mutation methods that may affect the scope of a basic block. After
nature” approach is futile. The second category is based on the idea that, we represent the semantics of a basic block as a set of logical
that the malicious behavior is not changed during code mutation. formulas by symbolic execution. Then we compare these logical
They detect metamorphic malware by measuring the similarity of formulas to find semantically equivalent basic block pairs with
API call sequences or graphs [28, 34, 56]. The main drawback to a theorem prover. After that, the semantically equivalent basic
these API call approaches is that they regard the code mutation as blocks are memoized in a union-find set [12], an efficient tree-based
a “black box”, lacking an illuminating insight into the metamorphic data structure. During successive comparisons, we continue to
engine. compare metamorphic variants and maintain the corresponding
The third category, also the most advanced one, aims at capturing union-find sets until reaching a fixed point, that is, there is little or
the metamorphic engine’s semantics [11]. They model the meta- no increase in the size of the union-find sets. At that point, we call
morphism either by semantic juice [25], algebraic specification [53], that we have explored the metamorphic malware mutation evolution.
or abstract interpretation [13, 14]. The key design of metamorphic Although theoretically the attempt to find all the metamorphic
engine is a set of morphing rules (e.g., equivalent instruction substi- mutations is equal to solving the halting problem [25], the collected
tution patterns), which guide how to transform instructions to their information has many interesting implications from the practical
equivalent ones but with different syntax. A common assumption point of view. For example, a mutation insensitive signature can be
in the third category is that the metamorphic transformation rules generated to capture all possible metamorphic variants; malware
lineage information [23] can even be recovered as well.
2 https://www.virustotal.com/
2
9DULDQW$ 9DULDQW% 9DULDQW& 9DULDQW' 9DULDQW( 0RUHPXWDWLRQV ಹಹ
6LJQDWXUH
1RPDWFKLQJ
VLJQDWXUH
PDWFKLQJ
,QVWDQFHVSHFLILFVLJQDWXUH
$ % & '
We have implemented a prototype of MetaHunt on top of the 2 BACKGROUND AND RELATED WORK
BitBlaze [46] binary analysis platform. MetaHunt not only impro-
2.1 Metamorphic Malware
ves the semantics-based binary diffing technique in the resilience
to highly obfuscated binary code, but also in the better perfor- Metamorphic malware mutates their code during each generation
mance. We perform a solid empirical study with 1, 400 metamor- so that the new generated version reveals different instructions with
phic malware samples, which are generated by nine metamorphic the previous one, but the semantics is preserved. This differs from
engines, including two advanced malware mutation tools based on polymorphic malware (e.g., via binary packing) which do not re-
LLVM [24, 41]. The evaluation shows that the iteration of compa- write their own code [37]. The constantly changing property makes
ring metamorphic malware variants converges in a reasonable time. it difficult for signature-based anti-malware approaches to recog-
Compared to manually reverse engineering of malware, MetaHunt’s nize all the mutations of the same metamorphic malware. The core
exploration result provides a comprehensive understanding about of metamorphic malware is a metamorphic engine, which performs
the mechanism of a metamorphic engine. In addition, MetaHunt a set of transformations to mutate the code. The commonly used
identifies several buggy metamorphic engine implementations that code morphing methods are register swapping, instruction sub-
ignore subtle side effects of the x86 instructions. Our MetaHunt stitution, instruction reordering, junk code insertion, and control
prototype gives a method to record and compare the semantics flow obfuscation (e.g., opaque predicates and control flow flatte-
of the metamorphic malware, which provides some feasible hints ning). We refer the reader to the literature [37, 47] for more detailed
for the mutation insensitive anti-malware solutions. The result de- information. As shown in Figure 2, MetaPHOR [19] substitutes
monstrates that MetaHunt is an appealing complement to existing one instruction with a set of semantics-persevering instructions;
metamorphic malware defenses. Lexotan32 mutates its code by inserting junk code (the instructions
In summary, the contributions of this paper are as follows. are in italics) and reordering instruction. Note that after mutation,
the original single basic block in Figure 2(b) has been divided into
(1) To the best of our knowledge, we are the first one to study multiple basic blocks.
metamorphic malware evolution systematically. Note that among the multiple mutation methods, instruction sub-
(2) Instead of being metamorphic engine specific, our approach stitution is the most sophisticated one. Due to the cumbersome x86
is a generalized solution by automatically comparing the ISA, checking whether two instruction sequences are semantically
possible mutations and memoizing semantically equivalent equivalent is challenging. The advanced semantics-based binary
basic blocks. Our exploration results provide a comprehen- diffing has to rely on symbolic execution and theorem proving
sive understanding of the metamorphic engine semantics. techniques to match equivalent instructions. Typically, a metamor-
(3) We present MetaHunt, a novel approach to comparing the si- phic engine performs code substitution by comparing instructions
milarities before/after metamorphic mutation. MetaHunt in- against a fixed table containing alternative sequences, and then
tegrates the advanced semantics-based binary diffing techni- randomly chooses one. Figure 2(a) presents a part of MetaPHOR
que in metamorphic malware analysis and improves it with code substitution table. On the other hand, the pure equivalent
better accuracy and performance. instruction substitution rules are not unlimited either [33, 50]; that
is, the length of code substitution table is fixed. All of these obser-
The rest of the paper is organized as follows. Section 2 pro- vations form the basis of our approach.
vides background information and related work. Section 3 and In Section 1, we have introduced the existing metamorphic mal-
Section 4 present our system design and implementation in detail. ware analysis work. However, a systematical study of metamorphic
We evaluate MetaHunt in Section 5. Discussions and limitations malware evolution is still missing. Understanding how a morphing
are presented in Section 6. We conclude the paper in Section 7. engine mutates code over time without a priori knowledge is an
3
start:
push ebp
mov ebp, esp
jmp loc_0003
interesting and challenging research problem. In this paper, we However, due to the slow symbolic execution and the high invoca-
propose MetaHunt to explore this problem. tion of a constraint solver, semantics-based binary diffing suffers
from significant performance slowdown [29].
The most relevant work to MetaHunt is the memoized binary
diffing method [32], another trace-oriented binary diffing tool for
2.2 Semantics-based Binary Diffing matching basic block pairs. However, MetaHunt is designed for
Since most malware spread in binary form, the techniques to detect comparing a large number of obfuscated metamorphic malware
the difference between two binaries (binary diffing) have been wi- variants; the binary diffing method [32] is used for comparing
dely applied to malware reverse engineering. Conventional binary different versions of normal programs. Compared to it, MetaHunt
diffing tools identify syntactical differences such as instruction se- is augmented with better resilience to various code obfuscation
quences, byte N-grams, and basic block hashing [36]. However, they methods (e.g. call/return obfuscation and opaque predicate) and a
can be easily evaded by various obfuscation methods. The core met- set of optimizations. Therefore, MetaHunt has better accuracy and
hod of the advanced semantics-based binary diffing [21, 29, 30, 35] performance on analyzing metamorphic malware.
is to first identify semantically equivalent basic block pairs. It uses
symbolic values to represent inputs to a basic block and then simula-
tes the function of each instruction by updating the corresponding 3 SYSTEM DESIGN
symbolic formula. The output of symbolic execution is a set of
formulas that represent the behavior of the basic block. After that, 3.1 Overview
we try to find whether there is an equivalent mapping between The architecture of MetaHunt is illustrated in Figure 4. It mainly
two basic block output formulas. If yes, those two basic blocks are comprises two parts: online trace logging and offline comparison.
equivalent in semantics. Figure 3 presents two semantically equi- The online part will produce a sequence of executed basic blocks
valent basic blocks. Their output symbolic formulas are verified as together with their associated taint tags, and then pass them to
equivalence by a constraint solver (e.g., STP [20]). Note that due the offline part for comparison. MetaHunt’s offline stage consists
to obfuscation such as register renaming, basic blocks could use of three components: normalization, basic block comparison with
different registers or variables to implement the same functionality. the semantics-based binary diffing technique, and a union-find set
As a result, current approaches exhaustively try all possible pairs structure to record semantically equivalent basic blocks. The nor-
to find if there exists a bijective mapping between output formulas. malization component performs several transformations to remove
4
Symbolic input: Symbolic input: following analysis. The comparison unit of most semantics-based
eax = i; eax = j; binary diffing work is basic block [21, 29, 30]. However, many obfus-
cation methods can split a single basic block to multiple basic blocks.
Basic block 1 Basic block 2 As a result, direct comparison between the split basic blocks with
not ebx the original block lead to false negatives. Moreover, too much extra
xor eax, -1 not ebx basic block comparisons increase the performance cost. Therefore,
add eax, 1 neg ebx
jmp loc_0022 a normalization pass is performed to reverse these obfuscation
jmp loc_0022
effects. Currently, we consider three major obfuscation methods: in-
Output Output struction reordering, call/return obfuscation, and opaque predicate
obfuscation. The effect of instruction reordering is to split one basic
eax = (i ^ -1) + 1; ebx = ((j ^ -1) ^ -1) × -1;
block into multiple new basic blocks, which are connected through
direct jumps. call/return obfuscation involves non-standard use
Figure 3: Example: basic block symbolic execution. of the call and ret instructions [45]. For example, push ADDR;
ret is equivalent to jmp ADDR. Reverting the effect of instruction
reordering or call/return obfuscation is straightforward. We merge
obfuscation effect. After that, the normalized basic blocks are com- all adjacent basic blocks that have only one predecessor and one
pared by a symbolic execution based method. Finally, the equivalent successor into a single basic block.
basic blocks are inserted into the same union-find set. The detail of Our normalization also removes opaque predicate obfuscation.
each component are discussed in the following sections. An opaque predicate means its value is known to the obfuscator at
obfuscation time, but it is difficult for an attacker to figure it out
3.2 Trace Logging afterward. For example, predicate (x 3 − x ≡ 0 (mod 3)) in Figure 5
The online trace logger records the basic blocks executed during is true for all integers x. Opaque predicates have been widely used
runtime. In general, not all of the executed instructions are of to introduce redundant branches for the purpose of control flow
interest, such as the code from packers or standard libraries. We obfuscation [31]. To handle opaque predicates, we submit a branch
want to compare the basic blocks that represent the virus behavior. condition to a constraint solver to verify whether it is always true
Our online stage supports recording the execution trace that comes or false. If yes, we conclude that the branch condition is an opaque
from real payload instead of various unpacking routines [45]. When predicate. After that, as shown in Figure 5, the unreachable paths
a packed binary starts running, the generic unpacking plug-in will and redundant predicates will be discarded; the basic blocks split
be invoked to monitor whether the original code is recovered; if by the opaque predicate will be merged.
so, the trace logging plug-in will be activated to record execution In addition, we also normalize basic blocks to ignore offsets
trace. Moreover, usually different metamorphic variants still call that may change due to code relocation and some nop instructi-
standard libraries, but the basic blocks in these libraries should ons. Binary code compiled from the same source code often have
not be compared. Our trace logger only records the code from the different address value caused by memory relocation during com-
metamorphic virus ignoring the standard library calls. pilation. What’s more, malware authors may intentionally insert
In addition to ignoring the unrelated basic blocks during run some instruction idioms like nop and xchg eax, eax to mislead
time, we also limit our comparison to the input-related code. The the following hash value calculation (see Section 3.4). The purpose
insight is that the basic blocks related to inputs implement the core of normalization is to ignore these effects and make the hash value
function of a virus, so these basic blocks should be recorded and more general.
compared. To this end, we utilize multi-tag taint forward tracking
to record input-related code, which also reduces the number of
possible basic block matches. We not only take multiple system 3.4 Basic Block Comparison and Memoization
calls that are used to receive outside input as different taint seeds The basic blocks tainted by the same taint tags are the candidates
but also consider the system calls that are commonly used to fulfill to be compared. Our basic block comparison is based on semantics-
malicious behavior, such as download and execution, replication based binary diffing with improvements in several ways. First, we
and remote injection. For example, when a MetaPHOR version exe- introduce an union-find set structure that records semantically
cutes, it invokes about 20 Windows Native API calls3 for replicating equivalent basic blocks. Managing the union-find structure during
and displaying its messages. Note that for the file-infecting meta- successive comparisons allows direct reuse of previously computed
morphic viruses (e.g., MetaPHOR and W32.Evol), multi-tag taint results rather than comparing them again. Specifically, after basic
tracking can also distinguish the host file code and virus body code. block normalization, we first calculate the MD5 value of the byte
The input-related basic blocks together with their associated taint sequence of each basic block. Then, we dynamically maintain a
tags will be passed to the MetaHunt’s offline stage for comparison. set of union-find subsets to record semantically equivalent basic
blocks, which are represented by their MD5 value. The basic blocks
3.3 Basic Block Normalization within the same subset are all semantically equivalent to each other.
After logging the execution trace, MetaHunt first lifts x86 instructi- To avoid a highly unbalanced searching tree, we adopt an improved
ons to an intermediate representation (IR), which facilitates the path compression and weighted union algorithm [12]. In addition
to the union-find set, we also maintain a DiffMap to record two
3 The system calls in Windows are named as Native API. subsets that have been verified that they are not equivalent. If two
5
2QOLQH 2IIOLQH
7(08 1RUPDOL]DWLRQ
9DULDQW
0HUJHEDVLF 5HPRYHRSDTXH %DVLFEORFNSDLUV
*HQHULF 0XOWLWDJ EORFNV SUHGLFDWH FRPSDULVRQ (TXLYDOHQWEDVLFEORFN
XQSDFNLQJ WDLQW XQLRQILQGVHW
9DULDQW 5HYHUVHFRGHUHORFDWLRQHIIHFWV
6
5 EVALUATION
We evaluate MetaHunt with several objectives in mind. First, we
want to evaluate our iterative comparison of metamorphic variants
will converge in a reasonable time, that is, MetaHunt is capable
of exploring the morphing code evolution. At the same time, we
make sure MetaHunt’s exploration results are comprehensive and
accurate. We provide a case study of the metamorphic engine in
MetaPHOR and W32.Evol to show more details about the engine’s
mechanism and how MetaHunt explores the variants generated
by the engine. We also test the optimization methods for speeding
up the malware comparison. At last, we report some interesting
findings during our evaluation.
7
Table 1: Metamorphic engine statistics and various code mutation methods adopted.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
call/return obfus.
Opaque constant
CFG flattening
Funct. inlining
Reg. renaming
Indirect jump
Opaque pred.
Instr. reorder
# UF subsets
Decryption
Dead code
Engine Type # Mutations
Lexotan32 attached 100 X X X X X X 1.5 90 8
MetaPHOR attached 100 X X X X X X 2.2 132 12
W32.Evol attached 100 X X X X X 1.0 52 6
NGVCK decoupled 200 X X X X X X X X 4.7 346 16
G2 decoupled 200 X X X X 1.4 115 8
VCL32 decoupled 200 X X X X 1.8 130 10
MPCGEN decoupled 200 X X X X 2.2 96 8
MalDiv decoupled 150 X X X X X X X X 6.8 522 34
Obfuscator-LLVM decoupled 150 X X X X X X X X 4.6 304 18
8
Frame 4
P: The original program P’ 9
P’: The permuted program based on P 10
Frame 1 jump
1
2 Frame 2
3 4
P
jump
1 Frame 2
2 4
Frame 1
3 1
Code frame size Frame 3 Code frame
4 randomization shuffling 2
5
5 3
6
6 jump
7
7 8
8 Frame 3
9 Frame 4 5
10 9 6
10 7
8
jump
Before After 5.3.2 W32.Evol. The W32.Evol virus is first discovered in July
20007 , which is the first virus to utilize a ‘true’ 32-bit metamor-
mov eax, 1
lea eax, [ecx+1] phic engine instead of the polymorphic engine which is suscepti-
add eax, ecx
ble to AV scanners that can trace virus decryption in memory. A
push 3 metamorphic engine is used to transform the executable code: it
mov eax, 3
pop eax implements some sort of an internal disassembler to parse input
code, and then transforms the program code and produces new
mov eax, ebx
lea eax, [ebx+8] different code while retaining its functionality.
add eax, 8
The instruction transformation supported by the engine can be
mov [eax], 3 divided into two parts: Inter-engine transformations are inlined
push 3
push [eax] inside the engine as a part of the engine’s core. External Transfor-
mations take place outside the main engine function, yet they act
mov [eax], ebx
add [eax], ecx add ebx, ecx as if they are inside the engine itself and jump back to the engine
mov ebx, [eax] when they are finished. The engine’s decision on whether or not to
transform a given instruction is based upon a random factor. The
mov [eax], 2 engine asks for a random number between 0 and 7, and the trans-
add [eax], ecx add ecx, 2 formation will be applied only if it is 0. Hence there is a probability
mov ebx, [eax] of 12.5% that an instruction would be transformed. Furthermore,
the engine will only disassemble the instructions that the author
or eax, 0 nop
had included.
As shown in Figure 10, the disassembly of the virus’ code before
Figure 9: Code compressing examples in MetaPHOR transformation in the left column and the corresponding transfor-
med code in the right column. We can see that for the first row, the
transformation is semantics-preserving unconditionally. However,
explains our experiment results. Since the metamorphic engine for the last two rows, we can see the value of eax is given the value
uses fixed transformation tables (in shrinker and expander) and 0x04 and 0x09 respectively. Therefore, the last two transformations
the reserved space for virus body is fixed, we can conclude that
although MetaPHOR employs full metamorphism engine, it only
has a finite length of evolution, which can be studied by MetaHunt. 7 https://www.symantec.com/security-center/writeup/2000-122010-0045-99
9
1000
Before After
6 O1
O2 Figure 13: Example: buggy metamorphic engine implemen-
O3
O4
tation (add instruction may modify the value of carry flag).
5
Speedup (times)
4 5.4 Optimization
The binary comparison component in MetaHunt is optimized for
3 quickly checking the equivalence of two basic blocks. Various met-
hods in MetaHunt contribute to improve the performance of com-
2 parison. First, the preprocessing normalizes the trace and remove
the obfuscations. Second, the union-find set and DiffMap keep the
checked basic block in memory so as to accelerate the future com-
1
parison. Third, concretizing the formulas in symbolic execution
Av
Le
O ida
M
W
N
VM EN
G
Th
VC
bf
G ol
et 32
PC
32
xo
er
em t
.-L
VC
Pr
aP
ag
ta
.E
ot
LV
2
H
n
K
v
e
ec
O
10
Table 2: Conditionally equivalent instructions (reg, imm and Ether [18]). Currently, MetaHunt’s detection on opaque predica-
random stand for register, immediate value and random number, tes focuses on invariant opaque predicates, whose value remain
respectively). the same for all possible inputs. The most recent work can detect
more advanced cases such as contextual and dynamic opaque pre-
Instruction Substitution Condition dicates [31]. Although we do not see such complicated opaque
inc reg add reg, 1 carry flag is not set predicates in our evaluation, we will extend our work to handle the
dec reg sub reg, 1 carry flag is not set advanced opaque predicates proactively.
mov reg, [esp] Another argument against studying the evolution of metamor-
pop reg no EFLAGS bit is set
add esp, 4 phic malware is the relatively high cost. In fact, compared to the
sub esp, 4 number and diversity of the malware samples in circulation, the
push reg no EFLAGS bit is set
mov [esp], reg metamorphic engine evolves rather slower because of the great
add reg, imm sub reg, -imm overflow and carry development complexity. A successful metamorphic engine tends
flags are not set to be reused and shared by malware authors. For example, NG-
mov reg, random VCK [55] is widely applied to generate metamorphic virus and
mov reg, imm no EFLAGS bit is set Obfuscator-LLVM, is also used to mutate both desktop and Android
add reg, imm - random
applications [24, 54]. Therefore, our one-time efforts to approximate
the semantics of nontrivial metamorphic engines are worthwhile.
is complicated as well, which make the design of metamorphic Furthermore, considering that manually tracing metamorphic mu-
transformation rules very difficult. Especially, certain instructions tations usually takes several days to weeks of hard work, the degree
have implicit side effects. They reveal different semantics when the of MetaHunt’s overhead is acceptable.
value of EFLAGS register varies. If a metamorphic engine neglects
such subtleties of x86 instructions, it is very likely that semantics- 7 CONCLUSION
breaking mutations will happen. Table 2 lists that some instructions The metamorphic malware relies on its morphing engine to mu-
and their substitutions are only conditionally equivalent when cer- tate the malicious code from generation to generation so that each
tain EFLAGS register bits are dead. For example, the Intel manual variant is different in syntax. Metamorphic malware have been
indicates that “inc/dec” does not affect the carry flag while “add/sub” demonstrated to evade the conventional signature-based malware
does; the instruction “pop reg” (the third row in Table 2) does not detection successfully. The mutation engine itself is also constantly
modify any EFLAGS bits while “add” may set as many as six bits. evolving. In this paper, we attempt to tame the metamorphic muta-
Unfortunately, the examples shown in Table 2 are misused by many tion by systematically chasing the morphing code evolution. We
of our testing metamorphic engines. Figure 13 shows a possible apply trace-based semantic binary diffing to compare possible mu-
semantics-breaking mutation we find in NGVCK. The instruction tation variants iteratively and memoizes equivalent basic blocks.
“rcr” rotates right using the carry flag as the “extra” bit. Therefore, Without pre-knowledge about a particular metamorphic engine,
the modification to the carry flag before the “rcr” instruction may our exploration result can approximate its mutation mechanism. We
lead to an incorrect rotation result. However, the “add” instruction have implemented our approach called MetaHunt and performed
in the new version may modify the value of carry flag. Since Meta- empirical evaluations on a large set of metamorphic malware. Our
Hunt also trace the symbolic execution for each EFLAGS register generalized approach can be seen as a first step towards designing
bit, we can find metamorphic engine bugs in terms of conditionally mutation insensitive anti-malware solutions.
equivalent transformations. In our evaluation, we find 62 semantics-
breaking bugs in total. These metamorphic engine bugs lead to fatal ACKNOWLEDGMENTS
runtime errors in many cases.
We thank the anonymous reviewers for their valuable feedback.
6 DISCUSSIONS AND LIMITATIONS This research was supported in part by the National Science Foun-
dation (NSF) grants CNS-1652790, and the Office of Naval Research
The power of MetaHunt is limited by the non-perfect path coverage. (ONR) grants N00014-16-1-2265, N00014-16-1-2912, and N00014-17-
This is mainly due to the limitation of dynamic malware analysis. 1-2894. Jiang Ming was also supported by the University of Texas
We can leverage automatic input generation techniques [22] to System STARs Program.
explore more paths. Since MetaHunt depends on multi-tag taint
analysis to reduce the number of basic block comparisons, Meta- REFERENCES
Hunt exhibits similar limitations of taint analysis in general, e.g., [1] Shahid Alam, Issa Traore, and Ibrahim Sogukpinar. 2014. Current Trends and the
implicit information flow evasions [7]. One possible solution is to Future of Metamorphic Malware Detection. In Proceedings of the 7th International
leverage statistical binary similarity comparison [15, 16] to reduce Conference on Security of Information and Networks (SIN’14).
[2] Kapil Anand, Matthew Smithson, Khaled Elwazeer, Aparna Kotha, Jim Gruen,
the number of constraint solving on multiple paths. Another threat Nathan Giles, and Rajeev Barua. 2013. A Compiler-level Intermediate Represen-
to dynamic malware analysis is environment-sensitive malware. tation Based Binary Analysis and Rewriting System. In Proceedings of the 8th
ACM European Conference on Computer Systems (EuroSys’13).
Since we analyze metamorphic malware in TEMU, a malware sam- [3] Philippe Beaucamps. 2007. Advanced Metamorphic Techniques in Computer
ple can detect itself running in an emulator instead of the physical Viruses. In Proceedings of the 2007 International Conference on Computer, Electrical,
machine and then quit immediately. To evade such sandbox envi- and Systems Science, and Engineering (CESSE’07).
[4] D. Bruschi, L. Martignoni, and M. Monga. 2006. Detecting Self-mutating Malware
ronment check, a possible countermeasure is to analyze malware Using Control-Flow Graph Matching. In Proceedings of Detection of Intrusions
in a transparent analysis platform via hardware virtualization (e.g., and Malware & Vulnerability Assessment (DIMVA’06).
11
[5] Danilo Bruschi, Lorenzo Martignoni, and Mattia Monga. 2007. Code Normaliza- [30] Jiang Ming, Meng Pan, and Debin Gao. 2012. iBinHunt: Binary Hunting with
tion for Self-Mutating Malware. IEEE Security and Privacy 5, 2 (2007). Inter-Procedural Control Flow. In Proceedings of the 15th Annual International
[6] Lorenzo Cavallaro. 2014. Malicious Software and its Underground Economy. Conference on Information Security and Cryptology (ICISC’12).
https://www.coursera.org/course/malsoftware. [31] Jiang Ming, Dongpeng Xu, Li Wang, and Dinghao Wu. 2015. LOOP: Logic-
[7] L. Cavallaro, P. Saxena, and R. Sekar. 2008. On the Limits of Information Flow Oriented Opaque Predicates Detection in Obfuscated Binary Code. In Proceedings
Techniques for Malware Analysis and Containment. In Proceedings of the GI of the 22nd ACM Conference on Computer and Communications Security (CCS’15).
International Conference on Detection of Intrusions & Malware, and Vulnerability [32] Jiang Ming, Dongpeng Xu, and Dinghao Wu. 2015. Memoized Semantics-Based
Assessment (DIMVA’08). Binary Diffing with Application to Malware Lineage Inference. In Proc. of the
[8] Sang Kil Cha, Iulian Moraru, Jiyong Jang, John Truelove, David Brumley, and 30th IFIP Int’l Information Security and Privacy Conference (IFIP SEC’15).
David G. Andersen. 2010. SplitScreen: Enabling Efficient, Distributed Malware [33] Vishwath Mohan and Kevin W Hamlen. 2012. Frankenstein: Stitching Malware
Detection. In Proceedings of the 7th USENIX Conference on Networked Systems from Benign Binaries. WOOT 12 (2012), 77–84.
Design and Implementation (NSDI’10). [34] Vinod P. Nair, Harshit Jain, Yashwant K. Golecha, Manoj Singh Gaur, and Vijay
[9] Mohamed R. Chouchane and Arun Lakhotia. 2006. Using Engine Signature Laxmi. 2010. MEDUSA: MEtamorphic Malware Dynamic Analysis Using Signa-
to Detect Metamorphic Malware. In Proceedings of the 4th ACM Workshop on ture from API. In Proceedings of the 3rd International Conference on Security of
Recurring Malcode (WORM’06). Information and Networks (SIN’10).
[10] Mohamed R. Chouchane, Andrew Walenstein, and Arun Lakhotia. 2007. Statistical [35] Beng Heng Ng and Atul Prakash. 2013. Exposé: Discovering Potential Binary
Signatures for Fast Filtering of Instruction-substituting Metamorphic Malware. Code Re-use. In Proceedings of the 37th IEEE Annual Computer Software and
In Proceedings of the 2007 ACM Workshop on Recurring Malcode (WORM’07). Applications Conference (COMPSAC’13).
[11] M. Christodorescu, S. Jha, S. Seshia, D. Song, and R. Bryant. 2005. Semantics- [36] Jeong Wook Oh. 2009. Fight against 1-day exploits: Diffing Binaries vs Anti-diffing
aware malware detection. In Proc. of the IEEE Symposium on Security and Privacy. Binaries. In Proceedings of the 2009 Black Hat USA.
[12] Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest, and Clifford Stein. [37] Philip OKane, Sakir Sezer, and Kieran McLaughlin. 2011. Obfuscation: The Hidden
2001. Introduction to Algorithms (Second ed.). MIT Press, Chapter 21: Data Malware. IEEE Security and Privacy 9, 5 (2011).
structures for Disjoint Sets, 498–524. [38] Orr. last reviewed, 04/14/2015. The Molecular Virology of Lexotan32: Metamor-
[13] Mila Dalla Preda, Roberto Giacobazzi, and Saumya Debray. 2015. Unveiling me- phism Illustrated. http://www.openrce.org/articles/full_view/29.
tamorphism by abstract interpretation of code properties. Theoretical Computer [39] Rodney Owens and Weichao Wang. 2011. Non-normalizable Functions: a New
Science 577 (2015), 74–97. Method to Generate Metamorphic Malware. In Proceedings of the 2011 IEEE
[14] Mila Dalla Preda, Roberto Giacobazzi, Saumya Debray, Kevin Coogan, and Military Communications Conference (MILCOM’11).
Gregg M Townsend. 2010. Modelling metamorphism by abstract interpreta- [40] Panda Security. 2017. PandaLabs Annual Report 2017. https://www.pandasecurity.
tion. In International Static Analysis Symposium. 218–235. com/mediacenter/src/uploads/2017/11/PandaLabs_Annual_Report_2017.pdf.
[15] Yaniv David, Nimrod Partush, and Eran Yahav. 2016. Statistical Similarity of [41] Mathias Payer. 2014. Embracing the new threat: towards automatically, self-
Binaries. In Proceedings of the 37th ACM SIGPLAN Conference on Programming diversifying malware. Symposium on Security for Asia Network (SyScan’14).
Language Design and Implementation (PLDI). [42] Mathias Payer, Stephen Crane, Per Larsen, Stefan Brunthaler, Richard Wartell,
[16] Yaniv David, Nimrod Partush, and Eran Yahav. 2017. Similarity of Binaries and Michael Franz. 2014. Similarity-based matching meets Malware Diversity.
Through Re-optimization. In Proceedings of the 38th ACM SIGPLAN Conference arXiv Technical Report (2014).
on Programming Language Design and Implementation (PLDI). [43] Frédéric Perriot, Peter Ferrie, and Péter Ször. 2003. Striking Similarities:
[17] Priti Desai and Mark Stamp. 2010. A highly metamorphic virus generator. Inter- Win32/Simile and Metamorphic Virus Code. Symantec Security Response.
national Journal of Multimedia Intelligence and Security 1, 4 (2010). [44] Mila Dalla Preda. 2012. The Grand Challenge in Metamorphic Analysis. In
[18] A. Dinaburg, P. Royal, M. Sharif, and W. Lee. 2008. Ether: Malware Analysis via Proceedings of the 6th International Conference on Information Systems, Technology
Hardware Virtualization Extensions. In Proceedings of the ACM Conference on and Management (ICISTM’12).
Computer and Communications Security (CCS’08). [45] Kevin A. Roundy and Barton P. Miller. 2013. Binary-code Obfuscations in Preva-
[19] The Mental Driller. last reviewed, 04/14/2015. Metamorphism in practice or How lent Packer Tools. Comput. Surveys 46, 1 (2013).
I made MetaPHOR and what I’ve learnt. http://vxheaven.org/lib/vmd01.html. [46] Dawn Song, David Brumley, Heng Yin, Juan Caballero, Ivan Jager, Min Gyung
[20] Vijay Ganesh and David L. Dill. 2007. A Decision Procedure for Bit-vectors and Kang, Zhenkai Liang, James Newsome, Pongsin Poosankam, and Prateek Saxena.
Arrays. In Proceedings of the 2007 International Conference in Computer Aided 2008. BitBlaze: A New Approach to Computer Security via Binary Analysis. In
Verification (CAV’07). Proceedings of the 4th International Conference on Information Systems Security
[21] Debin Gao, Michael K. Reiter, and Dawn Song. 2008. BinHunt: Automatically (ICISS’08).
finding semantic differences in binary programs. In Poceedings of the 10th Inter- [47] Peter Szor. 2005. The Art of Computer Virus Research and Defense. Addison-Wesley
national Conference on Information and Communications Security (ICICS’08). Professional.
[22] P. Godefroid, M. Y. Levin, and D. Molnar. 2008. Automated Whitebox Fuzz [48] Péter Ször and Peter Ferrie. 2001. Hunting For Metamorphic. Symantec White
Testing. In Proceedings of the 15th Annual Network and Distributed System Security Paper.
Symposium (NDSS’08). [49] Teja Tamboli, Thomas H. Austin, and Mark Stamp. 2014. Metamorphic code
[23] Jiyong Jang, Maverick Woo, and David Brumley. 2013. Towards Automatic Soft- generation from LLVM bytecode. Computer Virology and Hacking Techniques 10,
ware Lineage Inference. In Proceedings of the 22nd USENIX Security Symposium. 3 (2014), 177–187.
[24] Pascal Junod, Julien Rinaldini, Johan Wehrli, and Julie Michielin. 2015. Obfuscator- [50] Andrew Walenstein, Rachit Mathur, Mohamed R. Chouchane, and Arun Lakhotia.
LLVM – Software Protection for the Masses. In Proceedings of the IEEE/ACM 1st 2008. Constructing malware normalizers using term rewriting. Computer Virology
International Workshop on Software Protection (SPRO’15). 4, 4 (2008), 307–322.
[25] Arun Lakhotia, Mila Dalla Preda, and Roberto Giacobazzi. 2013. Fast Location of [51] Shuai Wang, Pei Wang, and Dinghao Wu. 2015. Reassembleable Disassembling.
Similar Code Fragments Using Semantic ’Juice’. In Proceedings of the 2nd ACM In Proceedings of the 24th USENIX Security Symposium (USENIX Security ’15).
SIGPLAN Program Protection and Reverse Engineering Workshop (PPREW’13). USENIX Association.
[26] Chris Lattner and Vikram Adve. 2004. LLVM: A Compilation Framework for [52] Shuai Wang, Pei Wang, and Dinghao Wu. 2016. Uroboros: Instrumenting Stripped
Lifelong Program Analysis & Transformation. In Proceedings of the International Binaries with Static Reassembling. In Proceedings of the 23rd IEEE International
Symposium on Code Generation and Optimization (CGO’04). Conference on Software Analysis, Evolution, and Reengineering (SANER ’16). USE-
[27] Felix Leder, Bastian Steinbock, and Peter Martini. 2009. Classification and de- NIX Association.
tection of metamorphic malware using value set analysis. In Proceedings of the 4th [53] Matt Webster and Grant Malcolm. 2009. Detection of metamorphic and
International Conference on Malicious and Unwanted Software (MALWARE’09). virtualization-based malware using algebraic specification. Computer Virology 5,
[28] Jusuk Lee, Kyoochang Jeong, and Heejo Lee. 2010. Detecting Metamorphic 3 (2009), 221–245.
Malwares using Code Graphs. In Proceedings of the 2010 ACM Symposium on [54] Ryan Welton. 2015. Obfuscating Android Applications using O-LLVM and the
Applied Computing (SAC’10). NDK. http://fuzion24.github.io/.
[29] Lannan Luo, Jiang Ming, Dinghao Wu, Peng Liu, and Sencun Zhu. 2014. [55] Wing Wong and Mark Stamp. 2006. Hunting for metamorphic engines. Computer
Semantics-based Obfuscation-resilient Binary Code Similarity Comparison with Virology 2, 3 (2006), 211–229.
Applications to Software Plagiarism Detection. In Proc. of the 22nd ACM SIGSOFT [56] Qinghua Zhang and Douglas S. Reeves. 2007. MetaAware: Identifying Metamor-
Int’l Symposium on Foundations of Software Engineering (FSE’14). phic Malware. In Proceedings of the 23rd Annual Computer Security Applications
Conference (ACSAC’07).
12