Tang 2017
Tang 2017
Abstract—Increasingly sophisticated code obfuscation tech- human involvement. Their approach does not require security
niques are quickly adopted by malware developers to escape from analysts to manually analyze and identify the obfuscation
malware detection and to thwart the reverse engineering effort techniques used by the malware. As a result, the time spent
of security analysts. State-of-the-art de-obfuscation approaches
rely on dynamic analysis, but face the challenge of low code in malware analysis is reduced greatly. While promising,
coverage as not all software execution paths and behavior will Coogan’s method only can deal with malware that uses
be exposed at specific profiling runs. As a result, these approaches virtulization-based obfuscation tools such as VMProtect [7]
often fail to discover hidden malicious patterns. This paper and Virtualizer [8].
introduces S EE AD, a novel and generic semantic-based de- In this work, we aim to extend the reach of existing
obfuscation system. When building S EE AD, we try to rely on
as few assumptions about the structure of the obfuscation tool malware de-obfuscation techniques. We present S EE AD, a
as possible, so that the system can keep pace with the fast novel and generic automated code de-obfuscation system.
evolving code obfuscation techniques. To increase the code cov- S EE AD is a semantic-based de-obfuscation approach. It makes
erage, S EE AD dynamically directs the target program to execute few assumptions about the structure of obfuscators. Therefore,
different paths across different runs. This dynamic profiling S EE AD can be applied to existing and unknown obfuscation
scheme is rife with taint and control dependence analysis to
reduce the search overhead, and a carefully designed protection methods. S EE AD works by first identifying the semantically
scheme to bring the program to an error free status should relevant instructions with dynamic taint analysis and control
any error happens during dynamic profile runs. As a result, the dependency analysis, and then simplifying the instruction
increased code coverage enables us to uncover hidden malicious traces of the target binary with these analytical results. Because
behaviors that are not detected by traditional dynamic analysis the whole de-obfuscation process of S EE AD does not require
based de-obfuscation approaches. We evaluate S EE AD on a range
of benign and malicious obfuscated programs. Our experimental any human involvement, it significantly reduces the time spent
results show that S EE AD is able to successfully recover the in malware analysis.
original logic from obfuscated binaries. Similar to most de-obfuscation approaches [9], [10],
Index Terms—Malware Analysis, De-obfuscation, Multiple Ex- S EE AD also uses dynamic analysis to characterize the pro-
ecution Paths Exploration gram behavior. However, profiling based dynamic analysis
suffers from poor code coverage because the program execu-
I. I NTRODUCTION
tion path during profiling runs only represents the application
Code obfuscation [1] methods like control flow flattening, behavior for a given set of inputs. As a result, existing dynamic
garbage code insertion, instruction deformation, binary code analysis based de-obfuscation techniques can miss some of
encryption and packing [2], and virtualization obfuscation [3], the malware behaviors that are only triggered under specific
are now commonplace in malware. These code obfuscation cases (e.g., when a particular file is present, or when a certain
techniques make it more difficult to uncover the true logic of command is received). Our approach to the problem is to
the program, giving security analyst an incredibly hard time. dynamically adjust the program control logic to direct the
Most existing de-obfuscation approaches [4], [5] only target program to execute different paths during different profiling
a limited set of specific obfuscation techniques. They work runs to increase the code coverage. Our carefully designed
under the assumption that security analysts have priori knowl- recovery scheme ensures that the program can roll back to an
edge of the structure of obfuscation tools (obfuscators) used error free status if the logic change leads to invalid program
by the malware developer. This means that these approaches execution or corrupted data. To reduce the search space and
require heavily human involvement (which often takes a lot of profiling overhead, we combine taint and control dependence
time and effort) and can only be applied to known obfuscation analysis to only change execution branches that dependent
methods. on the program input and ignore those do not. As a result,
The work presented by Coogan et al. [6] is among the first our scheme achieves higher code coverage with reasonable
attempts to automate malware code de-obfuscation without overhead compared to the state-of-the-art dynamic analysis
based approaches. The increase in code coverage allows us
*Corresponding authors:
Xiaoqing Gong, Email address: [email protected] to uncover more hidden behaviors of malware.
Zheng Wang, Email address: [email protected] We have evaluated S EE AD with a range of benign and
([HFXWHGLQIRUPDWLRQH[WUDFWLRQ
DQDO\VLV
&RQWURO'HSHQGHQF\ 2SWLPL]DWLRQ
$QDO\VLV 3URFHVV
2EIXVFDWHG 2SWLPL]HG
,QWHUPHGLDWH
%LQDU\)LOH LQVWUWUDFHV
'DWD
&)*)&*
&)*DQG)&*
&RQVWUXFWLRQ
0XOWLSOH([HFXWLRQ3DWK
([SORUDWLRQ 3& ,QGH[ 3UH%ORFN 6XFF%ORFN 3RVWGRP6HWV
Figure 1: Overview of S EE AD: The figure shows the framework of S EE AD, the components in yellow are the key functions of S EE AD, the components in
blue are the input and output of S EE AD. Others are some details produced during the de-obfuscation process.
malicious obfuscated binary programs. Experimental results instruction traces, values of registers and memory) of the
show that S EE AD is able to eliminate on average 76.8% of the obfuscated binary file. It is difficult for the common debug-
obfuscation instructions and 87.4% of obfuscation instructions gers (e.g. Ollydbg [11], IDA Pro [12]) to handle packing
generated using virtualization-based obfuscation techniques. and obfuscated malware, because malware developers usually
The main contributions of this work are: use various anti-reverse engineering strategies to increase the
• We present S EE AD, a novel and generic semantic-based difficulty of malware analysis. In addition, the cost of analysis
automated code de-obfuscation system that can apply is not very optimistic. Dynamic binary instrumentation tools
to unknown obfuscation methods without any human are effective against these anti-reverse engineering strategies.
involvement; Thus, we build S EE AD on the top of a dynamic binary
• S EE AD is a low-cost solution but provides wider code instrumentation tool called PIN.
coverage compared to state-of-the-art dynamic analysis
based code de-obfuscation tools; III. S EMANTICALLY R ELEVANT I NSTRUCTION
• Our evaluation performed on a range of benign and ma- I DENTIFICATION
licious obfuscated binaries show that S EE AD is effective In this paper, we use dynamic taint analysis to identify
at removing obfuscation instructions. values obtained through input operations and instructions
II. S EE AD OVERVIEW influenced by these input-tainted values directly and indirectly.
The computation only can capture the explicit information flow
S EE AD is a generic and semantic-based de-obfuscation from inputs to outputs of the program, but does not consider
system. An overview of S EE AD is presented in Figure 1. The the implicit information flow [13], it is possible that some
input of S EE AD is the obfuscated binary file, the output is the behaviors will be missed and the semantics of the program
simplified instruction traces, CFG (Control Flow Graph) and will be changed. To this end, we combine with the explicit
FCG (Function Call Graph), which can be easily analyzed and data dependencies identified earlier to capture implicit as well
understood. as explicit information flow from inputs to outputs.
To perform the code de-obfuscation, S EE AD goes through
the following steps: A. Dynamic Taint analysis
• Extract the executed information of the obfuscated file Dynamic taint analysis [14] is widely applied to program
based on the dynamic binary instrumentation tool. security analysis. The basic idea of dynamic taint analysis is
• Identify semantically relevant instructions with dynamic to mark the users’ sensitive data or untrusted input data as
taint analysis and the control dependence analysis in the taint source and track the taint source’s propagation path
Section 4. during the executed process.
• Present a low-cost solution for exploring multiple execu- Similar to prior work [15], S EE AD uses an one-bit tag (0
tion paths to increase the code coverage in Section 5. for “untaint” data and 1 for “taint” data) for each value in
• Perform the inter-block and intra-block optimization re- memory or general registers in the taint propagation process. If
spectively in Section 6, moreover, S EE AD constructs the necessary, the one-bit tag can be easily extended to a multiple-
CFG and FCG for the optimized instruction traces. bit tag for each value.
In order to identify the semantically relevant instructions, At the beginning of the taint propagation process, all tags
we need to extract the executed information (i.e., assembly are assigned to 0. Based on the taint propagation policy, taint
262
sources (e.g. data read from the network or standard input) $GGUHVV $VVHPEO\,QVWUXFWLRQ
will be tagged with 1 as “taint”. As the program executes,
/ % PRYHE[HD[
the dynamic taint scheduler propagates the tag information
from one instruction to another. It does this by dynamically / % FPSHF[HD[
tracking instructions with information flow. Some other data / % MQ]%)
may be tagged with 1 via information flow. Of course,“taint”
data can become “untaint” if its value is reassigned from some / % PRYHF[[
263
• Invalid instruction combination: Invalid instruction
$VVHPEO\&RGH /RFDO [% combination is some instructions in the combination of
LQW[
%
%
[ UHDGBLQSXW PRY>ORFDO@HD[ /RFDO!
which functionally invalided or can cancel each other out.
FPS>ORFDO@[ (e.g., add eax, 0xF; sub eax, 0xF).
LI [! MOHVKRUWWHVW
In order to better analyze and understand the logic of the
LI [!
%
% % H[LW original program, we construct the CFG and FCG for the
/RFDO!
SULQWI “Pass” FPS>ORFDO@[& optimized results.
MOHVKRUWWHVW$
H[LW CFG and FCG Construction: Construction of CFG and
% 3ULQWH[LW % H[LW FCG is a basic and highly challenging task for obfuscated
binaries, especially for the identification of indirect jump
Figure 3: An example of multiple execution paths exploration targets and API identification. Since S EE AD is based on
dynamic analysis, the targets of indirect jump instructions are
the first execution as usual by an arbitrary input is B1, B2, precise address which we obtained after dynamic computation.
B4, the blocks saved in the snapshot list are B2, B1. When However, for API identification, there is no standard approach
the current process wishes to terminate, we replace the current in the literature.
process address space with the saved snapshot B2 firstly, and As we all know, system calls play a vital role in malware
then B1. detection. To some extent, API function sequence is a special
representation of malware behavior. Thus, in order to prevent
B. Exception Recovery Mechanism security analysts from extracting the API call sequence and
As we all know, the process runs normally until it exits analyzing the program behavior, malware developers usually
normally or an exception happens. However, S EE AD does not use various API protection techniques to obfuscate system
allow the process to terminate. Because the operating system calls. To construct FCG, we have to develop API identification
will remove the process-related entries and free its memory, techniques against API obfuscation to reveal the information
we will not be able to recover the current image to a saved of API calls (e.g., address of API calls and their details).
snapshot. Moreover, the program input is merely to allow the Common API obfuscation techniques can be roughly classi-
program to execute along the normal execution flow, rather fied into import table encryption, Hook API and API rewriting.
than along different execution paths. It is possible to cause The first two techniques are ineffective to dynamic analysis,
exceptions because of the incorrect input. Thus, we adapt the because the entry point of the API function can always be
exception recovery mechanism to prevent any exceptions. traced in dynamic execution process. However, it is challeng-
In S EE AD, the obfuscated binary program is first executed ing to reveal API sequence from the program obfuscated by
as usual by providing arbitrary input. Its recovery mechanism API rewriting technique. API writing usually copies the first
prevents the program from termination. For the program which few instructions of the API function to the user space to
exit normally, we hook the system API function NtTermi- execute, so we cannot easily identify the entry point of the
nateProcess() of ntdll.dl library to monitor whether the process API function during the execution process. In this paper, we
wishes to terminate. Similarly, for the program crashes, we combine code injection and API hook to monitor the API calls
hook the system API function KiUserExceptionDispatcher() and record their invocation information. Finally, we use these
of ntdll.dll library. Whenever the process invokes the API, collected information of API calls to construct CFG. Since
we can know that a program exception occurs. In this case, if these two are standard techniques, we omit their details.
there are unexplored paths left, we will revert to the program’s
VI. E FFECTIVENESS E VALUATION
current image to a previous state.
A. Effectiveness Analysis
V. O PTIMIZATION P ROCESS
In this subsection, we demonstrate the effectiveness of our
The optimization process is mainly divided into two parts: de-obfuscation approach by elaborating that the analysts have
inter-block and intra-block optimization. For the inter-block only negligible probability of getting the same results with our
optimization, we discard those blocks without taint marked de-obfuscation approach.
which are semantically irrelevant. For the intra-block opti- Let tins denotes the average time of analyzing an instruc-
mization, we make assumptions as few as possible about the tion, let Nobf and Nsimp denote the instruction number of
structure of obfuscators. Thus, we present a set of general but the obfuscated program and the simplified traces respectively.
simpler semantics-preserving transformations as following: Ptime measures how much time we have been able to save
• Stack optimization. There are two cases: a useless push- when analyzing an obfuscated program. It is defined as:
pop couple and an element A is pushed onto the stack Nsimp × tins Nobf − Nsimp
and then popped into an element B. Ptime = 1 − = (1)
Nobf × tins Nobf
• Dead code removal: Dead code are the instructions
whose execution does not modify programs final states For Nobf instructions of obfuscated program, if we want
or control flow. Every instruction of a block in which all to simplify them into Nsimp instructions, this yields a total of
Nobf !
taints get overwritten before being used. (Nobf −Nsimp )! combinations. The security analysts’ probability
264
(N −N )!
of correctly getting these Nsimp instructions is obfNobfsimp
! .
For a 1536B (Nobf =751 instructions) obfuscated program, the
instructions can be simplified as 31 instructions after calcu-
6LPSOLILFDWLRQ6FRUH
probability of getting the same results with our de-obfuscation
approach therefore is:
(Nobf − Nsimp )! (751 − 31)!
P [Analysis] = = = 9.68−87
Nobf ! 751!
The time we have been able to save when analyzing this
obfuscated program is:
Nobf − Nsimp 751 − 31
Ptime = = = 95.872%
Nobf 751
ELQBVHDUFK EXEEOHBVRUW KXIIPDQ PDWUL[BPXOW ILERQDFFL IDFWRULDO
B. Experimental Results &)2EIXVFDWRU 0(03 903 &9
non-virtualization obfuscations (e.g., control flow flattening, ELQBVHDUFK EXEEOHBVRUW KXIIPDQ PDWUL[BPXOW ILERQDFFL IDFWRULDO
instruction deformation, encryption, etc.) so we do not know &)2EIXVFDWRU 0(03 903 &9
whether they are also able to handle the program obfuscated 2EIXVFDWHG6DPSOHV
by these non-virtulaization obfuscations. As far as we know,
Figure 5: Comparison results of difference Scores
none of existing approaches on de-obfuscation can be applied
to most obfuscation techniques. Thus, we present S EE AD The difference score measures the instruction number differ-
which is effective for most obfuscation techniques. In this ence between the original program and the simplified traces.
subsection, we demonstrate the power of S EE AD with four It is defined as:
common obfuscation tools: CF Obfuscator, MEMP [17], |Norig − Nsimp |
Code Virtualizer (CV) [18] and VMprotect (VMP) [19]. CF Dif f erence Score = (3)
Norig
Obfuscator is a binary control flow flattening tool which
realized by control flow algorithm OBFWHKD [20]. MEPE Analysis results of programs obfuscated with these four
combines equivalent deformation, control flow obfuscation and obfuscation tools are in Table I, Table II, Table III and Table IV
dynamic encryption and decryption. We present the results respectively. The first column shows the name of sample.
of evaluating S EE AD with 6 programs, both of which are As showed in the next 3 columns, we report the number of
common obfuscated samples [18]. Because this paper does instructions in the original program, obfuscated program and
not discuss these non-virtulazation obfuscations, we obfuscate simplified traces. The next two columns show the number of
these six programs with the CF Obfuscator and MEMP. total blocks and input-taint blocks respectively. Finally, we
Let Norig , Nobf and Nsimp denote the number of in- present the simplification score and in the last two columns.
structions for the original program, the obfuscated program Figure 4 shows the comparison results of simplification
and the simplified traces respectively. The simplification score scores in all samples. Simplification score introduced by
measures how much obfuscation code we have been able to MEMP is on average about 0.65, which means that S EE AD
eliminate. It is defined as: is able to eliminate about 65% of obfuscation instructions
introduced by CF Obfuscator. Simplification score introduced
Nobf − Nsimp
Simplif ication Score = (2) by CV is over 0.94 on average. The simplification scores
Nobf
265
Table I: Results for programs obfuscated with CF Obfuscator
Original Obfuscated Simplified Total basic Input-taint Simplification Difference
Samples
trace size trace size trace size blocks basic blocks Score Score
bin search 166 221 108 21 19 0.511312 0.3494
bubble sort 316 641 263 22 6 0.589704 0.16772
huffman 4367 7226 833 59 31 0.884722 0.80925
matrix-mult 651 936 479 44 28 0.488248 0.26421
fibonacci 2930 2950 781 20 12 0.735254 0.73345
factorial 132 174 39 13 10 0.775862 0.70455
introduced by CF Obfuscator and VMP lie in the middle. They instruction combination. All of these instructions were identi-
are about 0.67 and 0.81 on average respectively. fied by our analysis.
Similarity, the comparison results of difference scores are We examined the results by hand, and found the reason for
shown in Figure 5. The highest difference score is about 689 the higher simplification scores is that all the test cases we
on average which is introduced by VMP, and the lowest score used are all toy programs. We believe that this paper is just
is over 0.5 on average which is introduced by CF Obfuscator. an initial step on developing an advanced and functionally
Difference scores are introduced by MEMP and CV lies in the powerful de-obfuscation tool.
middle. They are about 2.6 and 26 on average respectively. The results in Table I, Table II, Table III and Table IV
Overall, these results are encouraging, especially for virtu- show that the extraordinary increase in the number of executed
alization obfuscations, as S EE AD only identifies those instruc- instructions for these four obfuscator tools. For example,
tions that are semantically relevant with the original code, and bin search executes 166 instructions in the original program.
discards those that are semantically irrelevant. Our evaluation However, the number of executed instructions of the program
results show that we can straightforwardly reconstruct the logic obfuscated by CF Obfuscator, MEMP, CV and VMP are 221,
of the original program and analyze them correctly with the 1325, 163599 and 659226 respectively.
functionality we have traced. As we all know, traditional dynamic analysis typically
We observed that most of the “missed” instructions were represents partial program behavior and the coverage heavily
classified into two categories: On the one hand, instruc- relies on good inputs which may not be available. We compare
tions which performed some preparatory work like allocating the results of S EE AD with traditional dynamic analysis and
memory or initializing data structures. On the other hand, demonstrate the effectiveness of our approach on multiple exe-
instructions which performed some invalid actions that were cution paths exploration. The comparison results are presented
semantically irrelevant, such as garbage instructions, invalid in Table V. Columns 2-5 present the comparison results of the
266
Table V: Evaluation of multiple execution paths exploration
CF Obfuscator MEMP
Samples Branch Input-taint Dynamic Branch Input-taint Dynamic
S EE AD S EE AD
blocks branch blocks analysis blocks branch blocks analysis
bin search 5 5 203 221 5 5 1301 1325
bubble sort 3 2 641 641 3 2 4529 4529
huffman 15 9 4383 7226 12 9 27325 34410
matrix mult 15 12 840 936 9 7 7119 8230
fibonacci 4 2 2943 2950 4 2 2992 2999
factorial 2 2 141 174 2 1 189 255
these techniques in terms of code coverage, the capability
3DFNHG3URJUDP
'\QDPLF$QDO\VLV
of handling packing and obfuscation and scalability. Static
6HH$' analysis usually has good code coverage, and which is very
1XPHURI$3,&DOOV
267
b) Multiple Execution Paths Exploration.: Early ap- [8] “Code virtualizer: Total obfuscation against reverse engineering,” Oreans
Technologies, http://www.oreans.com/codevirtualizer.php, Tech. Rep.,
proaches on multiple execution paths exploration usually rely 2008.
on profiling information to construct concrete program in- [9] C. Cadar, D. Dunbar, and D. R. Engler, “Klee: Unassisted and automatic
puts [27], [28], such as source code, software testing and generation of high-coverage tests for complex systems programs.” in
OSDI, vol. 8, 2008, pp. 209–224.
debugging information. Unfortunately, in practice, such in- [10] V. Chipounov, V. Kuznetsov, and G. Candea, S2E: a platform for in-vivo
formation is generally not available. Hence, for malware, the multi-path analysis of software systems. ACM, 2012, vol. 47, no. 4.
assumption can be considered unrealistic. The work in [29] [11] O. Yuschuk, “Ollydbg 1.1: A 32-bit assembler level analysing debugger
for microsoft windows, june 2004.”
requires concrete inputs firstly and then mutate such inputs to [12] “Ida pro: a windows, linux or mac os x hosted,” https://www.hex-
explore different paths which incurs high overhead. rays.com/products/ida/index.shtml, Tech. Rep.
There is an approach for multiple execution paths explo- [13] G. E. Suh, J. W. Lee, D. Zhang, and S. Devadas, “Secure program
execution via dynamic information flow tracking,” in Acm Sigplan
ration in [23] by forcing the branch outcomes to be reversed Notices, vol. 39, no. 11. ACM, 2004, pp. 85–96.
to construct control flow graphs, However, partial paths they [14] P. Saxena, R. Sekar, and V. Puranik, “Efficient fine-grained binary
explored are infeasible. Similar techniques are proposed to instrumentationwith applications to taint-tracking,” in Proceedings of the
6th annual IEEE/ACM international symposium on Code generation and
expose hidden behavior in Android apps [30], [31]. These optimization. ACM, 2008, pp. 74–83.
techniques randomly determine each branch’s outcome, facing [15] A. Lakhotia and E. U. Kumar, “Abstracting stack to detect obfuscated
the challenge of excessive infeasible. calls in binaries,” in Source Code Analysis and Manipulation, 2004.
Fourth IEEE International Workshop on. IEEE, 2004, pp. 17–26.
IX. C ONCLUSIONS [16] A. V. Aho, R. Sethi, and J. D. Ullman, Compilers, Principles, Tech-
niques. Addison wesley, 1986.
This paper has presented S EE AD, a novel, generic frame- [17] W. H. Fang Dingyi, Li Guanghui, “Research on deformation based
work for code de-obfuscation, targeting malware detection. binary,” Journal of Sichuan University (Engineering Science Edition),
2014,1:003.
S EE AD employs dynamic taint analysis and control depen- [18] “Obfuscated samples,” http://www.cs.arizona.edu/projects/lynx/Samples/,
dency analysis to carefully direct the program execution path Tech. Rep.
across profiling runs to increase the code coverage. It then [19] C.-K. Luk, R. Cohn, R. Muth, H. Patil, A. Klauser, G. Lowney,
S. Wallace, V. J. Reddi, and K. Hazelwood, “Pin: building customized
simplifies the instruction traces of the target binary to perform program analysis tools with dynamic instrumentation,” in ACM Sigplan
code de-obfuscation. S EE AD is fully automatic and requires Notices, vol. 40, no. 6. ACM, 2005, pp. 190–200.
little human involvement. We evaluate S EE AD on a range [20] J. Nagra and C. Collberg, Surreptitious Software: Obfuscation, Wa-
termarking, and Tamperproofing for Software Protection. Pearson
of benign and malicious obfuscated programs. Experimental Education, 2009.
results show that S EE AD can successfully recover the original [21] M. Christodorescu and S. Jha, “Static analysis of executables to detect
logic from obfuscated binaries. malicious patterns,” DTIC Document, Tech. Rep., 2006.
[22] J. Zeng, Y. Fu, K. A. Miller, Z. Lin, X. Zhang, and D. Xu, “Obfuscation
X. ACKNOWLEDGMENT resilient binary code reuse through trace-oriented programming,” in
Proceedings of the 2013 ACM SIGSAC conference on Computer &
This work was partial supported by projects of the Na- communications security. ACM, 2013, pp. 487–498.
tional Natural Science Foundation of China (No. 61672427, [23] F. Peng, Z. Deng, X. Zhang, D. Xu, Z. Lin, and Z. Su, “X-force: Force-
executing binary programs for security applications,” in Proceedings of
No. 61572402), the International Cooperation Foundation the 2014 USENIX Security Symposium, San Diego, CA (August 2014),
of Shaanxi Province, China (No.2015KW-003), the Re- 2014.
search Project of Shaanxi Province Department of Education [24] S. K. Udupa, S. K. Debray, and M. Madou, “Deobfuscation: Reverse
engineering obfuscated code,” in Reverse Engineering, 12th Working
(No. 15JK1734), the Service Special Foundation of Shaanxi Conference on. IEEE, 2005, pp. 10–pp.
Province Department of Education (No.16JF028), the Re- [25] C. Wang, J. Davidson, J. Hill, and J. Knight, “Protection of software-
search Project of NWU, China (No.14NW28). Especially this based survivability mechanisms,” in Dependable Systems and Networks,
2001. DSN 2001. International Conference on. IEEE, 2001, pp. 193–
work was supported by Tencent. 202.
[26] N. D. Jones, C. K. Gomard, and P. Sestoft, Partial evaluation and
R EFERENCES automatic program generation. Peter Sestoft, 1993.
[27] X. Zhang, N. Gupta, and R. Gupta, “Locating faults through automated
[1] C. Collberg, C. Thomborson, and D. Low, “A taxonomy of obfuscating
predicate switching,” in Proceedings of the 28th international conference
transformations,” Department of Computer Science, The University of
on Software engineering. ACM, 2006, pp. 272–281.
Auckland, New Zealand, Tech. Rep., 1997.
[28] S. Lu, P. Zhou, W. Liu, Y. Zhou, and J. Torrellas, “Pathexpander:
[2] R. Langner, “Stuxnet: Dissecting a cyberwarfare weapon,” Security &
Architectural support for increasing the path coverage of dynamic
Privacy, IEEE, vol. 9, no. 3, pp. 49–51, 2011.
bug detection,” in Microarchitecture, 2006. MICRO-39. 39th Annual
[3] M. Sharif, A. Lanzi, J. Giffin, and W. Lee, “Automatic reverse engineer-
IEEE/ACM International Symposium on. IEEE, 2006, pp. 38–52.
ing of malware emulators,” in Security and Privacy, 2009 30th IEEE
[29] A. Moser, C. Kruegel, and E. Kirda, “Exploring multiple execution
Symposium on. IEEE, 2009, pp. 94–109.
paths for malware analysis,” in Security and Privacy, 2007. SP’07. IEEE
[4] R. Rolles, “Unpacking virtualization obfuscators,” in 3rd USENIX Work-
Symposium on. IEEE, 2007, pp. 231–245.
shop on Offensive Technologies.(WOOT), 2009.
[30] R. Johnson and A. Stavrou, “Forced-path execution for android applica-
[5] M. G. Kang, P. Poosankam, and H. Yin, “Renovo: A hidden code
tions on x86 platforms,” in Software Security and Reliability-Companion
extractor for packed executables,” in Proceedings of the 2007 ACM
(SERE-C), 2013 IEEE 7th International Conference on. IEEE, 2013,
workshop on Recurring malcode. ACM, 2007, pp. 46–53.
pp. 188–197.
[6] K. Coogan, G. Lu, and S. Debray, “Deobfuscation of virtualization-
[31] Z. Wang, R. Johnson, R. Murmuria, and A. Stavrou, “Exposing security
obfuscated software: a semantics-based approach,” in Proceedings of
risks for commercial mobile devices,” in Computer Network Security.
the 18th ACM conference on Computer and communications security.
Springer, 2012, pp. 3–21.
ACM, 2011, pp. 275–284.
[7] “Vmprotect - new-generation software protection,”
http://www.vmprotect.ru/, Tech. Rep.
268