0% found this document useful (0 votes)
1 views

Multi-stage Binary Code Obfuscation using Improved Virtual Machine

This paper introduces a multi-stage software obfuscation algorithm utilizing improved virtual machine techniques to enhance code security and efficiency. The method involves iteratively obfuscating a program with different interpretations, requiring adversaries to crack multiple intermediate results to understand the original code structure. The proposed approach aims to provide better resistance against reverse engineering while maintaining the functionality of the software.

Uploaded by

rtloweb
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
1 views

Multi-stage Binary Code Obfuscation using Improved Virtual Machine

This paper introduces a multi-stage software obfuscation algorithm utilizing improved virtual machine techniques to enhance code security and efficiency. The method involves iteratively obfuscating a program with different interpretations, requiring adversaries to crack multiple intermediate results to understand the original code structure. The proposed approach aims to provide better resistance against reverse engineering while maintaining the functionality of the software.

Uploaded by

rtloweb
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 14

Multi-stage Binary Code Obfuscation Using

Improved Virtual Machine

Hui Fang1 , Yongdong Wu1 , Shuhong Wang2 , and Yin Huang2


1
Institute for Infocomm Research,
1 Fusionpolis Way, 21-01, Singapore 138632
{hfang,wydong}@i2r.a-star.edu.sg
2
Sumavision Soft Tech Co., Ltd.,
15 Kaituo Road, Shangdi District, Beijing, 100085, China
{wangshuhong,huangyin}@sumavision.com

Abstract. A software obfuscator transforms a program into another


executable one with the same functionality but unreadable code imple-
mentation. This paper presents an algorithm of multi-stage software ob-
fuscation method using improved virtual machine techniques. The key
idea is to iteratively obfuscate a program for many times in using differ-
ent interpretations. An improved virtual machine (VM) core is appended
to the protected program for byte-code interpretation. Adversaries will
need to crack all intermediate results in order to figure out the structure
of original code. Compared with existing obfuscators, our new obfuscator
generates the protected code which performs more efficiently, and enjoys
proven higher level security.

1 Introduction

Software obfuscation refers to transformations on the code which becomes hard


to understand while preserving all functionalities. It plays an importance role
in protecting confidential data and algorithms from reverse engineering or virus
modification [12,11,22,8]. Ideally, an adversary possessing a well-obfuscated pro-
gram should be only able to learn program input/output like a black-box access.
Due to this, software obfuscation has received many research interests for the
last ten years [3,33,28,39,21,24,2,4,10].
The challenge in software obfuscation lies in whether or not guaranteed
security and fair performance can be provided for obfuscated binary code. Specif-
ically, code security implies resistance to static analysis and even dynamic anal-
ysis, and code efficiency implies that the obfuscated code should not run much
slower than the original code. Up to now, some practical metrics for software
obfuscation have been proposed in the literature [25,21,22,27,2,9]. Meanwhile,
obfuscation on Turing machine programs with formal definitions has been re-
searched intensively as well [3,28,15,42,6,5,17,7]. Unfortunately most practical
obfuscation techniques lack a well-founded theoretical base, and thus it is un-
clear how effectively they perform. We take consideration of both practical and

X. Lai, J. Zhou, and H. Li (Eds.): ISC 2011, LNCS 7001, pp. 168–181, 2011.

c Springer-Verlag Berlin Heidelberg 2011
Multi-stage Binary Code Obfuscation Using Improved Virtual Machine 169

theoretical obfuscation metrics, and design our obfuscation algorithm align to


theoretical definitions in principle.
We address the challenge by presenting an algorithm of multi-stage software
obfuscation using improved virtual machine. The key idea is to obfuscate a soft-
ware for many times while each time applying different interpretations in order
to improve security. To fulfil the purpose, an improved virtual machine core
responsible for byte-code interpretation is appended to the protected software.
Under this design, an adversary must crack all intermediate results in order to
figure out the structure of original code. Compared with existing obfuscators,
our new obfuscator creates obfuscated code which performances more efficiently,
and enjoys a higher security level.
The paper is organized as follows. Section 2 introduces the related work on
software obfuscation and virtual machine. Section 3 describes our approach
in two steps: block-to-byte virtual machine and multi-stage code obfuscation.
Section 4 analyzes the security of our new software obfuscation algorithm. Sec-
tion 5 provides experimental results. Finally, Section 6 draws a conclusion.

2 Related Work
Most existing obfuscation techniques on binary code fall into three categories:
– data transformation, such as name renaming and string encryption.
– instruction transformation, which replaces binary instructions using a library
of equivalent instructions.
– control flow transformation, which transforms the graph structure of pro-
gram control flow.
Data transformation does not alter program controls. Even the encrypted data
will have to be decrypted inside the program for use. The code for decryption
again faces the attack from reverse engineering. Therefore data obfuscation is
usually applied together with other complicated obfuscation techniques to in-
crease security [26,16,35].
Control flow transformation is relatively complicated [41,18,14,30,1]. Typi-
cally a control flow flattening method puts all basic blocks into a single switch
statement which maintains whole control flow. It obfuscates the order in which
the computations are carried out, in order to stand against static analysis. How-
ever, constant propagation on the switch variable will expose the next block
to be executed. Besides, one large switch statement will generate many jumps
which decreases program performance. Opaque predicates are boolean expres-
sions whose values are known to the obfuscator but difficult for adversary to de-
duce. Junk codes are usually inserted into the dead path of an opaque predicate.
However, for the same reason as above, there still exists risk that an adversary
may figure out the value of an opaque predicate by static analysis.
Instruction transformation refers to replacement of protected binary instruc-
tion with a block of instructions which is functionally equivalent [20,19,23,29,32].
The introduced blocks representing native instruction are written as byte-codes
170 H. Fang et al.

into the program. Those byte-codes are often maintained by a virtual machine
integrated with the obfuscated program. In practice, instruction transformation
works well against static analysis except for runtime disassembly. However, lit-
tle theoretical work has been carried out to show guarantee on its security and
performance on obfuscated software.
Virtual machine (VM) based obfuscation recently becomes popular for soft-
ware obfuscation, and it is probably the most sophisticated in the literature
[36,34,32]. It usually integrates several obfuscation techniques including data per-
mutation, instruction institution, and control flow transformation. As a result,
VM obfuscation is fairly good against dynamic analysis in practice [40,37,31].
We observe the common way how VM obfuscator works, and summarize a gen-
eral code structure for the program before and after obfuscation as shown in
Figure 1. Generally speaking, a VM section will be appended to the original
program, and the protected binary code will be transformed to byte-code, which
is interpreted by a VM core. Finally, the entry point of the program will be
redirected into VM code. To fulfil the byte-code fetching, VM core still needs
to save all registers and flags in its own context, and to restore upon exiting
byte-code interpretation.
Classical VM obfuscators suffer two drawbacks. Firstly, they generate ob-
fuscated software which runs much slower than the original one. It is largely
because of byte-code interpretation working style [37,40]. Secondly, the security
of VM obfuscated program relies merely on an uncustomized VM core inte-
grated with program rather than each individual program. VM does not restore
byte-codes to original instructions any more. Therefore success of attacking ob-
fuscated program requires two steps: understanding VM code, and decoding
mapping between binary instructions and byte-codes. One round VM obfusca-
tion will output relatively intelligible mapping, which allows an adversary to
perform instruction level analysis, and further to reconstruct the structure of
original software [34,32].

Fig. 1. Virtual machine based obfuscation


Multi-stage Binary Code Obfuscation Using Improved Virtual Machine 171

The existing works are promising under certain situations. However, the dan-
ger of software cracking is always changing and increasing [38,24]. Therefore we
propose a new approach on software obfuscation in next section, introducing a
more light-weighted obfuscator which generates harder understanding codes.

3 Our Approach

In this section we firstly introduce the concept of black box security, then present
new design of block-to-byte virtual machine, and describe a framework of multi-
stage code obfuscation based on improved virtual machine.
A program obfuscator is often regarded as a processor on computer programs,
which outputs a new program of the same functionality but with unreadable
code structure [28,10]. More precisely, a program obfuscator O is theoretically
defined to be a probabilistic Turing machine or Boolean circuit, which satisfies
three requirements [3]:

– (Functionality Equivalence) For every TM/circuit P and for every input


x : P (x) = O(P )(x).
– (Polynomial Slowdown) There exists a polynomial q(.) such that for every
TM/circuit P , |O(P )| ≤ q(|P |). TMs are additionally required that for every
input x, if P halts after t steps on x then O(P ) halts within q(t) steps on x.
– (Virtual Black Box) For any PPT A, there is a PPT oracle machine S and
a negligible function negl(.) such that for all TM/circuit P : |P r[A(O(P )) =
1] − P r[S P (1|P | ) = 1]| < negl(|P |).

Although Barak et al. [3] further proved that this kind of universal black box
obfuscator does not exist, the theoretical concept is still useful in evaluating
performance of code obfuscators. In other words, a good obfuscator shall as best
as possible promise three properties: function equivalence, code efficiency, and
black box security. In light of these requirements we present our customized VM
obfuscator below.

3.1 Block-to-Byte Virtual Machine

The core of a virtual machine(VM) is a dispatcher which transforms byte-code


to an implementation of binary instructions. To adapt to the purpose of program
obfuscation, virtual machine must have byte-codes populated in and contain the
implementations of all byte-codes for the program to protect. Specifically, a
virtual machine will fetch byte-code one by one, position the target address in
its jump table, and give control to the instruction in that address. So a complete
virtual machine to be appended to the obfuscated program will be

V := {Bytecodes, Impl, Jmptable, Dispatcher}.

Classical VM obfuscator will map each binary instruction to a byte-code, to-


gether with its implementation (as described in Algorithm 1). We revise the
172 H. Fang et al.

design and present a block-to-byte VM obfuscation algorithm, as shown in Al-


gorithm 2. The major difference lies in that a control flow graph (CFG) of the
program is set up in prior, and then the obfuscator maps each basic block of the
graph into a byte-code based on which the obfuscation is carried out.

Input: Original program P .


Output: Obfuscated program Q.
1 create a virtual machine V for P ;
2 V.Impl = {};
3 V.Bytecodes = {};
4 for binary instruction b ∈ P do
5 translate b into byte-code B with implementation I(b);
6 b = instruction “jump to V ”;
7 I(b)’s last instruction = “jump to next to b”;
8 V.Jmptable[B] = I(b);
9 V.Bytecodes+ = B;
10 V.Impl+ = I(b);
11 end
12 output P + V ;
Algorithm 1. Classical VM based obfuscation

Input: Original program P .


Output: Obfuscated program Q.
1 construct control flow graph, CF G(G);
2 create a virtual machine V for P ;
3 V.Impl = {};
4 V.Bytecodes = {};
5 for block BL ∈ CF G(P ) do 
6 translate BL into byte-code B with I(BL) = b∈BL I(b);
7 BL’s first instruction = “jump to V ”;
8 I(BL)’s last instruction = “jump to last of BL”;
9 V.Jmptable[B] = I(BL);
10 V.Bytecodes+ = B;
11 V.Impl+ = I(BL);
12 end
13 output P + V ;
Algorithm 2. Block-to-byte VM based obfuscation

Figure 2 shows the format for binary instructions and VM byte-codes respec-
tively. It also gives an example how a binary instruction was transformed into
byte-code together with an implementation.
VM dispatcher works on stack based style: it saves registers for native code
and create own VM stack. The return value of last execution for each byte-code
was saved in VM registers (var RegEip and var RegDI in Figure 3) for next byte-
code execution. VM dispatcher then obtains the target address by searching a
jump table using byte-code as index. Target address is the location that current
instruction will transfer to. VM obfuscator retrieves all target addresses of the
Multi-stage Binary Code Obfuscation Using Improved Virtual Machine 173

Fig. 2. Format of VM byte-code instruction and an example of implementation

original program in four different ways: for direct jump, target address is specified
in the original instruction; for conditional jump, there are two target addresses
with a predicate; for call instruction, one target address is set for called function,
and another one for return address; and for return instruction, target address is
stored on the stack.

3.2 Multi-staged Code Obfuscation

In this section we extend the technique of block-to-byte virtual machine to a


multi-stage obfuscation. The idea of multi-stage obfuscation algorithm is de-
scribed as follows. Given an original program P , we choose a random number n

Fig. 3. VM byte-codes are executed by a dispatcher


174 H. Fang et al.

to be the number of obfuscation stages, a one-way function f , and an obfuscation


function Obf . Then we calculate multiple copies {P0 , P1 , ..., Pn } of the program
together with the keys {K0 , K1 , ..., Kn } for each obfuscation stage, as shown in
Figure 4.
We iteratively obfuscate program P for n times. The obfuscation key Ki is
generated from each intermediate program Pi of the previous obfuscation stage,
and Ki is again applied to Pi to compute Pi+1 .

Ki = f (Pi ),
Pi+1 = Obf (Pi , Ki ).

The function f maps any program into a key in binary string, satisfying that: f
must have one-way hardness, and the output key can characterize the program.
The examples of this type of function include: MD5 hash value of program where
the program is feed as data, or the number of nodes in program’s control flow graph.

Fig. 4. The multi-stage obfuscation algorithm. Pn is output

The obfuscation of program requires to hide program’s data and/or control


flow while preserving all the functionalities. In other words, each copy Pi of
the program must be executable and function normally. Our idea is to extract
all jmp/jcc/call points of P , and transform such information into a jump ta-
ble. Then the jump table is obfuscated given a particular K and some dummy
codes. Original program P is thus modified accordingly to jump table to preserve
correct control. In other words, a separate hidden jump table will take control
over program’s running. Adversaries need to crack all intermediate obfuscated
programs in order to recover original code’s control flow.
For intra-block instructions or a single instruction, we use a revised tree struc-
ture to describe the whole process of multi-stage obfuscation. In this tree struc-
ture, each node represents a list of binary instructions (as shown in example
of Figure 5). The root node x1 refers to only one binary instruction, denoted
by a circle. It links to its three children, V1 , V2 , V3 , which are different imple-
mentations of x1 . The children are called byte-codes, drawn in rectangles. Each
byte-code, e.g. V1 , contains a list of binary instructions, e.g. y1 → y2 → y3 .
In Stage-1 obfuscation, x1 is assumed to be mapped into byte-code V2 ; further
in Stage-2, y4 and y5 of V2 are mapped into V5 and V6 respectively. The path
Multi-stage Binary Code Obfuscation Using Improved Virtual Machine 175

Fig. 5. Tree structure used in multi-stage obfuscation


selection from an earlier stage to next stage is determined by Ki . In the example
case, a formal induction of resulted code would be
x1 = V2
= y4 → y5
= V5 → V6
= (z3 → z4 → z5 ) → (z6 → z7 )
= z3 → z4 → z5 → z6 → z7 .

4 Security Analysis
This section analyzes the security of multi-stage obfuscated program in two as-
pects: code efficiency and black box security. Specifically we strengthen the black
box security by introducing code polymorphism during multi-stage obfuscation,
and improve the code efficiency by removing unnecessary jump instructions dur-
ing block-to-byte VM obfuscation.

4.1 Multi-stage Polymorphism


Polymorphism refers to that one binary instruction could have many byte-code
interpretation with equivalent function. It is often used in code obfuscation to
improve the difficulty in reversing program to original status.
When one instruction was obfuscated over twice, the mapping relationships
from binary to byte codes become unrecognizable, due to many possible instruc-
tion combinations. Given an instruction sequence z3 → z4 → z5 → z6 → z7 ,
an adversary needs to separate them into byte-codes to understand the original
program structure. In other words, one cannot easily split a sequence of instruc-
tions into correct {V5 , V6 }, and further obtain byte code V2 which refers to x in
first stage. Generally speaking, the fan-out width W of each binary node and
the block size L of byte-code node for each stage determine the obfuscation com-
plexity. In addition, the number n of stages is randomly chosen to control the
complexity. The complexity of guessing increases exponentially with the num-
ber of stages. In this sense, multi-stage polymorphism makes the obfuscation of
software more secure than the one obfuscated by single VM obfuscation. This
claim is proved in Theorem 1.
176 H. Fang et al.

Theorem 1. An n-stage polymorphism tree provides C(n) possible implemen-


n−1
tations for root node given constant W and L, where C(n) = W L +...+L+1 .

Proof. Use mathematical induction. When n = 1, root node links to W children


which are all available choices. So C(1) = W satisfies the equation. Assume
k−1
C(k) = W L +...+L+1 , and consider the case when n = k + 1. Firstly we notice
that the number of choices owned by a binary component of each stage-1 node
is C(k). Since each node has L components, there will be C(k)L choices for
solution passing through this node. Secondly we notice that the root node can
choose path from its W children. So the total possible paths will be

C(k + 1) = W ∗ C(k)L
k−1
= W ∗ (W L +...+L+1 L
)
L +...+L2 +L
k
= W ∗ (W )
Lk +...+L2 +L+1
=W ,

which completes the proof. 

4.2 Improved Execution Efficiency

The classical VM obfuscator transforms protected code into byte-codes. The


resulted obfuscated program then interprets byte-codes sequentially, and runs
the implementation of byte-codes accordingly. However, the program control
will be unconditionally switched to VM dispatcher every time when one byte-
code interpretation is completed. The number of jmps inserted for byte-code
interpretation is proportional to the number of binary instructions. It is well
known that the jump operations block the instruction streamline for execution.
In contrast, our block-to-byte VM obfuscation chooses a “basic block” to exe-
cute before jumping back to VM dispatcher. There will be no new jmp/jcc/call
instruction inserted inside one basic block. The obfuscated program only needs
to interpret bytes representing basic blocks and follows the original control flow
of the program. So the number of jmps inserted for byte-code interpretation is
only proportional to the number of nodes in program control flow graph. By
interpreting a block of instructions into only one byte-code, our multi-stage VM
obfuscator is able to reduce those unnecessary jumps during code obfuscation.
The number of jmp instructions in the program plays a heavy part in slowing
down the program execution time. Given an average block size L of control flow
graph of the program, our block-to-byte VM obfuscator will generate only L1 the
number of jmp instructions by the classical one.

5 Experiments

The testing experiment on our multi-stage VM obfuscation module was carried out
on WinXP 2.4GHz CPU and 1G RAM platform. A demo of obfuscation out is given
Multi-stage Binary Code Obfuscation Using Improved Virtual Machine 177

in Appendix A. Three parameters are take into consideration: structure of con-


trol flow graph, program size, and running time of obfuscated program. We adopt
IDApro [13], a disassembly tool to facilitate view on IA-32 executables. VMprotect
[40], a popular VM obfuscation software, was chosen for empirical comparison.

5.1 Control Flow Graph


The complexity of a program’s control flow graph reflects program intelligibility
to certain extent. We capture the number of nodes and edges in graph as an indi-
cator of graph complexity. Accordingly, the obfuscation level is hereafter defined
as the ratio of number of nodes or edges in CFG before and after obfuscation.
Table 1 presents the obfuscation level for programs using multi-stage VM obfus-
cation. It implies that the control flow graph becomes interleaved which leads to
high obfuscation level of program.
Table 1. The number of nodes and edges of control flow graph before and after
obfuscation

Original Obfuscated Obfuscation Level


Program
#nodes,N #edges,E #nodes,N2 #edges,E2 N2 /N E2 /E
md5 437 164 581 353 1.33 2.15
calc 458 175 746 308 1.63 1.76
draw 397 96 1439 258 3.62 2.69
crc32 151 47 354 125 2.34 2.66
aes 1908 517 3465 1392 1.82 2.70

5.2 Program Size


Program size is measured in two parameters: the number of instructions, and
the size of program sections in bytes. Table 2 shows the program size of several
programs before and after obfuscation. It tells that the number of instructions
will normally increase at least four times after obfuscation, which implies the
slowdown of obfuscated program.

5.3 Running Time


Table 3 provides the execution time of several x86 programs on average of 10000
times. It shows that our block obfuscator generates more efficient obfuscated
Table 2. Program size before and after obfuscation

Original Obfuscated Increment Factor


Program
#instr, I bytes #instr, I2 bytes I2 /I
md5 675 1776 2837 9456 4.20
calc 485 825 2051 9559 4.23
draw 983 2109 8012 2935 8.15
crc32 231 583 1143 5665 4.95
aes 12302 32369 77748 314572 6.32
178 H. Fang et al.

Table 3. Execution time (secs) of obfuscated programs

Original VMprotect BlockVM MultiBlockVM(n = 2) Slowdown


Program
T T0 T1 T2 T2 /T
md5 0.34 3.85 2.67 6.03 17.73
calc 0.12 3.40 2.34 8.73 72.75
draw 0.58 6.81 6.21 15.95 27.50
crc32 0.15 2.54 2.31 8.59 57.27
aes 0.23 4.59 5.43 11.15 48.48

code than classical VM obfuscator in one stage. However when given multi-stage
obfuscation, the execution time of obfuscated program increases quickly due to
more complicated obfuscation.

6 Conclusion
We have presented a new method to obfuscate code in multiple stages to protect
software from reverse engineering. The key idea is to implement a block-to-byte
virtual machine to interpret byte-codes, while modifying program structure itera-
tively. Block obfuscation hides the binary details into byte-codes while improving
the program execution efficiency; multi-stage obfuscation hides the control flow
of program in a more complicated level by using a polymorphism tree. Literally,
an adversary will have to decode all n variants of program to obtain the struc-
ture of original program. Meanwhile compared with classical byte-code virtual
machine obfuscation, block obfuscation makes the program run more efficiently
by removing unnecessary jump instructions.

Acknowledgements. This paper is sponsored by the joint research project of


MOST(2010DFA11110). We are grateful to Huang Xinyi for very helpful discus-
sions and comments.

References
1. Abadi, M., Plotkin, G.: On protection by layout randomization. In: 23rd IEEE
Computer Security Foundations Symposium, pp. 337–351 (2010)
2. Anckaert, B., Madou, M., De Sutter, B., De Bus, B., De Bosschere, K., Preneel,
B.: Program obfuscation: a quantitative approach. In: ACM Workshop on Quality
of Protection, pp. 15–20 (2007)
3. Barak, B., Goldreich, O., Impagliazzo, R., Rudich, S., Sahai, A., Vadhan, S., Yang,
K.: On the (Im)possibility of obfuscating programs. In: Kilian, J. (ed.) CRYPTO
2001. LNCS, vol. 2139, pp. 1–18. Springer, Heidelberg (2001)
4. Beaucamps, P., Filiol, E.: On the possibility of practically obfuscating programs
towards a unified perspective of code protection. Journal in Computer Virology 3,
3–21 (2007)
5. Bitansky, N., Canetti, R.: On Strong Simulation and Composable Point Obfusca-
tion. In: Rabin, T. (ed.) CRYPTO 2010. LNCS, vol. 6223, pp. 520–537. Springer,
Heidelberg (2010)
Multi-stage Binary Code Obfuscation Using Improved Virtual Machine 179

6. Canetti, R., Dakdouk, R.R.: Obfuscating Point Functions with Multibit Output.
In: Smart, N.P. (ed.) EUROCRYPT 2008. LNCS, vol. 4965, pp. 489–508. Springer,
Heidelberg (2008)
7. Canetti, R., Tauman Kalai, Y., Varia, M., Wichs, D.: On Symmetric Encryption
and Point Obfuscation. In: Micciancio, D. (ed.) TCC 2010. LNCS, vol. 5978, pp.
52–71. Springer, Heidelberg (2010)
8. Cappaert, J., Preneel, B., Anckaert, B., Madou, M., De Bosschere, K.: Towards
tamper resistant code encryption: Practice and experience. In: Chen, L., Mu, Y.,
Susilo, W. (eds.) ISPEC 2008. LNCS, vol. 4991, pp. 86–100. Springer, Heidelberg
(2008)
9. Ceccato, M., Di Penta, M., Nagra, J., Falcarin, P., Ricca, F., Torchiano, M.,
Tonella, P.: The effectiveness of source code obfuscation -an experimental assess-
ment. In: The 17th IEEE International Conference on Program Comprehension
(ICPC), pp. 178–187. IEEE Computer Society, Los Alamitos (2009)
10. Collberg, C.: Tutorial: code transformation techniques for software protection. In:
ACM SIGPLAN 2009 Conference on Programming Language Design and Imple-
mentation, PLDI 2009 (2009)
11. Collberg, C., Thomborson, C.: Watermarking, tamper-proofing, and obfuscation
- tools for software protection. IEEE Transactions on Software Engineering 28,
735–746 (2002)
12. Collberg, C., Thomborson, C., Low, D.: A taxonomy of obfuscating transforma-
tions. Technical report (1997)
13. DataRescue. The ida pro disassembler and debugger (2005),
http://www.hex-rays.com/idapro/
14. Ge, J.: Control flow based obfuscation. In: Proceedings of the 5th ACM Workshop
on Digital Rights Management (DRM), pp. 83–92. ACM Press, New York (2005)
15. Goldweisser, S.: On the impossibility of obfuscation with auxiliary input, pp. 553–
562. IEEE Computer Society, Los Alamitos (2005)
16. Hohenberger, S., Rothblum, G.N., Shelat, A., Vaikuntanathan, V.: Securely Ob-
fuscating Re-encryption. In: Vadhan, S.P. (ed.) TCC 2007. LNCS, vol. 4392, pp.
233–252. Springer, Heidelberg (2007)
17. Hohenberger, S., Waters, B.: Constructing Verifiable Random Functions with Large
Input Spaces. In: Gilbert, H. (ed.) EUROCRYPT 2010. LNCS, vol. 6110, pp. 656–
672. Springer, Heidelberg (2010)
18. Jhala, R., Majumdar, R.: Path slicing. In: Proceedings of ACM SIGPLAN Con-
ference on Programming Language Design and Implementation, PLDI 2005, pp.
38–47. ACM, New York (2005)
19. Kanzaki, Y., Monden, A., Nakamura, M.: A software protection method based
on instruction camouflage. IEICE Transactions on Fundamentals of Electronics,
Communications and Computer Sciences (Japanese Edition) J87-A(6):755-767, 47–
59 (2004)
20. Linn, C., Debray, S.: Obfuscation of executable code to improve resistance to
static disassembly. In: ACM Conference on Computer and Communications Se-
curity (CCS), pp. 290–299. ACM Press, New York (2003)
21. Lynn, B., Prabhakaran, M., Sahai, A.: Positive Results and Techniques for Obfusca-
tion. In: Cachin, C., Camenisch, J.L. (eds.) EUROCRYPT 2004. LNCS, vol. 3027,
pp. 20–39. Springer, Heidelberg (2004)
22. Madou, M., Anckaert, B., De Bus, B., De Bosschere, K.: On the effectiveness of
source code transformations for binary obfuscation. In: Proc. of the Int’l Conf. on
Software Engineering Research and Practice (SERP 2006), pp. 527–533 (2006)
180 H. Fang et al.

23. Madou, M., Anckaert, B., Moseley, P., Debray, S.K., De Sutter, B., De Bosschere,
K.: Software protection through dynamic code mutation. In: Song, J.-S., Kwon, T.,
Yung, M. (eds.) WISA 2005. LNCS, vol. 3786, pp. 194–206. Springer, Heidelberg
(2006)
24. Madou, M., Van Put, L., De Bosschere, K.: Understanding obfuscated code. In:
14th IEEE Int’l Conf. on Program Comprehension (ICPC), pp. 268–274 (2006)
25. Mit, M.E., Ernst, M.D.: Static and dynamic analysis: synergy and duality. In:
WODA 2003: ICSE Workshop on Dynamic Analysis, pp. 24–27 (2003)
26. Monden, A., Monsifrot, A., Thomborson, C.: Security improvements for encrypted
interpretation. In: Proc. 3rd Workshop on Application Specific Processors (WASP)
Digest, pp. 19–26 (2004)
27. Naeem, N.A., Batchelder, M., Hendren, L.: Metrics for measuring the effectiveness
of decompilers and obfuscator. In: 15th IEEE Int’l. Conf. on Program Comprehen-
sion, pp. 253–258 (2007)
28. Ogiso, T., Sakabe, Y., Soshi, M., Miyaji, A.: Software obfuscation on a theoretical
basis and its implementation. IEICE Transactions on Fundamentals of Electronics,
Communications and Computer Sciences E86-A(1), 176–186 (2003)
29. Popov, I.V., Debray, S.K., Andrews, G.R.: Binary obfuscation using signals. In:
USENIX Security Symposium (2007)
30. Dalla Preda, M., Madou, M., De Bosschere, K., Giacobazzi, R.: Opaque Predicates
Detection by Abstract Interpretation. In: Johnson, M., Vene, V. (eds.) AMAST
2006. LNCS, vol. 4019, pp. 81–95. Springer, Heidelberg (2006)
31. Rolles, R.: X86 virtualizer (2008), http://rewolf.pl/
32. Rolles, R.: Unpacking virtualization obfuscators. In: Proceedings of the 3rd
USENIX Conference on Offensive Technologies, WOOT 2009, p. 1. USENIX As-
sociation (2009)
33. Schwarz, B., Debray, S.K., Andrews, G.R.: Disassembly of executable code revis-
ited. In: 10th Working Conference on Reverse Engineering, pp. 45–54 (2002)
34. Sharif, M., Lanzi, A., Giffin, J., Lee, W.: Automatic reverse engineering of malware
emulators. In: Proceedings of the 30th IEEE Symposium on Security and Privacy,
pp. 94–109. IEEE Computer Society, Los Alamitos (2009)
35. Sivadasan, P., Sojan Lal, P.: Jconsthide: a framework for java source code constant
hiding. CoRR (2009)
36. Smith, J.E., Nair, R.: Virtual machines: versatile platforms for systems and pro-
cesses. Morgan Kaufmann, San Francisco (2005)
37. Oreans Technologies. Code virtualizer, http://oreans.com/codevirtualizer.php
38. Udupa, S.K., Debray, S.K., Madou, M.: Deobfuscation: reverse engineering obfus-
cated code. In: 12th Working Conference on Reverse Engineering, pp. 45–54 (2005)
39. van Oorschot, P.C.: Revisiting Software Protection. In: Boyd, C., Mao, W. (eds.)
ISC 2003. LNCS, vol. 2851, pp. 1–13. Springer, Heidelberg (2003)
40. VMPsoft. Vmprotect software, http://www.vmprotect.ru/
41. Wang, C., Hill, J., Knight, J.C., Davidson, J.W.: Protection of software-based
survivability mechanism. In: Proceedings of the International Conference on De-
pendable Systems and Networks (formerly: FTCS), DSN 2001, pp. 193–202. IEEE
Computer Society, Los Alamitos (2001)
42. Wee, H.: On obfuscating point functions. In: Proceedings of the 37th Annual ACM
Symposium on Theory of Computing, STOC 2005, pp. 523–532. ACM, New York
(2005)
Multi-stage Binary Code Obfuscation Using Improved Virtual Machine 181

A Sample Output of Obfuscation

A function named modexp is to be obfuscated:

// modular exponentiation = base^exp % mod


int modexp (int base, int exp, int mod)
{
int c = 1, expNum = 0;
do
{
expNum++;
c = (base * c) % mod;
}
while (expNum < exp);
return c;
}

Fig. 6. CFG of obfuscated modexp function

You might also like