Inside Code Virtualizer
Inside Code Virtualizer
by scherzo - [email protected]
1
Contents
1 Introduction 3
1.1 About Code Virtualizer . . . . . . . . . . . . . . . . . . . . . . . 3
5.3 Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
DISCLAIMER
2
1 Introduction
protect their sensitive code areas against Reverse Engineering. Code Virtualizer
has been designed to enact high security for your sensitive code while requiring
Code Virtualizer will convert your original code into Virtual Opcodes that
and the Virtual Machine itself are dierent for every protected application,
Code Virtualizer can protect your code in any x32 and x64 native PE les,
like executable les (EXEs), system services, DLLs, OCXs, ActiveX controls,
This article aim to explain how Code Virtualizer works. During the last
month, I spent all my free time analysing the Code Virtualizer Demo 1.0.1.0
that this is the best software I have seen before. Not best in the meaning of
protection, but in the meaning of organization. This was the most pleasing
Three important things to notice are that the description and explanation
Most things that I am going to say are applicable only for the 1.0.1.0 version
of Code Virtualizer. For comments on new versions, see "Hopes for the Future
This article is divided in three parts. Firstly I am going to talk about how the
3
Virtual Machine is generated and why Oreans[4] says that each Virtual Machine
has its own characteristics. Secondly I use the concepts described before to
explain how the Virtual Opcodes are generated, how they are executed and
why they emulate the original code of an application. The last part is a bonus:
you are going to learn how to make an unpacked version of Code Virtualizer
full.
Enjoy this article and I hope you learn something reading it.
tually, not me but Oreans developers did that probably refering to the Themida
Virtual Machine.
Basically each Virtual Machine has 150 handlers and a main handler. By
handler, I mean a kind of function that will deal with the Virtual Opcodes.
In general, they are small (one to six lines of assembly code) and it is really
Next I will show the rst structure that I called Handler_Information and
• DWORD start // the address of the start of this handler in the Code
Virtualizer le
• DWORD end // the address of the end of this handler in the Code Vir-
tualizer le
• DWORD address // the address of the start of the handler in the protected
le
4
• WORD order // random number from (0Eh to A4h) that will indicate the
place of the handler in the protected le
This structure is the principal one to generate the VM. I will not show
you each of the 150 handlers. This is tedious but if you want to study Code
Virtualizer deeper, you must read and understand one by one. I will show you
just the handler I showed in the gure above (gure 1; id = 0000h; start =
5
There is a particularity in the main handler: you can see three times the
The rst DWORD is the address of the seventh line of the main handler in the
protected le. The second one is the "image base" of the Virtual Machine. The
and the Virtual Machine itself are dierent for every protected application,
avoiding a general attack over Code Virtualizer." is not a very important feature.
The rst step done by Code Virtualizer is to write the main handler. Next the
This piece of code looks for LODS instructions. This is not applicable for the
handlers 0154h and 0156h. But why these checks? Well, the LODS instruction
the security, Oreans developers insert random code after the LODS instruction.
To do that, they use another structure that I have called Special_Handler. Here
6
you are:
So before those operations, the handler 0000h (gure 2) will be like this:
7
Figure 6: Handler 0000h before addition of 4 instructions
The next step is another security feature. Some kind of instructions are
means that the code of each handler will be obfuscated and more, this mutation
this option only changes the complexity of the mutated opcode. This is really
low or highest in a general attack to Code Virtulizer (for more comments, see
8
Before that, a JUMP to the main handler is written so the next handler will
be called.
The next security feature is quite fun to see: all the 150 handlers are mixed
So in the end there will be a complete obfuscated, unique and dicult code
those handlers work but I promise that it will be clear in the section 3.3.
The gure below shows a macro not virtualized. The code that will be
Next a PUSH 0040108Dh and RET will be added to the original code so the
9
After that, the exported function Oreansf1.F1 disassembles the original code
as you can see below. It was really a surprise to me when I saw that; I hoped
that Code Virtualizer would threat the code through the bytes of the original
code not through strings. It uses Delphi functions to threat strings and I think
the principal and most complex work of assemble the assembly code in a Code
Virtualizer syntax and generate the most important structure that I have called
OreansX2.
OreansX2 structure:
10
• WORD unknown // unknown use
OreansX2.instruction instruction
00 LOAD
01 STORE
02 MOVE
03 IFJMP
04 EXTRN
05 UNDEF
06 IMULC
07 ADC
08 ADD
09 AND
0A CMP
0B OR
0C SUB
0D TEST
0E XOR
0F MOVZX
10 MOVZX_W
11 LEA
12 INC
13 RCL
14 RCR
15 ROL
16 ROR
17 SAL
18 SAR
19 SHL
1A SHR
1B DEC
1C NOP
1D MOVSX
1E MOVSX_W
1F CLC
20 CLD
21 CLI
22 CMC
23 STC
24 STD
25 STI
26 HLT
11
Table 3: Table of possible instructions for OreansX2 structure (cont.)
OreansX2.instruction instruction
27 BT
28 BTC
29 BTR
2A BTS
2B SBB
2C MUL
2D IMUL
2E DIV
2F IDIV
30 BSWAP
31 NEG
32 NOT
33 RET
OreansX2.sux sux
00
01 ADDR
02 %sADDR, %d
03 %sADDR, %.8x%h
04 BYTE PTR %s[ADDR]
05 WORD PTR %s[ADDR]
06 DWORD PTR %s[ADDR]
07 QWORD PTR %s[ADDR]
08 %sBYTE PTR [%.8x%h]
09 %sWORD PTR [%.8x%h]
0A %sDWORD PTR [%.8x%h]
0B %sQWORD PTR [%.8x%h]
0C ADDR, BYTE PTR %s[%.8x%h]
0D ADDR, WORD PTR %s[%.8x%h]
0E ADDR, DWORD PTR %s[%.8x%h]
0F ADDR, QWORD PTR %s[%.8x%h]
10 %s%d
11 %s%.8x%h
12 reserved
13 reserved
14 reserved
15 reserved
12
Table 5: Table of possible suxes (cont.)
OreansX2.sux sux
16 reserved
17 reserved
18 BYTE
19 WORD
1A DWORD
1B QWORD
1C reserved
1D reserved
1E FLAGS
1F %s[ADDR]
20 %sBYTE %d
21 %sWORD %d
22 %sDWORD %d
23 %sQWORD %d
As you can see, the syntax is quite logic. It uses XOR, ADD, etc. for
well known instructions and obvious names like MOVE, STORE, LOAD for
"special" instructions; the suxes use a single variable ADDR and well known
from the original code disassembled but I think that this is not a problem if
you do some tests to see the pattern. Next I show you one assembly instruction
respective OreansX2 structure (see the le [5] for more examples).
13
Figure 10: Example of Code Virtualizer syntax
I do not know if you have noticed it, but the rst parameter of the rst
OreansX2 structure above is 80000002h. 02 means MOVE as you can see in the
Table 2, but this 80 means that this instruction has a relative address. That is,
the address F0000028h is relative to the image base of the Virtual Machine.
be done to reach the next structure that I have called Pre_Handler. The size
14
• BOOL is_special // True if the original opcode is any kind of call, jump,
conditional jump and others. In this case, a special structure will be gen-
erated for those instructions
• 7 bytes unknown
So now the principal structure that is directly related with the Virtual Op-
• WORD handler // this is the principal parameter: it is the the one who
will determine what handler must be called. It is equivalent to Han-
dler_Information.id
15
• BYTE type_of_handler // 0 if the handler does not read Virtual Opcodes
through LODS intrscution. 1, 2, 4, 8 if the handler reads 1, 2, 4, 8 Virtual
Opcodes
• DWORD data1 // data for the Code Virtualizer instruction (like for ex-
ample LOAD 18h, data1 will be 18h)
• DWORD data2 // data for the case of 64-bit Code Virtualizer instrution
This is not so complicated but if I put each case here, this article would be
too big. So I will just comment how this works and if you want more details see
[6].
Basically each vector of Handler structures starts with the handler 015Bh
and ends with the handlers 0161h and 015Ch. The handlers 015Bh and 015CH
do not exist actually. They are there just to tell Code Virtualizer that special
initiated and when it is nished. This special code will be showed shortly.
tures is generated for each of the cases: MOVE, LOAD, STORE, SHL, ADD,
16
SUB, IFJMP, RET, UNDEF and default case (for the others Code Virtualizer
instructions). You can see more details about those sequences in [6].
can nally understand the brilliant part of Code Virtualizer: how the Virtual
The rst thing to say is about when Code Virtualizer nds the handlers
015Bh and 015Ch. There is a pre-built virtualized code (this means that the
Code Virtualizer instructions and the others structures are not there) that is
responsible to initialize and unitialize the Virtual Machine for example, catching
or returning the registers and ags before the protected application executes its
Virtual Opcodes.
So now I am going to talk about the generation of Virtual Opcodes given the
Handler structure. The rst thing that Code Virtualizer does is quite surprising.
is, those Virtual Opcodes are going to be executed but they will not change
anything in the program (like a sequence of NOPs) and so they are useful to
obsfucate the real Virtual Opcodes. Besides, there are ve dierent sequences of
"fake" Virtual Opcodes diculting even more the analysis of the program. And
I say more, the option Virtual Opcode Obfuscation (low, normal, high, highest)
is strictly related (I meant only related) with these "fake" Virtual Opcodes.
Depending on that option, the chance of the random number generator allow the
recursively execution of the specic CALL more than one time can be increased
there can be a lot of "fake" Virtual Opcodes. They can increase the size of the
Unless the "fake" Virtual Opcodes, you can say that the Virtual Opcodes
would be identical if you protect an application twice and compare the Virtual
Opcodes. What make them dierent, is a global variable in the Code Virtualizer
17
So if the handler 0010h must be called, given the Handler_Information.order
and the Special_Handler structure (see section 2.1 and 2.2 for the explanation
of these structures), the inverse operations of the ones described in Table 1 (that
is ADD, SUB, XOR) will be executed to reach the correct Virtual Opcode. The
Machine and the execution of the Virtual Opcodes. To do that, I will use a
le that I prepare and that does not have fragmented handlers and mutation
engine[7].
The value pushed is the address of the rst Virtual Opcode and the jump is
18
Figure 14: Virtual Opcodes
The code started at 004072D8h is always called before the execution of every
The key is initialized with the the address of the rst Virtual Opcode and it
19
is stored in the EBX register. The ESI register has the current address of the
Virtual Opcode read and the EDI register has the Image base of the Virtual
Machine. The stack is used to store values and the EAX register is used for
So when the code reaches the address 004072D8h, the registers are like this:
Now the byte 62h is read and after some operation with the key (those
random operations explained in the section 2.2; see gure 15), when the code
As you can see, the key was changed and the ESI register was updated. Now
the code jmp dword ptr ds:[edi+eax*4] seems obvious: as EDI has the image
base of the Virtual Machine, the EAX value obtained from the Virtual Opcode
plus some operation is very important to call the handler if you notice that there
20
Figure 18: Piece of table of pointers to handlers
By now, you know how every handler is called and it is possible to explain
why the Virtual Opcodes are unique for every protected application: because
of the key. The key is changed a lot of times and it is address depedent. As the
Virtual Opcodes depend on the key (see section 3.2 for explanation) and the
size of the Virtual Machine is not constant, the Virtual Opcodes are unique.
The rst two instructions of the Main Handler (PUSHAD and PUSHFD)
push onto the stack the registers EAX, ECX, EDX, EBX, ESP, EBP, ESI, EDI
and the Flags. After, the pre-built Virtual Opcodes that I have talked about are
responsible to pop those registers in the rst 38 bytes of the Virtual Machine.
Now a instruction like XOR ECX, ECX will change the value in the address
00407014h. At the end of the execution of the virtualized code, the registers
are restored in their correct position allowing the application to continue its
execution.
Now that is your time. I will not comment every executed line. I gave you
the basis and I hope that the things are more clear now. So trace the example
21
program [7] and understand how the others handlers are executed.
Virtualizer Full
This section is here because, I do not know if you remember that y from
unpack.cn released a version of Code Virtualizer Demo cracked, but his release
only allow the use of more Virtual Machines per application and does not allow
you to protect two or more macros with the same Virtual Machine. To clear
The rst thing you must know is where information about the protection
options are stored and how they are changed to restrict the use of Code Virtu-
alizer (yes, maybe you will not believe me but the custom options are there in
So here I show you the structure that keep information about the protection
options:
respectively
respectively
• Last Section Name: 1 if the name of the last section will not de changed
22
or 0 in the opposite
Below you can see the how the demo version works (by the way, a patch of
So now I can say that the restriction of only normal Virtual Opcodes Obfus-
cation and Virtual Machine Complexity are broken. There is still some patches
Change from JE to JMP to protect more than one time your application:
23
NOP this block of code to avoid the MessageBox Demo Limitation and the
Change the JLE to JMP to allow you increase the number of Virtual Ma-
knowledge that I have learned to you. Besides that, I need to say that I intended
to write a tool instead of this article. And I have also started it but as I am not
a programmer I saw that with the amount of free time that I will have I would
So what I hope is that someone gets interest in writing this tool (just e-mail
me). I can help and even provide source code of what I have coded until now.
24
An important thing to say is that this is a very resumed article. I mean
there is a lot of details that I omited (no time and so tired now to say them)
and others details that I did not notice. If you have any questions or if you saw
something wrong in my article or if you wants to improve this article just e-mail
me.
And my main hope is to see a similar article about the Themida Virtual Ma-
chine. Let me say, this would not be too dicult now before this article mainly
because Themida uses the same DLLs as Code Virualizer and because Oreans
areas but as you said not 100% safe (there is not anyone 100% safe). I think
there is not a similar one in the market too good as this one. Keep your good
• Preprocessing
∗ Find jumps to the macro. This is the place where the original
code was
• Analysis
25
Find the "fake" Virtual Opcodes and eliminate them
• Posprocessing
The two most dicult things are to nd and identify each handler in the Virtual
Machine and to retrieve the original code from the block of Code Virtualizer
intructions.
For the rst thing I say, you have two options: study the mutation engine
and do reverse engineering (very dicult); or as the mutation engine does not
mutate all the opcodes I noticed that it is almost 100% possible to nd each
For the second thing I say, you have two options: study how the Code Vir-
tualizer instructions are generated from the original disassembled code and do
tions and see the pattern. By the way, a hint is that a very well recognizable
handler is used always for every original instruction: the STORE FLAGS. This
makes the work of nd the number of original instructions easier.
This tool must support dierent versions of Code Virtualizer. As the struc-
ture of it does not change, you need to adapt a few things for example new
A fun example: commands like ADD, XOR, SHL, etc. have in general three
handlers; one for the byte operation, one for the word and one for the dword.
But when I rst saw the three handlers for the SHL instruction I saw something
very strange:
26
Figure 27: Code Virtualizer bug
But only in the version 1.2.0.0 we saw: "[!] Fixed Virtualization of "SHL
5.3 Acknowledgments
I must say a big thanks to people who helped me directly and indirectly to write
• softworm: well... what can i say? Without his really good job, this article
References
27
[2] http://www.unpack.cn/viewthread.php?tid=5802&fpage=
1&highlight=code%2Bvirtualizer
[4] http://www.oreans.com/
[7] ..\Annex\handler.exe - this le is included in the le Inside Code Virtial-
izer.rar
[8] http://www.oreans.com/CodeVirtualizerWhatsNew.php
28