0% found this document useful (0 votes)
12 views35 pages

Mas 10

This document is the tenth article in the Malware Analysis Series, focusing on analyzing ELF malware binaries on Linux. It provides a structured approach to malware analysis, including lab setup, retrieving malware samples, and gathering basic information about malicious binaries. The article emphasizes practical steps and tools, such as IDA Pro and Malwoverview, while avoiding unnecessary complexities in the ELF format.

Uploaded by

Елена О
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views35 pages

Mas 10

This document is the tenth article in the Malware Analysis Series, focusing on analyzing ELF malware binaries on Linux. It provides a structured approach to malware analysis, including lab setup, retrieving malware samples, and gathering basic information about malicious binaries. The article emphasizes practical steps and tools, such as IDA Pro and Malwoverview, while avoiding unnecessary complexities in the ELF format.

Uploaded by

Елена О
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 35

https://exploitreversing.

com

Malware Analysis Series (MAS):


Article 10 | Linux
by Alexandre Borges
release date: JAN/15/2025 | rev: A.1

0. Quote

“No thinking - that comes later. You must write your first draft with your heart. You rewrite with your
head. The first key to writing is... to write, not to think”
(William Forrester played by Sean Connery | “Finding Forrester” movie - 2000)

1. Introduction

Welcome to the tenth (and last) article of the Malware Analysis Series, where we are reviewing concepts,
techniques and practical steps used for analyzing ELF malware binaries. If readers have not read my
previous articles yet, all of them are available on the following links:
▪ ERS_02: https://exploitreversing.com/2024/01/03/exploiting-reversing-er-series-article-02/
▪ ERS_01: https://exploitreversing.com/2023/04/11/exploiting-reversing-er-series/
▪ MAS_9: https://exploitreversing.com/2025/01/08/malware-analysis-series-mas-article-09/
▪ MAS_8: https://exploitreversing.com/2024/08/07/malware-analysis-series-mas-article-08/
▪ MAS_7: https://exploitreversing.com/2023/01/05/malware-analysis-series-mas-article-7/
▪ MAS_6: https://exploitreversing.com/2022/11/24/malware-analysis-series-mas-article-6/
▪ MAS_5: https://exploitreversing.com/2022/09/14/malware-analysis-series-mas-article-5/
▪ MAS_4: https://exploitreversing.com/2022/05/12/malware-analysis-series-mas-article-4/
▪ MAS_3: https://exploitreversing.com/2022/05/05/malware-analysis-series-mas-article-3/
▪ MAS_2: https://exploitreversing.com/2022/02/03/malware-analysis-series-mas-article-2/
▪ MAS_1: https://exploitreversing.com/2021/12/03/malware-analysis-series-mas-article-1/
This article offers an introductory analysis of ELF binaries on Linux, and we will be moving slowly and
carefully to not getting in unnecessary details that, at first approach, do not contribute to building the
knowledge. The main goal of this article is to keep it short, simple, and informative, avoiding touching on
tons of details associated with ELF format, and only examining the most important aspects of the malware.

2. Acknowledgments

The year is 2025. Four years ago, I started drafting articles with the sole purpose of helping the hacking and
information security community, and as I could already imagine at that time, it would be challenging to
find time to continue producing content, and indeed this side effect has been confirmed over time.

1|Page
https://exploitreversing.com

As I have been using IDA Pro for a long time, I needed a license of my own, and that was when Ilfak
Guilfanov (@ilfak) and Hex-Rays SA (@HexRaysSA) decided to help me, and since then they have provided
continuous and decisive support to write this Malware Analysis Series (MAS), which is focused on malware
analysis, and the Exploiting Reversing series (ERs), which is my current and long-term series on internals,
vulnerability research and, eventually, exploitation in critical topics such as Windows, kernel drivers,
macOS, browser and hypervisors.
Time flies, and companies around the world have become more demanding in terms of technical skills, but
I still believe that one of the most effective ways to help these professionals is to write articles because
such content can offer a solid method to learn details that would be a bit more difficult in live conferences
or even other media. I still face serious time constraints to write, but I keep trying to do it because, in some
way, I know that these series have been important for people's careers.
Life may be short, but every moment is worth it. Enjoy the journey and keep exploiting it!

3. Lab Setup

This article and next ones I will be using the following lab configuration:

▪ Virtual machine running Ubuntu 24.04.x LTS: https://ubuntu.com/download/desktop


▪ IDA Pro or IDA Home version (@HexRaysSA): https://hex-rays.com/ida-pro/
▪ Malwoverview: https://github.com/alexandreborges/malwoverview

4. Retrieving malware samples

We need to establish a starting point to initiate our article analyzing malware threats for Linux and,
probably, one of the possible approaches is to search for samples and retrieve one of them to start our
work. As readers already know, Linux binaries are represented in ELF format, so we can use Malwoverview
tool to search such samples. One of the recommended sources to list and download samples is Malware
Bazaar, which readers could do by executing the following commands (I specified “-o 0” because I have
used a clear background for the terminal, but if you are using a dark background, so you should omit this
option):

▪ malwoverview -b 2 -B elf -o 0 (list last fifty samples reported as ELF format)


▪ malwoverview -b 2 -B linux -o 0 (list last fifty samples reported as Linux)
▪ malwoverview -b 5 -B <sha256 hash> -o 0 (download sample represented by the given hash)

Once you have downloaded the sample, you can retrieve reports from different services such as Virus
Total, Triage, InQuest, URL Haus, Alien Vault, Polyswarm, Hybrid Analysis and :

▪ malwoverview -v 2 -V <sha256 hash> -o 0 (Virus Total)

2|Page
https://exploitreversing.com

▪ malwoverview -a 5 -A <sha256 hash> -o 0 (Hybrid-Analysis)


▪ malwoverview -j 2 -J <sha256 hash> -o 0 (URL Haus)
▪ malwoverview -p 1 -P <sha256 hash> -o 0 (Polyswarm)
▪ malwoverview -n 4 -N <sha256 hash> -o 0 (Alien Vault)
▪ malwoverview -i 2 -I <sha256 hash> -o 0 (InQuest)
▪ malwoverview -b 1 -B <sha256 hash> -o 0 (Malware Bazaar)
▪ malwoverview -x 1 -X <sha256 hash> -o 0 (Triage)
▪ malwoverview -x 2 -X <triage_id> -o 0 (Triage)
Eventually, some samples have been written in Golang, and even though it is more interesting to examine
binaries written in C/C++, readers can do a quick triage using Yara with a basic (and flawed) rule:

[Figure 1]: Basic Yara rule


It is important to highlight that I am not interested in being precise here, and certainly this rule is far from
being good (the condition clause is horrible, by the way). The idea is that we want to be able to distinguish
between pure C/C++ and Golang binaries when it is necessary.

5. Gathering basic information

I will be using different tools to collect properties of our malicious binary, and I am going to focus on static
analysis. As usual, quite a few concepts and fundamentals will be provided to support and improve the
understanding of the topics exposed. I am going to use a simple example, whose hash is the following one:

▪ SHA256: f864922f947a6bb7d894245b53795b54b9378c0f7633c521240488e86f60c2c5
3|Page
https://exploitreversing.com

[Figure 2]: Triage report

4|Page
https://exploitreversing.com

Download and unzip it: malwoverview.py -x 5 -X 230524-q61ecacg54 -o 0 ; 7z e 230524-q61ecacg54.bin -


pinfected

[Figure 3]: Virus Total report


Apparently, the sample is Sodinokibi ransomware. We can run a brief sequence of commands to collect
eventual evidence about the binary:

[Figure 4]: Getting binary’s information (1)

5|Page
https://exploitreversing.com

We can extract some information from these first two commands:

▪ The binary is 64-bit and has amd64 architecture.


▪ It is dynamically linked then it depends on shared libraries.
▪ There is a unique identification (BuildID), which is normally useful when analyzing core files, for
example.
▪ The binary is stripped, so there is not any symbol information inside.
▪ The binary’s entry point is 0x401650, and usually has the following sequence of functions invoked:
_start_function → __libc_start_main → main.
▪ The program headers, which is a table that provides loaders with information to load a binary
into memory and, as expected, it is more focused on memory mapping, has 9 entries.
▪ The section headers, which provide us with information about the binary itself and it is naturally
focused on binary sections, are counted in 28 entries.

For now, we have not found anything really weird or that represents an issue, and the information shown
so far was already expected.

Thus, we can list the program header table by running the following command:

[Figure 5]: Getting binary information (2)


It is also recommended to retrieve the dynamic segment by running the following command:

6|Page
https://exploitreversing.com

[Figure 6]: Getting binary information (3)


According to the last two outputs, we can do the following considerations:

▪ This binary has 9 program headers, and readers can observe that each section is associated with
one segment, which contains runtime execution.

▪ The main program headers (check Figure 5) have the following description:

o PHDR: this header is a meta-segment, which contains the program header table and some
metadata.
o INTERP: this header an indication on the necessary system loader should be used to load
this binary into memory. In this case, it is /lib64/ld-linux-x86-64.so.2.
o LOAD: this headers provides information such as memory size, permission and alignment to
the loader about how to load the binary into the memory.
o DYNAMIC: this header provides information about the shared library dependencies and
relocations.
o NOTE: this header usually stores meta-information provided the vendor.
o GNU_PROPERTY: this header is usually generated by the linker, and it is used to locate
.note.gnu.property section.
o GNU_EH_FRAME: this header provides the memory address of the stack unwind tables,
which are used by exception handlers (throw or try/catch/finally).
o GNU_RELRO: this header, named as Relocation Read-Only, is used for exploit mitigations
such as -znow (full RELRO mitigation) or -zrelro (partial RELRO mitigation).

7|Page
https://exploitreversing.com

o GNU_STACK: this header, created by the linker, provides information whether stack is or
not executable. Thus, the key information presented by this header is the memory
protection used granted to the stack.

▪ The dynamic section has 28 entries, and a few of them are:

o NEEDED: it indicates main libraries dependencies, which are used by the dynamic linker to
load shared libraries when the program starts. In this case, the binary has a direct
dependency on three shared libraries.
o STRTAB: it holds a reference to the address of a string table, which contains strings used by
the dynamic linker, as expected.
o SYMTAB: it holds a reference to a symbol table, which contains information about symbols
such as functions (FUNC), global data variable (OBJECT), sections (SECTION), thread-local
data variable (TLS) and even symbols that do not have a specific type (NOTYPE).
Additionally, symbols can be shared with external programs and libraries (GLOBAL), not
accessible to other programs and libraries (LOCAL) and can even be used by function’s
implementation and overwritten in a later moment.
o STRSZ: it represents the size of the string table.
o SYMENT: it represents the size of the symbol table.

▪ There are many other dynamic sections, but they are not important for now.
▪ The symbol table referred to above can be checked by running the following command:

[Figure 7]: Listing a symbol table (truncated)


Although we have a list of sections associated with segments using readelf -lW, they are not shown in an
organized way, and a better approach would be using the following command:

8|Page
https://exploitreversing.com

[Figure 8]: Section list

▪ To the list of sections presented above, a concise description of the main ones follows below::

o .text section: it contains instructions, is marked as read-only (as usually we see on PE


binaries too) and code (executable).
o .data section: it contains global and initialized variables.
9|Page
https://exploitreversing.com

o .bss section: it contains global and non-initialized variables.


o .rodata section: it contains global data that should not be changed (const variables).
o .dynamic section: it contains information used by the loader to link and, mainly, make the
binary ready to be execute. Additionally, it also maintains an own string and symbol table.
o .init section: it stores the init function, which works as a constructor, which is called before
the entry point similar to a constructor in a C++ program.
o .fini section: it stores the fini function, which works as a destructor, and it is called right
before the program exit or the library is unloaded, and this behavior is also similar to a
destructor in a C++ program.
o .jcr section: it stores information about Java classes that are required to be registered.
o string table (.dynstr): this section defines and contains all strings needed by the ELF binary,
but it doesn’t contain anything releated to literals used by the program. Thus, it is focused
strictly on strings.
o symbol table (.dynsym): this section defines and contains a table of symbols (name, address
and respective size of functions and variables, in general), which are usually a named
location or even external library’s reference used by the program.
If we are interested in examining potentially interesting sections such as .rodata of this binary, execute:

[Figure 9]: Examining first bytes from .rodata section

10 | P a g e
https://exploitreversing.com

Through the output we found strings related command to be executed (esxclivm process kill --type=force)
and to encryption scripts, which tell us a bit about the malware/ransomware operation.
Repeating the same command, but for .data section, we have:

[Figure 10]: Checking first bytes from .data section


Of course, there is something within this section, but we do not have further details. Apparently it is an
exceedingly long string and includes a series of keys and associated values like a dictionary. In this case,
the obvious movement is extract it using simple commands like strings. Additionally, we can use the same
strings command to search for anything else that could be useful for us:
▪ strings -a -n200 mas_10.bin | jq

11 | P a g e
https://exploitreversing.com

[Figure 11]: Extracting the long string and formatting it


The nbody field can be decoded using base64 -d <Base64 message> command, the result is the following:

12 | P a g e
https://exploitreversing.com

[Figure 12]: Decoded ransomware message


Both extracted and decoded strings show that apparently the encrypted files have extension .vemar, the
website is on the Darknet (as usual), there is a KEY (probably dynamically generated) to submit the form,
and there is an associated UID.

[Figure 13]: Further strings found in the binary

13 | P a g e
https://exploitreversing.com

The only note is that the ransomware is looking for VMware ESXi installations and kill any associated
process before executing any operation.

It is time to install Radare2 to perform a quick triage:


▪ git clone https://github.com/radareorg/radare2
▪ cd radare2
▪ git clone https://github.com/radareorg/radare2
Executing a few basic commands, we have:

[Figure 14]: Radare2 | output 1


There is not any string related to http in data section (iz) or even the whole binary (izz), as expected. At
this point, we can search for additional indicators such as:
▪ encrypt
▪ esxcli
▪ crypto
▪ download (or upload)

[Figure 15]: Radare2 | output 2

14 | P a g e
https://exploitreversing.com

Get a compact list of imported functions, the number of imports and also the number of the functions:

[Figure 16]: Radare2 | output 3


If compared to other binaries and mainly ransomware threats, there are not too many imported functions
or defined functions, and this is good because it makes the analysis less complicated and extensive.
Additionally, there is no evidence that there are C++ functions (mainly those using templates). For
example, a first e very trivial method for searching for C++ functions would be to check for “::”.
To expand the usage on Radare2, you can use its package manager and install plugins such as r2ghidra,
r2dec and r2frida. However, my advice is that you use the stable release (5.9.4 or close version) because
the version from the GitHub might be unstable with plugins. Therefore, the task set is reduced to collect
visualize code and inter-connections:

15 | P a g e
https://exploitreversing.com

▪ curl -Ls https://github.com/radareorg/radare2/releases/download/5.9.4/radare2-5.9.4.tar.xz |


tar xJv
▪ radare2-5.9.4/sys/install.sh
▪ apt install ninja
▪ apt install make
▪ ap install meson
▪ r2pm -Uci r2ghidra r2dec r2frida
Files related to radare2 are installed in /root/.local/share/radare2 directory. For example, you can follow
from this point using r2dec plugin for decompiling functions from binary:

[Figure 17]: Radare2 | output 4

16 | P a g e
https://exploitreversing.com

I am not going to proceed using Radare2, but certainly many readers might follow the analysis using it
because messages and instructions there are truly clear.
Check for libraries loaded dynamically at loading time:

[Figure 18]: Shared libraries


This ransomware, like most of other ones, uses multi-threads and there is nothing new here.
If you want to install Detect It Easy (DiE) on your system, I recommend that you read the respective page:
https://github.com/horsicq/Detect-It-Easy/blob/master/docs/BUILD.md

[Figure 19]:Checking further binary information and packers


We see that it is a typical binary:
▪ 64-bit for Ubuntu AMD64.
▪ Compiled using GCC.
▪ Using typica GLIBC.
▪ No further external libraries.
Readers could use signsrch (wget https://aluigi.altervista.org/mytoolz/signsrch.zip), which can be easily
compiled as shown below, to detect eventual algorithms being used by the malware:
▪ unzip signsrch.zip -d signsrch_dir
▪ cd signsrch_dir/src ; make
▪ cp ../signsrch.sig .

17 | P a g e
https://exploitreversing.com

[Figure 20]: Signsrch output


You can also check the file’s entropy, and there are diverse ways to do it, where one of them is using the
simple ent command (apt -y install ent), as shown below:

[Figure 21]: ent command


Another tool that might produce some information to be used later is binwalk (apt -y install binwalk), but
this time the information is not relevant, as shown below:

18 | P a g e
https://exploitreversing.com

[Figure 22]: binwalk output


Based on indicators collected by ldd, DiE, signsrch, ent and binwalk commands, a list of comments follows
below:
▪ The binary is 64-bit and has amd64 architecture.
▪ It is dynamically linked then it depends on shared libraries.
▪ There is a series of cryptographic constants being used by ransomware, which suggests that it might
be using algorithm SALSA20 (stream cipher) and AES for symmetric encryptions. Both facts are not
an absolute certainty, and they should be checked during the analysis.
▪ The binary is stripped, so there is not any symbol information inside.
▪ The binary was compiled using GCC, which could be useful information while reversing it.
▪ The entropy (5.66) is low, which can suggest a non-packed binary.
▪ The program headers, which is a table that provides loaders with information to load a binary
into memory and, as expected, it is more focused on memory mapping, has 9 entries.
▪ The section headers, which provide us with information about the binary itself and it is naturally
focused on binary sections, are counted in 28 entries.
▪ The shown .init section stores the init function, which works as a constructor, and it is called before
the entry point. Likewise, the .fini section stores the fini function, which works as a destructor, and
it is called right before the program quitting, or the library being unloaded.
The malicious binary is using a series of shared libraries, as expected in most cases, and then there are
many external symbols coming from these libraries too. Once a symbol is found, the loader writes the
symbol’s address into a location (actually a slot within a table) indicated by the relocation entry (check
readelf and objdump outputs).
Later, when such a symbol needs to be used by the malware, a call will be performed using the same slot
of the table which contains the symbol’s address, which remains constant for later occurrences. The
mentioned table used for dynamic relocations is named GOT (Global Offset Table), and readers can see a
reference to it by looking for .got section in previous commands.
There is another table used by dynamic allocation that is called PLT (Procedure Linkage Table), which is
associated with the .plt section (as expected, of course). This table is typically associated with a lazy
symbol binding concept, which means that not all symbols are used when a program launches, so there is a
delay for the resolution procedure until such a symbol is really used.
Thus, functions always call the stub address stored in the PLT and, when a function is called for the first
time, the PLT forwards the call to a resolver function that will find the real address of the symbol and
update the GOT with the real address. Afterwards, it also updated the PLT with the real address of the
symbol because subsequent calls will check the PLT.
These concepts are used by different threats and exploits, and a few examples are:

19 | P a g e
https://exploitreversing.com

▪ Exploits can use GOT to find API functions’ addresses such as system, setuid, memcpy, strcpy and
many other ones. Eventually such an approach would only be feasible if GOT can be found in a
static location (predictable), which happens only with non-PIE programs.

▪ Readers could remember PIE (Position Independent Executable) programs can be loaded in any
address in the memory, which makes finding GOT harder. Obviously, it would not be a concern for
malicious binaries.

▪ There are exploits that replace entries in GOT with addresses pointing to a malicious function or
even a payload.
At the same way that we have seen on Windows operating system, malware threats have injected shared
libraries into processes to read and extract information, escalate privileges, hook and even establishing
persistence, and all of these mentioned attacks start by compromising the process structure (represented
by task_struct), which hold memory maps, opened file descriptors, scheduling information and so on, and
it is related through the slab allocator (and variants) to kmem_cache that caches kernel objects and related
information such as respective pointers and meta-information.
Finally, we have the main tool for any malware triage that is capa (from Mandiant), which one of last versions
can be downloaded from https://github.com/mandiant/capa/releases/tag/v8.0.1. Personally, I copy capa
binary to /usr/local/bin directory, but readers can adopt any approach to put it in a valid path. Executing
capa binary is direct, as shown below:

20 | P a g e
https://exploitreversing.com

[Figure 23]: capa output


Based on information offered by capa, we have an initial direction to proceed with our analysis:
▪ The ransomware apparently uses RC4 and AES as symmetric cryptography.
▪ There are pieces of code in Base64 (we already know this fact).
▪ There is a file and directory discovery portion as performed by any ransomware.
▪ The ransomware removes (of course), moves, reads, and writes files (nothing new).
▪ Threads are created, as expected, due to performance issues.
It is a pretty standard ransomware.
Although it is not important to a malicious binary, you should remember that Linux provides a series of
standard binary protections such as:
▪ NX (no-execute bit): there are regions of memory marked as non-executable.
▪ Stack Canaries: a protection to detect stack overflow.
▪ ASLR: system, application processes and libraries are loaded at random addresses.
▪ PIE: executable binaries are loaded at random addresses every time they are launched.
▪ Fortity-Source: additional checks added to detect buffer overflow and potential runtime
exploitation.
▪ CFI (Control Flow Integrity): it checks that the control flow of the program has not been changed,
which makes the usage of exploitation techniques such as ROP and JOP harder.
▪ Relocation Read-Only (RELRO): GOT is changed read-only after the initial linker resolution, which
prevents GOT from being overwritten by attackers, who aim to detour the execution flow to a
malicious payload.
▪ RTLD_NOW: this flag, used by function such as dlopen( ), forces the symbol resolution and related
relocations for shared libraries to happen at load time.
Of course, there are other protections (most of the time, variations from these ones presented above).
Additionally, there are multiple tools to check some of these binary protections, and such tools can be
installed using the following commands:
▪ binary-security-check: cargo install binary-security-check (of course, the system must have Rust
and associated packages installed).

21 | P a g e
https://exploitreversing.com

▪ checksec:
o git clone https://github.com/slimm609/checksec
o go build

▪ harnening-check: apt -y install devscripts


Using checksec and binary-security-check tools, we have the following:

[Figure 23]: Checking protections using multiple tools


As expected, the purpose of malware or ransomware is not binary security, but the information shown above
highlights a probability that the code is not well-done.

6. Reversing

Open our sample on IDA Pro (my current version is 8.4 SP2). If we check for signatures (View → Open
Subviews → Signatures or SHIFT+F5), we will not see anything there. If you want, you can right-click on
the background and pick up Apply new signature to add signatures such as elf64, libc or even go_std_abi0
(if it was a Golang file), but in this case you will not have success and no function signature will be added,
unfortunately. Using the same approach, we can add Type Libraries (SHIFT+F11). There is already an added
Type Library (gnulnx_x64), and we do not have further useful libraries to include. Finally, and as usual, we
must decompile the whole binary by going to File → Produce File → Create C File, where the IDA will
suggest mas_10.bin.c as filename, and we can accept it.
22 | P a g e
https://exploitreversing.com

The disassembly of the main function (not start function) is shown below:

[Figure 24]: IDA disassembly window | main function


Once we have the binary entirely decompiled, when can use the pseudo-code representation of the binary
and, when it is necessary, we check against the Assembly code to clear eventual doubts or
misunderstandings. Anyway, one of most frequently questions about any malware analysis is how to
initiate the analysis and proceed from the choosen piece of code. There are a few possibilities here:
▪ starting from beginning (main function).
▪ getting orientation by strings.
▪ getting orientation by functions.
▪ starting from important and known functions.
▪ checking for called cryptographic functions.
▪ checking for random parts of the code and, from there, investigating cross references.
▪ checking for functions that manipulate data from sessions such as .rodata or .data (check sections
by using CTRL+S on IDA)
Before taking decisions, you should remember that questions that can be answered (all or part of them)
are almost the same of any other malicious binary, with rare variantion and even purposes as, for example,
a ransomware:

23 | P a g e
https://exploitreversing.com

▪ communication method and C2


▪ cryptography algorithms
▪ persistence
▪ algorithm used
▪ targeted file types and/or folders.
In this article, our focus will be cryptography algorithms. A method to find important functions in the
disassembly code is to use Capa Explorer. To install it on Linux, you have to follow the steps below:
▪ python -m pip install -U flare-capa
▪ Download capa rules from https://github.com/mandiant/capa-rules/releases
▪ wget
https://raw.githubusercontent.com/mandiant/capa/master/capa/ida/plugin/capa_explorer.py
▪ From the home directory, execute: mkdir .idapro/plugins
▪ cp capa_explorer.py .idapro/plugins/
Close and open the IDA Pro again, load the respective mas_10.idb file, go to Edit → Plugins → FLARE capa
explorer (ALT+F5) and indicate the extracted capa-rules directory. Once the analysis has been concluded,
you will see something like:

[Figure 25]: FLARE Capa Explorer

24 | P a g e
https://exploitreversing.com

Some obvious and few observations that can be made about the code:
▪ As signsrch tool, Capa Explorer have also identified several functions involved with cryptography.
▪ AES and RC4 are the main symmetric algorithms used by this ransomware.
▪ Base64 encoding is also present, and it was expected according to what we have seen previously.
▪ For ransomware, operations such as reading, writing, moving, and deleting are completely usual.
However, the information offered by Capa Explorer is limited, and there are multiple parts of the code
involved with cryptography that, eventually, deserves some light on.
The sub_40E2AA routine performs Base64 encoding and decoding, and there are some indications such as
the explicit alphabet and constants. The piece of code below is the decoding part:

[Figure 25]: sub_40E2AA routine: it performs Base64 encoding and decoding


Another indication is the sub_40E001 routine, which presents the following very characteristic code
related to Base64 character verification:

[Figure 26]: sub_40E001 routine: it performs Base64 character verification


The sub_401ED8 routine is the setup of Salsa20’s initial state (known as load/store little-endian in the
Daniel Bernstein implementation and other ones -- the paper is here: https://cr.yp.to/snuffle/spec.pdf),
which has, in very imprecise words, initialization, quarter round, column and row rounds operations,
addition of the original state into the processed one, and keystream generation phases.
25 | P a g e
https://exploitreversing.com

[Figure 27]: Setup of Salsa20


The sub_40173D routine (shown below) performs the Salsa20 byte manipulation because it seems to be a
Quarter-Round function due to the constants and, at the end of routine, there is the double round part
(columns and rows). The constants are not equal to the reference (https://cr.yp.to/snuffle/salsafamily-
20071225.pdf) because the code is using __ROR4__ macro, which is equivalent to return
__ROL__((uint32)value, -count). Additionally, readers can use this reference (https://hex-
rays.com/blog/igors-tip-of-the-week-67-decompiler-helpers) from Hex-Rays about decompile helpers:

[Figure 28]: sub_40173D: Salsa20 quarter-round and double-round operations

26 | P a g e
https://exploitreversing.com

The sub_401F94 routine performs the remaining Salsa20 operations:

[Figure 29]: Finalizing Salsa20 phases


The sub_40CC2B routine is responsible for hashing string using CRC32, where we can see polynomial
0xEDB88320 (https://en.wikipedia.org/wiki/Cyclic_redundancy_check#table). If reads want to read about
CRC32, a good write up is https://github.com/Michaelangel007/crc32. The routine code is shown below:

[Figure 30]: CRC32 hashing


The sub_40E7F5 routine prepares for the AES encryption, and the code is related to the initialization phase
(setup key), according to the key size, which represents the setup phase:

27 | P a g e
https://exploitreversing.com

[Figure 31]: AES setup (cypher key expansion)


The sub_40F370 performs AES encryption itself through key expansion and table-base implementations,
which tables are dword_411EC0, dword_411AC0, dword_4116C0 and dword_4112C0, as shown below:

[Figure 32]: AES encryption (table-base implementations)


The sub_40EEE9 routine prepares and sets the AES key for decryption. At the same form, the sub_40FA4C
routine is responsible for AES decryption itself. As the entire process occurs in an equivalent way to the
encryption phase, I am not going to show it here.
The sub_405801 routine seems to be related to curve2519-donna. The curve25519 is an elliptic curve
(used with ECDH -- Elliptic-curve Diffie-Hellman) developed by Dan Bernstein, and Adam Langley (@agl__)
wrote the donna version. Further information can be found here:
▪ Original code by Adam Langley: https://github.com/agl/curve25519-donna

28 | P a g e
https://exploitreversing.com

▪ File used to identify the algorithm: https://github.com/agl/curve25519-


donna/blob/master/curve25519-donna.c
▪ Wikipedia: https://en.wikipedia.org/wiki/Curve25519
Within sub_405801 routine, I have found sub_403792, which is responsible for the reduced coefficient
form for the input, and that was my hint on the elliptic curve mentioned:

[Figure 33]: curve25519-donna: coefficient reduction


The sub_40C6E4 routine does not seem very meaningful, as shown below:

[Figure 34]: sub_40C6E4 routine

29 | P a g e
https://exploitreversing.com

However, the sub_40C62D routine, which is called by sub_40C6E4 routine, shows the there is a generation
of a random value, which helps the sub_40C6E4 routine to produce a 32 bytes random value:

[Figure 35]: sub_40C6E4 routine


The sub_408C7A routine represents the Keccak algorithm, which is the base for SHA-3.

[Figure 36]: sub_408C7A routine: Keccak

30 | P a g e
https://exploitreversing.com

The sub_40E7F5 routine sets the AES key for encryption (128, 192 and 256 bits), and the prototype and
respective arguments follow:
__int64 __fastcall sub_40E7F5(
int *a1,
unsigned __int8 *a2,
int a3)
{

▪ a1: rk | round key (obviously, it depends on the number of rounds | 4 * (Nr + 1)


▪ a2: cipherKey | input cypher key
▪ a3: the length of the key (128, 192, 256)
The routine is given as:

[Figure 37]: sub_40E7F5 routine


The sub_40C810 routine comes up after we have quickly (actually, very quickly) analyzed previous
routines, and, in general words, we see that it:
▪ Check if any data is provided to encrypt.
▪ Curve25519 donna and Keccak provides a better, efficient, and more secure method for key
exchange. Thus, in this case, the asymmetric algorithm (curve25519 donna) is encrypting the
symmetric key (AES) for exchange.
▪ Given the session key, the data is encrypted

31 | P a g e
https://exploitreversing.com

▪ A CRC32 hash of encrypted data is calculated and appended to the final encrypted data.
The piece of code putting all routines together is shown below:

[Figure 38]: sub_40C810 routine


From this point onward all routines are simpler than those ones involving cryptographic operations. The
first one is sub_405913 routine, but first we need to learn the value associated to off_616228, format and
off_616238, which are shown below:

[Figure 39]: sub_40C810 routine


Even though that the representation is not so clear, readers can realize that the line aims to kill any
instance of ESXi. As expected, the sub_405913 routine use all of these elements to accomplish the
objective of killing instances of the ESXi:

32 | P a g e
https://exploitreversing.com

[Figure 40]: sub_405913 routine | killing ESXi instances


The sub_405ED8 routine starts the operation of file encryption , as shown below:

[Figure 41]: sub_405ED8 routine | file encryption

33 | P a g e
https://exploitreversing.com

The sub_40CF44 routine generates a JSON file, which probably contains information about the target
system:

[Figure 42]: sub_40CF44 routine | JSON file generation


The sub_40D53F routine loads the JSON file:

[Figure 43]: sub_40D53F routine | loads the JSON file


The sub_4064FD routine parses each directory, encrypt its respective files and drops a ransomware note
(sub_4059B3 routine). Additionally, note that the routine is recursive (line 30):
34 | P a g e
https://exploitreversing.com

[Figure 44]: sub_4064FD routine | encrypt files within a directory

7. Conclusion

In this article we have reviewed a few basic concepts of analyzing an ELF binary as also we perform a quick
and superficial analysis of a ransomware sample.
Finally, I have accomplished my promise in producing a ten-articles series! This was the last article of the
Malware Analysis Series (MAS), and I hope you have enjoyed reading all articles over the years. As you
already know, I moved to another area (vulnerability research) a bit more than a couple of years ago, and
now I really do something I have passion to do. Therefore, that is my advice. Follow your heart.
Just in case you want to stay connected:
▪ Twitter: @ale_sp_brazil
▪ Blog: https://exploitreversing.com
Keep learning, reversing, and exploiting everything, and I will see you next time!
Alexandre Borges

35 | P a g e

You might also like