0% found this document useful (0 votes)
41 views57 pages

Hodur Recon2024

The document discusses techniques for reversing and emulating malware command and control (C2) protocols, specifically focusing on the Hodur protocol. It outlines the challenges posed by compiler-level obfuscations, such as control flow flattening and mixed Boolean arithmetic expressions, and presents strategies for overcoming these obstacles. Additionally, it details the development of a scanner and a fake C2 server for validating the protocol's request and response data.

Uploaded by

cr8syoki
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
41 views57 pages

Hodur Recon2024

The document discusses techniques for reversing and emulating malware command and control (C2) protocols, specifically focusing on the Hodur protocol. It outlines the challenges posed by compiler-level obfuscations, such as control flow flattening and mixed Boolean arithmetic expressions, and presents strategies for overcoming these obstacles. Additionally, it details the development of a scanner and a fake C2 server for validating the protocol's request and response data.

Uploaded by

cr8syoki
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 57

1

THE ART OF MALWARE C2


SCANNING - HOW TO REVERSE
AND EMULATE PROTOCOL
OBFUSCATED BY COMPILER

TAKAHIRO HARUYAMA
BINARLY
2
WHO AM I?
• Takahiro Haruyama (@cci_forensics)
• Principal Security Researcher at Binarly
• Previously Staff Threat Researcher at Carbon Black TAU
• Past Research
• Scalable RE automation (e.g., hunting vulnerable drivers)
• Anti-Forensics (e.g., firmware acquisition MitM attack)
• Malware Analysis (e.g., Internet-wide C2 scanning)
3
AGENDA
BACKGROUND
PEELING HODUR: DEFEATING
COMPILER-LEVEL
OBFUSCATIONS
HODUR PROTOCOL REVERSING
HODUR PROTOCOL EMULATION
WRAP-UP
4

BACKGROUND
5
WHY MALWARE C2 SCANNING?
• IP reputation is not effective for catching fresh C2s
• Internet-wide C2 scanning is beneficial from both
detection and threat intel perspectives
6
HOW MALWARE C2 SCANNING?

Protocol reversing Protocol emulation

• Identify • Develop PoC scanner


• Data format • Validate
• Encoding/encryption request/response with
algorithm fake/real C2
7
CASE: PLUGX
• Long used, but still many variants in the wild
• Most variants has almost the same C2 protocol except the
packet encoding algorithm
• The “Hodur” variants (aka MiniPlug) were obfuscated
with multiple methods likely applied at compile time
• EclecticIQ and Check Point reported the latest variants last
year, but no one had described the updated C2 protocol
details
• I focus on the Hodur de-obfuscations, then explain the
protocol reversing and emulation briefly
8

PEELING HODUR:
DEFEATING
COMPILER-LEVEL
OBFUSCATIONS
9

CONTROL FLOW
FLATTENING
DEFEATING COMPILER-LEVEL OBFUSCATIONS
WHAT’S CONTROL FLOW 10

FLATTENING?
• Control flow flattening (CFF) transforms a program's
control flow to make it much harder to understand,
while preserving the original functionality
First Block(s)

Control Flow
Dispatcher(s)

Flattened
Blocks

http://tigress.cs.arizona.edu/transformPage/docs/flatten/index.html
11
HOW CFF WORKS
• Control flow dispatchers decide which block to
execute next based on a state variable
• The state variable is updated in first/flattened blocks
CONTROL FLOW UNFLATTENING: 12

BASIC STRATEGY
1. Identify control flow dispatchers and state variables
2. Trace back the state variable values from the end of
flattened blocks
3. Associate the values with the block IDs
4. Re-order the code flow based on the associations
• I Use IDA Pro microcode for the unflattening task
• Intermediate representation used by Hex-Rays decompiler
• We can implement the algorithm in the optblock_t callback
CONTROL FLOW UNFLATTENING: 13

BASIC STRATEGY
1. Identify control flow dispatchers and state variables
2. Track back the state variable values from the end of
flattened blocks
3. Associate the values with the block IDs
4. Re-order the code flow based on the associations
• I Use IDA Pro microcode for the unflattening task
• Intermediate representation used by Hex-Rays decompiler
• We can implement the algorithm in the optblock_t callback
CONTROL FLOW UNFLATTENING: 14

IDA MICROCODE TOOL HISTORY


• HexRaysDeob (2018)
• The first implementation breaking CFF
• Ported to IDAPython by Hex-Rays (2019)
• Tested on only one binary, so some versions implemented
• APT10 ANEL (2019), Emotet (2022)
• D-810 (2020)
• Effective for not only OLLVM but also Tigress Flatten
• Works reliably with different binaries
15
D-810 ISSUES
• D-810 worked for the most functions of the Hodur
samples, but some key functions related to the C2
protocol were still flattened
• Additional CFF settings?
• Two issues
1. The control flow dispatcher detections failed
2. The block state variable tracking failed
ISSUE1: CONTROL FLOW 16
DISPATCHER DETECTION FAILURE
dispatcher

• The dispatcher
detection algorithm
misses dispatchers
whose predecessors
are conditional jumps
by the state variable
• The genmc plugin
was useful for
troubleshooting

predecessor
17
ISSUE1: FIX
• I added another dispatcher detection algorithm
• The algorithm simply guesses a dispatcher block based on
the biggest number of predecessors
• The dispatcher will be validated based on the entropy
value of the state variable (only effective for OLLVM)
18
ISSUE1: FIX
• I added another dispatcher detection algorithm
• The algorithm simply guesses a dispatcher block based on
the biggest number of predecessors
• The dispatcher will be validated based on the entropy
value of the state variable (only effective for OLLVM)
ISSUE2: BLOCK STATE VARIABLE 19

TRACKING FAILURE
• The state variable tracking fails if the value is assigned
in the first blocks
• D-810 only traces in the flattened blocks and doesn’t
recognize the dispatcher has been reached -> loop L

The value is assigned


Tracking fails

D810.emulator - WARNING - Can't evaluate instruction: ..Variable '%var_depend_on_a10_1.4{24}' is not defined


D810.tracker - DEBUG - Computing: ['ebx.4'] for path [8, 22, 44, 45, 46, 47, 48, 49, 50, 8, 9, 35, 36, 109, 110, 111, 112]
20
ISSUE2: FIX
• The added code detects dispatchers in tracking and
resumes the tracking from the end of the first blocks
• The unflattening performance is also improved
21
ISSUE2: FIX
• The added code detects dispatchers in tracking and
resumes the tracking from the end of the first blocks
• The unflattening performance is also improved
22

MIXED BOOLEAN
ARITHMETIC
EXPRESSIONS
DEFEATING COMPILER-LEVEL OBFUSCATIONS
23

• Mixed Boolean
Arithmetic (MBA)
expressions
transform a
simple expression
into a complex
but semantically
equivalent form

The same encoded string


is decoded in different
expressions
24
SIMPLIFYING MBA EXPRESSIONS
$ ipython
1. Find an obfuscation pattern
and hypothesize for In [1]: import z3
simplification In [2]: x, y = z3.BitVecs("x y", 8)
2. Validate the hypothesis by
equivalence checking In [3]: s = z3.SolverFor("QF_BV")

In [4]: s.add((~(x ^ ~y)) != (x ^ y))


• e.g., using Z3 or Arybo
In [5]: s.check()
3. Replace the pattern with the Out[5]: unsat
simplified one
$ iarybo 8

In [1]: ~(x ^ ~y) == x ^ y


Out[1]: True
25
SIMPLIFICATION ON IDA + D-810
• D-810 uses a custom AstNode class to represent an
(abstract) microcode instruction
• I could easily define several new replacement patterns
• genmc is useful to show microcode instruction structures
26
SIMPLIFICATION ON IDA + D-810
• D-810 uses a custom AstNode class to represent an
(abstract) microcode instruction
• I could easily define several new replacement patterns
• genmc is useful to show microcode instruction structures
27
LIMITATION
• More functions, more complicated patterns L
• It was difficult to defeat all MBA expressions perfectly
• I only handled interesting patterns, especially related to
the string decoding used by the samples
28

POLYMORPHIC
STACK STRINGS
DEFEATING COMPILER-LEVEL OBFUSCATIONS
29
STACK STRINGS
• All strings are constructed and decoded in the stack area
• After defeating CFF and MBA expressions, the decoding
algorithm was identified
• enc[i] ^= (i + Const) ^ Const
• The constant value is different per function
COPYING THE ENCODED STRING 30

BYTES INTO STACK


• Sometimes the Hex-Rays decompiler partially recognizes the
copy or only shows the assignments
• For static decoding, we need to
• Construct the bytes from the assigned variables

• Detect the length and constant value used in the decoding algorithm

Combination of
global variable and
hard-coded bytes

Length and
constant value
31
VARIOUS ACCESS PATTERNS
Additional XORs
before decoding

Referencing
another variable
(enc is decoded)

Defeating MBA expressions


is not perfect

I decided to take an emulation approach


32
EMULATION ISSUE IN GENERAL
• Unicorn-based flare-emu library provides users with a flexible
interface for scripting emulation tasks on IDA
• The iterateAllPaths API emulates all basic block paths in a
function
• Looked to be useful to de-obfuscate stack strings (e.g., ironstrings)
• This API emulates only once per basic block
• I modified the code to reproduce xor loops detected by CAPA
33
EMULATION ISSUE IN THIS SAMPLE
• The flare-emu API takes only one path in CFF functions
• The code simply tracks basic block successors
• The search ends when revisiting the CFF dispatchers
• Microcode-based solutions
• Emulate x86 code in an unflattened microcode block order
• Extend D-810 microcode emulation functionality
• I tried both a little bit, but I realized that they are not
straightforward L
34
SOLUTION
• I utilized another flare-emu API (emulateRange) that
emulates the code as is, without changing the code flow
• Some quick hacks added to flare-emu (e.g.,
LoadLibrary/GetProcAddress hook, infinite loop detection, etc.)
• The created script worked for 58% of the tested functions
• I also implemented a script based on the IDA debug hook
class (DBG_Hooks) to handle the failed functions
• Not elegant, but the combination covers most strings
quickly
35
SOLUTION (CONT.)
• Both scripts recover argument strings on call instructions in
emulation/debugging
• The information such as calling convention and argument type is
taken through the Hex-Rays decompiler APIs
• The sample dynamically resolves all API addresses except
GetProcAddress after decoding the API name strings
• When an address assignment is detected, the script applies the
API function type to the local variable pointer
• GetTypeSignature() written by Rolf Rolles
36

Set type to the local variable by


ida_hexrays.modify_user_lvars()

Set type to the operand of the call instruction


by ida_nalt.set_op_tinfo()
37
SOLUTION (CONT.)
• The scripts still don’t
cover all strings
• A semi-automatic
script handles minor
cases individually
• flare-emu
emulateSelection +
static decoding
38
IDA_CALLSTRINGS SCRIPTS
Used Library Static Flare-emu Flare-emu Flare-emu IDA
and API decoding iterateAllPaths emulateRange emulateSelection DBG_Hooks

Automated? Yes Yes Yes No Yes

Effective for No Yes Yes No Yes


another
malware?
Effective in Yes No Yes - Yes
CFF funcs?
API func No Yes Yes No Yes
type set?
Limitation Strings used Modifications All execution Manual selection Strings used
by memcpy needed to paths not required during
flare-emu and covered debugging
CAPA
39

HODUR PROTOCOL
REVERSING
40
PROTOCOL OVERVIEW
• The latest Hodur samples only support HTTP/HTTPS
• Two header values (Sec-Dest/Sec-Site) used to
authenticate clients
• GET request for the initial handshake
• A RC4 key returned
• Periodical POST requests to receive C2 commands
after the handshake
• The request/response data are encrypted with the key
41
AUTHENTICATION HEADERS
• Sec-Dest: %2.2X%ws (e.g., “7BnqmmCg”)
• A random byte (0x64-0x99)

• 0x64 + 0-0x35 by QueryPerformanceCounter


• A random 6 characters

• The checksum depends on the method


In [2]: sum(b for b in b'nqmmCg') & 0xff
• GET = 99, POST = 88 Out[2]: 99

• Sec-Site: %2.2X%2.2X%ws (e.g., “896B2AC144C9E2E09836”)


• Two random bytes (0x64-0x99)

• 8-bytes victim ID generated by time-related APIs


42
INITIAL HANDSHAKE
• GET request with the authentication headers
• A RC4 key is returned if the header values are valid
• If not valid, no content returned
• The Hodur sample code checks if the Content-Type is
application/octet-stream
• The Content-Length was unknown at static analysis but
revealed during the scanner development
43
AFTER HANDSHAKE
• The sample receives a C2 command by POST requests
• The POST request and response data are encrypted
using RC4
• The POST data header is the same as the PlugX variants,
but the head key is not used
• The C2 response body also has the same header
44
POST DATA PAYLOAD
45

HODUR SCANNER
DEVELOPMENT
46
FAKE C2 SERVER FOR VALIDATION
• Developed a fake C2 server to validate the request
data of the PoC scanner and other recent samples
• fakenet (IP diverter) + Python HTTPS server

POST request [*] Validating Sec-Dest..


validation [+] Prefix number 0x95 is valid
[+] The hash of the random bytes b'xbsYpB' matches 88
[*] Validating Sec-Site..
[+] Prefix numbers 0x7f/0x8e is valid
[+] victim_id='F4EB6EF3A8882016’
..
[+] The decrypted POST data is saved as dec_post_data.bin
[*] Responding with PlugX custom header data.. (C2 command = 0x7002)
47
HUNTING RECENT SAMPLES
o_imm
• VT-retrohunted
fixup using yara_fn
o_mem

o_displ

o_near

{ 55 8B EC 6A ?? 68 ?? ?? ?? ?? 64 A1 ?? ?? ?? ?? 50 81 EC ?? ?? ?? ??
53 56 57 A1 ?? ?? ?? ?? 33 C5 50 8D 45 ?? 64 A3 ?? ?? ?? ?? 89 65 ??
8B 45 ?? 50 8D 8D ?? ?? ?? ?? E8 }
48
HUNTING RECENT SAMPLES (CONT.)
• One of the rules hit the
latest sample in Dec last
year
• CFF was not applied to
the sample
• The C2 included in the
sample was active J
• I could check the
Content-Length and the
format of the GET
response
49
APPROACH BASED ON VALIDATION
• All recent samples had exactly the same C2 protocol
encryption and data format
• Every sample’s C2 protocol/port is HTTPS/443
• No need to send the POST request after handshake
• The C2 likely responded without content until commands
are specified by operators
• I started to implement a scanner just checking the
difference between GET requests with/without the
authentication headers
50
TLS HANDSHAKE ISSUE
• OpenSSL caused an internal error during the TLS
handshake
* TLSv1.0 (OUT), TLS header, Certificate Status (22):
* TLSv1.3 (OUT), TLS handshake, Client hello (1):
* TLSv1.2 (IN), TLS header, Certificate Status (22):
* TLSv1.3 (IN), TLS handshake, Server hello (2):
* TLSv1.2 (IN), TLS handshake, Certificate (11):
* TLSv1.2 (IN), TLS handshake, Server key exchange (12):
* TLSv1.2 (IN), TLS handshake, Server finished (14):
* TLSv1.2 (OUT), TLS header, Unknown (21):
* TLSv1.2 (OUT), TLS alert, internal error (592):
* error:0800006A:elliptic curve routines::point at infinity
* Closing connection 0
curl: (35) error:0800006A:elliptic curve routines::point at infinity
51
TLS HANDSHAKE ISSUE (CONT.)
• I tested major open source TLS clients
• Only LibreSSL (pylibtls) worked for the TLS handshake

OpenSSL Mbed TLS wolfSSL LibreSSL


(python-mbedtls) (wolfssl-py) (pylibtls)
Tested version 1.1.1k, 3.0.2, 2.28.6 5.6.0 3.8.2
3.2.0
Worked? No No No Yes
52
DETECTION BY THIRD PARTY SCANS
• Shodan haven't been able to recognize the port since at least last Dec
• Censys can detect the port but the protocol is UNKNOWN (not HTTPS)
53
INTERNET-WIDE SCANNING WORKFLOW
• Automate with Python (Use asynchronous I/O for OpenSSL/JARM scans)
• Exclude as much as possible before the pylibtls scan

ZMap OpenSSL JARM pylibtls

• Get the list of • Try TLS • Match the • GET request


hosts open at handshake JARM fingerprint with/without auth
TCP/443 • Cause an value of the headers
internal error? Hodur C2? • Get a RC4 key-like
string only when
sending with the
headers?
54
RESULT
• Two C2 servers were found late last December
• 149[.]104.12.64 and 45[.]83.236.105
• Two months later, Trendmicro referred to the C2s in the
blog
• But they are still active
55
DEMO
56

WRAP-UP
57
WRAP-UP
• Defeating compiler-level obfuscations is easier than
before
• 2-3 months for APT10 ANEL -> 3-4 weeks for Hodur
• We still need to improve or create tools when RE requires
de-obfuscating code precisely
• Code will be available online after the conference
• The developed scanner keeps tracking the malware
C2s on the Internet
• We can respond proactively using the intel

You might also like