0% found this document useful (0 votes)
24 views19 pages

Sec21 Alrawi Forecasting

Uploaded by

hifzanahichu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
24 views19 pages

Sec21 Alrawi Forecasting

Uploaded by

hifzanahichu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 19

Forecasting Malware Capabilities From

Cyber Attack Memory Images


Omar Alrawi, Moses Ike, Matthew Pruett, Ranjita Pai Kasturi, Srimanta Barua,
Taleb Hirani, Brennan Hill, and Brendan Saltaformaggio, Georgia Institute of Technology
https://www.usenix.org/conference/usenixsecurity21/presentation/alrawi-forecasting

This paper is included in the Proceedings of the


30th USENIX Security Symposium.
August 11–13, 2021
978-1-939133-24-3

Open access to the Proceedings of the


30th USENIX Security Symposium
is sponsored by USENIX.
Forecasting Malware Capabilities From Cyber Attack Memory Images

Omar Alrawi*, Moses Ike*, Matthew Pruett, Ranjita Pai Kasturi, Srimanta Barua,
Taleb Hirani, Brennan Hill, Brendan Saltaformaggio
Georgia Institute of Technology

Abstract analyst, slowing down the investigation and giving the


attackers an advantage.
The remediation of ongoing cyber attacks relies upon
To automate incident response, symbolic execution is
timely malware analysis, which aims to uncover
promising for malware code exploration, but lacks the
malicious functionalities that have not yet executed.
prior attack execution state which may not be
Unfortunately, this requires repeated context switching
re-achievable after-the-fact (e.g., concrete inputs from
between different tools and incurs a high cognitive load
C&C activity). Environment-specific conditions, such as
on the analyst, slowing down the investigation and
expected C&C commands, limit dynamic and concolic
giving attackers an advantage. We present Forecast,
techniques (e.g., [3]–[14]) from predicting inaccessible
a post-detection technique to enable incident responders
capabilities. In addition, these techniques depend on
to automatically predict capabilities which malware
dissecting a standalone malware binary or running it in
have staged for execution. Forecast is based on a
a sandbox. However, malware are known to delete their
probabilistic model that allows Forecast to discover
binary or lock themselves to only run on the infected
capabilities and also weigh each capability according to
machine (hardware locking). Worse still, researchers
its relative likelihood of execution (i.e., forecasts).
found that fileless malware incidents (i.e., only resides
Forecast leverages the execution context of the
in memory) continue to rise [1], [15], [16].
ongoing attack (from the malware’s memory image) to
guide a symbolic analysis of the malware’s code. We Having access to the right execution context is
performed extensive evaluations, with 6,727 real-world necessary to guide malware into revealing its
malware and futuristic attacks aiming to subvert capabilities. Malware internally gather inputs from
Forecast, showing the accuracy and robustness in environment-specific sources, such as the registry,
predicting malware capabilities. network, and environment variables, in order to make
behavior decisions [11], [17], [18]. Therefore, an ideal
and practical input formulation for malware can be
1 Introduction adapted from this internal execution state in memory
bearing the already-gathered input artifacts. It turns
Cyber attack response requires countering staged out that anti-virus and IDS already collect memory
malware capabilities (i.e., malicious functionalities images of a malicious process after detecting it [19]–[21].
which have not yet executed) to prevent further A malware memory image contains this internal
damages [1], [2]. Unfortunately, predicting malware concrete execution state unique to the specific attack
capabilities post-detection remains manual, tedious, and instance under investigation.
error-prone. Currently, analysts must repeatedly carry During our research, we noticed that if we can
out multiple triage steps. For example, an analyst will animate the code and data pages in a memory image,
often load the binary into a static disassembler and and perform a forward code exploration from that
perform memory forensics, to combine static and captured snapshot, then we can re-use these early
dynamic artifacts. This painstaking process requires concrete execution data to infer the malware’s next
context switching between binary analysis and forensic steps. Further, by analyzing how these concrete inputs
tools. As such, it incurs a high cognitive load on the induce paths during code exploration, we can predict
which paths are more likely to execute capabilities
*Authors contributed equally. based on the malware’s captured execution state. Based

USENIX Association 30th USENIX Security Symposium 3523


on this idea, we propose seeding the symbolic 2 Overview
exploration of a malware’s pre-staged paths with
concrete execution state obtained via memory image This section presents the challenges and benefits of
forensics. Through this, we overcome the previous combining the techniques of memory image forensics
painstaking and cognitively burdensome process that an and symbolic analysis. Using the DarkHotel incident [2]
analyst must undertake. as a running example, we will show how incident
We present Forecast, a post-detection technique to responders can leverage Forecast to expedite their
enable incident responders to forecast what capabilities investigation and remediate a cyber attack.
are possible from a captured memory image. Forecast
ranks each discovered capability according to its
probability of execution (i.e., forecasts) to enable Running Example - DarkHotel APT. DarkHotel
analysts to prioritize their remediation workflows. To is an APT that targets C-level executives through spear
calculate this probability, Forecast weighs each phishing [2]. Upon infection, DarkHotel deletes its
path’s relative usage of concrete data. This approach is binary from the victim’s file system, communicates with
based on a formal model of the degree of concreteness a C&C server, injects a thread into Windows Explorer,
(or DC (s)) of a memory image execution state (s). and ultimately exfiltrates reconnaissance data. When an
Starting from the last instruction pointer (IP) value in IDS detects anomalous activities on an infected host, an
the memory image, Forecast explores each path by end-host agent captures the suspicious process memory
symbolically executing the CPU semantics of each (i.e., DarkHotel’s), terminates its execution, and
instruction. During this exploration, Forecast models generates a notification. At this point, incident
how the mixing of symbolic and concrete data responders must quickly understand DarkHotel’s
influences path generation and selection. Based on this capabilities from the different available forensic sources
mixing, a “concreteness” score is calculated for each (network logs, event logs, memory snapshot, etc.) to
state along a path to derive forecast percentages for prevent further damages.
each discovered capability. DC (s) also optimizes Dynamic techniques [11]–[14] may require an active
symbolic analysis by dynamically adapting loop bounds, C&C, which may have been taken down, to induce a
handling symbolic control flow, and pruning paths to malware binary to reveal its capabilities. Because
reduce path explosion. DarkHotel only resides in memory, these techniques,
To automatically identify each capability, we which work by running the malware in a sandbox,
developed several modular capability analysis plugins: cannot be applied.1 With only the memory image, an
Code Injection, File Exfiltration, Dropper, Persistence, analyst can use a forensic tool, such as Volatility [24],
Key & Screen Spying, Anti-Analysis, and C&C URL to “carve out” the memory image code and data pages.
Connection. Each plugin defines a given capability in Based on the extracted code pages, symbolic analysis
terms of API sequences, their arguments, and how their can simulate the malware execution in order to explore
input and output constraints connect each API. all potential paths. Unfortunately, existing symbolic
Forecast plugins are portable and can easily be tools require a properly formatted binary and are not
extended to capture additional capabilities based on the optimized to work with memory images [7], [22], [23].
target system’s APIs. It is worth noting that Ideally, an analyst can manually project these code
Forecast’s analysis only requires a forensic memory fragments into symbolic analysis and source concrete
image, allowing it to work for fileless malware, making values from the data pages to tell which code branch
it well-suited for incident response. leads to a capability. However, this back-and-forth
We evaluated Forecast with memory images of process of “stitching up” code with extracted memory
6,727 real-world malware (including packed and artifacts, involves context switching between symbolic
unpacked) covering 274 families. Forecast renders execution and the forensic tool. This places a very high
accurate capability forecasts compared to reports cognitive burden on the analyst. An analyst must also
produced manually by human experts. Further, we show handle challenges such as path explosion, API call
that Forecast is robust against futuristic attacks that simulation [4], [22], [25]–[27], and concretizing API
aim to subvert Forecast. We show that Forecast’s arguments (e.g., attacker’s URL), which may not be
post-detection forecasts are accurately induced by early statically accessible in the memory image. Lastly, an
concrete inputs. We empirically compared Forecast analyst must manually inspect APIs along each path to
to S2E [6], angr [22], and Triton [23] and found that infer high-level capabilities.
Forecast outperforms them in identifying capabilities
and reducing path explosion. Forecast is available 1 Forensic
memory images are not re-executable due to being
online at: https://cyfi.ece.gatech.edu/. “amputated” from the original operating system and hardware.

3524 30th USENIX Security Symposium USENIX Association


1 2 3 4 5 6 31%
CODE
Code Injection
EAX, EBX,
CPUECX,
STATE 𝑝# 𝑝"
15%
EDX,
Parser EIP, ESP,
ESI, EDI,
EFLAGS 𝑝% 𝑝$ File Exfiltration

DATA Augmented 54%


Symbolic Probability Capability-Relevant API & C&C URL
Memory Execution Exploration Assignment Paths Argument Capability Plugin
Image Context Constraints Analysis
Context-Aware Memory Forensics Probabilistic Symbolic Analysis Capability Forecast
Figure 1: Forecast workflow. A memory image is used to reconstruct the original execution state. Concrete data is
utilized to explore code paths while API constraints are analyzed against plugins to forecast capabilities.

2.1 Hybrid Incident Response 2.2 Incident Response with Forecast


Incident responders rely on memory forensics to identify Forecast identifies capabilities originating from a
attack artifacts in memory images. However, memory malware memory image in an automated pipeline. To
forensics alone, which is largely based on signatures, demonstrate this, we simulated DarkHotel’s attack and
misses important data structures due to high false memory capture, which involved setting up an IDS with
negatives [21]. On the other hand, symbolic execution DarkHotel’s network signature and executing the
can explore code in the forward direction, but suffers Advanced Persistent Threat (APT). Following
from issues such as path explosion [22]. To address detection, the IDS signals the end host agent to capture
these limitations, Forecast combines symbolic the DarkHotel process memory. We then input this
execution and memory forensics through a feedback memory image to Forecast for analysis. In 459
loop to tackle the shortcomings of both techniques. seconds, Forecast reveals DarkHotel’s capabilities: a
Context-Aware Memory Forensics. Symbolic C&C communication (i.e., mse.vmmnat.com), a file
analysis provides code exploration context to accurately exfiltration (i.e., of host information), and a code
identify data artifacts that are missed by memory injection (i.e., into Windows Explorer).
forensics. For example, traditional forensic parsing of There are six stages for processing a forensic memory
DarkHotel’s memory image missed C&C URL strings image shown in Figure 1. 1 Forecast forensically
because they are obfuscated via a custom encoding parses the memory image and reconstructs the prior
scheme. However, subsequent symbolic analysis of the execution context by loading the last CPU and memory
instructions that reference those bytes as arguments, state into a symbolic environment for analysis. In
such as a strncpy API, allowed Forecast to correctly analyzing the memory image, Forecast inspects the
identify and utilize these data artifacts in the memory loaded libraries to identify the exported function names
image. Moreover, targeted malware may employ tactics and addresses. Next, 2 Forecast proceeds to explore
that aim to subvert Forecast, using anti-forensics and the possible paths, leveraging available concrete data in
anti-symbolic-analysis, which we carefully considered in the memory image to concretize path constraints. 3
our design and evaluation. Forecast models and weighs how each path is
Memory image forensics provides concrete inputs that induced by concrete data and assigns a probability to
can help symbolic analysis perform address each generated path. 4 Forecast then uses this
concretization, control flow resolution, and loop probability as a weight to adapt loop bounds and prune
bounding. In addition, memory forensics identifies false paths, allowing Forecast to narrow-in on the
loaded library addresses in memory which allows induced capability-relevant paths. 5 Forecast
Forecast to perform library function simulation. matches identified APIs to a repository of capability
Path Probability. Given a memory image, the goal analysis plugins to report capabilities to an analyst.
is to utilize available concrete data to explore potential Finally, 6 Forecast identifies three capabilities and
code paths and forecast capabilities along them. By derives their forecast percentages from the path
analyzing how different paths are induced by concrete probabilities as 31%, 15%, and 54%, respectively.
memory image data, Forecast can derive the The first path matches the Code Injection plugin.
probability that a path will reach a capability relative to This path contains the APIs: VirtualAllocEx,
other paths. Forecast computes this probability WriteProcessMemory, and CreateRemoteThread, which
based on modeling how concrete and symbolic data are used in process injection. Analyzing the argument
operations are influencing path generation and selection. constraints leading to these APIs reveals explorer.exe as
Forecast also leverages this probability metric as a the target process. The second path matches the File
heuristic in pruning paths with the least concrete data. Exfiltration plugin. This path contains APIs

USENIX Association 30th USENIX Security Symposium 3525


getaddrinfo, SHGetKnownFolderPath, WriteFile, Socket, 3.1 Modeling Concreteness to Guide
and Send. Forecast inspects their arguments’ Capability Forecasting
constraints to determine that the malware writes host
information to a file, which it sends over the network. Forecast models how available concrete data in a
The File Exfiltration plugin concretizes the argument of memory image induces capability-relevant paths using
SHGetKnownFolderPath to reveal the file location the degree of concreteness model (DC (s)). Degree of
identifier: FOLDERID_LocalAppData. The third path concreteness is a property of execution states which
matches the C&C Communication plugin, which reveals encapsulates the “mixing” of symbolic and concrete
a sequence of network APIs including operations. Symbolic operations (Sym_Ops) make use
InternetOpenUrlA. The plugin queries the API of symbolic variables such as arithmetic involving
constraints and concretizes InternetOpenUrlA’s symbolic operands. Concrete operations (Con_Ops) do
argument then reports that DarkHotel makes an HTTP not make use of symbolic variables. Sym_Ops and
request to the mse.vmmnat.com domain. Con_Ops are intrinsic to every state transition. A
state transition happens each time a basic block is
Given these forecast reports, an incident responder
executed along an explored path. Based on the ratio of
learns from the captured memory snapshot, that
Sym_Ops to Con_Ops, there exists an associated
DarkHotel will communicate with mse.vmmnat.com,
degree of concreteness (DC (s)) value, which measures
steal host data, and inject into Windows Explorer. This
how concrete or symbolic the current execution state is.
will prompt the analyst to block the URL and clean up
Forecasting is based on malware’s use of pre-staged
the affected Explorer process mitigating further
concrete data to execute a set of capabilities. Under
damages. Forecast empowers the analyst to quickly
DC (s), paths that increasingly utilize concrete states are
and efficiently respond to threats by alleviating the
more likely to reach a set of capabilities. As a result,
cognitive burden and context switching required to
Forecast assigns DC (s) scores to states by modeling
manually obtain the same results.
their cumulative usage of concrete data. This DC (s) score
is then used to derive the probability, Pprob (s), that a
path will reach a capability relative to other paths. At
3 System Architecture the end of exploration, the paths where capabilities are
found are analyzed based on their Pprob (s), to compute
Forecast is a post-detection cyber incident response forecast percentages of identified capabilities.
technique for forecasting capabilities in malware memory In addition to deriving forecasts, DC (s) detects
images. It only requires a memory image as input. The conditions that trigger path explosion (e.g., rapid path
output of Forecast is a text report of each discovered splitting due to symbolic control flows), and makes
capability (e.g., code injection), a forecast percentage, performance improvements including pruning false
and the target of the capability (e.g., injected process). states based on the degree of concreteness of every
active state (discussed in §3.2).
Reconstructing Execution Context. Forecast Formulation of DC (s). For DC (s) to forecast
parses the memory image to extract the execution state capabilities, it must summarize two key features:
(e.g., code pages, loaded APIs, register values, etc.) to (1) the rate of change in the ratio of symbolic
be used to reconstruct the process context. Static operations to all operations, with respect to state
analysis of the code pages is used to initialize symbolic transitions, and (2) the cumulative state conditions
exploration. It explores each path beginning from the from a starting exploration state j to a target state n.
last IP in the reconstructed process context. We normalize DC (s) with respect to the number of
Forecast symbolically executes the CPU semantics states explored in our model. This bounds its value
of the disassembled code pages until an undecidable between 0.0 and 1.0, which describes the current state
control flow is encountered. To resolve this, Forecast mixing. Formally, we define a state transition set τn ,
recursively follows the code blocks to resolve new CFG which is a set of ordered states from sj to sn :
paths. When a library call is reached, Forecast τn := {sj , sj+1 , sj+2 , ..., sn } (1)
simulates and symbolizes the call (discussed in §3.2).
Library call simulation introduces symbolic data for where state sj is the first state generated from a memory
each explored state, thus increasing the possibility of image and 0 ≤ j ≤ n, n ∈ Z. Transitioning from state si−1
state explosion. However, the DC (s) model (discussed to si involves executing every operation (All_Opsi−1 )
next) provides optimization metrics that enable in the basic block BBi−1 at state si−1 . The states in τn
Forecast to dynamically adapt parameters for loop are ordered based on the basic block ordering, i.e., the
bounding, symbolic control flow, and path pruning. basic block BBi maps to state si , and executing BBi

3526 30th USENIX Security Symposium USENIX Association


BB1 0x403280
mov eax, ecx
mov ecx, 5 Memory View:
jmp 0x40374D Let state 𝑠! be the current state after basic block 𝐵𝐵! is
0x732460: AA 23 BF CA
0x732464: SYMBOLIC executed, and let 𝐷" 𝑠! be the degree of concreteness
BB2
0x732468: SYMBOLIC at state 𝑠! .
mov edx, [0x732460] 0x73246C: F1 EC 2B 32
cmp edx, 0 0x732470: SYMBOLIC 1
0.333
𝐷" 𝑠# = 1 − 3
jnz 0x403787
=1− = 0.67
1 1
BB3 BB5
1 0
mov edx, eax lea eax, [0x732468] Register View: + 0.333
add edx, 3 mov eax, [eax] 𝐷" 𝑠$ = 1 − 3 3 =1− = 0.83
mov eax, 0x732470 jmp 0x40385B EAX: 0x732468 2 2 (c) Plot of cumulative ratio vs states.
mov eax, [eax] EBX: SYMBOLIC 1 0 4
add eax, 1 ECX: SYMBOLIC + + 1.133
EDX: 0x2000 𝐷" 𝑠% =1−3 3 5 =1− = 0.62
EIP: 0x403280 3 3
BB4 ESP: 0x28FECC
1 0 4 1
push eax ESI: 0x4000 + + + 1.383
xor esi, esi 𝐷" 𝑠& = 1 − 3 3 5 4 =1− = 0.65
push esi 4 4
call 0x4042AD

(b) Value derivation for degree of


(a) Symbolic exploration for the control-flow concreteness (DC (s)).
graph, memory, and register values from the (d) Plot of DC (s) vs states.
memory image.
Figure 2: Forecast recovers context from the process memory image, including the memory values and register
values for the captured state in (a). Using the degree of concreteness (DC (s)) formula, (b) calculates the values
for each transition state. Figure (c) plots the cumulative ratio of Sym_Ops to All_Ops accumulated across state
transitions. Figure (d) plots the degree of concreteness (DC (s)) across state transitions in the symbolic exploration.

transitions the program’s context to BBi+1 and state An Example of DC (s) Computation. Figure 2 is
si+1 . The set All_Opsi is partitioned into 2 disjoint sets, a working example to show the computation of DC (s).
Sym_Opsi and Con_Opsi , such that: Figure 2a depicts a recovered CFG and memory and
register values from the memory image. Symbolic
Sym_Opsi ∪ Con_Opsi = All_Opsi (2) execution starts at basic block BB1 and ends at BB4 .
and We annotate each basic block to show which
Sym_Opsi ∩ Con_Opsi = ∅ (3) instructions are Sym_Ops based on the register or
For a state sn , we define the DC (sn ) function as follows: memory values when the basic block is being executed.
Notice that because register edx at BB2 and memory
n
P |Sym_Opsi | address 0x732460 at BB2 have concrete values, only one
|All_Opsi |
i=j branch is taken by the conditional jump instructions at
DC (sn ) = 1 − (4) the end of BB2 . For this reason, BB5 is not explored.
|τn |
Symbolic data can be introduced by I/O-related
where |Sym_Opsi | is the cardinality of the Sym_Ops
function calls and calls to functions that are simulated
performed to reach state si and |All_Opsi | is the
based on Forecast’s function models. Such function
cardinality of All_Ops performed to reach state si .
calls create symbolic variables within the memory dump
Further, |τn | is the cardinality of the state transitions
which causes a mixing of symbolic and concrete data.
from state sj to sn .
Tracking the cumulative ratio of Sym_Opsi to Following along with Figure 2a, Figure 2b computes
All_Opsi for each state transition enables us to DC (s) for each state (basic block) transition. For
calculate DC (s) instantaneously without iterating example, DC (s1 ) = 0.67 when we transition to state s2 ,
through the previous states sj to sn . An extended form then it increases to 0.83 as we transition from s2 to s3 .
of DC (s) that allows us to calculate its instantaneous For each DC (si ) value derived in Figure 2b, we plot
value is given as follows: them against the transition states in Figure 2d.
Figure 2c plots the Cumul_Ratio(si ) for each state
δ (shown in black). The instantaneous Cumul_Ratio(sn )
DC (sn ) = 1 − Cumul_Ratio(sn ) (5)
δT function is a straight line (Cumul_Ratio(sn ) = mT )
drawn from origin to the point sn ∈ T , where m is the
where, for all transition states T , Cumul_Ratio(sn ) is
the sum of the states’ ratio for states sj to sn , and slope. The derivative of Cumul_Ratio(sn ) = mT gives
defined as: the instantaneous DC (sn ) (Equation 5).
n Path Probability. Given m current states, the path
X |Sym_Opsi |
{∀si ∈ T : Cumul_Ratio(sn ) := } (6) probability of a path p, with current state s, is derived
|All_Opsi |
i=j by dividing s’s DC (s) by the summation of the DC (s)

USENIX Association 30th USENIX Security Symposium 3527


Algorithm 1 The Degree of Concreteness (DC (s)) optimize exploration. We evaluate these features against
Input: PATHS: Explored program paths in a memory image adversarial symbolic analysis tactics in §4.
Output: DC (s): ∀s ∈ path, ∀path ∈ P AT HS
Adapting Loop Bounds. Forecast optimizes
. Initialize Cumul_Ratio for each explored path p
for path p ∈ P AT HS do loops by forcing a bound only when DC (s) indicates a
Cumul_Ratio ← 0 heavy symbolic state over time (specifically, when
T ←0 DC (s) drops below 0.10 after 10 state transitions). This
. Compute DC (s) for each state s generated along p optimization precisely measures how much a loop is
for State s ∈ SuccessorStates(p) do
affecting a state to decide when to bound it. We
. Get Sym_Ops and All_Ops
N um_all_ops ← GetN umAllOps(s) observed that unlike harmless loops, explosion-causing
N um_sym_ops ← GetN umSymOps(s) loops converge DC (s) to 0.10 after two or more
. Calculate the ratio of Sym_Ops to All_Ops for state s transitions.
Sym_Ratio ← N um_sym_ops/N um_all_ops
. Update Cumul_Ratio along the explored path On-Demand State Pruning. When performance is
Cumul_Ratio ← Cumul_Ratio + Sym_Ratio overwhelmed by heavy state symbolism, Forecast
. Compute DC (s) for the considered state s
DC (s) ← 1 − (Cumul_Ratio/T ) prioritizes states for pruning by selecting the worst
T ++ performers. Under DC (s), this selection is trivial since
end for every state has a DC (s) score, which is used to prune
end for states with heavy symbolic footprints. In §4.6, we found
on-demand pruning drove Forecast toward more
concrete paths than tools which prune paths via a
of all m states. This bounds its value between 0.0 and
hard-coded threshold — leading to Forecast
1.0, and is given as follows:
exploring deeper in selected paths.
Stack Backtrace Analysis. False successor paths
DC (sx )
{Pprob (sx ) = m , m = |{AllCurrentStates}|} (7) often arise in symbolic analysis. Forecast examines
the return addresses on the stack in a memory image to
P
DC (si )
i=1 identify false paths — function returns which do not
conform to previously established targets in the call
Algorithmic Approach to DC (s). In order to stack. Specifically, the stack backtrace enables
derive DC (s), Forecast uses Algorithm 1. Forecast to verify flow-correctness by comparing the
Cumul_Ratio is the cumulative ratio of symbolic stack pointer and return addresses in the backtrace with
operations to all operations, and T is the total state that computed after executing a return instruction.
transitions in terms of basic blocks. For each explored
path p in the memory image, DC (s) is calculated for Address Concretization. Forecast uses the
every state s generated and executed along the path p. memory image data space to concretize symbolic
indices to a tractable range. In addition, we observed
that false states perform illegal indices accesses (indices
3.2 DC(s)-Guided Symbolic Analysis beyond the mapped code/data space of a process).
Forecast uses this indicator to prune such states.
Forecast uses DC (s) to optimize symbolic execution
Further, Forecast’s analysis is transparent to address
multi-path exploration by bounding loops, concretizing
space layout randomization (ASLR) because ASLR is
addresses for symbolic control flow, and pruning paths.
done at process load, before execution.
Neglecting these parameters impacts soundness and
performance [27], [28]. State-of-the-art tools [6], [22], Library Function Simulation. Forecast
[23] rely on hard-coded thresholds to balance the analyzes the libraries present in the memory image to
trade-off between coverage and soundness. These identify the exported functions. Identified functions are
techniques mostly focus on finding bugs in hooked to redirect the symbolic exploration to a
non-malicious code. Choosing an informed threshold is simulated procedure. Forecast also handles dynamic
application-specific and may require a manual library loading by calls to the LoadLibrary functions. If
investigation. Yet, unlike finding bugs, malware employ a library is loaded during symbolic exploration,
adversarial means to vary these issues at run-time, Forecast creates a new section in memory for the
hence a hard-coded or manual threshold will be limiting. loaded library. Once a call to GetProcAddress is
However, by modeling the changing concrete state of an reached, a new address is allocated in the library’s
exploration, Forecast can dynamically adapt these memory section and hooked, then this address is
(otherwise application-specific) thresholds at run-time. returned. Any calls made to this address will be
DC (s) embodies this automated adaptability to redirected to the correct simulated procedure.

3528 30th USENIX Security Symposium USENIX Association


3.3 Forecasting Malware Capabilities
To characterize high-level capabilities, we focus on
contextualizing a malware’s API functionality by
analyzing the constraints on their input and output
parameters. Forecast analyzes the symbolic
constraints on the input and output parameters of each
API to “connect the dots” between APIs. Analyzing
APIs used by malware is useful for identifying its
capabilities because a malware’s behavior stems from
its API calls and data flow [11]–[13], [29]. Specifying a
unique trace involves identifying the first (source) and
last (sink) API in the sequence. While analyzing API
data flow is not novel [30], previous work relies on
dynamic taint-tracking [11], [14], [29], which can hardly
be applied here. To tackle this, we leverage a constraint Figure 3: API Constraints Analysis of AveMaria.
matching technique introduced by [5] to model
malware’s decision making. Our approach is based on
the formulation that for a given API trace to embody a exfiltrated by send are matched with the constraints on
capability, the path constraints on the input of each buf _2, an output argument of ReadFile. Next, the
succeeding API starting from the sink, can be matched constraints on the file handle (hF ile) of ReadFile are
to the output constraints of at least one preceding API. matched with the constraints on the output of OpenFile.
When a sink is encountered, Forecast performs a When these constraints are matched from a send to
call-based backward slice to record all call instructions socket, Forecast reports a File Exfiltration.
such that, for each instruction, there is a data flow from Capability Analysis Plugin. A plugin specifies
at least one of its operands to the input argument of different ways that a given capability is to be
the sink. If the extracted slice includes a corresponding identified.2 It lists one or more API sequences, their key
source, Forecast proceeds to match the constraints on arguments, and how constraints on their input and
the input of every succeeding call, starting from the output parameters connect each other. We develop
sink, to the output of any preceding call. Note that plugins to identify 7 specific malware capabilities.
traditional system call/API tracing often misses Analysts can easily extend these plugins to specify
malware capabilities due to a lack of contextual additional capabilities by reviewing the API
connection between observed APIs. Instead, Forecast documentation of the target operating system. Next, we
uses the constraints on the API parameters in this describe each capability, showing how a plugin can
call-based backward slice to precisely connect the data specify them.
flow between the APIs to infer capabilities. Put simply:
The constraints encapsulate only the relevant data flow 1. File Exfiltration. Malware sends stolen
between sources and sinks. information from an infected host by uploading a file to
its drop site. This is done by using OpenFile and
Figure 3 illustrates this analysis on AveMaria, a
ReadFile APIs to copy data into a buffer followed by
Trojan that steals Firefox cookie files. AveMaria infects
use of the send or HttpSendRequest network API.3 The
by replacing the code of Svcshost, a Windows service,
plugin matches the constraints on the buffer written to
with its own code, a code injection capability known as
by ReadFile with the buffer of data sent by send or
process hollowing. AveMaria also takes screenshots to
HttpSendRequest. Figure 3 shows Forecast’s analysis
spy on the user’s screen. The shaded boxes are the
of AveMaria’s file exfiltration.
relevant APIs in the trace and their key arguments. The
dotted line matches the input constraints on an 2. Code Injection. Malware injects its code into a
argument of a latter API to the output constraints of at victim process to run under the target process ID. This
least one preceding API. The analysis starts when a is done by the OpenProcess or CreateProcess APIs,
sink is identified (e.g., SetThreadContext for AveMaria’s followed by WriteProcessMemory (process hollowing)
Code Injection) and the entire trace is recovered by a and/or CreateRemoteThread (PE or DLL Injection).
call-based backward slice. The numbers, 1, 2, etc., show 2 Several plugins could be defined for one capability to capture
the constraint matching steps, starting from the sink different possible ways that malware exhibit that capability.
and walking backwards to a source. In AveMaria’s File 3 We refer to APIs with multiple variants (A, ExA, W, and

Exfiltration, the constraints on the input file (buf _3) ExW ) by the base API name but our plugins cover all variants.

USENIX Association 30th USENIX Security Symposium 3529


The plugin matches the input constraints on the process do. Since there are a finite number of ways malware can
handle used by these APIs. Figure 3 shows Forecast’s exhibit a given capability, we can expect to model most
analysis of AveMaria’s code injection. of those methods. In doing this, we observed that there
could be variations in API traces for the same
3. Dropper. Malware writes a file to disk and changes
capability, but the key APIs are always present. In
its attributes for execution. The plugin matches the
addition, some APIs perform the same function, and
constraints on the file handle returned by CreateFile
hence can be interchanged. For example,
with the file handle input passed to WriteFile, as well
W riteV irtualM emory can be interchanged for
as the file name passed to CreateFile, SetFileAttributes,
W riteP rocessM emory in the process hollowing
and CreateProcess.
example in Figure 3. Furthermore, this approach is
4. Key & Screen Spying. Malware records resilient to noisy API calls that malware authors may
keystrokes and screenshots of a user’s computer. To mix into their capability function. We provide
detect key spying, the plugin matches the constraints additional details about the constraints for each plugin
on the window handle passed to RegisterHotKey and in Appendix A Table 7.
GetMessage and checks if WH_KEYBOARD was
passed to SetWindowsHook to monitor keystrokes. For Capability Forecasts Percentages. The paths
screenshots, the plugin checks if a device context handle where capabilities are found are known as capability
returned by GetDC or GetWindowDC is passed to paths or CP aths . Forecast considers these paths to
CreateCompatibleBitmap. Figure 3 shows this analysis derive forecast percentages for discovered capabilities.
for AveMaria’s screen spying. For each capability cx along a path x, Forecast
reports a forecast Ccast (cx ) as a percentage. Ccast (cx )
5. Persistence. Malware make registry entries to
is derived from path probabilities of all CP aths , and
maintain persistence across reboots. The persistence
measures the probability that cx will be executed
plugin compares the constraints on the registry key
relative to other capabilities. Let the cardinality of
handle returned by RegCreateKey or RegSetValue with
Cpaths be m. A forecast is given as follows:
the input to RegSetValue. We also specify the keys and
subkeys that malware commonly use with these APIs,
Pprob (x)
such as HKLM, HKCU, Run, and ControlSet. {∀i ∈ CP aths : Ccast (cx ) = m × 100} (8)
P
6. Anti-analysis. Malware check for analysis Pprob (i)
i=1
environments and tools to determine if it should hide its
behavior. This can be done by checking for debuggers 4 Evaluation
with OutputDebugString, IsDebuggerPresent, or
CheckRemoteDebuggerPresent. VM checks look for
Forecast builds upon several angr [22] features,
running services by using CreateToolhelp32Snapshot or
including exploration techniques, SimProcedures, and
EnumProcesses or invoking cpuid to check for virtual
state plugins. Our focus is on Windows malware since
CPUs. The plugin checks for usage of these APIs.
they are most prevalent, but our methodology could be
7. C&C Communication. This plugin checks the ported to other platforms.
arguments of socket (af is an IP address),
InternetOpenUrl (lpszUrl is a domain), and Experiment Setup. Our experiment mimics a
IWinHttpRequest::Open (lpszServerName is a domain or real-world deployment where a host-based security tool
IP) to determine which servers are contacted. For captures a memory image of malware once an IDS
domains that are represented by constant values or detects malicious network activity. Our testbed is
stored in memory (e.g., obtained from an external comprised of (1) an Ubuntu 14.04 machine (with 40GB
source such as file or socket), the plugin can successfully RAM and 4-core 2.7GHz cpu) running Forecast, (2) a
extract the domain. If the domain is from an external Windows 7 machine executing malware, and (3) an IDS
source and had not be stored in memory at the time of system running SNORT. We collected the alert network
the memory capture, the plugin is unable to determine signatures of each malware to configure SNORT. IDS
its concrete value. In the case of domains generated alerts during the malware’s execution trigger the
algorithmically, Forecast builds constraints on the capture of a process memory image4 and sends it to
bytes of the domain, seeds Z3 with the concrete Forecast. We profiled all captured memory images
execution data, and attempts to solve the constraints. and observed that 83% were taken while the malware
To develop these plugins, we manually analyzed 50 was polling on I/O, such as a network socket.
samples and compiled many relevant API traces and
their key arguments, similar to what an analyst would 4 WinDBG memory capture also collects pages swapped to disk.

3530 30th USENIX Security Symposium USENIX Association


C&C Comm File Code Dropper Key & Screen Persistence Anti-analysis

OF N
OF P
Malware Exfiltration Injection Spy
PF OM OF PF OM OF PF OM OF PF OM OF PF OM OF PF OM OF PF OM OF
Bokbok 38% 2 2 5% 3 3 57% 1 1 - - - - 0 0
AcridRain 23% 3 3 19% 4 4 - 28% 2 2 - 30% 1 1 - 0 0
AthenaGo - 11% 4 4 - 22% 3 2 - 33% 2 3 34% 1 1 2 0
Rokrat 30% 1 1 26% 2 2 22% 3 3 - 17% 4 4 - 15% 5 5 0 0
AdamLocker 22% 3 3 0 4 ∅ 45% 1 1 - - 33% 2 2 - 0 1
Marap - 46% 3 3 40% 1 1 - 14% 2 2 - - 0 0
ATI - - - 41% 2 2 - 42% 1 1 17% 3 3 0 0
TeslaAgent 11% 4 4 14% 3 3 32% 1 1 - 13% 2 3 30% 3 3 - 0 0
Andromeda 25% 2 2 - 14% 3 3 - - 61% 1 1 - 0 0
AveMaria 28% 3 4 29% 2 2 28% 4 3 - 25% 1 1 0 3 ∅ - 2 1
Aveo 22% 3 3 - - 40% 1 1 - 38% 2 2 0 4 ∅ 0 1
7Honest - 16% 3 3 51% 1 1 11% 4 4 - 22% 2 2 - 0 0
Abaddon - 26% 2 2 - - - 84% 1 1 - 0 0
AVCrpyt 51% 1 1 - - - - 19% 3 3 30% 2 2 0 0

Table 1: Capability Forecasts of 14 Select Recent Samples. PF : Forecast percentage, OM : Ground truth manual
ordering, OF : Forecast ordering, OF P : Ordering false positive, OF N : Ordering false negative.

4.1 Evaluating Capability Forecasts respectively. Thus, Code Injection is less difficult to
reach and hence has the highest forecast.
Table 1 presents the capability forecasts of 14 recent Next, we validate capability ordering. We assign an
samples5 we manually collected ground truth for. increasing number, starting at 1, to each capability
Forecast output 49 distinct capability forecasts. identified by manual checking (defined as OM ) and
Manual analysis validated 45 of them; we found 4 false ordered by increasing difficulty. We assign an increasing
positives (FP) and 3 false negatives (FN), with an number to each capability identified by Forecast
accuracy of 86.5%. FPs were due to over-approximating (OF ) up to the number of identified capabilities. For
symbolic constraints when simulating undocumented Bokbot, both manual checking and Forecast report an
APIs such as RtlCreateUserThread. The FNs were due ordering of 1, 2, and 3 for Code Injection, C&C
to rare unresolved symbolic targets. Communication, and File Exfiltration respectively.
As shown in Table 1, because Forecast’s forecast for
Ground Truth. Validating each forecast involves 2
Bokbok’s Code Injection is the highest, (i.e., 57%), Code
checks: (1) the presence or absence of the identified
Injection’s ordering or OF is 1. Similarly, the ordering
capability, and (2) the accuracy of the forecast
by manual checking or OM is also 1, which validates
percentage. For ground truth for the presence or
Forecast’s forecast for Bokbot’s Code Injection. In
absence of a capability, we leveraged malware reports
another example, Forecast’s prediction for AthenaGo’s
from security vendors [31], [32] and our own manual
Dropper is 22%, which is the second highest forecast (i.e.,
analysis. We also used the MITRE ATT&CK
OF is 2). However, manual checking shows Persistence as
Framework [33] for our initial ground truth mappings.
the second highest instead, resulting in FP for AthenaGo
To validate our ground truth forecast percentages,
(listed in the OF P column of Table 1). Forecast missed
(i.e.rank each outcome according to the “difficulty” or
Aveo’s Anti-analysis capability, resulting in a FN (listed
“constraints required” of arriving at an outcome) we
in the OF N column), and a forecast of 0 (Ccast column).
modeled the difficulty metric of executing capabilities
Overall, Persistence reported the highest forecast
from the memory image capture point based on the
percentages, as high as 84% for Abaddon. We found
number of branch constraints to reach a given
that most malware persist via infecting the registry.
capability. We can obtain this metric via manual
Conversely, File Exfiltration reports the lowest forecasts,
analysis of the memory image since we know the
as low as 5% for Bokbok. Reasonably, File Exfiltration
addresses of the individual capabilities. Using Bokbot as
can be seen as an “end goal” capability, which malware
an example, Table 1 shows its 3 capabilities: Code
deploy in deep code under several constraints. By
Injection, C&C Communication, and File Exfiltration.
integrating capability analysis plugins, Forecast was
For these, Forecast reports forecast percentages of
able to automatically identify them.
57%, 38%, and 5% respectively (listed in the Ccast
column of Table 1). Based on manual analysis of its C&C Communication. Table 1 shows 7 C&C
memory image, the number of branch constraints to domains identified with 1 FP. We focused on
reach these capabilities are 166, 195, and 257, WinINET’s APIs such as InternetOpenUrl and socket.
In particular, we concretized their domain and IP
5 Their hashes are presented in Table 8 in Appendix A. address arguments. Forecast revealed Rokrat and

USENIX Association 30th USENIX Security Symposium 3531


Capabilities
Malware Packer Paths Steps Const. Leaves Time (s) DC (s) C&C Exfil. Inject Drop Spy Persist Anti-Analy. FP FN
Packed Marap 227 465.95 25.74 3.01 97.39 0.94 3 3 3 0 0
UPX
From AVCrypt 59 184.69 23.53 2.00 27.91 0.84 3 3 3 0 0
Type-I
Table 1 ATI 115 179.44 19.89 3.17 56.90 0.83 3 3 3 0 0
RokRat 595 265.68 14.05 1.99 143.54 0.93 3 7 3 3 3 0 1
ASPack
AcridRain 1410 330.39 26.82 2.84 247.47 0.88 3 3 3 7 0 1
Stress Type-III
AthenaGo 677 371.39 26.48 2.03 193.44 0.92 3 3 3 3 0 0
Test
Packers RokRat 732 56.39 18.19 2.96 139.31 0.68 3 3 7 3 3 0 1
Armadillo
AcridRain 338 226.30 23.70 3.42 93.34 0.84 3 3 3 7 0 1
Type-VI
AthenaGo 701 55.21 18.17 2.66 107.42 0.67 3 7 3 3 0 1

Table 2: Packed malware evaluation results based on packer taxonomy found in Ugarte-Pedrero et al. [34].

AVCrypt’s usage of dropbox.com and TOR 4.2 Packed Malware


(bxp44w3qwwrmuupc.onion), respectively. TeslaAgent
uses a hardcoded IP address (45.77.35.239), and a gmail We evaluated Forecast’s robustness against packers
account ([email protected]) to communicate using the taxonomy proposed by Ugarte-Pedrero et
externally. Aveo communicates with a .it domain, al. [34]. In fact, 3 of the 14 samples from Table 1 are
vacanzaimmobiliare.it. We found that this server is packed by UPX, which is a Type-I packer. We include
hosting a vacation website and is likely compromised. those three samples in our packer robustness evaluation
Code Injection. Forecast reports 8 Code as a reference, as shown in Table 2. Our evaluation also
Injection with 1 FP. Explorer and Svchost are the most looked at three additional families using two different
common Windows programs injected into. 7Honest, types of packers, namely ASPack (Type-III) and
Bokbot, and AveMaria hollows into Svchost by invoking Armadillo (Type-VI), giving us a total of 9 samples.
CreateProcess with a CREATE_SUSPENDED flag, and Type-I through Type-IV packers fully unpack the
thereafter swaps the code pages with malware code in memory before executing the malicious
WriteProcessMemory and SetThreadContext. TeslaAgent code [34]. For completeness, we evaluate Forecast
and Andromeda inject into Explorer using the against ASPack, a Type-III packer, where layered
VirtualAlloc and WriteProcessMemory API sequences. unpacking routines are not sequential, leaving junk code
Dropper. Forecast reports 5 Dropper forecasts and data in memory from earlier layers. In Table 2,
with no FP and FN. 7Honest and AthenaGo drop Forecast explores an average of 894 paths per sample
additional files in the AppData and ProgramData with a high final DC (s) (mostly concrete). Additionally,
directories and manipulate their permissions using Forecast identifies almost every capability found in
SetFileAtrributes. AcridRain drops a WinDDecode.exe Table 1, except for exfiltration (Exfil.) and persistence
executable in AppData. We determined it was a custom (Presist) capabilities for RokRat and AcridRain,
decoder for its C&C. Aveo drops .dat executables in respectively. We mark those missed capabilities as
system32. false-negatives (FN) in Table 2.
Type-V and VI packers unpack malicious code
Key & Screen Spying. We focused on detecting
incrementally using different memory frames. We
keyloggers and screen captures based on the Key Hooks
evaluate Forecast against Armadillo with
and GDI API toolkit. Forecast reported 4 Key &
CopyMem-II protection, which incrementally unpacks
Screen spying forecasts with 1 FP. RokRat and
and executes code at a memory-page granularity.
TeslaAgent used GetAsyncKeyState and RegisterHotKey
Forecast explores an average of 590 paths per sample
API to obtain key presses. AveMaria invoked screen
with an average of 0.73 for the final DC (s), which is
capture using a sequence of GetDeskstopWindow,
lower than Type-I and Type-III packers. Moreover,
GetWindowDC, and CreateCompatibleBitmap.
Forecast identifies all the capabilities in Table 1
Anti-Analysis. Forecast reports 4 Anti-analysis except code injection (Inject), persistence (Persist), and
forecasts with 1 FN. RokRat and AthenaGo performed dropper (Drop) capabilities found in RokRat,
network checks via InternetCheckConnectionA. AVcrypt AcridRain, and AthenaGo, respectively. These results
uses IsDebuggerPresent, OutputDebugString, and empirically show the effect of incremental unpacking on
CheckRemoteDebuggerPresent to check for debuggers. Forecast’s capability to analyze malware, which is
To check for VM, ATI issues cpuid calls to obtain rooted in the memory artifacts that are visible during
hardware platform information. malware capture. We discuss these limitations in §6.

3532 30th USENIX Security Symposium USENIX Association


Malware Family All Samples browsefox coinminer xtrat autoit expiro bifrose darkkomet rebhip dprotect llac delf
Total Samples 6,727 200 161 57 161 3428 69 163 80 398 68 65
C&C URL 30.5% 51% 47% 39% 32% 17%
File Exfiltration 11.3% 12% 17% 8%
Code Injection 32.7% 25% 44%
Dropper 41.0% 37% 23% 23% 11% 44% 33% 37% 26% 10%
Persistence 55.2% 52% 60% 67% 61% 63% 57% 46%
Key&Screen Spy 24.4% 40% 33%
Anti-analysis 29.4% 29% 34% 39%
Avg. Explore time(s) 291 218 234 196 124 310 128 326 285 227 134 420
Avg. APIs per path 26 21 18 12 9 17 29 13 45 28 13 11
Avg. States generated 1638 1196 1267 3450 950 1471 4601 670 897 1136 823 1568
DC (s) of final states 0.21 0.34 0.21 0.29 0.39 0.28 0.18 0.43 0.43 0.32 0.41 0.31

Table 4: Average Capability Forecasts and Metrics, featuring the 11 most prevalent malware families.

4.3 Tactics To Subvert Forecast 4.4 Large-Scale Analysis


Category Samples Paths Steps C/P Leaves Flags We show that Forecast is effective when applied to a
No Hash 10 2.00 21.50 3.00 16.00 100% larger set of memory images from 6,727 malware
Hash-Guarded 10 74.70 45.15 19.00 3.40 100% samples (covering 274 different malware families).
Tigress 2311 4.02 58.65 8.47 3.84 97% Table 4 summarizes Forecast’s capability forecasts for
the 6,727 samples and highlights metrics for the top 11
Table 3: Averaged results of symbolic obfuscation most prevalent malware families in our dataset. The
evaluation. C/P denotes constraints per path. highest capability forecasts were recorded for
Persistence and Dropper. We observed that over 70% of
Malware authors who are aware of Forecast may all 6,727 samples have Dropper and Persistence
try to adapt advanced tactics to subvert our capability capabilities. When averaged, Persistence reports 55.2%
exploration. To evaluate Forecast against targeted overall forecast, peaking at 67% for the Bifrose family.
attacks, we follow the set of obfuscation benchmarks Our experiment revealed that the Bifrose family enters
proposed by Banescu et al [35], [36]. These anti-analysis several registry Run keys in both the HKLM* and HKCU*
benchmarks are broken into two sets, a set of 10 registry directories – an aggressive means to force
hash-guarded programs that simulate license checking persistence across reboots, compared to other families.
(Hash-Guarded) and a set of 2,311 Tigress-obfuscated Bifrose samples also drop a .dat executable file in
programs (Tigress). Table 3 presents the results for Windows\System32 and connect to a no-ip.com
three experiments, namely baseline (No Hash), domain C&C. Dropper capability reported 41.0%
Hash-Guarded, and Tigress. For the Hash-Guarded overall, peaking at 44% for the expiro family. The lowest
programs we created a Forecast plugin that triggers forecasts were File Exfiltration, with 11.3% overall.
when the license check is correct (captured Flag). For
We observed fairly low variance between the highs
the No Hash programs, Forecast found the flag and
and lows of forecasts within each family. Digging deeper,
explored 2 paths with an average of 21.5 steps per path,
this is due to samples in the same family reusing the
3 constraints, and 16 leaf nodes per constraint AST.
same features (e.g., dropper filenames). Samples in the
For the Hash-Guarded programs, Forecast found all
Browsefox adware family drop an executable with a
flags and explored an average of 74.7 paths, with 45.15
consistent file name format of “<random>Expance.exe”
steps and 19 constraints per path, and 3.4 leaves per
in ProgramData directory. Our investigation found that
constraint. For the Tigress obfuscated programs, the
it installs extensions to browsers to display ads, earning
code performs various transformations on the input and
the attacker ad revenue. The Xtrat family of remote
compares the derived value against an expected value
access trojans displays similar patterns of C&C domain
that represents the correct license key.6 The results show
names, namely <random>to.org. Concrete examples are
that 97% of the flags were found and an average of
zapto.org and hopto.org.
4.02 paths were explored with an average of 58.65 steps
per path, 8.47 constraints per path, and 3.84 leaves per Exploration Metrics. Table 4 reveals interesting
constraint. These results empirically demonstrate that observations about the metrics reported by each
Forecast is resilient against adversarial obfuscation malware family. The average exploration time for one
attacks targeting symbolic execution. memory image is 291 seconds, which shows that
6 We excluded Tigress programs which crashed or did not print Forecast is efficient as an offline investigation tool.
the flag during a natural execution with the correct argument. Forecast revealed an average of 26 unique APIs per

USENIX Association 30th USENIX Security Symposium 3533


memory image and generated 1,638 states on average
per sample. The Bifrose family reported the largest
number of states per sample (4,601), while Darkkomet
generated the lowest (670 states per sample).
The average DC (s) for end states was 0.21, which
indicates that states toward the end were more symbolic
than concrete. Bifrose reported the lowest DC (s) of
0.18 indicating a very symbolic ending. This was the
general observation for most C&C-based malware since
simulating socket calls introduces more symbolic data, Figure 4: LokiRAT ground truth. PT RU T H−regnewkey ,
causing DC (s) to drop. Darkkomet and Rebhip tied for PT RU T H−message , and PT RU T H−rename represent the
the highest DC (s) with 0.43. This confirms the ground truth set of paths for each LokiRAT C&C
correlation between DC (s) and cumulative states command (regnewkey, message, rename).
coverage. Samples in the Delf family reported the
maximum exploration time (420 seconds on average), Ground Truth Forecast Results
which explains their high average states (1568). Malware
C&C Cmds Paths Paths TP FP FN Acc(%)
regnewkey 4 5 4 1 0 80
4.5 Pre-Staged Concrete Input LokiRAT message 4 4 4 0 0 100
rename 2 2 2 0 0 100
Recall that when no concrete input data exists, pure .ntstats 1 1 1 0 0 100
symbolic analysis will explore all paths. The DC (s) XTBot .netinfo 2 2 2 0 0 100
model assumes that following paths that involve .sysinfo 28 30 27 2 1 90.0
pre-staged concrete data in the memory image focuses Benign Argument Paths Paths TP FP FN Acc(%)
Forecast on the most urgent payloads. We -a 3 3 3 0 0 100
netstat -e 3 3 3 0 0 100
empirically evaluated this assumption with -r 2 2 2 0 0 100
controlled-experiments on 2 malicious and 3 benign -release 4 4 4 0 0 100
programs: (1) LokiRAT, a remote access trojan, (2) ipconfig -renew 6 5 5 0 1 83.3
xTBot, an IRC-based malware, (3) netstat, (4) -no-flag 19 18 16 2 1 84.2
-a 6 6 6 0 0 100
ipconfig, and (5) arp. These were chosen because arp -d 10.1.1.1 8 7 7 0 1 87.5
their source code is publicly available and their -s :cf:b8:20 11 12 10 1 1 83.3
behavior for concrete inputs can be determined.7 We
analyzed their source code and compiled binaries to Table 5: Exploration Based on Pre-Staged Input.
establish the ground truth set of paths that selected
concrete inputs will cause the program to take. For Table 5 shows that, for the malware, Forecast
LokiRAT and xTBot, we determined all specific paths discovered 40 out of 41 ground truth paths, with 3 FP
that the malware could take when it receives certain and 1 FN, giving an accuracy of 95.0%. For the benign
commands from its C&C server. For netstat, programs, Forecast discovered 56 out of 62 ground
ipconfig, and arp, we determined all specific paths truth paths, with 3 FP and 4 FN, giving an accuracy of
that the programs could take when executed with a set 93.1%. We found that the FP results were caused by
of command-line flags. Figure 4 illustrates an example. short paths that were pruned when they accessed illegal
Table 5 shows these programs and each of the memory. FNs were caused by symbolic IP values due to
concrete data we investigated. For netstat, ipconfig, unconstrained jump targets. Overall, Forecast
and arp, we executed each program with the attained an accuracy of 94.0%. This shows that
command-line flags shown in Table 5 and took a Forecast’s exploration of memory images using
memory image when main was called to ensure the pre-staged inputs is accurate.
flags exist in the memory image as concrete data. For
these experiments, we obtained 9 memory images (3 4.6 Comparing Existing Techniques
command-line flags for each of the 3 programs). For We empirically compared Forecast with S2E [6],
LokiRAT and xTBot, we executed each sample, injected angr [22], and Triton [23]. We found that Forecast
each selected C&C command, and captured memory outperforms them at identifying malware capabilities
images as soon as they received each C&C command (6 based on the coverage of capability paths (i.e. code
in total). The intuition here is that Forecast should paths where at least one capability is found). Since they
produce the same paths as each ground truth set for the cannot take a memory image as input, with the
corresponding memory images. exception of angr, we provided the malware binary and
7 Forecast did not have access to the ground truth. configured them to start from an equivalent IP as

3534 30th USENIX Security Symposium USENIX Association


concretize constraints, while MAYHEM [27] applies

Basic blocks
Exploration

capabilities
techniques

Identified

explosion
on-line and off-line concolic execution to manage path

instances
Explored

Explore

covered
time(s)
exploration. However, Forecast reduces path

paths

Path
Tools explosion by using the DC (s) framework to identify
Forecast Data-Guided 877 32 28 301 12488 capability-relevant paths. Additionally, Forecast does
angr [22] Pure Symbolic 1292 11 521 236 14567 not require an intact binary file or prior knowledge of a
S2E [6] Concolic 602 7 57 98 10007 program’s input and environment, which avoids
Triton [23] Concolic 229 3 N/A 522 4309
restrictive assumptions for symbolic execution.
Table 6: Forecast Compared to Existing Techniques. For malware applications, prior works use full-system
Forecast. We used 50 samples for this experiment.8 emulation [4], dynamic analysis [5], [45], and Win API
As shown in Table 6, Forecast identified more than simulation [46] to identify malware capabilities.
twice the capabilities compared to angr, S2E, and Yadegari et al. [47] study the robustness of symbolic
Triton. Forecast explored as many as 877 paths per analysis techniques against malware obfuscation. In
sample on average. By leveraging prior execution state contrast, Forecast is a post-detection approach that
to optimize paths, only 28 paths were terminated due combines both symbolic analysis and memory forensics
to path explosion compared to 521 by angr and 57 by to identify staged malware capabilities. Prior work on
S2E. Although angr explored the most paths (1292), it memory forensics focuses on kernel objects [48], [49],
terminated 521 due to path explosion. We observed that access patterns to kernel objects [50]–[54], and dynamic
angr could not concretize paths when faced with early memory traces [55], [56] to detect and remediate rootkit
symbolic control flow, causing state explosion. The malware. DSCRETE [57] leverages memory image code
exploration time for angr was relatively low (236s) reuse for interpreting single data structures. Similarly,
because many paths quickly became unconstrained and for mobile security, prior works [58]–[61] analyze a
terminated. Forecast reported a higher runtime of mobile application’s memory to recover artifacts related
301s due to the overhead of computing probability to recent activities. However, Forecast relies on
scores for each state. memory artifacts to contextualize malware behavior
S2E requires symbolic variables to be manually through symbolic analysis and surgically analyzes a
induced for multi-path exploration. When we initially single target malicious process.
tested S2E with malware, we traced only a single path. Provenance-based investigation techniques are also
However, to enable S2E to explore multiple paths, we related to Forecast. NoDoze [62] and Hassan et
symbolized the arguments of the malware’s local al. [63] utilize Windows and Linux system events to
functions and only traced paths that originated from prioritize alerts through a network diffusion approach
the malware code. This led to an exploration of 602 using temporal ordering. Similarly, HOLMES [64]
paths, where 57 became unconstrained and terminated. correlates suspicious events by examining information
S2E had the fastest average runtime (98s), because it flows and TARDIS [65] identifies compromised websites
executes code natively on the CPU. Triton uses a through a spatial-temporal approach to present attack
per-input iterative approach to code exploration, hence tactics for analysts. Attack2Vec [66] uses system event
the path explosion metric is not applicable. To trace embedding to derive emerging attack tactics.
multiple paths with Triton, we manually pushed new Forecast uses the DC (s) model to predict in-progress
constraints to each path predicate, but Triton was malware capabilities using a similar network diffusion
heavily hindered by input requirements to explore new approach [62]–[64] but instead identifies relevant paths
paths. Triton traced 229 paths on average, 3 of which based on the execution context of a malware.
identified capabilities. Due to its iterative nature and
instruction-level emulation, it incurred the highest
runtime of 522 seconds. 6 Limitations and Discussion

5 Related Work Subverting Symbolic Analysis. An adversary may


target the symbolic execution component of Forecast
Prior work uses symbolic execution for various
by exploiting path explosion, path divergence, and
applications including test case generation [8]–[10], [27],
constructing complex constraints. In §4.3, we turned to
[37]–[40], vulnerability detection [9], [41], [42], and
the published literature on symbolic analysis
enhancing dynamic malware analysis [3], [5], [43], but
benchmarks [35] and found that Forecast is robust
often relies on simplistic heuristics to optimize symbolic
against these attacks. However, we acknowledge that a
execution. FuzzBall [44] initializes the program states to
novel attack, not considered in the literature, may
8 Hashes and capability addresses are in Table 9 in Appendix A. subvert Forecast’s results.

USENIX Association 30th USENIX Security Symposium 3535


Subverting Memory Artifacts. An adversary may 7 Conclusion
target memory acquisition or memory artifacts to
subvert Forecast. The memory acquisition depends Forecast overcomes the high cognitive burden on an
on the IDS, which Forecast has no control over. It is analyst by forecasting future malware capabilities.
reasonable to assume that the IDS will detect and Forecast integrates memory forensics and symbolic
capture a malware while the malware is executing analysis in a feedback loop to efficiently explore
malicious routines (which produced the detected malware with context. Our evaluation has shown that
signature). To tamper with memory artifacts, an Forecast produces accurate forecasts of capabilities.
adversary can obfuscate code segments, use a
non-standard stack layout, or insert junk code/data. Acknowledgments
Forecast was shown to be resilient to junk code/data
produced by Type-III packers in §4.2. If Forecast is The authors would like to thank the anonymous reviewers
affected by an attack that subverts code analysis, for their constructive comments and feedback. We also
Forecast could be extended to handle specific thank Dr. Nolen Scaife for his guidance while shepherding
memory manipulation attacks by porting IDA this paper. This work was supported, in part, by ONR
microcodes to flatten obfuscated code structures [67]. under Award N00014-19-1-2179 and NSF under Award
1755721. Any opinions, findings, and conclusions in this
Virtualization-Based (VM) Packing. Generally,
paper are those of the authors and do not necessarily
like any symbolic exploration framework, Forecast
reflect the views of our sponsors or collaborators.
cannot explore capabilities in packed code unless it is
unpacked. As our evaluation in §4.2 shows, Forecast
can handle Type-I, Type-III, and Type-VI packers as References
outlined in Ugarte-Pedrero et al. [34]. Some packers use
[1] Fileless attacks against enterprise networks, https://se
virtualization to convert programs into bytecode and curelist . com / fileless - attacks - against - enterprise -
use an interpreter to run the bytecode. Due to the networks/77403/, 2017.
complexity of virtualization, Forecast cannot handle [2] The Darkhotel APT: A Story of Unusual Hospitality, https:
such techniques, which account for less than 2% of //media.kasperskycontenthub.com/wp-content/uploads/
packed malware [34]. sites/43/2018/03/08070903/darkhotel_kl_07.11.pdf,
2014.
Adversarial Aware Attacks. An adversary that is [3] U. Bayer, A. Moser, C. Kruegel, and E. Kirda, “Dynamic
aware of Forecast can influence the analysis via two analysis of malicious code,” Journal in Computer Virology,
factors: memory frame replacement and pointer vol. 2, no. 1, 2006.
obfuscation. First, memory frame replacement can [4] D. Brumley, C. Hartwig, M. G. Kang, Z. Liang,
J. Newsome, P. Poosankam, D. Song, and H. Yin,
subvert Forecast for specific samples using Armadillo
“Bitscope: Automatically dissecting malicious binaries,”
with CopyMem-II protection due to the iterative Technical Report, School of Computer Science, Carnegie
unpacking and execution sequence. This unpacks code Mellon University, vol. CS-07-133, 2007.
at a memory frame-level granularity, which limits the [5] A. Moser, C. Kruegel, and E. Kirda, “Exploring Multiple
visibility of the malicious code to the most recent Execution Paths for Malware Analysis,” in Proceedings of
unpacked memory frame. This artifact is evident from the 28th Symposium on Security and Privacy (Oakland),
Oakland, CA, May 2007.
our evaluation in Table 2. Second, pointer obfuscation
[6] V. Chipounov, V. Kuznetsov, and G. Candea, “S2E: A
creates additional overhead for the symbolic execution
platform for in-vivo multi-path analysis of software
engine, which drops the degree of concreteness (DC (s)) systems,” in Proceedings of the 16th ACM International
metric. An attacker can heavily utilize pointer Conference on Architectural Support for Programming
obfuscation by relying on a unique seed in memory to Languages and Operating Systems (ASPLOS), Newport
Beach, CA, Mar. 2011.
deobfuscate pointers.
Heavy obfuscation of memory artifacts can and does [7] I. Yun, S. Lee, M. Xu, Y. Jang, and T. Kim, “QSYM: A
Practical Concolic Execution Engine Tailored for Hybrid
affect the performance and stability of the malware, Fuzzing,” in Proceedings of the 27th USENIX Security
which may not be in the favor of the malware operator. Symposium (Security), Baltimore, MD, Aug. 2018.
Not surprisingly, Ugarte-Pedrero et al. [34] finds such [8] K. Sen, D. Marinov, and G. Agha, “CUTE: A concolic unit
heavy obfuscation in only 1.8% of in-the-wild malware. testing engine for C,” in Proceedings of the ACM SIGSOFT
Finally, we emphasize that the quality of the memory Software Engineering Notes, Lisbon, Portugal, Sep. 2005.
capture is dependent on the detection tool, independent [9] C. Cadar, V. Ganesh, P. M. Pawlowski, D. L. Dill, and
of Forecast. Forecast is a post-detection approach D. R. Engler, “EXE: Automatically generating inputs of
death,” ACM Transactions on Information and System
that relies on a forensic memory capture to perform Security, vol. 12, no. 2, 2008.
capability prediction.

3536 30th USENIX Security Symposium USENIX Association


[10] P. Godefroid, N. Klarlund, and K. Sen, “DART: Directed [25] Y. Li, Z. Su, L. Wang, and X. Li, “Steering symbolic
automated random testing,” in Proceedings of the 2005 ACM execution to less traveled paths,” in Proceedings of the
SIGPLAN Conference on Programming Language Design 2013 Annual ACM SIGPLAN International Conference on
and Implementation (PLDI), Chicago, IL, Jun. 2005. Object Oriented Programming, Systems, Languages &
[11] P. M. Comparetti, G. Salvaneschi, E. Kirda, C. Kolbitsch, C. Applications (OOPSLA), Indianapolis, IN, Oct. 2013.
Kruegel, and S. Zanero, “Identifying dormant functionality in [26] V. Kuznetsov, J. Kinder, S. Bucur, and G. Candea, “Efficient
malware programs,” in Proceedings of the 31th Symposium state merging in symbolic execution,” ACM SigPlan Notices,
on Security and Privacy (Oakland), Oakland, CA, May 2010. vol. 47, no. 6, pp. 193–204, 2012.
[12] L. Martignoni, E. Stinson, M. Fredrikson, S. Jha, and J. C. [27] S. K. Cha, T. Avgerinos, A. Rebert, and D. Brumley,
Mitchell, “A layered architecture for detecting malicious “Unleashing MAYHEM on Binary Code,” in Proceedings of
behaviors,” in International Workshop on Recent Advances the 33rd Symposium on Security and Privacy (Oakland),
in Intrusion Detection, Springer, 2008, pp. 78–97. San Francisco, CA, May 2012.
[13] K. A. Roundy and B. P. Miller, “Hybrid analysis and [28] T. Avgerinos, A. Rebert, S. K. Cha, and D. Brumley,
control of malware,” in Proceedings of the 13th “Enhancing symbolic execution with veritesting,” in
International Symposium on Research in Attacks, Proceedings of the 36th International Conference on
Intrusions and Defenses (RAID), Ottawa, Canada, Sep. Software Engineering (ICSE), Hyderabad, India, May 2014.
2010. [29] C. Kolbitsch, P. M. Comparetti, C. Kruegel, E. Kirda,
[14] C. Kolbitsch, T. Holz, C. Kruegel, and E. Kirda, “Inspector X. Zhou, and X. Wang, “Effective and Efficient Malware
gadget: Automated extraction of proprietary gadgets from Detection at the End Host,” in Proceedings of the 18th
malware binaries,” in Proceedings of the 31th Symposium on USENIX Security Symposium (Security), Montreal,
Security and Privacy (Oakland), Oakland, CA, May 2010. Canada, Aug. 2009.
[15] Non-Malware Attacks and Ransomware Take Center Stage [30] H. Lim, “Detecting Malicious Behaviors of Software through
in 2016, https://www.carbonblack.com/wp-content/uploa Analysis of API Sequence k-grams,” Computer Science and
ds/2016/12/16_1214_Carbon_Black-_Threat_Report_Non- Information Technology, vol. 4, no. 3, pp. 85–91, 2016.
Malware_Attacks_and_Ransomware_FINAL.pdf, 2016. [31] Malpedia: Free and Open Malware Reverse Engineering
[16] Q. Wang, W. U. Hassan, D. Li, K. Jee, X. Yu, K. Zou, J. Rhee, Resource offered by Fraunhofer FKIE, https://malpedia.
Z. Chen, W. Cheng, C. Gunter, and H. Chen, “You are caad.fkie.fraunhofer.de, [Accessed: 2019-01-28].
what you do: Hunting stealthy malware via data provenance [32] Malware Archaeology: Malware Discovery, Education,
analysis,” in Proceedings of the 2020 Annual Network and Training, Active Defense, Detection and Response,
Distributed System Security Symposium (NDSS), San Diego, https : / / www . malwarearchaeology . com / analysis,
CA, Feb. 2020. [Accessed: 2019-01-28].
[17] M. I. Sharif, A. Lanzi, J. T. Giffin, and W. Lee, “Impeding [33] MITRE ATT&CK Framework: A globally-accessible
malware analysis using conditional code obfuscation,” in knowledge base of adversary tactics and techniques based
Proceedings of the 15th Annual Network and Distributed on real-world observations.
System Security Symposium (NDSS), San Diego, CA, Feb. https : / / attack . mitre . org / software/, [Accessed:
2008. 2019-04-20].
[18] D. Balzarotti, M. Cova, C. Karlberger, E. Kirda, C. Kruegel, [34] X. Ugarte-Pedrero, D. Balzarotti, I. Santos, and
and G. Vigna, “Efficient detection of split personalities in P. G. Bringas, “SoK: Deep Packer Inspection: A
malware,” in Proceedings of the 17th Annual Network and Longitudinal Study of the Complexity of Run-Time
Distributed System Security Symposium (NDSS), San Diego, Packers,” in Proceedings of the 36th Symposium on
CA, Feb. 2010. Security and Privacy (Oakland), San Jose, CA, May 2015.
[19] FireEye: Endpoint Forensics, https://www.fireeye.com/ [35] S. Banescu, C. Collberg, V. Ganesh, Z. Newsham, and
products/mir-endpoint-forensics.html, [Accessed: 2018- A. Pretschner, “Code obfuscation against symbolic
02-28]. execution attacks,” in Proceedings of the 32nd Annual
Computer Security Applications Conference (ACSAC),
[20] B. D. Carrier and J. Grand, “A hardware-based memory 2016.
acquisition procedure for digital investigations,” Digital
Investigation, vol. 1, 2004. [36] S. Banescu, C. Collberg, V. Ganesh, Z. Newsham, and
A. Pretschner, Obfuscation benchmarks, 2016. [Online].
[21] S. Vömel and F. C. Freiling, “A survey of main memory Available:
acquisition and analysis techniques for the windows https://github.com/tum-i22/obfuscation-benchmarks.
operating system,” Digital Investigation, 2011.
[37] J. C. King, “Symbolic execution and program testing,”
[22] Y. Shoshitaishvili, R. Wang, C. Salls, N. Stephens, Communications of the ACM, vol. 19, no. 7, 1976.
M. Polino, A. Dutcher, J. Grosen, S. Feng, C. Hauser,
C. Kruegel, and G. Vigna, “SoK: (State of) The Art of [38] R. S. Boyer, B. Elspas, and K. N. Levitt, “SELECT — a
War: Offensive Techniques in Binary Analysis,” in formal system for testing and debugging programs by
Proceedings of the 37th Symposium on Security and symbolic execution,” ACM SigPlan Notices, vol. 10, no. 6,
Privacy (Oakland), San Jose, CA, May 2016. pp. 234–245, 1975.

[23] Triton: A Dynamic Symbolic Execution Framework, SSTIC, [39] W. E. Howden, “DISSECT — A symbolic evaluation and
2015, pp. 31–54. program testing system,” IEEE Transactions on Software
Engineering, no. 4, pp. 266–278, 1978.
[24] Volatility: Open Source Memory Forensics Framework, htt
[40] C. Cadar and D. Engler, “Execution generated test cases:
ps://www.volatilityfoundation.org, 2019.
How to make systems code crash itself,” in Proceedings of
the International SPIN Workshop on Model Checking of
Software, San Francisco, CA, Aug. 2005.

USENIX Association 30th USENIX Security Symposium 3537


[41] C. Cadar, P. Godefroid, S. Khurshid, C. S. Păsăreanu, [55] M. Polino, A. Scorti, F. Maggi, and S. Zanero, “Jackdaw:
K. Sen, N. Tillmann, and W. Visser, “Symbolic execution Towards Automatic Reverse Engineering of Large Datasets
for software testing in practice: Preliminary assessment,” in of Binaries,” in Proceedings of the Conference on Detection
Proceedings of the 33th International Conference on of Intrusions and Malware, and Vulnerability Assessment
Software Engineering (ICSE), Honolulu, HI, May 2011. (DIMVA), Milan, IT, Jul. 2015.
[42] V. Chipounov, V. Georgescu, C. Zamfir, and G. Candea, [56] Z. Xu, J. Zhang, G. Gu, and Z. Lin, “Autovac:
“Selective Symbolic Execution,” in Proceedings of the 5th Automatically extracting system resource constraints and
Workshop on Hot Topics in System Dependability (HotDep), generating vaccines for malware immunization,” in
Estoril, Portugal, Jun. 2009. Proceedings of the 33rd International Conference on
[43] D. Brumley, C. Hartwig, Z. Liang, J. Newsome, D. Song, and Distributed Computing Systems (ICDCS), 2013.
H. Yin, “Automatically identifying trigger-based behavior [57] B. Saltaformaggio, Z. Gu, X. Zhang, and D. Xu, “DSCRETE:
in malware,” in Botnet Detection, Springer, 2008, pp. 65–88. Automatic Rendering of Forensic Information from Memory
[44] L. Martignoni, S. McCamant, P. Poosankam, D. Song, and Images via Application Logic Reuse,” in Proceedings of the
P. Maniatis, “Path-exploration lifting: Hi-fi tests for lo-fi 23rd USENIX Security Symposium (Security), San Diego,
emulators,” in Proceedings of the 17th ACM International CA, Aug. 2014.
Conference on Architectural Support for Programming [58] B. Saltaformaggio, R. Bhatia, X. Zhang, D. Xu, and G. G.
Languages and Operating Systems (ASPLOS), London, Richard III, “Screen after previous screens: Spatial-temporal
UK, Mar. 2012. recreation of android app displays from memory images,”
[45] F. Peng, Z. Deng, X. Zhang, D. Xu, Z. Lin, and Z. Su, in Proceedings of the 25th USENIX Security Symposium
“X-Force: Force-Executing Binary Programs for Security (Security), Austin, TX, Aug. 2016.
Applications,” in Proceedings of the 23rd USENIX Security [59] R. Bhatia, B. Saltaformaggio, S. J. Yang, A. Ali-Gombe,
Symposium (Security), San Diego, CA, Aug. 2014. X. Zhang, D. Xu, and G. G. Richard III, “"Tipped Off
[46] R. Baldoni, E. Coppa, D. C. D’Elia, and C. Demetrescu, by Your Memory Allocator": Device-Wide User Activity
“Assisting Malware Analysis with Symbolic Execution: A Sequencing from Android Memory Images,” in Proceedings
Case Study,” in Proceedings of the International of the 2018 Annual Network and Distributed System Security
Conference on Cyber Security Cryptography and Machine Symposium (NDSS), San Diego, CA, Feb. 2018.
Learning (CSCML), Israel, Jun. 2017. [60] B. Saltaformaggio, R. Bhatia, Z. Gu, X. Zhang, and D. Xu,
[47] B. Yadegari and S. Debray, “Symbolic Execution of “GUITAR: Piecing Together Android App GUIs from
Obfuscated Code,” in Proceedings of the 22nd ACM Memory Images,” in Proceedings of the 22nd ACM
Conference on Computer and Communications Security Conference on Computer and Communications Security
(CCS), Denver, Colorado, Oct. 2015. (CCS), Denver, Colorado, Oct. 2015.
[48] M. Carbone, W. Cui, L. Lu, W. Lee, M. Peinado, and X. [61] B. Saltaformaggio, R. Bhatia, Z. Gu, X. Zhang, and D. Xu,
Jiang, “Mapping kernel objects to enable systematic integrity “VCR: App-Agnostic Recovery of Photographic Evidence
checking,” in Proceedings of the 16th ACM Conference on from Android Device Memory Images,” in Proceedings of the
Computer and Communications Security (CCS), Chicago, 22nd ACM Conference on Computer and Communications
Illinois, Nov. 2009. Security (CCS), Denver, Colorado, Oct. 2015.
[49] W. Cui, M. Peinado, Z. Xu, and E. Chan, “Tracking [62] W. U. Hassan, S. Guo, D. Li, Z. Chen, K. Jee, Z. Li, and
Rootkit Footprints with a Practical Memory Analysis A. Bates, “NoDoze: Combatting Threat Alert Fatigue with
System,” in Proceedings of the 21st USENIX Security Automated Provenance Triage,” in Proceedings of the 2019
Symposium (Security), Bellevue, WA, Aug. 2012. Annual Network and Distributed System Security
Symposium (NDSS), San Diego, CA, Feb. 2019.
[50] J. Rhee, R. Riley, Z. Lin, X. Jiang, and D. Xu, “Data-Centric
OS kernel malware characterization,” IEEE Transactions [63] W. U. Hassan, A. Bates, and D. Marino, “Tactical
on Information Forensics and Security, vol. 9, 2014. provenance analysis for endpoint detection and response
systems,” in Proceedings of the 41st Symposium on
[51] B. Dolan-Gavitt, T. Leek, J. Hodosh, and W. Lee, “Tappan
Security and Privacy (Oakland), Online Conference, May
zee (north) bridge: Mining memory accesses for
2020.
introspection,” in Proceedings of the 20th ACM Conference
on Computer and Communications Security (CCS), Berlin, [64] S. M. Milajerdi, R. Gjomemo, B. Eshete, R. Sekar, and V.
Germany, Oct. 2013. Venkatakrishnan, “Holmes: real-time APT detection through
[52] Z. Lin, X. Zhang, and D. Xu, “Automatic reverse engineering correlation of suspicious information flows,” in Proceedings
of data structures from binary execution,” in Proceedings of of the 40th Symposium on Security and Privacy (Oakland),
the 17th Annual Network and Distributed System Security San Francisco, CA, May 2019.
Symposium (NDSS), San Diego, CA, Feb. 2010. [65] R. P. Kasturi, Y. Sun, R. Duan, O. Alrawi, E. Asdar, V. Zhu,
[53] A. Slowinska, T. Stancescu, and H. Bos, “Howard: a Dynamic Y. Kwon, and B. Saltaformaggio, “TARDIS: Rolling back
Excavator for Reverse Engineering Data Structures,” in the clock on CMS-targeting cyber attacks,” in Proceedings
Proceedings of the 18th Annual Network and Distributed of the 41st Symposium on Security and Privacy (Oakland),
System Security Symposium (NDSS), San Diego, CA, Feb. Online Conference, May 2020.
2011. [66] Y. Shen and G. Stringhini, “Attack2vec: Leveraging
[54] Q. Feng, A. Prakash, H. Yin, and Z. Lin, “Mace: temporal word embeddings to understand the evolution of
High-coverage and robust memory analysis for commodity cyberattacks,” in Proceedings of the 28th USENIX Security
operating systems,” in Proceedings of the 30th Annual Symposium (Security), Santa Clara, CA, Aug. 2019.
Computer Security Applications Conference (ACSAC), [67] R. Rolles, Rolfrolles/hexraysdeob, https : / / github . com /
2014. RolfRolles/HexRaysDeob, Jun. 2018.

3538 30th USENIX Security Symposium USENIX Association


A Appendix: Additional Technical Material

Capability Plugin Tracked APIs (Reverse Order) Tracked Parameters Description


Send(socket, buf)
socket <- Socket(socket) Exfiltration functionality tracks back from the Send function
Socket(socket)
File Exfiltration buf <- ReadFile(hFile, buf) by tracking Socket creation, file access (OpenFile and ReadFile),
ReadFile(hFile, buf)
hFile <- OpenFile(lpFname) and the parameters associated with each API.
OpenFile(lpFname)
SetThreadContext(pContext)
WriteProcessMemory(hProcess) This code injection technique is known as process hollowing. The
pContext <- GetThreadContext(pContext)
VirtualAllocEx(hProcess) plugin tracks from SetThreadContext with WriteProcessMemory,
Code Injection hProcess <- CreateProcess(appName)
ZWUnMapViewOfSection(pHandle) and VirtualAllocEx to identify parameter constraints tying back
pHandle <- hProcess
GetThreadContext(pContext) to pContext, hProcess, and pHandle.
CreateProcess(appName)
CreateProcess(lpApplicationName) This plugin tracks code that writes a file to disk by creating a file
hFile <- CreateFile(lpFileName)
SetFileAttribute(lpFileName) handle based on a filename. Then it tracks an attribute modification
Dropper lpFileName <- lpFileName
WriteFile (hFile) that sets the property for execution. Then tracks filename and path
lpApplicationName <- lpFileName
CreateFile(lpFileName) used in the process creation to make up the dropper capability.
FromHbitmap(hBitmap)
SelectObject(hDCDest) hBitmap <- CreateCompatiableBitmap(hDCSource) This plugin tracks screen capture capability by identifying a handle
CreateCompatibleDC(hDCSource) hDCDest <- CreateCompatiableDC(hDCSource) to bitmap object that constraints on a handle to a device context
Key & Screen Spy
CreateCompatiableBitmap(hDCSource) hDCSource <- GetWindowDC(Whandle) object. Then constraints the handle to a Windows handle object
GetWindowDC(Whandle) Whandle <- GetDesktopWindow() that is created by referencing the user Window.
GetDesktopWindow()
This plugin tracks a persistent method that relies on registry keys.
GetFullPathNameA(lpFileName)
lpData <- GetFullPathnameA(lpFileName) Specifically, we track constraints on the file path value set to a key
Persistence RegOpenKeyEx(lpSubKey)
hKey <- RegOpenKeyEx(lpSubKey:str-match) and sub key value by matching for HKLM, HKCU,
RegSetValueEx(hKey, lpData)
ControlSet, and Run to the registry key handle.
InternetGetConnectedStates
GetConnectedProfiles
GetConnectivity
This plugin applies no parameter constraints to identify and track
InternetAttemptConnect
anti-analysis capability. Since Forecast assumes the memory image under
OutputDebugString
Anti-Analysis None specified analysis is a suspicious or malicious (detected by HIDS), Forecast simply
IsDebuggerPresentPresent
searches for any invocation of these Windows API functions to track
CheckRemoteDebuggerPresent
anti-analysis capability.
CreateToolhelp32Snapshot
EnumProcesses
cpuid (instruction)
InternetConnectA(lpszServerName) lpszServerName - IP/Domain regex-match This plugin applies regex match constraint to the parameters of a
C&C Comm. InternetCheckConnectionA(lpszUrl) lpszUrl - IP/Domain regex-match select network-based APIs to identify and track C&C communication.
IWinHttpRequest::Open(Url) Url - IP/Domain regex-match Specifically, the plugin tracks internet routable and valid domain names.

Table 7: Capability identification and tracking is a modular component of Forecast. Analysts can build additional
capability plugins to help in future investigations by identifying APIs and parameter constraints that make up the
capability. The parameter constraints are tracked through data flow analysis and backward slicing.

Sample Year Reported Hash (SHA 256)


rokrat 2018 4d37f80da97845129debf3244e1f731d2c93a02519f9fdaa059f5f124cf7c26f
7honest 2016 575e6fa02a54b9e3cd5977a66d09cf0e841d6efbe59be334056cf8fe8613194a
bokbot 2019 62b7fbffd000a8d747c55260f0b867d09bc4ad19b2b657fb9ee3744c12b87257
AcridRain 2018 7b045eec693e5598b0bb83d21931e9259c8e4825c24ac3d052254e4925738b43
AthenaGo 2016 af385c983832273390bb8e72a9617e89becff2809a24a3c76646544375f21d14
AdamLocker 2016 0fb2e4bdd84c3ae8af8fb255ad4f5d093bc10544684bff739ccc985ebd4e64cb
Marap 2018 5859a21be4ca9243f6adf70779e6986f518c3748d26c427a385efcd3529d8792
Abaddon 2015 7cfc340ed0bd2af138c4b2b85c19693755a9c9ea798028d1a17d0cfcc61b5a3a
ATI 2016 b101cd29e18a515753409ae86ce68a4cedbe0d640d385eb24b9bbb69cf8186ae
TeslaAgent 2018 c2cae82e01d954e3a50feaebcd3f75de7416a851ea855d6f0e8aaac84a507ca3
Andromeda 2013 f20355d0e3689bf7e8540c6881cb5299e36c5342a3679dd54d206c4ff4f8b979
AVCrypt 2018 58c7c883785ad27434ca8c9fc20b02885c9c24e884d7f6f1c0cc2908a3e111f2
AveMaria 2018 81043261988c8d85ca005f23c14cf098552960ae4899fc95f54bcae6c5cb35f1
Aveo 2016 9dccfdd2a503ef8614189225bbbac11ee6027590c577afcaada7e042e18625e2
Table 8: Malware Samples Used In The Forecasting Evaluation (§4.1).

USENIX Association 30th USENIX Security Symposium 3539


Sample Hash Capture IP Capability End Address(es)
f9c6db5331051aa487b706f0616f3287a40a27606bfddc804b3c4684d4203717 0x140005057 0x140008060
59b9d061ff78c240e1e0e8135d9be482e0fe788186b6cb940f56c67798a862df 0x14000515b 0x140008152
1eed6b168c2cd7701bf3a2aa6a30cf014cae9bc6ae813ef7356c5c6bc8ad6d18 0x1400050e7 0x1400080ec
471de9132673ec513b5c7c06a4bc1f67a7e91c6c8c7def55e9e03131ac5fb400 0x40109d 0x401374, 0x77244bb4
153fb1b9cd5dabffa3d123c4ac91abae46546db7447140df7b4aa1f2d3e8f59e 0x1400010e6 0x1400010f1, 0x77994bb4
ffec8e4a80182eb507489bcabd368d42489bf1ec871542c131df04c068d01a76 0x14000221f 0x140002516, 0x78204bb4
baa0f9e799a3d46ccb04c9d4520a69e58383b2d88aad8746f9214eaa8d3a06f3 0x14000f2b1 0x140011380, 0x14000f29e
ff64690b250faa9b1902b945f543a7b4ff9560cb562c0b18f3798538cc28178c 0x14000c2ac 0x14000fa10, 0x14000c299
dc616f2f6b1856454412ea608b96d3d6d7ab719684b6d04f0a79cf9228477d4f 0x140001408 0x1400013fd, 0x140001054
f2f9696ffea5b8cf3c1bf860a3d0704033b7693199cf097367a052144b0c350f 0x14000b089 0x1400234a9, 0x77314bb4
3025bf51ac1f1571e3f49ee1836d44f0cfd9bfcf6e39731f6fea0ddde33925a1 0x1400012d4 0x140007690, 0x78714bb4
e82b6a27a1aec373983f189cd422f1eeb336f1f493db341df5d090a4946feae8 0x14000159a 0x140012987, 0x7fefef911fd,
0x77984bb4, 0x14000de82,
0x14000de5f, 0x14000de8d
51c5668f052bbfb4ca9670413a240c8214264839211119543b28f90f86504edc 0x14000136c 0x1400043b5
f055f75abb82c9500b3f2cf64f6b546105177599b718304b3fc569e932533087 0x14000be19 0x14000e360, 0x14000be06
c06b359921a385efbf8ce33bd875a797d89f88c575fe640173429ce5a10b45ae 0x140001c56 0x140002e0e, 0x77ba4bb4
54b49a2faef8b8a6b8ef9bd96a44575403025e8c422ef8817d8cba6ea0344945 0x14000ec06 0x140010bc7, 0x14000ebf3
a785bc5be1fd3e9f6997f558a4e613b973769cc43c6e7b738158354b66390d06 0x1400012b5 0x140004daa, 0x78114bb4
fee18f402375b210fc7b89e29084fb8e478d5ee0f0cdb85d4618d14abb2e5197 0x14000faa9 0x140011e80, 0x14000fa96
f85abdfa7e8931686bbbb9bb0dd2e12ca10f28b8b1b7be2890eb19023c52232a 0x1400242b8 0x1400242c3, 0x77314bb4
ec72f1af9119754195a77cd890cc9e5ee1e555e9ef89fe2e535ee3e4ce2132cf 0x140011f10 0x140016387, 0x140011efd
a5c8d9df73b2ff360d22e879b678d323bbccd81cb9e0ef45cce4aaf4e37c7f27 0x140011516 0x14001163a, 0x77644bb4
58f9504b59b40dfbff5e3093af0a39def00b449c499ef3e7c0880ac986575f76 0x14000fda9 0x140012180, 0x14000fd96
6130a8c7595f6d9abc3dba157e8bd7596b11c9903296060e52d764a8719d7b84 0x140023d5a 0x77d0a358, 0x140023156,
0x77ba4bb4, 0x77d03e18,
0x140025d80
c9b27cbdc1b4258cd4103b3847e7de9c52985289ce4bd61323d69bf9c1e2a8c0 0x1400025bc 0x14000343c, 0x77874bb4
1edfad978a9e4beb24c2f51e9cf12424d415f5e9b5292279ac47b9f650495b31 0x140001041 0x14000174e, 0x140001045,
0x77e1a358, 0x1400016a0,
0x140001437
cab869f98ba3fe1948d2b48fa76fa4767fa7f31e28f3be2b34572ab0c63f942a 0x14000f8cd 0x140011ca0, 0x14000f8ba
600845916e82b6de80f9ff1d6a0553ff98bce6f41dc6029343821f095072fcee 0x14000a6c9 0x14001bbb9, 0x77644bb4
c2bda34d3ac4844ea377aa87b115b94019b98919de7d153029865efc969fd46d 0x140002a5e 0x76f53e18, 0x140002a62
b59c3d14968a9d7d90baa0df624339aa977dc98e5de1c7f6b71bef23606db769 0x14000d20d 0x14000f1a0, 0x14000d1fa
2dbd5d77540a1470459d74906d1668ae49fb275d834976fae1f31bbe74d8e168 0x140003df6 0x140010ba0, 0x76cd4bb4
e47b4147f8a51511b087f90ae07a4d0650b17a6ca2be5a7b19ad1c3f058fb15f 0x4038db 0x406a8b, 0x7fefbb1580a,
0x405b7a, 0x405b7f, 0x406a7f
6dabcf4ce36360826b381a80a7bd34d0df6612f37528e0086009a87bbc16ee57 0x401724 0x7fefdaa99e2, 0x7fefdad811e
b1dee4864ee0d67afb4889cdb0efe1ea54e1005debeb9ef4b4541848c23750c8 0x14000b079 0x14001f7e9, 0x77984bb4
f3fb1b8bd66a67e9f5e00895fb1fee886764c1fc65def4b0104eb7408973ee40 0x140004750 0x778c3e18, 0x77644bb4
b3f91bd440d63ff0b3a28e3fc444714088dc8f30160a6e5f8073594f7d9a6aa6 0x1400016ae 0x1400172b0, 0x7feff5d11fd,
0x77244bb4, 0x140010cdb,
0x140010ce9, 0x140010cb2
d12899958f7adc1be6a3f540f5a25a6ea5eb024dba018d7d3d0a1808df970323 0x140004ac2 0x140004c00
ae210c336cdfec7f7f523fa5b910981e2896f53184b3863621629e81cc0607ed 0x14000a747 0x78263e18, 0x14000a732
35b8a197bd6642f62af2b809ba72d8d7cc4ac18879f10cffeb8f2df66db93746 0x14000a6cd 0x14001f869, 0x140005d7e,
0x14000370e
c9ee386c3d2b8230d95870ce3391aa8a4890169a0fe021a5562d3735f2466160 0x140001238 0x140001243
db8caaf17e1e9afa4a64b7e6a57d07a2eb6669edaed70daced81295ea183da9f 0x14000233c 0x140002347
355341b710fe7f121df4c5fcfc32de9da5a5e2003f0869fcbb7a47f92f2471f2 0x40369d 0x403a64, 0x40146e, 0x4014f8
fba0cc427658445f0ca78d6a263c5b9a9714e99e733ffe25ba719c9b39b98664 0x14000a685 0x140019af9, 0x77ba4bb4
23c7eee980ca21ac8597bd6eb2147e4bfc1941490db87f276a13146914ea5637 0x140003957 0x1400075ac, 0x140003945,
0x1400074d6, 0x1400074fe,
0x14000721f
4f998e4290bdf67dc4a1e75ed739eb57defda3c329b6b07f29b3b6c771a8b3ea 0x1400010ae 0x1400032d9, 0x77ba4bb4
a238ccc209980719927c777fc9f16866403cb9d58c0c847b9cd92ece0d46e725 0x14000226c 0x14000ab18, 0x14000225a,
0x14000aa42, 0x14000aa6a,
0x14000a79f
17ecabd73e1eb5f7a7f6b35b0c48d3fcf5f73f65aef34993726439d7d27da849 0x14000254b 0x140002556
5c9e92f6b45b0cb098838e5db6623067396f066704f9c909b31d234bfaf74458 0x100005642 0x10000c259
697256960cdded3229b0f2f99b593751d3862774dc7c5cabdbbf769beadd263f 0x2000032cb 0x20000da50
c0be7a344a863894890127e61851838037bd9d076423bfc8296cfd6e01d66f6b 0x14000f939 0x140011d10, 0x14000f926
656ac5ec110c5f8ce68ce1962d6b2cbd47ee6ce20a181c88bb1e5481793f0578 0x140001c70 0x140001c81, 0x14000133a

Table 9: Malware Samples And Parameters Used In The Empirical Evaluation (§4.6).

3540 30th USENIX Security Symposium USENIX Association

You might also like