Automatic Detection and Analysis of Keystroke Loggers Using Memory Forensics
Automatic Detection and Analysis of Keystroke Loggers Using Memory Forensics
Journal Pre-proof
PII: S0167-4048(20)30145-0
DOI: https://doi.org/10.1016/j.cose.2020.101872
Reference: COSE 101872
Please cite this article as: Andrew Case, Ryan D. Maggio, Md Firoz-Ul-Amin, Mohammad M. Jalalzai,
Aisha Ali-Gombe, Mingxuan Sun, Golden G. Richard III, Hooktracer: Automatic Detection and Anal-
ysis of Keystroke Loggers Using Memory Forensics, Computers & Security (2020), doi:
https://doi.org/10.1016/j.cose.2020.101872
This is a PDF file of an article that has undergone enhancements after acceptance, such as the addition
of a cover page and metadata, and formatting for readability, but it is not yet the definitive version of
record. This version will undergo additional copyediting, typesetting and review before it is published
in its final form, but we are providing this version to give early visibility of the article. Please note that,
during the production process, errors may be discovered which could affect the content, and all legal
disclaimers that apply to the journal pertain.
Andrew Case
Volatility Foundation
Ryan D. Maggio
Division of Computer Science and Engineering, Louisiana State University
Md Firoz-Ul-Amin
Division of Computer Science and Engineering, Louisiana State University
Mohammad M. Jalalzai
Division of Computer Science and Engineering, Louisiana State University
Aisha Ali-Gombe
Department of Computer Science, Towson University
Mingxuan Sun
Division of Computer Science and Engineering, Louisiana State University
Abstract
Advances in malware development have led to the widespread use of attacker
toolkits that do not leave any trace in the local filesystem. This negatively im-
pacts traditional investigative procedures that rely on filesystem analysis to
reconstruct attacker activities. As a solution, memory forensics has replaced
filesystem analysis in these scenarios. Unfortunately, existing memory foren-
sics tools leave many capabilities inaccessible to all but the most experienced
investigators, who are well versed in operating systems internals and reverse
∗ Corresponding author
Email addresses: [email protected] (Andrew Case), [email protected] (Ryan D.
Maggio), [email protected] (Md Firoz-Ul-Amin), [email protected] (Mohammad M.
Jalalzai), [email protected] (Aisha Ali-Gombe), [email protected] (Mingxuan Sun),
[email protected] (Golden G. Richard III)
1. Introduction
The rise of memory-only malware and attack payloads has led to the nearly
ubiquitous use of volatile memory analysis in incident response. Volatile mem-
ory analysis, also known as memory forensics, is the technique of acquiring and
then analyzing a sample of volatile memory (RAM) obtained from a running
computer system or virtual machine. Whereas traditional filesystem forensics
can recover only those artifacts that the operating system and running applica-
tions choose to record to disk, memory analysis techniques allow an investigator
to fully examine and reconstruct the entire state of a system. This state includes
all of the in-kernel and userland data structures, code, user-generated input and
output, and more. When focused against malware, these capabilities allow an
investigator to detect and analyze all of a malware sample’s actions, regardless
of the techniques employed by the malware to become resident within memory.
As recently documented by Microsoft [42], nearly all modern malware and
attacker toolkits have at least one memory-only component and many of them
reside solely in memory. One of the most infamous of these was Duqu, which
was used to compromise a significant portion of Kaspersky’s corporate network
environment [10]. Duqu leveraged memory-only rootkits that were installed by
exploiting then zero-day vulnerabilities in Microsoft Windows. Duqu utilized
no persistence mechanisms, meaning a reboot fully removed it from a system.
In Kaspersky’s environment, this provided little relief, however, as other Duqu-
infected systems would probe the network for rebooted systems and then re-
infect them. As documented by Kaspersky in their post-mortem report, full
detection and understanding of Duqu was only achieved after memory foren-
sics was used by its incident response team. Careto [39], Skeleton Key [19],
and Poison Ivy [22] are other examples of powerful malware that execute in a
memory-only or near memory-only manner and that require memory forensics
to detect and analyze. The popular open source Metasploit [63] and PowerShell
Empire [1] attack frameworks also run memory-only unless the user chooses to
2
store files within the filesystem. Combined, these malware samples and frame-
works provide attackers with complete control of a system in a manner that
requires no storage of any data anywhere on the local filesystem or within its
contained stores, such as the registry.
As the trend of attackers leveraging memory-only toolkits continues to grow,
the need for memory forensic tools and techniques that are accessible to foren-
sic investigators with a wide range of skill levels has become essential. Unfortu-
nately, current memory forensics tools do not meet this need. Volatility [2] is the
most widely used and powerful memory forensic framework currently available.
It is open source and contains over 200 plugins that support deep inspection of
memory-resident artifacts contained within volatile memory captures of Win-
dows, Linux, and Mac systems. While extremely powerful, a major shortcoming
of Volatility is that many of its capabilities are only accessible to expert digital
forensics investigators. This is particularly true when malware analysis tasks are
involved, as they often require manual reverse engineering within Volatility by
the analyst. This is primarily a result of Volatility being able to correctly iden-
tify malware hooks in memory, but not providing post-processing algorithms
capable of successfully differentiating hooks placed by legitimate software from
malicious ones.
The goal of the research presented in this paper is to automate and make
accessible to investigators of all skill levels one of Volatility’s most powerful ca-
pabilities - detection of userland keyloggers. Keyloggers are one of the most
dangerous threats facing users [74] as they allow recording and exfiltration of
keystrokes entered by users, contents of copy and paste buffers, and data dis-
played within applications. Sophisticated malware often bundles additional ca-
pabilities within its keylogger modules, including the ability to take screenshots
and record audio from microphones and video from web cameras [36]. As dis-
cussed in the next section, the Windows API commonly abused by keyloggers,
SetWindowsHookEx [53], is one type of hook that Volatility can correctly find,
but with no post-processing capability. As documented on the Volatility Labs
blog [44], making this determination currently requires a labor-intensive mix
of reverse engineering in conjunction with running multiple Volatility plugins.
This process also assumes deep knowledge of the Windows API and systems
internals. These requirements make the technique inaccessible to all but the
most experienced investigators. Furthermore, even for subject matter experts,
the process is still labor intensive, manual, and time consuming. Given the
substantial amount of evidence that investigators must sift through in modern
investigations [56], any portion of the workflow that requires manual exami-
nation by senior staff members causes a severe bottleneck in an organization’s
incident response capabilities. Our work, embodied in a new Volatility plugin
called hooktracer messagehooks, leverages the significant additions that we made
to the HookTracer engine [13] to enable automated and scalable detection and
analysis of this threat.
3
2. The SetWindowsHookEx API
The first parameter, idHook, dictates which event the hook wants to mon-
itor, such as for keystrokes or mouse clicks. The second parameter, lpfn, is
the callback function within the hooking module that receives each triggered
message. The third parameter, hmod, is a handle to the DLL containing the
hooking procedure or NULL. The fourth parameter, dwThreadId, is either zero
or the thread ID of the thread for which the hook should be active. The specific
process(es) in which a hook will be active and where the hook code will reside
is fully dependent on the values of the last two arguments to the function. If
dwThreadId is non-zero, then it specifies the thread ID within the calling pro-
cess to hook. Otherwise, if dwThreadId is zero, then all threads within the same
desktop will be hooked. This also affects the behaviour of hmod. If dwThreadId
is zero and hmod is a handle to a DLL, then that DLL will get loaded (injected)
into every hooked process once a thread of the process generates a registered
hook (e.g., receives keyboard input). If dwThreadId is zero and hmod is NULL,
then the callback at the offset specified by pfn inside the calling executable will
be triggered from the context of the hooked threads.
4
On the other hand, userland keyloggers that utilize SetWindowsHookEx have
much of the hard work done for them by the operating system. For example, if
the keylogger wishes to monitor keystrokes across all processes within a desk-
top, then all it needs to do is set dwThreadId to zero. Similarly, if the keylogger
wants its malicious DLL injected into all of its victim processes automatically,
then it can simply pass a handle to the DLL in the hmod parameter. All of this
can be done without suspicious and noisy API calls, such as AdjustTokenPriv-
ileges, WriteProcessMemory, and CreateRemoteThread. Volatility’s apihooks
and malfind plugins will not detect DLLs loaded through SetWindowsHookEx
as maliciously injected since they are loaded through normal operating system
procedures.
As seen in Figure 2, this message hook is active inside the Default desktop
of logon session 0. We can tell it is a global hook, since its thread is identi-
fied as “<any>”. This is Volatility’s method for signifying that the hook has
been set in all threads associated with the same desktop. We also see that the
WH CALLWNDPROC message type is being monitored. Finally, we see the
registered callback is the code beginning at offset 0x1160 inside of C: \Windows
\system32 \wls0wndh.dll. Examining Figure 3, we see it is reporting that the
global hook is active inside of thread 3180 of the spoolsv process with process
ID 1360. messagehooks will report one of these blocks for each thread of the
desktop.
5
Figure 3: A local message hook as displayed by messagehooks.
6
Figure 4: Analyzing message hooks using Volatility.
4. HookTracer
7
The amount of work performed by message hooks compared to API hooks
necessitated a significant amount of new research and development to integrate
new functionality into HookTracer. To add these missing capabilities, our team
developed a new set of features and APIs for HookTracer, totaling nearly 2,000
lines of new Python code. The main purpose of these additions was to 1) add
full support for 32-bit executables and libraries, as the existing implementation
was focused on 64-bit code, and 2) provide function call interception capabilities.
Our new function call interception APIs allow HookTracer to internally monitor
when particular Windows APIs are about to be called by emulated code and
then provide callbacks for the following purposes:
• Stable emulation
• Supporting system resource access
• Faking return values of called functions
• Faking parameters sent to functions
• Recording parameters passed to emulated functions
The remainder of this section describes each of these issues in more detail.
• Lock access - Since no other processes or threads are active within the
emulated environment, the state of a lock is “stuck” at its value at the
time of memory capture
• Memory region allocations - Handling of memory region allocation re-
quests, such as through the VirtualAlloc API, requires in-kernel code that
is not supported by the emulator
• Debugging APIs - These APIs allow reading and writing memory of other
processes, none of which are present within the emulated environment
8
4.2. Supporting System Resource Access
Of particular importance to forensic investigators is malware’s access and
activity related to the filesystem, the registry, and the network. While our goal
is to detect malware that leaves absolutely no traces in the local filesystem, we
certainly do not want to miss detecting malware that does. Since these resources
provide persistence, lateral movement, data gathering, and data exfiltration, we
added extensive support for each.
9
network connectivity checks by malware will often thwart automated and dy-
namic analysis. The widespread use of network connectivity checks by malware
for this purpose led to the creation of projects such as FakeNet [31] and FakeNet-
NG [23]. These software projects present fake instances of network services, such
as DNS and HTTP, to malware running inside of automated analysis systems to
trick the malware into thinking it has real internet access. Use of these projects
is now common in the industry.
HookTracer implements a strategy similar to FakeNet in that it intercepts
calls to network functions and returns fake data. Full discussion of returning
faked data is discussed in the next subsection. For DNS resolution attempts,
HookTracer simply returns the public IP address of Google. For calls to func-
tions that perform HTTP requests, HookTracer will construct a reply that
matches the requested protocol and file (or mime) type. For calls to receive
raw data, such as recv, HookTracer returns the English alphabet repeating for
the length requested. For network APIs that don’t require generating fake data,
such as bind, accept, and socket, HookTracer will fake return values that indicate
success. Combined, this capability allows HookTracer to successfully emulate a
significant amount of network activity generated by emulated code.
10
To accomplish this, malware will call GetFileTime with a handle to a common
Windows file, such as kernel32.dll, to get its timestamps. The malware will then
use SetFileTime to copy these timestamps for the malicious file. This allows
malicious files to blend in with other files during timeline analysis and filesystem
anomaly detection. HookTracer handles this call sequence by returning times-
tamps matching the time when the memory sample was taken and allowing
SetFileTime calls to succeed.
Filesystem Enumeration: The FindFirstFile and FindNextFile APIs al-
low programs to enumerate all files and sub-directories of a given directory. This
functionality is often abused by malware to find files to infect, encrypt, delete,
or exfiltrate. Since there is no actual filesystem present, HookTracer fakes a
realistic looking directory structure.
Process Enumeration: The list of running processes is queried by mal-
ware for many purposes, such as finding processes to inject code into, searching
for security monitoring software, or gathering a list of process names for exfil-
tration. Enumerating processes is accomplished through a set of calls to Cre-
ateToolhelp32Snapshot, Process32First, and Process32Next. HookTracer fakes
the corresponding data structures for these calls and uses the actual process
list generated by Volatility’s APIs to return the processes active in the memory
sample.
11
5. Message Hooks Analysis
The first parameter, nCode, can assume several values. These are HC ACTION
or HC NOREMOVE. Keyloggers will verify that the value is HC ACTION so
that they are ensured they have received a current keystroke. The second pa-
rameter, wParam, is the action that generated the keystroke event. Keylog-
gers generally filter for WM KEYDOWN events, but can also monitor for a
WM KEYDOWN followed by a WM KEYUP. The third parameter, lParam,
describes the key’s value and associated attributes (i.e., it is an extended key,
etc.).
Figure 6 shows the initial code of the keyboard event handler of the Gozi
malware. Gozi steals keystrokes and passwords, captured screenshots, and in-
fected systems throughout the world. Its source code was eventually leaked
online and is still accessible on GitHub [12]. As shown, before processing the
12
keystroke and executing its malicious payload, Gozi first checks that the ac-
tion matches HC ACTION and that the event is WM KEYDOWN. Analysis of
Gozi’s message hook handler is is described in Section 7.3.
13
chain can be called. Malware will often abuse SetWindowsHookEx to inject a
malicious DLL into many processes, but then not actually analyze the requested
hook. This allows the injected DLL to perform other malicious actions with-
out concern for specific keystrokes or other actions by the victim users. Laqma
[44] was one of the first advanced malware samples to use this technique. The
infamous Carberp malware, which was used to steal an estimated $250M USD
from victims [38], also utilized this technique, as illustrated in its leaked source
code [57].
14
monitors for transformations of gathered keystrokes. In situations where a trans-
formation occurred, the plugin will attempt to automatically generate a script
capable of decrypting a given key log file.
hooktracer messagehooks currently supports automated decoding and script
generation against keyloggers that leverage the XOR and ROL operations with
a static key (shift value). To accomplish this, the plugin monitors for transfor-
mation that utilizes these instructions, and when detected, records the source
value. For XOR, this is the integer key used to transform the destination value.
For ROL, this is the integer that specifies by how many bits to shift the data.
When a static key (shift) is used for every offset of transformed data, then
hooktracer messagehooks generates a simple Python script that takes a file path
from the command line, transforms every byte of the file with the monitored
operation and static key, and then writes it to an output file. When used in
conjunction with the plugin’s automatic component extraction, this allows an
investigator to decrypt all previously logged data on infected systems.
For transformations beyond XOR and ROL, the plugin currently reports
the address ranges of the instructions that transformed the data. This tells
an investigator where to begin analyzing the encryption routine. This process
does then require reverse engineering, but our plugin pinpoints where this effort
should begin and provides the associated data that must analyzed.
15
6. Case Study: Turla
6.1. The Turla Malware
Turla is both the name of a high-profile advanced persistent threat (APT)
group as well this group’s digital espionage platform. As documented by MITRE
[16] and Kaspersky [40], the Turla group is responsible for compromising victims
in over 45 countries, with the majority of the victims belonging to government
agencies, military departments, and embassy operators. It is widely believed
that the Turla team is Russia’s most advanced hacking group inside of its intel-
ligence agencies, and its past attack campaigns have involved hacking satellites
to target victims in remote areas and compromising entire ISPs to deliver tar-
geted malware to a single victim [14].
Of the many capabilities provided by the Turla espionage platform, log-
ging of keystrokes and environmental data is a central focus. As part of its
payload, Turla leverages SetWindowsHookEx to gather and record keystrokes
along with other system data. As documented in two lengthy blog posts by
malware.news [48, 49], Turla’s hook handler performs a substantial number of
operations per-keystroke and to manually uncover the actions taken requires
days of expert-level reverse engineering. To showcase our HookTracer engine
and our hooktracer messagehooks plugin, we now present each feature of the
plugin as it analyzes Turla’s malicious message hooks.
59b57bdabee2ce1fb566de51dd92ec94
16
6.3. Hook Handler Analysis
When run in its default mode, hooktracer messagehooks lists only the mes-
sage hooks that it determines to be malicious based on the criteria previously
listed. This allows an investigator to quickly determine, with confidence, if mal-
ware utilizing malicious hooks is present on the system. If an investigator wishes
to observe the behaviour of a hook in detail, they can then run the plugin with
the –list-apis option set. This will instruct the plugin to list, in order, all ex-
ported functions called by an emulated hook. The plugin can also be run with
–list-apis-condensed option set, which instructs it to only list functions that
match the built-in filter of suspicious and malicious functions. In both modes,
functions whose parameters are of interest to investigators are listed. Figure
7 shows the output of hooktracer messagehooks’s condensed API listing mode
against an instance of Turla’s hook. Note that line numbers have been added
in the figure to aid the discussion.
By simply reading the output, an investigator can determine the hook’s func-
tionality and make the same determination as the plugin’s automated engine,
namely, that the hook is malicious. Lines 1-5 illustrate building a string that
includes the handle of the current window and the system time. HookTracer
reports sprintf related functions by showing both the format specifier sent to
the function as well as the buffer filled in by the function. Lines 6-7 show gath-
ering of the victim process’ process ID (1084) and associated formatting. Lines
8-10 show the filename of the process being gathered and saved. Line 11-12
shows the Window Text as returned by HookTracer. All of the gathered values
are then concatenated together by the memmove call on line 13. Lines 14-17
show the common API sequence used to convert a keyboard input to a Unicode
character (ToUnicodeEx ). On line 18, our fake keystroke (’A’) is then sent to
swprintf. Lines 19-26 show the file path of the keylogger file being built, and line
27 shows the CreateFile call to open a handle to the file. Note that the 41424344
in the output is the fake handle value assigned by HookTracer. After the file
is opened, successive calls to memmove are used to concatenate a header-type
value (KSL0T Ver=21.0 ), the Windows computer or domain name, username
of the logged on user, the previously gathered environmental data, and finally
the ’A’ keystroke. The hook then writes to the previously opened file, closes the
file handle, and calls CallNextHookEx, as its work is completed for the current
keystroke.
All of the previous analysis was accomplished without any reverse engineer-
ing effort by the plugin’s user. Results are also obtained quickly as the plugin’s
analysis time for each hook is less than a minute in our test Debian virtual
machine, to which we assigned a mere 2 CPU cores and 2GB of RAM.
17
C:/Users/bob/Desktop/SPUNINST/msimm.dat.
18
to our emulator. Turla is an example of malware that writes its log file to
the directory that the malware is launched from, which obviously changes per
infection. Other keyloggers write to hardcoded paths though, which is why we
still include the full path. Second, we include any sub-directories created by
the malware as experience tells us that some malware will create a hardcoded
directory name, but then vary the name of the file inside of it. Finally, we
keep the name of the keylogger file itself for malware, such as Turla, that use
hardcoded names. By keeping all components, the generated IOCs are as flexible
and broad as possible. Once generated, an incident response team member
can then feed the IOC into any enterprise-level endpoint security monitoring
(EDR) product to determine every system in the environment that contains
file(s) matching the IOC. When using industry-standard EDRs, thousands of
endpoints can be checked in an under a minute, and the use of IOCs in this
manner is standard practice in the industry [18, 37, 68].
7.1.1. Setup
Our analysis was performed against a 32-bit executable sample of Loki that
has a MD5 hash value of:
eccad903b4c27d149e159338f58481a9
19
7.1.2. Analysis
Figure 10 shows the output of our plugin against the message hooks handler
of Loki, and the malicious nature of the handler is clear. After the handler
locates and calls APIs to retrieve the keystroke value (lines 1-16), it then calls
GetWindowTextW on line 18. Next it opens a handle to a strangely named
file (line 22) and writes the return value of GetWindowTextW, whose value was
faked by HookTracer (MyWindowName), to the file. It then reads the contents
of the clipboard (line 30), opens a a new handle the same file (line 35), and
writes out the clipboard contents to the file (line 38). Loki prefixes clipboard
contents with CB: and HookTracer fakes GetClipboardData to return a value
of BBBBCCCCDDDD. Finally, on lines 43-46 we see the handler writing the
faked keystroke A to the file.
Loki’s message hook handler triggers several of our malicious criteria, in-
cluding clipboard access and writing clipboard and keystroke data to a file, and
is automatically flagged as malicious by the plugin.
7.2.1. Setup
Our analysis was performed against a 32-bit Epic Turla keylogger sample
that has a MD5 hash value of:
a3cbf6179d437909eb532b7319b3dafe
7.2.2. Analysis
Figure 11 shows the output of our plugin against the message hook handler of
the Epic Turla keylogger. Unlike the previously discussed malware, this handler
performs only one task, which is to append the current keystroke to the key
log file. For unknown reasons, the keylogger exports its message hook function
LowLevelKeyboardProc@12, which causes it to appear as line 1 in the output.
Lines 2-4 show the zeroing out of a buffer at address 0xcfffffe8 followed by the
copying of a lowercase a into it.
This a is actually our fake keystroke of A in lowercase form. The case is
inverted as, instead of using the system APIs to translate the keyboard code to
a character, the hook does an unusual combination of checking for special keys
(shift, cap locks, etc.). Since these are not faked by HookTracer, the keylogger
calculates the keystroke as being in lowercase. The keystroke is then written to
20
a file on line 5 using an already opened file handle, the file buffer flushed to disk
on line 6, and the hook terminates on line 7. This behaviour meets our criteria
of not writing keystrokes to disk and is automatically flagged as malicious.
7.3. Gozi
Gozi, which also goes by Ursnif or ISFB, is a banking trojan that has been
around since the mid-2000s [64] and is still actively used in attack campaigns
today [67]. It has undergone significant changes during this period and also
inspired related malware, such as GozNym [65]. Combined, the Gozi family of
malware is responsible for the theft of hundreds of millions of dollars.
7.3.1. Setup
Our analysis was performed against a 32-bit Gozi sample that has a MD5
hash value of:
e6d118192fc848797e15dc0600834783
7.3.2. Analysis
Figure 13 shows the output of hooktracer messagehooks against the system
infected with Gozi. This hook operates by attaching the infected threads input
queue to its own (lines 6, 10, and 44), gathering the name of the module it is
executing inside of (lines 4, 11, 14, 16, 38, and 39), gathering the system time
(line 41), and gathering the name of the window in which is executing (line 42).
The use of the debug APIs by this handler meets our criteria and is automat-
ically flagged as suspicious. Manual review of the output also shows behaviour
very consistent with a keylogger and would trigger an analyst to perform further
analysis of the malware.
21
7.4. Telebot Keylogger
Telebots is an APT group believed to be based out of Russia. There are
previously attributed to attacks against the Ukrainian power-grid as well as the
NotPetya ransomware outbreak [52, 21]. The keylogger analyzed in this section
was used in the second wave of attacks against the Ukrainian infrastructure. It
was part of a toolchain that ended with the KillDisk malware, which deletes
important user and system files and renders victim systems unbootable.
7.4.1. Setup
Our analysis was performed against a 64-bit sample that has a MD5 hash
value of:
4919569cd19164c1f123f97c5b44b03b
7.4.2. Analysis
Figure 14 shows the output of hooktracer messagehooks against a system
infected with the Telebots keylogger. On lines 3-10, the output shows that
process ID of the host process is gathered and written to a log file. This log
file is stored in a suspiciously named file under the user’s temp folder, which is
a common location for malware to store data. Lines 11-20 show the malware
writing the window name to the log file.
Lines 18-27 show the malware gathering the name of the executable it as
running as and writing it to the log file. This is accomplished through the use
of CreateToolhelp32Snapshot and Process32FirstW. These functions are used
to begin walking the process list. The malware walks the list in order to find
the process that it is running as so that it can extract the name. HookTracer
fakes the name of the first process returned as fake process.exe, which can be
seen in the output on line 25. HookTracer also returns the same PID in calls
to GetWindowThreadProcessID as it does for the fake process it returns from
Process32FirstW. The combination of these two functions used together occurs
often in malware so this increases the chances that malware will find “itself”
during emulation. Lines 31-36 show the malware converting the faked A key to
a Unicode character (ToUnicodeEx ) and then writing it to the log file.
In summary, this hook gathers to name and PID of the process it is running
as, the name of the active window when the latest key was pressed, and the
Unicode value for the last key pressed. It then writes these to a log file. These
actions violate several of our criteria, including the use of debug APIs and
writing keystroke data to disk. This automatically triggers the hook being
marked as suspicious by the plugin.
22
7.5. Limitations of Automated Message Hook Analysis
In order for hooktracer messagehooks to identify a hook as malicious, the
hook must violate at least one of the criteria described previously as being
suspicious. The yty malware framework leveraged by the Donot Team APT
group is an example of keylogger malware that leverages SetWindowsHookEx
for keylogging, but does not violate any of the criteria [69].
Figure 15 shows the output of hooktracer messagehooks against the yty key-
logger. As can be seen, the hook’s only operations are to convert the pressed key
to its character equivalent and to allocate and de-allocate a few memory regions.
Reverse engineering of the handler showed that it was storing the converted key-
press inside of a custom data structure, and only once a certain number of keys
were pressed did the hook write the stored keys to disk. Since HookTracer
emulates only one key press, it did not trigger this extended behaviour.
Although our plugin does miss the malicious activity of this particular hook,
we still strongly believe that our research is highly practical and of great real-
world use. To start, the approach taken by yty to only record keystrokes once
a certain number is reached is very rarely seen in real world malware as it can
lead to lost keystrokes. For example, if the hosting process is terminated be-
tween a set of keystrokes and the threshold being reached, then none of them
will be logged. Similarly, if the user logs off or shuts down the system in the
gap time, keystrokes will again be lost. Further driving our belief is that, even
in rare occurrences such as yty’s approach, our plugin still tells an analyst that
the executable processes the keystroke, and that further reversing is needed to
figure out what is done with it. This itself is a clue that points investigators
in the right direction. We believe the powerful automation provided by hook-
tracer messagehooks to accurately describe the actions of message hooks is a
significant and novel memory forensic capability.
8. Related Work
The use of emulation to evaluate malware has a long history in the field
of computer security. The most common form of emulation for this purpose
is whole system emulation. In this model, the entire operating system as well
as all running applications are emulated. This allows fine grained inspection
and control of the running system by monitoring applications. QEMU and
Bochs [41] are the most commonly used emulators for this purpose. TEMU
[77], built on top of QEMU, is one of the first mature security analysis projects
to use whole system emulation. HookFinder [76] was built on top of TEMU to
monitor for malicious hooks installed by rootkits in kernel memory. Panaroma
[78], MAVMM [55], Lares [61], and Ether [20] are other foundational projects
in this area. Besides direct emulation, there are also other areas of significant
research aimed at allowing analysis and monitoring of malware outside of the
environment the malware is executing in. Virtual machine introspection (VMI)
is a widely-use technology for this goal as it allows monitoring of guest virtual
machines from the host. This has the benefit of the security monitor executing
23
from a “safe” environment where the malware being observed would ideally not
be able to attack it. Due to this advantage, there has been significant virtual
machine introspection research in both academia [54, 25, 11, 34, 7, 6, 5, 8] and
industry [60], including the libvmi project that allows running Volatility plugins
against live virtual machine guests [59]. The use of malware sandboxes is also
very popular and driven by virtual machine technology. Cuckoo Sandbox is
the most widely used of these [17] and can produce detailed reports of system
activity by malware.
While whole system emulation, virtual machine introspection, and sandboxes
are mature technologies that are widely used for malware analysis, they do
not fully meet the needs of real-world incident response teams nor do they
fit in well within memory analysis-based workflows. To use these technologies
during incident response, an analyst must first accomplish two tasks. The first
is actually locating the malicious code in memory. As documented previously,
this is a labor intensive task when using currently available Volatility plugins.
Second, the analyst must then extract the module (DLL or EXE file) hosting
the code. This presents a few problems in itself. For Volatility to automatically
extract an executable module, it needs the metadata contained in the file’s
header. Since anti-virus engines and EDRs also process this metadata on live
systems, malware will often zero out their modules’ header after initialization.
This forces the analyst to then perform a very manual process of rebuilding
the PE header from scratch to make the file understandable by analysis tools,
such as IDA Pro [30]. Furthermore, only in extremely rare circumstances can
a file extracted from process memory be later executed on a different system.
This occurs due to the substantial changes that occur during loading, including
global variable initialization, selective section loading, and IAT patching [46].
This means that executables extracted from memory cannot be reliably executed
in a virtual machine and the analyst must attempt to recover the module from
filesystem of the infected system, assuming it is present.
Memory-only loading of DLLs, commonly referred to as reflective injection,
is an extremely popular attack technique, and as the name implies, the loaded
DLL is never written to disk at any point [66, 72]. This technique is generally
accomplished by malware reading encrypted DLLs over the network or from
encrypted stores within the main executable already in memory. The buffer
containing the DLL is then decrypted and directly initialized by the malware’s
loader. The buffer is then usually zeroed out to prevent direct recovery. While
Volatility can find and extract the sections of such DLLs, they will never be
directly executable on any other system.
HookTracer solves all of these issues and makes powerful, automatic emula-
tion of memory resident code accessible to even novice incident response team
members. Instead of requiring the analyst to reverse engineer in-memory code,
HookTracer can automatically determine if a hook is malicious, locate the host-
ing code, and extract it to disk. For situations where malicious code is not
backed by a file on disk, HookTracer will determine the memory region hosting
the code and extract it. HookTracer also eliminates the need to attempt to make
malicious code executable on a separate system to then leverage technologies
24
such as whole system emulation. Instead, in-memory code can be directly emu-
lated and detailed reports can be produced of the code’s behaviour. This greatly
streamlines memory analysis processing and allows full automation of the entire
workflow. Besides HookTracer, there have been two recent projects that leverage
unicorn in conjunction with Volatility. The first, ROPEMU [27, 26], automat-
ically detects ROP chains [50] within memory. ROP is used by system-level
exploits to perform code-reuse attacks. Such attacks are necessarily memory-
only and can be difficult to detect with traditional Volatility plugins. The second
project [29] also hunts for ROP chains and was specifically developed to detect
the “Gargoyle” attack [47] that hides executable code using permission changes
and timers. Detection of Gargoyle is implemented by emulating the handler of
each registered timer found by Volatility and checking if calls are made to the
Windows API functions leveraged by the Gargoyle attack. Although neither
of these projects overlap our efforts, we consider them to be important related
work, as they both leverage unicorn in conjunction with Volatility and help
showcase the growing realization by the memory forensics community that cur-
rent incident response workflows are incompatible with traditional techniques
and technology.
9. Conclusion
The rise of memory-only malware and attack payloads has led to significant
research and development efforts in the field of memory forensics. The largest
downside of these efforts has been the general inaccessibility of many of the
techniques to all but expert investigators. This causes significant bottlenecks
within the incident response workflow of organizations and leads to inconsistent
analysis results that are heavily dependent on the skill level of the investiga-
tor. Our research efforts with HookTracer and hooktracer messagehooks have
bridged this gap in a key area of incident response - the detection and analysis
of userland keyloggers. When keyloggers are active on a system, a wide range
of data, including keystrokes, clipboard contents, and more, are vulnerable to
recording and exfiltration. By leveraging hooktracer messagehooks, even novice
investigators can automatically determine the presence of keyloggers as well as
generate detailed records of the keylogger’s behaviour. These records can then
used in automated detection and remediation at enterprise scale.
10. Acknowledgements
References
25
[2] 2017. The Volatility Framework: Volatile Memory Artifact Extraction
Utility Framework. https://github.com/volatilityfoundation/vola
tility.
[3] 2018. Unicorn Showcase. http://www.unicorn-engine.org/showcase/.
[8] Irfan Ahmed, Aleksandar Zoranic, Salman Javaid, Golden G. Richard III,
and Vassil Roussev. 2013. IDTchecker: Rule-based Integrity Checking of
Interrupt Descriptor Tables in Cloud Environments. Proceedings of the 9th
IFIP WG 11.9 International Conference on Digital Forensics (2013).
[9] Fabrice Bellard. 2005. QEMU, A Fast and Portable Dynamic Translator.
In USENIX Annual Technical Conference, FREENIX Track, Vol. 41. 46.
[10] Boldizsár Bencsáth and Gábor Pék and Levente Buttyán and Márk
Félegyházi. 2011. Duqu: A Stuxnet-like Malware Found in the Wild.
CrySyS Lab Technical Report 14.
[11] Brendan Dolan-Gavitt and Tim Leek and Michael Zhivich and Jonathon
Giffin and Wenke Lee. 2011. Virtuoso: Narrowing the Semantic Gap in
Virtual Machine Introspection. 2011 IEEE Symposium on Security and
Privacy. , 297–312 pages.
[12] Gianluca Brindisi. 2016. gozi-isfb. https://github.com/gbrindisi/mal
ware/tree/master/windows/gozi-isfb.
26
[14] Catalin Cimpanu. 2018. Russia’s Elite Hacking Unit Has Been Silent, But
Busy. https://www.zdnet.com/article/russias-elite-hacking-uni
t-has-been-silent-but-busy/.
[15] The MITRE Corporation. 2018. Technique: Timestomp. https://attack
.mitre.org/techniques/T1099/.
[18] Jessica DeCianno. 2014. IOC Security: Indicators of Attack vs. Indicators
of Compromise. https://www.crowdstrike.com/blog/indicators-att
ack-vs-indicators-compromise/.
[19] Dell SecureWorks Counter Threat Unit Threat Intelligence. 2015. Skeleton
Key Malware Analysis. https://www.secureworks.com/research/skele
ton-key-malware-analysis.
[20] Artem Dinaburg, Paul Royal, Monirul Sharif, and Wenke Lee. 2008. Ether:
Malware Analysis via Hardware Virtualization Extensions. In Proceedings
of the 15th ACM Conference on Computer and Communications Security.
ACM, 51–62.
[27] Mariano Graziano, Davide Balzarotti, and Alain Zidouemba. 2016. ROP-
MEMU: A Framework for the Analysis of Complex Code-reuse Attacks. In
Proceedings of the 11th ACM Conference on Computer and Communica-
tions Security. ACM, 47–58.
27
[28] Nikolay Grebennikov. 2011. Keyloggers: Implementing Keyloggers in Win-
dows. Part Two. https://securelist.com/keyloggers-implementin
g-keyloggers-in-windows-part-two/36358.
[29] Aliz Hammond. 2018. Hunting for Gargoyle Memory Scanning Evasion.
https://countercept.com/blog/hunting-for-gargoyle/.
[34] Salman Javaid, Aleksandar Zoranic, Irfan Ahmed, and Golden G. Richard
III. 2012. Atomizer: A Fast, Scalable and Lightweight Heap Analyzer for
Virtual Machines in a Cloud Environment. Proceedings of the 6th Layered
Assurance Workshop (LAW’12) (2012).
[35] Kaspersky. 2014. The Epic Turla Operation: Solving some of the mysteries
of Snake/Uroboros. https://media.kasperskycontenthub.com/wp-con
tent/uploads/sites/43/2018/03/08080105/KL_Epic_Turla_Technica
l_Appendix_20140806.pdf.
[36] Takashi Katsuki. 2013. Crisis: The Advanced Malware. 2013 Symantec
Internet Security Threat Report.
28
[41] Kevin P Lawton. 1996. Bochs: A Portable PC Emulator for Unix/X. Linux
Journal (1996).
[42] Andrea Lelli. 2018. Out of Sight But Not Invisible: Defeating File-
less Malware with Behavior Monitoring, AMSI, and Next-gen AV.
https://cloudblogs.microsoft.com/microsoftsecure/2018/09/27/
out-of-sight-but-not-invisible-defeating-fileless-malware-w
ith-behavior-monitoring-amsi-and-next-gen-av/.
[43] Michael Ligh, Steven Adair, Blake Hartstein, and Matthew Richard. 2010.
Malware Analyst’s Cookbook and DVD: Tools and Techniques for Fighting
Malicious Code. Wiley Publishing.
[44] Michael Hale Ligh. 2012. MoVP 3.1 Detecting Malware Hooks in the Win-
dows GUI Subsystem. https://volatility-labs.blogspot.com/2012/
09/movp-31-detecting-malware-hooks-in.html.
[45] Michael Hale Ligh. 2012. Reverse Engineering Poison Ivy’s Injected Code
Fragments. https://volatility-labs.blogspot.com/2012/10/revers
e-engineering-poison-ivys.html.
[46] Michael Hale Ligh, Andrew Case, Jamie Levy, and AAron Walters. 2014.
The Art of Memory Forensics: Detecting Malware and Threats in Windows,
Linux, and Mac Memory. Wiley, New York.
29
[54] Matthew Muscat and Mark Vella. 2018. Enhancing Virtual Machine
Introspection-Based Memory Analysis with Event Triggers. 2018 IEEE
International Conference on Cloud Computing Technology and Science
(CloudCom).
[55] Anh M. Nguyen, Nabil Schear, HeeDong Jung, Apeksha Godiyal, Samuel T.
King, and Hai D. Nguyen. 2009. Mavmm: Lightweight and Purpose Built
VMM for Malware Analysis. 2009 Annual Computer Security Applications
Conference. 441–450.
[56] Martin Novak, Jonathan Grier, and Daniel Gonzales. 2018.
New Approaches to Digital Evidence Acquisition and Analysis.
https://www.nij.gov/journals/280/pages/new-approaches-to-d
igital-evidence-acquisition-and-analysis.aspx.
[57] nyx0. 2015. Carberp Banking Trojan. https://github.com/nyx0/Carber
p.
[62] Nguyen Anh Quynh and Dang Hoang Vu. 2015. Unicorn: Next Generation
CPU Emulator Framework. Black Hat USA (2015).
[63] Rapid7. 2019. Metasploit. https://www.metasploit.com/.
[64] SecureWorks. 2007. Gozi Trojan. https://www.secureworks.com/rese
arch/gozi.
[65] SentinelOne. 2019. GozNym Banking Malware: Gang Busted, But Is That
The End? https://www.sentinelone.com/blog/goznym-banking-mal
ware-gang-busted/.
[66] Skape and Jarkko Turkulainen. 2004. Remote Library Injection. http://
www.hick.org/code/skape/papers/remote-library-injection.pdf.
[67] Sophos. 2019. Gozi V3: Tracked by Their Own Stealth.
https://news.sophos.com/en-us/2019/12/24/gozi-v3-tracked-b
y-their-own-stealth/.
30
[68] Tanium. 2017. Tanium IOC Detect UserGuide. https://docs.tanium.co
m/.
[69] ASERT Team. 2018. Donot Team Leverages New Modular Malware Frame-
work in South Asia. https://de.netscout.com/blog/asert/donot-tea
m-leverages-new-modular-malware-framework-south-asia.
[78] Heng Yin, Dawn Song, Manuel Egele, Christopher Kruegel, and Engin
Kirda. 2007. Panorama: Capturing System-wide Information Flow for Mal-
ware Detection and Analysis. In Proceedings of the 14th ACM Conference
on Computer and Communications Security. ACM, 116–127.
31
Figure 6: Leaked Gozi source code.
32
Figure 7: hooktracer messagehooks against Turla.
33
Figure 8: Analysis of Turla XOR loop in IDA Pro.
34
Figure 10: hooktracer messagehooks against Loki Bot.
35
Figure 11: hooktracer messagehooks against Epic Turla.
36
Figure 13: hooktracer messagehooks against Gozi.
37
Figure 14: hooktracer messagehooks against Telebots Keylogger.
38
Figure 15: hooktracer messagehooks against Donot Team’s Keylogger.
39
Declaration of interests
☒ The authors declare that they have no known competing financial interests or personal relationships
that could have appeared to influence the work reported in this paper.
☐The authors declare the following financial interests/personal relationships which may be considered
as potential competing interests:
Andrew Case: Writing- Original draft preparation, Conceptualization, Methodology,
Investigation, Software
Andrew Case is a core Volatility developer and a memory forensics researcher and practitioner.
Ryan D. Maggio is a Ph.D. student studying in the Division of Computer Science and Engineering
and conducts research in the Applied Cybersecurity Laboratory at the Center for Computation
and Technology at Louisiana State University.
Md Firoz-Ul-Amin was a Ph.D. student studying in the Division of Computer Science and
Engineering and conducted research in the Applied Cybersecurity Laboratory at the Center for
Computation and Technology at Louisiana State University. Firoz died shortly after the research
described in this paper was concluded and is sadly missed by his friends and collaborators.
Mohammad M. Jalalzai was a Ph.D. student at LSU studying in the Division of Computer Science
and Engineering and conducted research in the Applied Cybersecurity Laboratory at the Center
for Computation and Technology at Louisiana State University. He recently defended his
dissertation and graduated from LSU.
Mingxuan Sun is an Assistant Professor in the Division of Computer Science and Engineering at
Louisiana State University.
Golden G. Richard III is a Professor in the Division of Computer Science and Engineering and
Associate Director for Cybersecurity at the Center for Computation and Technology at Louisiana
State University. He also directs the Applied Cybersecurity Laboratory at the Center for
Computation and Technology.