0% found this document useful (0 votes)

18 views10 pages

Machine_Learning_Analysis_of_Memory_Images_for_Process_Characterization

This document presents a study on using machine learning for analyzing memory images to enhance malware detection and process characterization. The authors utilize the Volatility memory forensics framework to identify deviations in memory behavior that may indicate malicious activity, particularly focusing on fileless malware techniques. The research demonstrates the effectiveness of random forest models in distinguishing between benign and suspicious processes based on their memory footprints.

Uploaded by

lohisa9422

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

18 views10 pages

Machine_Learning_Analysis_of_Memory_Images_for_Process_Characterization

Uploaded by

lohisa9422

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 10

LLNL-CONF-830179

Machine Learning Analysis of Memory

Images for Process Characterization and
Malware Detection

S. Lyles, M. DeSantis, M. Gallegos, H. Nyholm, C.

Taylor, J. Donaldson, K. Monteith

December 16, 2021

Digital Networking Security Conference, Artificial Intelligence to

Security Workshop
Baltimore, MD, United States
June 27, 2022 through June 30, 2022
Disclaimer

This document was prepared as an account of work sponsored by an agency of the United States
government. Neither the United States government nor Lawrence Livermore National Security, LLC,
nor any of their employees makes any warranty, expressed or implied, or assumes any legal liability or
responsibility for the accuracy, completeness, or usefulness of any information, apparatus, product, or
process disclosed, or represents that its use would not infringe privately owned rights. Reference herein
to any specific commercial product, process, or service by trade name, trademark, manufacturer, or
otherwise does not necessarily constitute or imply its endorsement, recommendation, or favoring by the
United States government or Lawrence Livermore National Security, LLC. The views and opinions of
authors expressed herein do not necessarily state or reflect those of the United States government or
Lawrence Livermore National Security, LLC, and shall not be used for advertising or product
endorsement purposes.
Machine Learning Analysis of Memory Images for
Process Characterization and Malware Detection
Seth Lyles, Mark Desantis, John Donaldson, Micaela Gallegos,
Hannah Nyholm, Claire Taylor, and Kristine Monteith
Livermore, CA
Lawrence Livermore National Laboratory

Abstract—As signature-based malware detection techniques We introduce the state of the field in Section II. In Section
mature, malware authors have been forced to leave fewer foot- III, we describe our experimental setup including the ob-
prints on target machines. Malicious activity can be conducted by served statistics and malware techniques targeted. Section IV
chaining together benign, built-in functions in subversive ways.
Because the functions are native to the host system, attackers presents our results and analysis, showing that we can identify
can slip under the radar of signature filtering tools such as suspicious processes when compared against their benign
YARA. To address this challenge, we utilize the Volatility memory counterparts. Conclusions and future work are presented in
forensics framework to measure and characterize typical in- section V.
memory behavior, then observe the deviations from normal use
that may indicate a compromise. We demonstrate that processes II. R ELATED W ORK
have characteristic memory footprints, and that machine learning
models can flag malicious behavior as anomalous. This work deals with memory forensics coupled with ma-
chine learning in order to detect cyber threats such as fileless
I. I NTRODUCTION malware.
A robust cyber defense requires an in-depth understanding Malicious actors write and distribute malware in order to
of how malware uses memory. The ability to characterize compromise users’ devices. The exact goals vary; examples
normal memory behavior is essential to developing abnormal include implementing rootkits to enable remote admin access,
and malicious activity detectors. The prevalence of “fileless” or exposing sensitive resources to malicious access, and deploy-
“memory-based” attacks increased more than 300% since 2016 ing ransomware, which encrypts data and holds it hostage for
[1]. These attacks work by causing native programs or libraries financial gain [5], [6]. These dangerous pieces of code can take
to act in unexpected ways [2]. For example, an attacker could a variety of forms and contexts, from compiled and executable
use the Microsoft Teams updater to download a file from an binaries to scripts embedded in files (such as DOCs or PDFs)
arbitrary host, or take advantage of a url-parsing library to or webpages [7], [8]. Malware employs mechanisms in the
launch a local program. As a result, antivirus companies are target environment to accomplish its goals.
being forced to evolve; many have added, or are in the process A particularly insidious threat, fileless malware, lives in
of adding, memory scanning capabilities. random access memory (RAM) and leaves no trace on a
The specifics of in-memory behavior are not always well- computer’s file system. This makes it difficult for Antivirus
understood or well-documented, particularly when dealing (AV) software to detect and trace, as such software generally
with proprietary operating systems and software. Open source relies on scanning files and comparing their hash signatures
Linux-based systems offer a degree of transparency given to a database of known-malicious hashes.
their robust code formatting and documentation standards [3].
Windows, on the other hand, is a closed-source commercial A. Malware Detection Techniques
operating system where information about internal workings Practitioners and researchers in the cyber security industry
is more opaque. Furthermore, security features such as ASLR attempt to detect malware in order to stop it in real time
pose as obstacles to researchers looking to develop tools that or analyze and respond to it later [9]. Traditional detection
aid in the process of detecting malicious activity. techniques generate signatures which can be used to identify
To address this challenge, our research focuses on empiri- known types of malware based on their file contents. The most
cally characterizing the in-memory behavior of processes in a basic of these involve hashing the entire executable, but this
Windows 10 environment. We use random forests to model technique can be easily evaded by obfuscation of the binary
behaviors based on statistics extracted using the Volatility [10]. Additional techniques include signature-based n-gram
memory forensic framework [4]. We then train these models to models [11] and other anomaly detection methods [12].
detect maliciousness with a reasonably high degree of accuracy More recent detection techniques have also integrated dy-
and precision. This capability may allow for more streamlined namic analysis. Research also demonstrates the effectiveness
analysis of forensic memory images, with further potential of both symbolic and concolic execution. Such methods blend
applicability to live memory-based detection of malicious static and dynamic execution by modeling code execution with
behavior. symbolic values to explore dynamic behaviors [13], observing
code as it executes to generate more sophisticated, behavior- simulated malicious activity into the applications and again
based classifiers [14]. captured static memory images. We analyzed the collected
images with Volatility and then trained random forest classifier
B. Memory Analysis models on the extracted measurements. Figure 1 illustrates
Memory analysis can be done in both a static and dynamic this basic methodology.
context [15], [16]. Static memory images can be analyzed
offline [17]. This is often the approach taken in incident Fig. 1. We captured static memory images, analyzed them with Volatility, then
response, where analysts try to recreate what happened after trained a classifier from the results. The virtual machine provided a sandbox
a compromise. Other efforts investigate memory access and environment in which we ran code exhibiting malware-like activity.
use patterns during runtime [18]. Tooling such as debuggers
can record memory use in real time [19], while Volatility
and similar tools can analyze memory images generated by
dumping copies of system memory into files [20].
Memory analysis techniques also necessarily differ depend-
ing on the environment in question: Windows, Macintosh,
and Linux all structure memory use in different ways, which
makes unified analysis difficult [4]. Non-OS environments
such as firmware present additional challenges [21]. Even
the choice of programming language can have an impact on
memory forensics [22]. Unifying memory analysis for these
We focused our experiments on Windows 10 because its
environments remains an active area of research [23]. The
widespread use makes it a common attack target [26]. Our
work here focuses solely on the Windows environment.
test environment consisted of Windows 10 virtual machines
C. Machine Learning Aided Malware Detection running in VirtualBox, with 4 cores and 4GB RAM. We exe-
cuted a variety of default Windows programs, such as Notepad,
Given the size of memory data, manual analysis alone
PowerShell, Task Manager, and Paint. For our baseline, we
proves difficult, if not impossible, in memory forensics.
collected 165 full (4gb) RAM dumps from which we extracted
Effective analysis generally requires the use of automated
features using the Volatility3 framework. For the malicious
techniques. Frameworks such as Volatility apply rule-based
tests, we executed the same programs but with the addition of
algorithms in order to extract data from known structures
sample malware demonstrated in prior work [27]. We collected
based on operating system type [20]. This enables analysts to
200 memory dumps with malware.
explore memory in a structured manner, iterating over memory
allocations for different programs and components. Even so, A. Infrastructure
the sheer size of memory images and diversity of malicious Our host machine had two 4-core, 2.5GHz Xeon processors,
behavior presents a challenge in pinpointing the programs and and 260GB RAM. We used VirtualBox version 6.1.22r144080,
program regions where maliciousness resides. and Windows 10 Build 10586 (version 1511) with 4 cores
Agh and associates [24] added data structure tracking to and 4gb of RAM. We chose an earlier build of Windows
the VAD (Virtual Address Descriptor) tree, files mapped to because it is more easily recognized by Volatility and has
memory, and the Windows registry configuration. They used fewer mitigation techniques against the memory violations we
models based on this information to effectively detect malware are attempting to detect; we expect that Volatility will better
samples. Mosli and associates [25] used information about support later versions of Windows as it receives updates.
registry activity, imported libraries, and API function calls
to separate benign and malicious processes. They compared B. Volatility
the effectiveness of models based on these three different Volatility is a well-known Python-based memory forensics
feature types. Developing such feature sets can require more tool. At the time of this research, Volatility3 was the most
subject matter expertise than the n-grams and/or deep learning current version. Volatility uses a heuristic to identify the build
approaches, but the resulting models are often more explain- version of Windows. Once it recognizes the build version,
able. In this work, we similarly employ a more curated Volatility loads the correct symbols and maps them onto the
feature set and evaluate the effectiveness of individual features memory dump. Some of these symbols contain data necessary
in characterizing process behavior and indicating potential to reconstruct the environment in which a process ran. Our
malicious compromise. analysis focused on the EPROCESS, ETHREAD, PEB, and
VAD tree structures.
III. M ETHODOLOGY AND E XPERIMENTAL S ETUP
We hosted a virtual machine (VM) on a hypervisor to C. Windows Memory Structures
provide a sandbox environment for our experiments. We The EPROCESS data structure contains information about
ran applications on the VM and captured static memory process instances, such as image name and ProcessID, the
images while running those applications. We then introduced resources allocated in terms of memory allocations (how
much and where), types (private, mapped, shareable, etc.), E. Observed Malware Techniques
memory protections (combinations of read, write, execute, and We tested our methods on several typical and often fileless
reserved), modules loaded, and pointers to ETHREADs and malware techniques: DLL injection, PE injection, process
the process environment block. hollowing, and shellcode injection.
Both EPROCESS and ETHREAD are considered opaque DLL (Dynamic Link Library) injection involves a process
objects by Microsoft [28], inhibiting analysis; fortunately, loading a DLL into the allocated memory for a different
third-party work has been done to understand these struc- process [34]. This method is used by developers to interface
tures [29], [30]. Microsoft does provide symbol files1 , which with third party software for which they do not have source
help communicate the layout of data structures [31]. Indeed, code. Microsoft enables DLL injection [35] for occasional
Volatility uses these symbols for its own processing. benevolent use, such as fixing bugs in applications. However,
Included in EPROCESS, the ETHREAD object is an malevolent actors often leverage DLL injection for nefarious
opaque structure which contains useful information about the purposes, as it readily grants access to otherwise restricted
stack. We calculated the size of a stack from the difference resources and is difficult to detect. The loaded DLL shows its
between its limit and base, both of which are attached to the activity under the host process, not the process which injected
ETHREAD. the DLL. This masks the malware files with benign signatures
Another member of the EPROCESS structure, the VAD thus making detection difficult [36].
tree, maps out the virtually allocated memory for a process PE injection loads a payload executable directly into the
[32]. VAD nodes refer to loaded modules (in the allocations in victim’s virtual memory space, modifying headers and protec-
which they were referenced) and also have unique permission tions to match that of the victim. A new thread is created in
flags per node. the victim and set to execute the payload’s main function.
The PEB (process environment block) contains data about Process hollowing occurs when one process starts another,
the number of heaps, which modules have been loaded into assigning it virtual memory, then switches the entrypoint to a
memory, and the command-line string that invoked the process malicious piece of code instead [37]. Malicious actors employ
[33]. The module list may not match the VAD tree’s list process hollowing to stealthily gain access to resources; similar
exactly, the difference of these two sets indicating images of to DLL injection, the malicious code will appear to be the host
interest. process.
Malicious actors also perform shellcode injection attacks to
avoid detection. These involve compiling malicious code into
D. Custom Plugin
machine language, placing the instructions in a location read-
Volatility is built to be modular; plugins are easy to write able and executable by a program, then overwriting the return
using a common base template which accesses important data address in the host process to instead point to the malicious
structures. For our research, we wrote a plugin to collect the code [38]. Though operating system and compiler vendors
following attributes: have developed counters and protections against shellcode
• Stack frame size: stack base - stack limit injection over the years, sophisticated injection techniques and
• Statistics on Module loads: legacy software maintain this vector’s potency [39].
All of these strategies are implemented in demonstration
– DLL sizes modules provided by the HollowsHunter open-source project
– number of loaded modules [40], which we use to simulate maliciousness in our experi-
– SHA256 of sorted list of path names ments.
• Statistics on the VAD node file system protections:
IV. R ESULTS
– Size of VAD tree
A. Differentiating Between Common Processes
– SHA256 of all protections (sorted)
– SHA256 of sorted distinct protections Using the features extracted by our custom Volatility plugin,
– Cardinality of distinct protection set a random forest model was able to perfectly differentiate
between the eight most common processes in our sandbox
• Sum, min, max, mean, and standard deviation of:
environment, as illustrated in the confusion matrix in Table I.
{image size, stack size, memory mapped segment sizes,
The svchost.exe process performs various tasks de-
VAD sizes}
pending on the command line arguments, and this correlates
We compiled a list of names of the loaded DLLs, sorted, to different memory-based statistics. For example, Figure
concatenated, and fed the string into the SHA256 algorithm. 2 provides a histogram for Average Image Size for eight
VAD protections can vary over some combination of READ, common processes in the sandbox environment. Note that
WRITE, EXECUTE, Guard, or No Access. Similar to the measurements for most processes tended to fall in the same
DLLs, we sorted a list of the permissions assigned to various bin for each process run. However, measurements for the runs
VAD nodes, concatenated, and hashed. of svchost.exe had more variation.
We conducted a similar set of experiments to determine if a
1 https://msdl.microsoft.com/download/symbols random forest model could separate these processes based on
TABLE I TABLE II
D IFFERENTIATING BETWEEN COMMON PROCESSES USING D IFFERENTIATING BETWEEN S V C H O S T . E X E PROCESSES BASED ON
MEMORY- BASED STATISTICS COMMANDLINE ARGUMENTS USING MEMORY- BASED STATISTICS

taskhostw.exe

LocalServiceAndNoImpersonation
winlogon.exe

services.exe
svchost.exe

spoolsv.exe
wininit.exe

LocalServiceNetworkRestricted
LocalSystemNetworkRestricted
sihost.exe
csrss.exe

LocalServiceNoNetwork
svchost.exe 645

UnistackSvcGroup
csrss.exe 109

NetworkService

WbioSvcGroup

WerSvcGroup
winlogon.exe 51

DcomLaunch
LocalService
wininit.exe 47

appmodel
taskhostw.exe 64

wsappx
RPCSS
netsvcs

imgsvc
utcsvc
spoolsv.exe 48
sihost.exe 59
services.exe 44

-k
-k
-k
-k
-k
-k
-k
-k
-k
-k
-k
-k
-k
-k
-k
-k
-k NetworkService 60
-k utcsvc 47
Fig. 2. Histogram of values for Average Image Size for common processes. -k netsvcs 60
-k appmodel 53
-k WbioSvcGroup 58
-k RPCSS 38
-k LocalSystemNetworkRestricted 55
-k LocalServiceNoNetwork 47
-k LocalServiceNetworkRestricted 60
-k LocalServiceAndNoImpersonation 59
-k LocalService 58
-k DcomLaunch 46
-k UnistackSvcGroup 5
-k WerSvcGroup 1
-k wsappx 3
-k imgsvc 4

coherent models in this simpler environment is a necessary

first step in creating models that would generalize in a more
complicated setting.

B. Detecting Malicious Behavior

these arguments. As shown in Table II, the model was perfectly In order to simulate a malicious attack where we would
able to differentiate between svchost.exe processes based expect artifacts to show up in volatile memory, we used the
on command line arguments in our sandbox environment. demonstration code from the Hollows Hunter project [40].
Four different types of attacks are simulated by their demos:
A random forest classifier based on our memory features
was also able to perform well in binary classification experi- • ChimeraPE: Process mapping/PE injection
ments (e.g., determining if a process was svchost.exe or • DLLInject: Dynamically linked library injection
not based on the entire set of running processes). Table III • Injection: Shell code injection
shows the results of such experiments for the top 50 most • RunPE: Process hollowing
common processes in the sandbox environment. These demos are designed to attack Calculator.exe
In some cases, even models based on single features ef- by default. For each demo, we built a model based on the
fectively differentiated between a given process and back- normal Calculator.exe processes run in our environ-
ground. Table IV shows the f1 scores for models based on ment, and then used that model to determine if “infected”
Average Image Size, Average Stack Size, Average Heap Size, Calculator.exe processes appeared anomalous.
and Average Size of Mapped Region. Models often achieved For three of the four scenarios, infected processes could
high classification accuracy using Average Image Size. Not easily be detected as anomalous based on single features. For
surprising, Average Stack Size was not as predictive of process example, Figure 3 provides an ROC curve for the detectabil-
name. ity of Calculator.exe processes that had been targeted
Note that these experiments were conducted in a clean, by the ChimeraPE demo code based solely on a Gaussian
sandbox environment. We would not expect nearly such tidy model of the Average Image Size feature. All of the infected
results in an operational environment, where variation in Calculator.exe appeared more anomalous than all of the
process behavior was introduced by users instead of scripts normal Calculator.exe processes. Figure 4 provides a
and more processes are being run. However, the ability to build histogram of values for this feature. As illustrated, values
TABLE III TABLE IV
P RECISION , RECALL , AND F 1 SCORES FOR MODELS DIFFERENTIATING F 1 SCORES FOR MODELS DIFFERENTIATING BETWEEN A PROCESS AND
BETWEEN A PROCESS AND BACKGROUND FOR THE TOP 50 MOST COMMON BACKGROUND FOR THE TOP 50 MOST COMMON PROCESSES BASED ON
PROCESSES . F OR MOST PROCESSES , RANDOM FORESTS BASED ON SINGLE FEATURES .
MEMORY STATISTICS FEATURES WERE ABLE TO SUCCESSFULLY

Average Size of Mapped Region

DIFFERENTIATE BETWEEN A GIVEN PROCESS AND THE REMAINING
BACKGROUND PROCESSES .

support for positive class

Average Image Size

Average Stack Size

Average Heap Size

precision

f1 score
recall

svchost.exe 1.00 1.00 1.00 1306 svchost.exe 1.00 0.00 0.69 0.96
csrss.exe 1.00 1.00 1.00 224 csrss.exe 1.00 0.00 1.00 0.99
winlogon.exe 1.00 1.00 1.00 116 winlogon.exe 0.97 0.00 0.37 0.99
wininit.exe 1.00 1.00 1.00 108 wininit.exe 1.00 0.00 0.12 1.00
taskhostw.exe 1.00 1.00 1.00 107 taskhostw.exe 1.00 0.00 0.64 0.92
spoolsv.exe 1.00 1.00 1.00 106 spoolsv.exe 1.00 0.00 0.27 1.00
sihost.exe 1.00 0.96 0.98 105 sihost.exe 1.00 0.00 0.56 0.95
services.exe 1.00 1.00 1.00 110 services.exe 1.00 0.00 0.00 1.00
lsass.exe 1.00 1.00 1.00 109 lsass.exe 1.00 0.00 0.03 1.00
explorer.exe 1.00 1.00 1.00 113 explorer.exe 0.97 0.00 0.98 0.99
dwm.exe 1.00 1.00 1.00 114 dwm.exe 1.00 0.00 0.37 0.78
VBoxTray.exe 1.00 1.00 1.00 115 VBoxTray.exe 1.00 0.00 0.61 1.00
VBoxService.ex 1.00 1.00 1.00 104 VBoxService.ex 1.00 0.00 0.24 1.00
System 1.00 1.00 1.00 106 System 0.00 1.00 0.00 1.00
SkypeHost.exe 1.00 1.00 1.00 110 SkypeHost.exe 1.00 0.00 0.54 1.00
ShellExperienc 1.00 1.00 1.00 109 ShellExperienc 1.00 0.00 0.28 0.98
SearchUI.exe 1.00 1.00 1.00 113 SearchUI.exe 1.00 0.00 0.99 0.98
SearchIndexer. 1.00 1.00 1.00 109 SearchIndexer. 1.00 0.00 0.04 0.85
RuntimeBroker. 1.00 1.00 1.00 105 RuntimeBroker. 0.90 0.00 0.30 0.71
OneDrive.exe 1.00 1.00 1.00 108 OneDrive.exe 1.00 0.00 0.57 1.00
MsMpEng.exe 1.00 1.00 1.00 113 MsMpEng.exe 1.00 0.00 0.86 0.99
backgroundTask 1.00 0.50 0.67 109 backgroundTask 0.64 0.00 0.28 0.59
NisSrv.exe 1.00 1.00 1.00 103 NisSrv.exe 1.00 0.00 0.37 1.00
cmd.exe 1.00 1.00 1.00 48 cmd.exe 1.00 0.00 0.12 1.00
ApplicationFra 1.00 1.00 1.00 52 ApplicationFra 0.95 0.00 0.46 0.92
conhost.exe 1.00 1.00 1.00 45 conhost.exe 0.94 0.00 0.33 1.00
SearchProtocol 1.00 1.00 1.00 54 SearchProtocol 1.00 0.00 0.00 0.97
SearchFilterHo 1.00 1.00 1.00 41 SearchFilterHo 0.98 0.00 0.00 1.00
BackgroundTran 1.00 0.85 0.92 48 BackgroundTran 0.90 0.00 0.06 0.65
WmiPrvSE.exe 1.00 1.00 1.00 5 WmiPrvSE.exe 1.00 0.00 0.00 1.00
SystemSettings 1.00 0.67 0.80 7 SystemSettings 0.80 0.00 0.00 0.50
AtBroker.exe 1.00 1.00 1.00 8 AtBroker.exe 1.00 0.00 0.00 1.00
Microsoft.Msn. 1.00 1.00 1.00 7 Microsoft.Msn. 0.80 0.00 0.00 1.00
wmplayer.exe 1.00 1.00 1.00 3 wmplayer.exe 0.67 0.00 0.80 1.00
strings64.exe 1.00 1.00 1.00 2 strings64.exe 1.00 0.00 0.00 1.00
procexp64.exe 1.00 1.00 1.00 3 procexp64.exe 1.00 0.00 0.00 1.00
powershell.exe 0.00 0.00 0.00 1 powershell.exe 1.00 0.00 0.00 1.00
osk.exe 1.00 1.00 1.00 3 osk.exe 1.00 0.00 0.00 1.00
notepad.exe 1.00 1.00 1.00 2 notepad.exe 1.00 0.00 0.00 1.00
mspaint.exe 1.00 1.00 1.00 4 mspaint.exe 1.00 0.00 0.67 1.00
msinfo32.exe 1.00 1.00 1.00 3 msinfo32.exe 1.00 0.00 0.00 0.80
msconfig.exe 1.00 0.67 0.80 2 msconfig.exe 1.00 0.00 0.00 1.00
browser broker 1.00 0.67 0.80 2 browser broker 1.00 0.00 0.00 0.50
WinStore.Mobil 1.00 0.50 0.67 3 WinStore.Mobil 1.00 0.00 0.00 1.00
WWAHost.exe 1.00 1.00 1.00 3 WWAHost.exe 1.00 0.00 1.00 1.00
Taskmgr.exe 1.00 1.00 1.00 4 Taskmgr.exe 1.00 0.00 0.00 0.00
PeopleApp.exe 1.00 1.00 1.00 3 PeopleApp.exe 1.00 0.00 1.00 0.57
PING.EXE 0.00 0.00 0.00 3 PING.EXE 0.80 0.00 0.00 1.00
Fig. 3. ROC curve plotting detection rate vs. false alarm rate for a TABLE V
model using Average Size of Mapped Region to differentiate between nor- AUC STATISTICS FOR MODELS DIFFERENTIATING BETWEEN NORMAL AND
mal Calculator.exe processes and those infected with the ChimeraPE C HIMERA PE- INFECTED PROCESSES BASED ON SINGLE FEATURES .
Hollows Hunter demo code.
AUC
Total VAD Size 1.00
Total Stack Size 1.00
Total Mapped Region Size 1.00
Total Image Size 1.00
Total Pointer Difference 0.85
Total Heap Size 0.33

TABLE VI
AUC STATISTICS FOR MODELS DIFFERENTIATING BETWEEN NORMAL AND
I NJECTION - INFECTED PROCESSES BASED ON SINGLE FEATURES .

AUC
Total VAD Size 1.00
Total Stack Size 1.00
Total Mapped Region Size 1.00
Total Image Size 1.00
Total Pointer Difference 0.96
Total Heap Size 0.00

TABLE VII
Fig. 4. Histogram of values for Average Size of Mapped Region for AUC STATISTICS FOR MODELS DIFFERENTIATING BETWEEN NORMAL AND
normal Calculator.exe processes and those infected with the ChimeraPE RUN PE- INFECTED PROCESSES BASED ON SINGLE FEATURES .
Hollows Hunter demo code.
AUC
Total VAD Size 1.00
Total Stack Size 1.00
Total Mapped Region Size 1.00
Total Image Size 1.00
Total Pointer Difference 0.96
Total Heap Size 0.00

TABLE VIII
AUC STATISTICS FOR MODELS DIFFERENTIATING BETWEEN NORMAL AND
DLLI NJECT- INFECTED PROCESSES BASED ON SINGLE FEATURES .

AUC
Total Pointer Difference 0.87
Total VAD Size 0.83
Total Stack Size 0.80
Total Mapped Region Size 0.78
Total Image Size 0.75
Total Heap Size 0.08

for the normal processes cleanly separate from the infected Fig. 5. ROC curve plotting detection rate vs. false alarm rate for a model
using Total VAD Size to differentiate between normal and DLLInject-infected
processes. processes.
Tables V through VIII report “area under the curve” for such
models based on individual features. For Chimera, Injection,
and RunPE, single-feature models were easily able to flag all
infected processes as anomalous.
For DLLInject, single-feature models were often able to
detect many of the infected processes, but not all. Figure 5
shows the ROC curve for detecting DLLInject infections using
the Total VAD Size.

C. Generalization Concerns
All of these experiments were conducted in a sandbox
environment where the number of running processes was
limited, and there was not the variation in process behavior
that would be seen in an operation environment. To begin to
Fig. 6. Histogram of values for Average Size of Mapped Region for V. C ONCLUSION AND F UTURE W ORK
Calculator.exe (both normal and infected) vs. non-Calculator.exe Windows processes appear to have characteristic mem-
processes.
ory footprints, at least in the sandbox environment where
we conducted our experiments. Random forests are able to
perfectly differentiate between the top ten most common
processes in our sandbox environment when processes with
permission errors in the Volatility analysis were excluded
from consideration. Models are also easily able to differentiate
between a modeled process and background processes, even
for the less-common processes. In many cases, they are able
to differentiate based on a single feature.
Models are also able to detect when simulated maliciousness
was introduced. For three of the Hollows Hunter demos,
single-feature models are able to identify all of the infected
processes as anomalous. For the remaining demo, single-
feature models are able to identify many of the infected
processes.
As previously mentioned, these experiments run in a sand-
Fig. 7. ROC curve plotting detection rate vs. false alarm rate for a model using box environment. We expect many more processes and much
Average Size of Mapped Region to differentiate between Calculator.exe more process variation in an operational environment. While
(both normal and infected) and non-Calculator.exe processes. experiments like these are a necessary first step, they are not
sufficient to demonstrate that maliciousness can be effectively
detected using these types of features and machine learning
models in a practical setting. Nonetheless, these experiments
show promise in automating malware detection in memory
forensics.
Further research may focus on live machines operated
by real users to determine generalizability of our results.
Additionally, future work may apply this research to real-time
memory scanning techniques, thereby enabling a novel avenue
of malware detection.
ACKNOWLEDGMENT
This work was performed under the auspices of the U.S.
Department of Energy by Lawrence Livermore National Lab-
oratory under Contract DE-AC52-07NA27344.
R EFERENCES
address the “lack of variation” issue, we ran an additional
set of experiments to determine if all Calculator.exe [1] S. Saad, F. Mahmood, W. Briguglio, and H. Elmiligi, “Jsless: A tale
of a fileless javascript memory-resident malware,” in International
processes, both normal runs and those infected by the Hollows Conference on Information Security Practice and Experience. Springer,
Hunter demos, could be separated from the other processes in 2019, pp. 113–131.
our environment. In this case, the Hollows Hunter code was [2] O. Moe, Jun 2018. [Online]. Available: https://github.com/
LOLBAS-Project/LOLBAS
used simply to introduce variation in process behavior. [3] L. Torvalds et al., “Linux kernel coding style,” Also available as
https://www. kernel. org/doc/Documentation/CodingStyle, 2001.
As with our previous binary classification experiments, we [4] M. H. Ligh, A. Case, J. Levy, and A. Walters, The art of memory
used Random Forests based on the features extracted by our forensics: detecting malware and threats in windows, linux, and Mac
custom Volatility plugin. memory. John Wiley & Sons, 2014.
[5] S. Embleton, S. Sparks, and C. C. Zou, “Smm rootkit: a new breed of os
Figure 6 shows values for Average Size of Mapped Region independent malware,” Security and Communication Networks, vol. 6,
for Calculator.exe (both normal and infected) and non- no. 12, pp. 1590–1605, 2013.
[6] S. Mohurle and M. Patil, “A brief study of wannacry threat: Ransomware
Calculator.exe processes. Figure 7 provides a ROC attack 2017,” International Journal of Advanced Research in Computer
curve for differentiating between Calculator.exe and Science, vol. 8, no. 5, pp. 1938–1940, 2017.
non-Calculator.exe based on this single feature. [7] S. M. Tabish, M. Z. Shafiq, and M. Farooq, “Malware detection using
statistical analysis of byte-level file content,” in Proceedings of the
This still does not address the problem of a relatively limited ACM SIGKDD Workshop on CyberSecurity and Intelligence Informatics,
number of processes in our sandbox. However, it does suggest 2009, pp. 23–31.
[8] M. Rajab, L. Ballard, N. Jagpal, P. Mavrommatis, D. Nojiri, N. Provos,
that our models could be capable of discriminating between and L. Schmidt, “Trends in circumventing web-malware detection,”
processes when more variation is introduced. Google, Google Technical Report, 2011.
[9] C. H. Malin, E. Casey, and J. M. Aquilina, Malware forensics: investi- [34] G. Erdélyi, “Hide’n’seek? anatomy of stealth malware,” Proceedings of
gating and analyzing malicious code. Syngress, 2008. the 2004 black Hat Europe, pp. 147–167, 2004.
[10] I. You and K. Yim, “Malware obfuscation techniques: A brief survey,” [35] J. Berdajs and Z. Bosnić, “Extending applications using an advanced
in 2010 International conference on broadband, wireless computing, approach to dll injection and api hooking,” Software: Practice and
communication and applications. IEEE, 2010, pp. 297–300. Experience, vol. 40, no. 7, pp. 567–584, 2010.
[11] I. Santos, Y. K. Penya, J. Devesa, and P. G. Bringas, “N-grams-based [36] S. Fewer, “Reflective dll injection,” 2008.
file signatures for malware detection.” ICEIS (2), vol. 9, pp. 317–320, [37] J. Leitch, “Process hollowing,” https://www.autosectools.com/
2009. Process-Hollowing.html, 2018.
[12] P. Faruki, V. Ganmoor, V. Laxmi, M. S. Gaur, and A. Bharmal, [38] D. Kapil, “Shellcode injection,” https://dhavalkapil.com/blogs/
“Androsimilar: robust statistical feature signature for android malware Shellcode-Injection/, 2015.
detection,” in Proceedings of the 6th International Conference on [39] T. Bao, R. Wang, Y. Shoshitaishvili, and D. Brumley, “Your exploit is
Security of Information and Networks, 2013, pp. 152–159. mine: Automatic shellcode transplant for remote exploits,” in 2017 IEEE
[13] B. Yadegari and S. Debray, “Symbolic execution of obfuscated code,” Symposium on Security and Privacy (SP). IEEE, 2017, pp. 824–839.
in Proceedings of the 22nd ACM SIGSAC Conference on Computer and [40] hasherezade, “hollows hunter,” https://github.com/hasherezade/hollows\
Communications Security, 2015, pp. 732–744. hunter.git, 2021.
[14] P. Shijo and A. Salim, “Integrated static and dynamic analysis for
malware detection,” Procedia Computer Science, vol. 46, pp. 804–811,
2015.
[15] A. Aljaedi, D. Lindskog, P. Zavarsky, R. Ruhl, and F. Almari, “Compar-
ative analysis of volatile memory forensics: live response vs. memory
imaging,” in 2011 IEEE Third International Conference on Privacy,
Security, Risk and Trust and 2011 IEEE Third International Conference
on Social Computing. IEEE, 2011, pp. 1253–1258.
[16] R. Sihwail, K. Omar, and K. Z. Ariffin, “A survey on malware analysis
techniques: Static, dynamic, hybrid and memory analysis,” Int. J. Adv.
Sci. Eng. Inf. Technol, vol. 8, no. 4-2, pp. 1662–1671, 2018.
[17] O. Alrawi, M. Ike, M. Pruett, R. P. Kasturi, S. Barua, T. Hirani, B. Hill,
and B. Saltaformaggio, “Forecasting malware capabilities from cyber
attack memory images,” in 30th USENIX Security Symposium (USENIX
Security 21), 2021, pp. 3523–3540.
[18] S. Banin, A. Shalaginov, and K. Franke, “Memory access patterns for
malware detection,” NISK, 2016.
[19] Y. Kawakoya, M. Iwamura, and M. Itoh, “Memory behavior-based auto-
matic malware unpacking in stealth debugging environment,” in 2010 5th
International Conference on Malicious and Unwanted Software. IEEE,
2010, pp. 39–46.
[20] A. Schuster, “The impact of microsoft windows pool allocation strategies
on memory forensics,” digital investigation, vol. 5, pp. S58–S64, 2008.
[21] J. Stüttgen, S. Vömel, and M. Denzel, “Acquisition and analysis of
compromised firmware using memory forensics,” Digital Investigation,
vol. 12, pp. S50–S60, 2015.
[22] A. Case and G. G. Richard III, “Detecting objective-c malware through
memory forensics,” Digital Investigation, vol. 18, pp. S3–S10, 2016.
[23] R. Petrik, B. Arik, and J. M. Smith, “Towards architecture and os-
independent malware detection via memory forensics,” in Proceedings of
the 2018 ACM SIGSAC Conference on Computer and Communications
Security, 2018, pp. 2267–2269.
[24] M. Aghaeikheirabady, S. M. R. Farshchi, and H. Shirazi, “A new
approach to malware detection by comparative analysis of data structures
in a memory image,” in 2014 International Congress on Technology,
Communication and Knowledge (ICTCK), 2014, pp. 1–4.
[25] R. Mosli, R. Li, B. Yuan, and Y. Pan, “Automated malware detection
using artifacts in forensic memory images,” in 2016 IEEE Symposium
on Technologies for Homeland Security (HST). IEEE, 2016, pp. 1–6.
[26] S. Liu, “Operating systems market share of desktop pcs 2013-
2021, by month,” https://www.statista.com/statistics/218089/
global-market-share-of-windows-7/, 2021.
[27] hasherezade, “Demos of various injection techniques found in malware,”
https://github.com/hasherezade/demos, 2020.
[28] Microsoft, “Windows kernel opaque structures,” https://docs.microsoft.
com/en-us/windows-hardware/drivers/kernel/eprocess, 2021.
[29] S. P. Svitlana Storchak, “Vergilius project,” https://www.vergiliusproject.
com/, 2018-2021.
[30] G. Chappell, “Geoff chappell, software analyst,” https://www.
geoffchappell.com/, 1997-2021.
[31] Microsoft, “Symbols for windows debugging (windbg, kd, cdb, ntsd),”
https://docs.microsoft.com/en-us/windows-hardware/drivers/debugger/
symbols, 2021.
[32] B. Dolan-Gavitt, “The vad tree: A process-eye view of physical mem-
ory,” Digital Investigation, vol. 4, pp. 62–64, 2007. [Online]. Available:
https://www.sciencedirect.com/science/article/pii/S1742287607000503
[33] Microsoft, “Peb structure (winternl.h),” https://docs.microsoft.com/
en-us/windows/win32/api/winternl/ns-winternl-peb, 2021.

A Novel Method For Malware Detection On ML-based Visualization Technique
No ratings yet
A Novel Method For Malware Detection On ML-based Visualization Technique
41 pages
Accelerated Windows API for Software Diagnostics
No ratings yet
Accelerated Windows API for Software Diagnostics
305 pages
SSIIM User Manual
No ratings yet
SSIIM User Manual
204 pages
applsci-12-08604-v2
No ratings yet
applsci-12-08604-v2
21 pages
Sec21 Alrawi Forecasting
No ratings yet
Sec21 Alrawi Forecasting
19 pages
Malware Analysis
No ratings yet
Malware Analysis
19 pages
Malware Classification Dimva08
No ratings yet
Malware Classification Dimva08
20 pages
Analyzing Malicious Software
100% (1)
Analyzing Malicious Software
47 pages
Malware_Analysis_using_Machine_Learning_and_Deep_Learning_techniques
No ratings yet
Malware_Analysis_using_Machine_Learning_and_Deep_Learning_techniques
7 pages
MalClassifier Malware Family Classification Using Network Flow Sequence
No ratings yet
MalClassifier Malware Family Classification Using Network Flow Sequence
13 pages
Malware Behaviour Analysis
No ratings yet
Malware Behaviour Analysis
9 pages
Hacking Windows Internals
100% (1)
Hacking Windows Internals
31 pages
A Memory-resident Malware Detection Framework Based on Memory Forensics and Deep Neural Network
No ratings yet
A Memory-resident Malware Detection Framework Based on Memory Forensics and Deep Neural Network
22 pages
GUI & Windows Programming Course Information: Dr. Richard R. Eckert
100% (1)
GUI & Windows Programming Course Information: Dr. Richard R. Eckert
10 pages
Automated Classification and Analysis of Internet Malware
No ratings yet
Automated Classification and Analysis of Internet Malware
18 pages
Malware Analysis
No ratings yet
Malware Analysis
59 pages
HP OpenView TeMIP Client V510L03
0% (1)
HP OpenView TeMIP Client V510L03
63 pages
Tms Script
100% (1)
Tms Script
24 pages
A Survey and Experimental Evaluation of Practical Attacks On Machine Learning For Windows Malware Detection
No ratings yet
A Survey and Experimental Evaluation of Practical Attacks On Machine Learning For Windows Malware Detection
31 pages
Detecting Obfuscated Malware using Memory Feature Engineering
No ratings yet
Detecting Obfuscated Malware using Memory Feature Engineering
12 pages
A Behavior-Based Approach For Malware Detection: Rayan Mosli, Rui Li, Bo Yuan, Yin Pan
No ratings yet
A Behavior-Based Approach For Malware Detection: Rayan Mosli, Rui Li, Bo Yuan, Yin Pan
16 pages
Winedev Guide
No ratings yet
Winedev Guide
133 pages
JETIR1907J68 June 2019
No ratings yet
JETIR1907J68 June 2019
17 pages
SANS_W_Copeland_Leveraging_Generative_AI_Memory_Analysis
No ratings yet
SANS_W_Copeland_Leveraging_Generative_AI_Memory_Analysis
23 pages
Malicious Code Invariance Based On Deep Learning
No ratings yet
Malicious Code Invariance Based On Deep Learning
7 pages
Malware Detection Using Machine Leaning
No ratings yet
Malware Detection Using Machine Leaning
9 pages
A Behavior-Based Approach For Malware Detection
No ratings yet
A Behavior-Based Approach For Malware Detection
15 pages
Unit Ii Ais
No ratings yet
Unit Ii Ais
26 pages
C# Programming
100% (2)
C# Programming
46 pages
Malware Detection Using Machine Learning
No ratings yet
Malware Detection Using Machine Learning
11 pages
P-CAD 2004 Circuit Simulator User's Guide
No ratings yet
P-CAD 2004 Circuit Simulator User's Guide
184 pages
A - Multi-Strategy - Adversarial - Attack - Method - For - Deep - Learning - Based - Malware - Detectors
No ratings yet
A - Multi-Strategy - Adversarial - Attack - Method - For - Deep - Learning - Based - Malware - Detectors
5 pages
1742747318200
No ratings yet
1742747318200
37 pages
Output Log
0% (1)
Output Log
127 pages
Exploiting Online Games:: Cheating Massively Distributed Systems
No ratings yet
Exploiting Online Games:: Cheating Massively Distributed Systems
40 pages
MARD - A Framework For Metamorphic Malware Analysis and Real-Time Detection Shahid Alam A
No ratings yet
MARD - A Framework For Metamorphic Malware Analysis and Real-Time Detection Shahid Alam A
10 pages
Visualizing Indicators of Rootkit Infections in Memory Forensics
No ratings yet
Visualizing Indicators of Rootkit Infections in Memory Forensics
18 pages
MC0081
No ratings yet
MC0081
385 pages
Wine Developer's Guide - Debugging Wine - WineHQ Wiki
No ratings yet
Wine Developer's Guide - Debugging Wine - WineHQ Wiki
26 pages
Recommendation Of: Perfect Unpacking
No ratings yet
Recommendation Of: Perfect Unpacking
40 pages
08 Rohit Final Malware Research Paper
No ratings yet
08 Rohit Final Malware Research Paper
13 pages
Catch Them Alive: Malware Detection
No ratings yet
Catch Them Alive: Malware Detection
19 pages
Effective Malware Detection Based On Behaviour and Data Features
No ratings yet
Effective Malware Detection Based On Behaviour and Data Features
16 pages
Machine Learning Based Fileless Malware Traffic Classification Using Image Visualization
No ratings yet
Machine Learning Based Fileless Malware Traffic Classification Using Image Visualization
18 pages
orig-hashes
No ratings yet
orig-hashes
25 pages
Survey Paper of Group 7
No ratings yet
Survey Paper of Group 7
9 pages
Ijcna 2021 o 56
No ratings yet
Ijcna 2021 o 56
18 pages
Automated in Memory Malware Rootkit Detection Via Binary Analysis and Machine Learning (Slides)
No ratings yet
Automated in Memory Malware Rootkit Detection Via Binary Analysis and Machine Learning (Slides)
111 pages
15709-Article Text-55876-2-10-20220114
No ratings yet
15709-Article Text-55876-2-10-20220114
26 pages
Malware Detection Using Convolutional Neural Network, A Deep Learning Framework: Comparative Analysis
No ratings yet
Malware Detection Using Convolutional Neural Network, A Deep Learning Framework: Comparative Analysis
14 pages
752shrd0gb PDF
No ratings yet
752shrd0gb PDF
82 pages
2024-A2-CLM_Few-Shot_Malware_Detection_Based_on_Adversarial_Heterogeneous_Graph_Augmentation (1)
No ratings yet
2024-A2-CLM_Few-Shot_Malware_Detection_Based_on_Adversarial_Heterogeneous_Graph_Augmentation (1)
16 pages
DX Diag
No ratings yet
DX Diag
37 pages
Malware Identification
No ratings yet
Malware Identification
28 pages
MeMalDet A Memory Analysis-Based Malware Detection Framework Using Deep Autoencoders and Stacked Ensemble Under Temporal Evaluations
No ratings yet
MeMalDet A Memory Analysis-Based Malware Detection Framework Using Deep Autoencoders and Stacked Ensemble Under Temporal Evaluations
20 pages
Research Paper
No ratings yet
Research Paper
8 pages
Memory Forensics
100% (1)
Memory Forensics
70 pages
IJETT-V73I1P132
No ratings yet
IJETT-V73I1P132
15 pages
Persistence Techniques in Windows Malware
No ratings yet
Persistence Techniques in Windows Malware
70 pages
Efficient and Effective Malware Detection System
No ratings yet
Efficient and Effective Malware Detection System
5 pages
606 (2)
No ratings yet
606 (2)
16 pages
Malware Family Detection Approach Using Image Processing Techniques: Visualization Technique
No ratings yet
Malware Family Detection Approach Using Image Processing Techniques: Visualization Technique
4 pages
A novel ensemble-based approach for Windows malware detection
No ratings yet
A novel ensemble-based approach for Windows malware detection
10 pages
Document
No ratings yet
Document
5 pages
User Guide Dwsim
No ratings yet
User Guide Dwsim
8 pages
Tilbury ISSA 2013
No ratings yet
Tilbury ISSA 2013
36 pages
Towards Resilient Machine Learning For Ransomware Detection - 1812.09400
No ratings yet
Towards Resilient Machine Learning For Ransomware Detection - 1812.09400
10 pages
Mini Project
No ratings yet
Mini Project
11 pages
Malware Detection Using Machine Learning and Deep Learning
No ratings yet
Malware Detection Using Machine Learning and Deep Learning
10 pages
Classification of Malware Detection Using Machine Learning Algorithms A Survey
No ratings yet
Classification of Malware Detection Using Machine Learning Algorithms A Survey
7 pages
Bilal Unit V Pe Csy R 21
No ratings yet
Bilal Unit V Pe Csy R 21
32 pages
Automated_Malware_Detection_Using_Memory_Forensics
No ratings yet
Automated_Malware_Detection_Using_Memory_Forensics
5 pages
CH1- Introduction to malware analysis-v1
No ratings yet
CH1- Introduction to malware analysis-v1
23 pages
Practical 2
No ratings yet
Practical 2
11 pages
SAP Content Server 6.40 On Windows 2000 Server
No ratings yet
SAP Content Server 6.40 On Windows 2000 Server
3 pages
Theme Resource Changer Tutorial
No ratings yet
Theme Resource Changer Tutorial
5 pages
Kuka Robot Controller
No ratings yet
Kuka Robot Controller
19 pages
DownloadTool - ReleaseNote
No ratings yet
DownloadTool - ReleaseNote
10 pages
OS-Independent Malware Detection Applying Machine Learning and Computer Vision in Memory Forensics
No ratings yet
OS-Independent Malware Detection Applying Machine Learning and Computer Vision in Memory Forensics
5 pages
Malcode Detection
No ratings yet
Malcode Detection
5 pages
Godiagram Win Version 5.3.0 Release Notes
No ratings yet
Godiagram Win Version 5.3.0 Release Notes
5 pages
05 - Laboratory - Exercise - 1 Rodas
No ratings yet
05 - Laboratory - Exercise - 1 Rodas
3 pages
Img 0001 PDF
No ratings yet
Img 0001 PDF
1 page
Sec535 PR PDF
No ratings yet
Sec535 PR PDF
23 pages
Issues and Concerns of Teaching and Non
No ratings yet
Issues and Concerns of Teaching and Non
13 pages
Read
No ratings yet
Read
2 pages
HYPACK 2009a Changes
No ratings yet
HYPACK 2009a Changes
10 pages
Description Us
0% (1)
Description Us
2 pages
A Forensic Analysis of Android Malware - How Is Malware Written and How It Could Be Detected?
No ratings yet
A Forensic Analysis of Android Malware - How Is Malware Written and How It Could Be Detected?
5 pages
Penetration Testing Fundamentals-2: Penetration Testing Study Guide To Breaking Into Systems
From Everand
Penetration Testing Fundamentals-2: Penetration Testing Study Guide To Breaking Into Systems
Devi Prasad
No ratings yet

Machine_Learning_Analysis_of_Memory_Images_for_Process_Characterization

Uploaded by

Machine_Learning_Analysis_of_Memory_Images_for_Process_Characterization

Uploaded by

LLNL-CONF-830179

Machine Learning Analysis of Memory

S. Lyles, M. DeSantis, M. Gallegos, H. Nyholm, C.

December 16, 2021

Digital Networking Security Conference, Artificial Intelligence to

coherent models in this simpler environment is a necessary

B. Detecting Malicious Behavior

Average Size of Mapped Region

support for positive class

Average Image Size

Average Stack Size

Average Heap Size

You might also like