A Memory-Resident Malware Detection Framework Based On Memory Forensics and Deep Neural Network
A Memory-Resident Malware Detection Framework Based On Memory Forensics and Deep Neural Network
Abstract
Cyber attackers have constantly updated their attack techniques to evade antivirus software detection in recent years.
One popular evasion method is to execute malicious code and perform malicious actions only in memory. Mali-
cious programs that use this attack method are called memory-resident malware, with excellent evasion capability,
and have posed huge threats to cyber security. Traditional static and dynamic methods are not effective in detect-
ing memory-resident malware. In addition, existing memory forensics detection solutions perform unsatisfactorily
in detection rate and depend on massive expert knowledge in memory analysis. This paper proposes MRm-DLDet,
a state-of-the-art memory-resident malware detection framework, to overcome these drawbacks. MRm-DLDet first
builds a virtual machine environment and captures memory dumps, then creatively processes the memory dumps
into RGB images using a pre-processing technique that combines deduplication and ultra-high resolution image
cropping, followed by our neural network MRmNet in MRm-DLDet to fully extract high-dimensional features from
memory dump files and detect them. MRmNet receives the labeled sub-images of the cropped high-resolution RGB
images as input of ResNet-18, which extracts the features of the sub-images. Then trains a network of gated recurrent
units with an attention mechanism. Finally, it determines whether a program is memory-resident malware based on
the detection results of each sub-image through a specially designed voting layer. We created a high-quality dataset
consisting of 2,060 benign and memory-resident programs. In other words, the dataset contains 1,287,500 labeled
sub-images cut from the MRm-DLDet transformed ultra-high resolution RGB images. We implement MRm-DLDet for
Windows 10, and it performs better than the latest methods, with a detection accuracy of up to 98.34%. Moreover, we
measured the effects of mimicry and adversarial attacks on MRm-DLDet, and the experimental results demonstrated
the robustness of MRm-DLDet.
Keywords Memory-resident malware, Memory forensics, Malware detection, Deep learning, Ultra-high resolution
image
Introduction
Over the years, artificial intelligence (AI) techniques have
significantly promoted the efficiency and ability of file-
based malware detection engines. Yet, at the same time,
*Correspondence: cyber-attackers also keep exploring advanced methods
Qixu Liu to evade or compromise antivirus software. One such
[email protected]
1
Institute of Information Engineering, Chinese Academy of Sciences, method is In-memory Code Execution (ICE) (Fewer
Beijing 100085, China 2008; Team 2021; Paschen 2020; Malik 2019; odzhan
2
School of Cyber Security, University of Chinese Academy of Sciences, 2019; Microsoft 2018; Kumar et al. 2020). ICE attacks
Beijing 100049, China
only execute malicious operations in memory and leave
© The Author(s) 2023. Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which
permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the
original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or
other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line
to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory
regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this
licence, visit http://creativecommons.org/licenses/by/4.0/.
Liu et al. Cybersecurity (2023) 6:21 Page 2 of 22
almost no evidence on the disk, making it challenging for features (Nataraj et al. 2011; Ni et al. 2018; Bozkir et al.
traditional static and dynamic analysis methods to detect 2021; O’Shaughnessy and Sheridan 2022). These stud-
(Arefi et al. 2018; Alrawi et al. 2021). For instance, in ies obtained great detection results. However, exist-
cyberattacks against the National Bank of Malawi (CERT ing vision-based malware detection techniques usually
2018), attackers rewrote and recompiled the multiple analyze PE files directly. They still face the drawbacks of
open-source codes embedding the encrypted DarkComet existing static and dynamic malware detection methods,
(Lesueur 2020) remote access trojan (RAT) into relevant i.e., they cannot effectively detect malware only running
codes. In the actual operational process, the encrypted in memory. Moreover, a memory dump can be reshaped
data will be loaded, decrypted, and expanded into a into an RGB image, but the size of a memory dump file is
complete DarkComet RAT portable executable (PE) file the same as the virtual machine’s memory, which is 2GB
in memory. In this way, the payload implements an ICE or above. Therefore, the generated memory dump images
attack to avoid killing and bypassing anti-virus solutions. are of ultra-high resolution, and their minimum size is
Advanced Persistent Threats (APT) groups and malware about 6000 × 6000 pixels after our processing method
families such as Lazarus Group (MITRE 2021), Poweliks (more details can be found in the “MRm-DLDet” sec-
(O’Murchu and Gutierrez 2015), Zeus (Binsalleeh et al. tion). However, existing vision-based malware detection
2010) all employed ICE attacks. methods only allow the input of regular-size pictures.
Kumar et al. (2020) briefly presents memory-resident For example, Ni et al. (2018) used images of max 32 × 32
malware. In this work, we define memory-resident mal- pixels for their model; the detection model of Bozkir
ware as malware that executes shellcodes and PE files in et al. (2021) extracts features from images converted
memory using ICE technology. from malware with 256 × 256 pixels; O’Shaughnessy and
With the spread of memory-resident malware, mem- Sheridan (2022) proposed a hybrid malware classification
ory forensics has become more critical. Memory foren- model that extracts visual features from the images with
sics (MF), is a technique that captures volatile memory a max size of 512 × 512 pixels, none of which can handle
data from computers’ memory dumps and analyzes our ultra-high resolution images. Thus, it can be inferred
them. Memory dumps contain processes, network con- that existing vision-based detection methods can not effi-
nections, open files, and registry modifications created ciently handle ultra-high resolution images.
during the malware’s runtime, significant traces for iden-
tifying memory-resident malware. There have been many Motivation
memory forensics studies incorporating machine learn- Therefore, we proposed a novel approach by combining
ing (Barabosch et al. 2017; Bozkir et al. 2021; Wang et al. the information of the malware’s whole memory dumps,
2020; Sihwail et al. 2021). These works have significantly such as memory pages, processes, and other related data
improved the memory-resident malware detection accu- with deep neural network for detection to solve the diffi-
racy and efficiency. Unfortunately, variants of ICE attacks culties that traditional static and dynamic analysis meth-
may inject different processes and use different methods ods to detect memory-resident malware. And solve the
to load shellcodes or PE files. Moreover, attackers always two challenges in memory-resident malware detection.
look for never-detected vulnerable processes or meth- Our work can better use the malware-specific execution
ods to construct advanced attacks. Thus, manual feature data to detect memory-resident malware by converting
engineering requires analysts to be familiar with and pos- memory dumps to pictures without extensive and com-
sess extensive domain knowledge to distinguish high and plex expert knowledge. A memory dump file can be con-
low differentiation features. verted into an RGB image, every pixel of a memory dump
image is associated with memory data, and the difference
Challenges between images can help separate benign from malicious
Based on the above discussions, existing memory-res- memory dumps. Moreover, this paper designs a memory
ident malware detection methods face two challenges: dump file preprocessing method to relieve the storage
(1) The accuracy of detection frameworks relies on vari- space pressure caused by the size of memory dump files
ous hand-crafted features of memory-resident malware, and solve the problem that existing vision-based malware
which requires massive expert knowledge in the field of detection methods cannot handle ultra-high resolution
memory analysis and it is somewhat subjective and not images.
generalizable. (2) Existing detection tools do not take full In order to further discuss whether visualization can
advantage of memory data information. help detect memory-resident malware, we analyzed
In the malware detection field, many studies have the memory dump of a Lazarus Group’s sample. An
used computer vision to convert malware into images, MS-DOS header is found at address 0xB14D000 of this
then classify malware programs by specific image memory dump, which is the start of a PE file. Lazarus
Liu et al. Cybersecurity (2023) 6:21 Page 3 of 22
implements the ICE attack by decrypting the payload overcomes the two challenges faced by existing memory-
and loading it into its own memory space, so the PE file resident malware detection methods and solves the prob-
found at 0xB14D000 is the payload of the malware from lem that current image detection methods cannot handle
Lazarus Group. We analyzed a benign sample for com- ultra-high resolution images. Experiments show that
parison, Fig. 1 shows a motivating example. The two MRm-DLDet has a high detection accuracy (98.34%).
images on top are data from the same location in the To summarize, in this paper, we make the following
two memory dump files. We converted the two memory contributions:
dumps to RGB images by our framework. The bottom
of Fig. 1 shows a part of the two RGB images that cor- • We comb through the latest ICE methods from mal-
respond to the code fragments of two dumps at addresses ware families and APT groups and first define mal-
from 0xB14D000 to 0xB1589DF. These images have sig- ware that uses ICE methods to execute shellcode or
nificant differences between color, texture, and structure. malicious PE files in memory as memory-resident
That leads us to a driving thesis of our work: The seman- malware.
tic and structural differences between malicious • We propose the first memory-resident malware
memory-resident actives and benign memory dumps detection framework that combines memory foren-
can be effectively identified by visually comparing sics and deep learning named MRm-DLDet, which
memory dump RGB images. Detailed description of the focuses on capturing and analyzing memory dumps.
memory dump files and its visualization can be found in MRm-DLDet has a virtual machine environment for
the “MRm-DLDet” section. capturing memory dumps, a novel memory dump
preprocessing method combining data deduplication
Our work and ultra-high resolution image cropping, and a neu-
We propose a state-of-the-art memory forensic frame- ral network named MRmNet.
work based on deep learning called MRm-DLDet • Because of the lack of publicly available open source
(Memory-Resident malware Deep Learning Detector). datasets for in-memory-resident malware detection,
MRm-DLDet first captures memory dumps, and their we collected a dataset with 2,060 benign and mali-
size will be reduced by memory duplicate page deletion. cious programs. The memory dumps of the programs
Then MRm-DLDet converts memory dumps into ultra- that are converted into ultra-high resolution images
high resolution RGB images and uses a non-overlap- will be cropped into 1,287,500 sub-images. Now our
ping sliding window to crop the images into sub-images dataset is ready to be used publicly for non-commer-
served as inputs to MRm-DLDet’s MRmNet neural cial reasons.
network. The MRmNet combined with ResNet-18 (He • We studied the influence of different image sizes
et al. 2016), gated recurrent units (Cho et al. 2014), and on the performance of MRm-DLDet and compared
attention mechanism (Zhou et al. 2016). Our framework MRm-DLDet with the most advanced methods.
Compared to the latest methods, our framework is
better in all experimental evaluation metrics. Spe-
cifically, MRm-DLDet has a detection rate of up to
98.34%.
Related work
Memory forensics‑based memory‑resident malware
detection
Since Malik (2019) first performed how to load and run
portable executables entirely from memory manually,
in-memory malware execution has gradually become
a prevalent attack method for cyber attackers. In this
paper, we divide MF-based malicious code detection
methods into two types based on different technical
bases: method based on the characteristics of operating
system and memory pages (OS Characteristics Based),
and detection method combining artificial intelligence
Fig. 1 A motivating example of memory dump visualization
Liu et al. Cybersecurity (2023) 6:21 Page 4 of 22
(AI Based). Figure 2 succinctly shows the latest MF-based A very recent effort by Alrawi et al. (2021) presented a
malware detection techniques. post-detection technique named FORECAST to predict
capabilities that malware has staged for execution auto-
OS characteristics‑based matically. FORECAST guides a symbolic analysis of the
Volatility (Foundation 2020) is the most widely used and malware’s code by leveraging the execution context of the
authoritative open-source MF-based framework with ongoing attack from the malware’s memory image.
a plugin called malfind. It determines whether the pro-
cess is suspicious by checking the memory pages’ vir- AI‑based
tual address descriptor (VAD), a process’ VAD tree that Wang et al. (2020) proposed PROVDETECTOR that
describes the layout of memory segments. In the VAD detects malware with steganography, which is a prove-
tree, there is also some information about the type and nance-based approach. PROVDETECTOR first uses a
level of protection (read, write, execute) of the memory novel selection algorithm to identify potentially malicious
page (Ligh et al. 2014), in addition to information related parts of the process’ OS-level provenance data. Then
to the mapped object and several other flags. For exam- it applies neural embedding and machine learning. In
ple, if the protection field of a memory page is set to another study, Quincy (Barabosch et al. 2017) extracted
“PAGE_EXECUTE_READWRITE”, then malfind will ini- 38 features from the volatile memory and used Random
tially determine that it is a malicious process. However, Forests and Extremely Randomized Trees to classify the
malfind can be easily bypassed by malware developers. memory storage area, which achieved an AUC score of
For instance, the “Bypass Malfind” method proposed by 93.8% on Windows XP, but only 84.4% on Windows 10.
Block and Dewald (2019) will assign a memory of one Bozkir et al. (2021) represented the suspicious pro-
memory-resident malware with READONLY protec- cesses’ memory dumps into RGB images and reported
tion and then change the protection state of all contained 96.39% prediction accuracy by combining the RBF ker-
pages to EXECUTE_READWRITE by VirtualProtectEx. nel-based SMO algorithm with GIST+HOG for feature
What’s more, malfind needs an expert in the memory vectors. Still, This study only investigated malware pro-
forensics field, as malfind does not provide a post-pro- cesses and did not consider malware that hides in benign
cessing algorithm to distinguish benign software from processes to execute, such as UUID Shellcode (Team
malicious ones, which means it requires extensive expert 2021) and Earlybird (spotheplanet 2020). We assume
knowledge of memory forensics to analyze Volatility’s that this limits the accuracy of the study by Bozkir et al.
output and determine if a program is a malicious one. (2021) to some extent.
Arefi et al. (2018) have reported a reverse engineering Sihwail et al. (2021) applied memory forensics to
tool named FAROS to detect in-memory-only malware extract memory-based features from malware memory
injection attacks. FAROS only focused on three in-mem- images to expose the actual behavior of malware. They
ory code injection attack techniques. And only imple- used feature engineering and the SVM algorithm, con-
mented on Windows 7 VM, without considering new verted the features into binary vectors, and obtained
attack methods on Windows 10 systems, which are now a classification accuracy of 98.5% in Windows 7 OS.
more widely used.
However, this task is time-consuming for manual feature to RGB images, and then uses deep neural network to
extraction and performs poorly on Windows 10. detect memory-resident malware. In this paper, memory-
resident malware is primarily used as attack payloads to
Deep learning‑based malware detection vision methods launch attacks on the target system to directly execute
In recent years, computer vision has been applied to mal- the malware in the victim computer’s memory, instead of
ware detection with good results. The idea of converting writing malware to the hard drive to evade the progres-
files into images before detection inspired our research. sively increasing malware detection process and remain
Nataraj et al. (2011) first visualized Malware binaries invisible in the target device.
as grayscale images based on the observation that the We assume that the memory-resident malware exe-
images belonging to the same malware family appear cutes the attack when we capture memory dumps.
very similar in layout and texture. In their solution, Boz- Recent research (Wang et al. 2020) can help alleviate this
kir et al. (2019) employed several various convolutional assumption. Moreover, it is assumed that all ICE attacks
neural networks to classify persistent malware files. They leave traces of in-memory data.
converted PE files’ binary bytes into images, and they
reported 97.48% detection accuracy in experiments. Our framework
Pinhero et al. (2021) used three malware visualiza- MRm-DLDet is a memory-resident malware detection
tion methods: grayscale maps, RGB maps, and Markov tool that integrates computer vision and deep learning
images, and then extracted features of the three types of techniques with memory forensics to model ICE attacks.
images using Gabor filters. Twelve different neural net- Figure 3 gives an overview of the MRm-DLDet architec-
works were trained and the F-measure up to 99.97%. ture, which consists of three main parts. We outline the
Tekerek and Yapici (2022) proposed a new method approach here, and the following two sections provide
based on CNN by converting byte files to gray and RGB complete information.
image formats respectively for malicious code classifica- First, MRm-DLDet captures memory dump files and
tion. O’Shaughnessy and Sheridan (2022) proposed a removes duplicate memory pages (A in Fig. 3). Next, we
hybrid framework for malware classification by setting an convert deduplicated memory dumps to RGB images.
entropy threshold to quickly determine whether a sample The RGB images are ultra-high resolution since the
is packed or not, then analyzing the samples using static memory dumps still contain too much data after remov-
and dynamic methods respectively. Static PE files or ing duplicates. Inspired by ultra-high resolution image
memory dump files of processes are mapped into images processing methods in remote sensing image recogni-
by space-filling curves, then the model extracts visual tion, we propose a vision-based enormous image pro-
features from the images, reporting an accuracy of 97.6%. cessing solution. To avoid the important information
However, most of the existing vision-based malware loss caused by traditional multiple downsampling layers
detection methods directly analyze the binary files. image scaling methods, we cropped the enormous ones
O’Shaughnessy and Sheridan (2022) converted memory into sub-images (B in Fig. 3). After that, sub-images are
dumps to images when the malicious program is run- fed into the MRmNet. MRmNet extracts the feature
ning, but they do not analyze the complete memory data. vector of each sub-image by a pre-trained ResNet-18
All existing vision-based malware detection efforts can- network. The feature vectors of sub-images formed the
not deal with ultra-high resolution images. Our work feature of the whole memory dump file, which are fed
solves this problem by deduplicating complete memory into the gated recurrent units (GRU) model later. Then,
dumps and using non-overlapping sliding windows to cut we add an attention layer to retain important details and
images into multiple sub-images. prevent information loss. Finally, we design a voting layer
to output the memory-resident malware detection results
Framework overview (C in Fig. 3).
We first discuss the threat model, then introduce the
overall framework of MRm-DLDet. Finally, we describe Background on ultra‑high resolution image classification
the background of ultra-high resolution image classifica- Our MRm-DLDet framework first visualizes binary
tion (one of the essential techniques used in this study) in files into RGB images. Then we want to get the features
detail. of these images (i.e., features of the memory dump
files) by the ResNet-18 network. However, the dumps
Threat model converted after deduplication still have a minimum
MRm-DLDet is a framework for memory-resident mal- size of 6000 × 6000 . Limited by the storage of GPUs in
ware detection in Windows 10. It takes PE files as input, general devices at this stage, it is not possible to handle
converts the memory dumps of PE programs runtime the computation of ultra-high resolution images, and
Liu et al. Cybersecurity (2023) 6:21 Page 6 of 22
exists. If any of them is present, Zeus aborts execution off, and create a new snapshot. Finally, we use vmrun’s
and removes itself. Screen resolution is a commonly ‘createSnap’ operation to capture a snapshot that dumps
used anti-virtual machine detection indicator by mal- the VM’s memory state to a file. We created a snapshot of
ware as well. For example, the banking Trojan TrickBot the Windows 10 VM before any operations and named it
(Abrams 2020) checks the target device’s screen resolu- ‘Initial State’. Every time before running one sample, the
tion to detect virtual machines. If the screen resolution VM machine will roll back to the ‘Initial State’. Küchler
is 800 × 600 or 1024 × 768, the machine will be consid- et al. (2021) suggests that most malicious behavior can be
ered a virtual machine. In addition, malware will search observed within the first two minutes that it is executed.
for user activity on the device by whether the mouse is Each malicious sample was given two minutes to initial-
moved or clicked, or the keyboard is typed, etc., to deter- ize and execute.
mine if it is being analyzed within a virtual machine To further analyze the memory dump file, we present
(Miramirkhani et al. 2017; Yokoyama et al. 2016; Bulazel the structure of the captured memory dump files. Win-
and Yener 2017). According to Malware Behavior Catalog dows memory management can be summarized into
(2022), samples from the DarkComet family will check three mechanisms: (1) virtual address space manage-
if the mouse is moving. The Darkhotel and Ursnif mal- ment, (2) physical page management, and (3) address
ware (MITRE 2021; Ionut Arghire 2017), check whether translation and page swapping (Yu et al. 2015). MRm-
the mouse cursor position has changed to determine DLDet analyzes the entire data of the memory dump,
whether it is running on a real device. Therefore, we miti- including the data of the physical and virtual memory
gate these anti-VM detections by performing actual user space, where each process runs in its own virtual address
actions while the malicious sample is running, including space. In Fig. 4, we briefly show the layout of a typi-
moving and clicking the mouse and typing characters on cal process (Yosifovich et al. 2017), with each part of it
the keyboard. described as follows.
Therefore, we modified the configuration of our Win-
dows 10 virtual machine by the following steps to prevent • Kernel address space: Users do not have access to
it from being checked by memory-resident malware. this part of the memory, which is managed by the
operating system and used for paging pools, system
• Uninstall VMware Tools. cache, device drivers, etc.
• Do some modifications to the.vmx file, such as mak- • User address space: Programs running in user mode
ing the virtual machine use the same BIOS serial have no access to the kernel address space but are
number as the physical machine, etc.
• Modify the MAC address to a random one except
default VMware MAC address (e.g., 00:0c:29,
00:50:56, 00:05:69).
• Modify screen resolution to any value except
800 × 600 and 1024 × 768. In our virtual machine,
we set it to 1152 × 864.
• Mimic normal user behavior by clicking or mov-
ing the mouse and tapping random characters on
the keyboard when running samples in the virtual
machine.
denotes each sub-image generated after sliding window Trainc = V1,1 , V1,2 , ...V2,1 , V2,2 , ..., Vn,625 (2)
partitioning a D i image, j denotes the serial number of
the sub-picture, 625 in total. To further describe the GRU principle, set Vt to represent
the input at the current moment, zt as the update gate, rt
as the reset gate, ht as the hidden state that passes to the
ResNet‑18 layer next moment, while ht−1 is the old state, h̃t is the candi-
The second layer is the network to extract the di,j fea- date hidden state. The specific implementation of a single
tures. In this study, we use the ResNet-18 network (He gated recurrent unit is as follows:
et al. 2016). To get over the difficulty that deep networks
zt = σ (Wz · [ht−1 , Vt ])
are not easily optimized, the ResNet-18 network uses a
residual structure. Each residual block is a multilayer rt = σ (Wr · [ht−1 , Vt ])
neural network consisting of a convolutional layer, a (3)
ht = (1 − zt ) ∗ ht−1 + zt ∗ h̃t
batch normalization layer, and an activation layer. The
h̃t = tanh(W · [rt ∗ ht−1 , Vt ]j )
new technique introduced by the ResNet model provides
shortcut connections between non-contiguous convolu- In addition, we applied dropout technique on our GRU
tional layers. This technique allows the model to skip lay- layer to reduce the risk of overfitting. Overall, the output
ers to process vanishing gradients to achieve lower losses by GRU layer at moment t is represented as:
and better results.
To obtain the image features, we extract the output of
Gt = GRU Vi,j (4)
the ResNet-18 model’s avgpool layer as the result of one
sub-image feature extraction, which is a vector with a
length of 512. In detail, the cropped sub-images, repre- Attention layer
sented as di,j that obtained from the input layer, enter the The attention mechanism was proposed by Zhou et al.
ResNet-18 layer as the input. After going through the pre- (2016), which could assign weights to data and weight
trained ResNet network, it extracts a [1, 512] vector Vi,j summation, and is highly interpretable. The attention
(i = 1, 2, ..., n, j = 1, 2, ...625), which is generated by avgpool mechanism effectively retains important details and pre-
layer. i denotes that the sample is the i-th program of our vents critical information from being lost. For that rea-
dataset, and j denotes the serial number of the sub-image. son, we added an attention layer after the GRU layer and
The output of the ResNet-18 layer is formalized as: let it assign weights to the output vectors so that it could
Vi,j = ResNet − 18(di,j ) (1) further improve the detection accuracy of the MRm-
DLDet framework. In our neural network, the attention
mechanism estimates the association level between fea-
tures. Gt is the vector output by the GRU layer at moment
GRU layer t. W is set as the result of the GRU layer weighted sum-
The third layer of MRmNet is the GRU layer. It is well mation of the output vectors. Let at represent the weight
known that long short-term memory (LSTM) (Hochre- of the hidden layer of the attention module, b represent
iter and Schmidhuber 1997) solves RNN’s problem of the bias.
lacking long-term dependence on learning by adding a
gated mechanism and memory cell. However, the LSTM
W = at Gt + b (5)
network has many parameters and converges slowly. Set Li,j as the final output label.
GRU (Cho et al. 2014) is an improved version of standard
LSTM. GRU makes simplifications and improvements Li,j = softmax(at W ) (6)
on LSTM networks. GRU only has update gate and reset
In the end, the output Li,j is the classification result of
gate, while LSTM has three gates (forget gate, input gate,
sub-images. The output of MRmNet has two cases:
and output gate). GRU has fewer training parameters, so
on the one hand, the source binary of the sub-image is
it saves much time when the training data is enormous.
a memory-resident malware, i.e., Li,j is set to 1. On the
This layer divides the feature-extracted sub-image
other hand, the program which generated the sub-image
vectors from the previous layer into the training set,
is a benign file, i.e., Li,j is set to 0.
validation set, and test set. Then, for example, the mem-
ory-resident malware in the training set can be obtained
as the vector sequence shown in Eq.(2). Trainc(c = 0, 1) Voting layer
represents the class of samples in the training set, i.e., 0,1. The final layer of MRmNet is voting layer, a layer that we
designed. In previous layers, ResNet-18 will go through
Liu et al. Cybersecurity (2023) 6:21 Page 12 of 22
GRU, and then the attention layer will assign weights and byte files and asm files. EMBER only includes features
give detection results. extracted after parsing the PE file, and SOREL-20M
The detection result Li,j from the attention layer is offers malicious samples with the PE header set to 0.
only the classification result of one sub-image. In MRm- Since memory-based detection methods require memory
DLDet, a memory dump image will be cropped into 625 data when a program runs, these samples are not avail-
sub-images. Therefore, a combined result of the 625 sub- able for memory forensics. On the other hand, these
images will report the whole memory dump image’s clas- datasets provide few benign executables. Therefore, the
sification result, which is the tested sample’s detection existing datasets do not apply to our study. To construct a
result. suitable dataset, we constructed a dataset that meets the
In order to effectively detect memory-resident mal- following requirements:
ware, it becomes a challenge as how to most effectively
organize the classification results of the 625 sub-images • Malicious samples from memory-resident malware
from each memory dump image. We designed a vot- family.
ing layer, the attention layer’s output Li,j as input. Every • Both malicious samples and benign programs are
625 sub-images represent a memory dump file, that complete PE files and can be run on Windows 10.
can be considered as a group, for example, when i = 1, • All the samples were built recently.
Group1 = L(1,1) , L(1,2) , ...L(1,625) . Calculate the arithme-
VGG16 Simonyan and Zisserman (2014) The VGG16 network has a strong fitting ability. It is often used as a benchmark for malware identifica-
tion, and its core design idea is to use smaller convolutional kernels and build deeper network layers.
Inception V3 Vasan et al. (2020b) Proposed by Google, the highlight is the addition of decomposition techniques to decompose the
convolutional kernel.
ResNet-18 He et al. (2016) ResNet-18 is one of the ResNet network family, its network structure balances training efficiency and
accuracy well, and it has achieved excellent results in visual malware classification.
Table 6 Comparison of the MRm-DLDet framework evaluation memory dumps better than directly resized images to get
using different neural networks higher detection accuracy.
Method Model Accuracy Precision Recall F1-score We analyzed the misreported programs. A malicious
(%) sample of the DarkComet family was misreported as
benign, and we found that this was because when the
Directly VGG16 87.76 0.8842 0.8733 0.8787
resize memory dump of the sample was obtained, the runt-
ResNet-18 91.59 0.9124 0.9062 0.9093
ime sample did not successfully connect to C&C and
Inception 85.79 0.8649 0.8813 0.8730
V3 therefore did not perform the following attack behavior,
MRm- 90.55 0.8824 0.9002 0.8912
resulting in similar characteristics to the benign program.
DLDet Another observation is that memory-resident malware
Non- VGG16 92.36 0.9192 0.9210 0.9201 samples are falsely reported at a higher rate, probably
overlapping ResNet-18 94.20 0.9276 0.9524 0.9398 because hackers are constantly improving ICE attacks to
sliding
window Inception 91.77 0.9193 0.9128 0.9160 make their actions increasingly slight and more similar to
V3 the APIs used by benign programs, for example, to obtain
MRm- 98.34 0.9896 0.9777 0.9836 a higher evade capability.
DLDet
Bold values indicate the best detection result Different memory dump image sizes and neural networks
evaluation
In MRm-DLDet’s Ultra-High Resolution Image Preproc-
method, using a non-overlapping sliding window could essing Module, when using the non-overlapping sliding
significantly improve the memory-resident malware window to crop memory dump RGB images, the num-
detection performance of MRm-DLDet. ber of sub-images varies depending on the resolution of
Furthermore, to compare the effect of different neural the memory dumps. MRm-DLDet solves this problem
networks with different image processing methods on by resizing all RGB images to the same size by bicubic
the performance of visual detection of memory-resident interpolation. We evaluate the effect of three different
malware, we selected three neural network models that image sizes: 3360 × 3360, 4480 × 4480, and 5600 × 5600
are widely used in malware visualization detection meth- on the detection performance of the model. Additionally,
ods as comparison baselines. Then trained and tested to choose the best CNN-RNN combination for MRm-
the models using the processed memory dump data. The Net, we selected three CNN models (VGG16, Inception
models are briefly described in Table 5. V3, ResNet-18), three RNN models (RNN, LSTM, GRU)
Table 6 shows the results of training and testing the and respectively cross-combined them. Each CNN-RNN
four neural network models separately with images pro- combination uses three different sizes of memory dump
cessed by two different dimensionality reduction meth- images as input, the CNN models extract image features,
ods. It can be found that MRm-DLDet using sub-images and the features are then transferred to the RNN models
cropped by non-overlapping sliding windows as input has that are combined with the attention mechanism.
the best detection accuracy of 98.34% and F1 score > 0.98, Totally 27 detection models were generated, 9 of each
and the detection results of the models using images pro- memory dump image size. Each sub-image is 224 × 224 ,
cessed with the non-overlapping sliding window method through the non-overlapping sliding window, one RGB
as input are better than those of the models trained with image of three sizes produces 625, 400 as well as 225
direct resized images. This also proves that it is reason- sub-images respectively. Table 7 shows the training
able to apply non-sliding windows with ultra-high resolu- and detection results of each model. Besides the four
tion memory images, which can preserve the features of
Liu et al. Cybersecurity (2023) 6:21 Page 15 of 22
Table 7 Comparison of different memory dump image sizes and different neural networks
Memory dump image size MRmNet’s deep learning model Accuracy Precision Recall F1-score Feature
extraction time
(minutes)
5600 × 5600(625 Sub-Images) VGG16 + RNN + Attention 96.49% 0.9474 0.9671 0.9572 0.5970
VGG16 + LSTM + Attention 97.51% 0.9554 0.9676 0.9614 0.5970
VGG16+GRU+Attention 97.62% 0.9546 0.9680 0.9613 0.5970
Inception V3 + RNN + Attention 94.27% 0.9328 0.9351 0.9340 3.9532
Inception V3 + LSTM + Attention 94.90% 0.9450 0.9526 0.9488 3.9532
Inception V3 + GRU + Attention 95.22% 0.9309 0.9674 0.9488 3.9532
ResNet-18 + RNN + Attention 97.66% 0.9817 0.9642 0.9729 0.4779
ResNet-18 + LSTM + Attention 98.11% 0.9828 0.9810 0.9819 0.4779
ResNet-18 + GRU + Attention 98.34% 0.9896 0.9777 0.9836 0.4779
4480 × 4480(400 Sub-Images) VGG16 + RNN + Attention 95.67% 0.9529 0.9524 0.9527 0.5837
VGG16 + LSTM + Attention 95.97% 0.9500 0.9575 0.9537 0.5837
VGG16 + GRU + Attention 96.05% 0.9581 0.9534 0.9557 0.5837
Inception V3 + RNN + Attention 93.28% 0.9251 0.9388 0.9319 1.1931
Inception V3 + LSTM + Attention 94.01% 0.9468 0.9418 0.9442 1.1931
Inception V3 + GRU + Attention 93.78% 0.9321 0.9375 0.9348 1.1931
ResNet-18 + RNN + Attention 97.81% 0.9808 0.9787 0.9797 0.4598
ResNet-18 + LSTM + Attention 97.66% 0.9821 0.9733 0.9777 0.4598
ResNet-18 + GRU + Attention 97.41% 0.9808 0.9714 0.9761 0.4598
3360 × 3360(225 Sub-Images) VGG16 + RNN + Attention 95.28% 0.9427 0.9531 0.9479 0.2674
VGG16 + LSTM + Attention 94.81% 0.9345 0.9463 0.9403 0.2674
VGG16 + GRU + Attention 95.59% 0.9464 0.9644 0.9553 0.2674
Inception V3 + RNN + Attention 94.98% 0.9316 0.9536 0.9425 1.4769
Inception V3 + LSTM + Attention 94.61% 0.9439 0.9404 0.9422 1.4769
Inception V3 + GRU + Attention 95.02% 0.9405 0.9562 0.9483 1.4769
ResNet-18 + RNN + Attention 95.58% 0.9807 0.9432 0.9616 0.3964
ResNet-18 + LSTM + Attention 96.00% 0.9794 0.9528 0.9659 0.3964
ResNet-18 + GRU + Attention 96.03% 0.9784 0.9643 0.9713 0.3964
Bold values indicate the best experimental results
evaluation metrics, we also consider the feature extrac- by non-overlapping sliding windows and the model is
tion time, which shows the time taken by different CNN relatively getting better detection results. Therefore, we
models to extract features from a memory dump image choose 5600 × 5600 for MRm-DLDet as the size of the
of different sizes. memory dump image after bicubic interpolation process-
Table 7 shows that in terms of image size, the accu- ing in order to get the best detection results.
racy of the neural networks decreases with the For the CNN models, five evaluation metrics are con-
increased compression of the images. The evalua- sidered: accuracy, precision, recall, F1-score, and feature
tion metrics of the model gradually decrease from extraction time. In MRm-DLDet, the pre-trained CNN
5600 × 5600 to 3360 × 3360. For example, the ResNet- model extracts features for each memory dump file’s sub-
18+GRU+Attention model has 98.34% accuracy, 0.9896 images, which are then transferred to the RNN model to
precision, 0.9777 recall, and F1-score is 0.9836 when the train and test. The feature extraction time is affected by
image size is 5600 × 5600, while the accuracy drops to two factors:
97.41% and 0.9808 when the image size is 4480 × 4480.
When the size is reduced to 3360 × 3360, the detection • The number of sub-images of each memory dump
accuracy is only 96.03%, the precision is only 0.9784. The file is different due to the different sizes of the images.
same finding was found in the other 9 combinations of • The difference between the structure of the neural
neural network models. This may be because using a network results in different feature vector lengths
5600 × 5600 image, more sub-images can be generated extracted by each model.
Liu et al. Cybersecurity (2023) 6:21 Page 16 of 22
UUID Shellcode (Team 2021) UUidFromStrinA API takes a string based UUID and converts it to its binary representation.
Providing UUidFromStrinA a pointer to a heap address, it decodes data and writes them to memory without using
common functions such as memcpy or WriteProcessMemory.
The EnumSystemLocales function executs shellcode.
Earlybird (spotheplanet 2020) Earlybird method creates a new legitimate process in a suspended state and allocates memory for shellcode in
the new process’s memory space.
Declare APC routine pointing to the shellcode, then shellcode is written to the previously allocated memory.
Queuing APC to the main thread, resuming the thread and executing the shellcode.
Phantom DLL Hollowing (orr 2021) Phantom DLL Hollowing first open a TxF handle to a Microsoft signed DLL file on disk, and infect its .text section
with shellcode.
Generate a phantom section from this malware-implanted image and map a view of it to the address space of a
process of his choice.
The shellcode is hidden and then executed in the .text section with +RX permissions.
Module Stomping (ired.team Module Stomping first injects some benign Windows DLL into a remote (target) process.
2020) Overwrites DLL’s that loaded in step one, AddressOfEntryPoint point with shellcode.
Starts a new thread in the target process at the benign DLL’s entry point, where the shellcode has been written to.
Table 10 Comparison of the detection results for the latest ICE Discussion
attacks by different antivirus solutions Our experimental results indicate that MRm-DLDet
AV engines Attacks framework is superior to the state-of-the-art memory-
resident malware detection methods. Measurements of
UUID Earlybird PhantomDLL Module
Shellcode hollowing stomping
MRm-DLDet’s runtime have shown that it could also
be used in real-world malware detection. We detect
McAfee U∗ U U M ICE attacks with memory forensics, which takes the
Bitdefender M ∗∗ U U M memory dumps of program runtime as information
Webroot U U U U sources. MRm-DLDet takes full advantage of the feature
Malwarebytes U U U U that memory forensics can directly analyze RAM data
ESET-NOD32 U U U M and detect malware operations in memory for captur-
Sophos U U U U ing memory-resident malware. Some challenges are still
Microsoft M M U M faced in memory-resident malware detection.
MRm-DLDet M M M U
∗
U means Undetcted
∗∗ The risk of overfitting
M means Memory-Resident Malware
Our dataset consists of 1010 benign and 1050 memory-
resident samples, which seems insufficient for a deep
We believe that attacks against deep neural networks learning model, and the model may be at risk of overfit-
are not feasible in our environment. On the one hand, our ting. In this paper, the risk of overfitting is concentrated
method takes PE files as input. It is deployed on the client in the neural network MRmNet of MRm-DLDet, where
system, so the attackers cannot theoretically obtain the ResNet-18 is primarily used for extracting image features.
feature data during detection for data poisoning attacks, The attention-based GRU network is used to complete
etc. On the other hand, existing black-box attack meth- the classification task. The high-resolution images input
ods (Hu and Tan 2017; Anderson et al. 2018) are aimed into the GRU network is split into multiple sub-images
at static PE anti-malware, such as modifying malware by after the ultra-high resolution image preprocessing mod-
adding irrelevant junk characters to bypass detection. ule, samples for training and testing GRU network are
Since most of these methods are intended to alter mal- 631,250 benign sub-images and 656,250 malicious sub-
ware without affecting the program’s regular function, images, so the sample size is enough to avoid overfitting.
malware’s operations in memory will not change. So the We have also added dropout and early stopping tech-
existing adversarial attacks that modify the PE files have niques that effectively generalize the network and reduce
minimal impact on our detection framework. the overfitting of trained data to the GRU network.
Liu et al. Cybersecurity (2023) 6:21 Page 20 of 22
Block F, Dewald A (2019) Windows memory forensics: detecting (un) intention- Nataraj L, Karthikeyan S, Jacob G, Manjunath BS (2011) Malware images: visualiza-
ally hidden injected code by examining page table entries. Digit Investig tion and automatic classification. In: proceedings of the 8th international
29:3–12 symposium on visualization for cyber security, pp 1–7
Bozkir AS, Tahillioglu E, Aydos M, Kara I (2021) Catch them alive: a malware Ni S, Qian Q, Zhang R (2018) Malware identification using visualization images
detection approach through memory forensics, manifold learning and and deep learning. Comput Sec 77:871–885
computer vision. Comput Sec 103:102166 odzhan: Shellcode: in-memory execution of DLL (2019). https://modexp.wordp
Bozkir AS, Cankaya AO, Aydos M (2019) Utilization and comparision of convolu- ress.com/2019/06/24/inmem-exec-dll/
tional neural networks in malware recognition. In: 2019 27th signal process- orr F (2021) Phantom DLL hollowing. https://github.com/forrest-orr/phantom-
ing and communications applications conference (SIU), pp 1–4. IEEE dll-hollower-poc
Brengel M, Rossow C (2018) Memscrimper: Time-and space-efficient storage of O’Murchu L, Gutierrez FP (2015) The evolution of the fileless click-fraud malware
malware sandbox memory dumps. In: international conference on detec- poweliks. Symantec Corp
tion of intrusions and malware, and vulnerability assessment, pp 24–45. O’Shaughnessy S, Sheridan S (2022) Image-based malware classification hybrid
Springer framework based on space-filling curves. Comput Sec 116:102660
Bulazel A, Yener B (2017) a survey on automated dynamic malware analysis eva- Paschen C (2020) Avoiding get-injectedthread for internal thread creatioN.
sion and counter-evasion: Pc, mobile, and web. In: proceedings of the 1st https://www.trustedsec.com/blog/avoiding-get-injectedthread-for-inter
reversing and offensive-oriented trends symposium, pp. 1–21 nal-thread-creation/
C1air3: MRm-DLDet (2023). https://github.com/C1air3/MRm-DLDet PCmag: the best antivirus protection for 2022 (2022). https://www.pcmag.com/
CERT A (2018) Analysis of cyberattacks against the national bank of Malawi. picks/the-best-antivirus-protection
https://www.antiy.com/response/20181127.html Pinhero A, Anupama M, Vinod P, Visaggio CA, Aneesh N, Abhijith S, Anan-
Cho K, Van Merriënboer B, Bahdanau D, Bengio Y (2014) On the properties thaKrishnan S (2021) Malware detection employed by visualization and
of neural machine translation: encoder-decoder approaches. arXiv preprint deep neural network. Comput Sec 105:102247
arXiv:1409.1259 Reza AM (2004) Realization of the contrast limited adaptive histogram equaliza-
Ebach L (2017) Analysis Results of Zeus. Variant Panda G DATA, G DATA tion (clahe) for real-time image enhancement. J VLSI Signal Proc Syst Signal,
Fewer S (2008) Reflective DLL injection Image Video Technol 38(1):35–44
Foundation V (2020) The volatility framework. http://www.volatilityfoundation. Ronen R, Radu M, Feuerstein C, Yom-Tov E, Ahmadi M (2018) Microsoft malware
org classification challenge. arXiv preprint arXiv:1802.10135
Grosse K, Papernot N, Manoharan P, Backes M, McDaniel P (2017) Adversarial Sihwail R, Omar K, Ariffin KAZ (2021) An effective memory analysis for malware
examples for malware detection. In: European symposium on research in detection and classification. CMC-Comput Mater Continua 67(2):2301–2320
computer security, pp 62–79. Springer Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale
Harang R, Rudd EM (2020) Sorel-20m: A large scale benchmark dataset for mali- image recognition. arXiv preprint arXiv:1409.1556
cious pe detection. arXiv preprint arXiv:2012.07634 spotheplanet: Early Bird APC Queue Code Injection (2020). https://www.ired.
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. team/offensive-security/code-injection-process-injection/early-bird-apc-
In: proceedings of the IEEE conference on computer vision and pattern queue-code-injection
recognition, pp. 770–778 Suciu O, Coull SE, Johns J (2019) Exploring adversarial examples in malware
Hu W, Tan Y (2017) Generating adversarial malware examples for black-box detection. In: 2019 IEEE security and privacy workshops (SPW), pp 8–14. IEEE
attacks based on gan. arXiv preprint arXiv:1702.05983 Team R (2021) RIFT: analysing a lazarus shellcode execution method. https://resea
Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput rch.nccgroup.com/2021/01/23/rift-analysing-a-lazarus-shellcode-execution-
9(8):1735–1780 method/ Accessed Accessed 23 January 2021
Ionut Arghire: Ursnif banking Trojan gets mouse-based anti-sandboxing (2017). Tekerek A, Yapici MM (2022) A novel malware classification and augmentation
https://www.securityweek.com/ursnif-banking-trojan-gets-mouse-based- model based on convolutional neural network. Comput Sec 112:102515
anti-sandboxing/ Van Etten A (2018) You only look twice: Rapid multi-scale object detection
ired.team: Module stomping for shellcode injection (2020). https://www.ired. in satellite imagery. arXiv preprint arXiv:1805.09512
team/offensive-security/code-injection-process-injection/modulestom Vasan D, Alazab M, Wassan S, Naeem H, Safaei B, Zheng Q (2020) Imcfn: Image-
ping-dll-hollowing-shellcode-injection based malware classification using fine-tuned convolutional neural network
Küchler A, Mantovani A, Han Y, Bilge L, Balzarotti D (2021) Does every second architecture. Comput Netw 171:107138
count? time-based evolution of malware behavior in sandboxes. In: pro- Vasan D, Alazab M, Wassan S, Naeem H, Safaei B, Zheng Q (2020) Imcfn: Image-
ceedings of the network and distributed system security symposium, NDSS. based malware classification using fine-tuned convolutional neural network
The Internet Society architecture. Comput Netw 171:107138
Kumar S et al (2020) An emerging threat fileless malware: a survey and research Ventures R (2022) Download.com. https://download.cnet.com/
challenges. Cybersecurity 3(1):1–12 VirusShare: VirusShare. https://virusshare.com/
Lesueur J-P (2020) Darkcomet: remote administration tool. https://www.darkc VirusTotal: virustotal. https://www.virustotal.com/gui/home/upload
omet-rat.com/ VMware I (2022) VMware. https://www.vmware.com/
Ligh MH, Case A, Levy J, Walters A (2014) The art of memory forensics: detecting Wagner D, Soto P (2002) Mimicry attacks on host-based intrusion detection
malware and threats in windows, Linux, and Mac memory. John Wiley, USA systems. In: proceedings of the 9th ACM conference on computer and
Malik A (2019) In-memory execution of an executable. https://securityxploded. communications security, pp 255–264
com/memory-execution-of-executable.php Wang Q, Hassan WU, Li D, Jee K, Yu X, Zou K, Rhee J, Chen Z, Cheng W, Gunter
Malware Behavior Catalog: Dark Comet (2022). https://github.com/MBCPr CA et al (2020) You are what you do: hunting stealthy malware via data
oject/mbc-markdown/blob/master/xample-malware/dark-comet.md#4 provenance analysis. In: NDSS
Microsoft: Out of sight but not invisible: Defeating fileless malware with behavior Wang L, Tao D, Wang R, Wang R, Li H (2019) Big map r-cnn for object detection
monitoring, AMSI, and next-gen AV - microsoft security (2018). https://www. in large-scale remote sensing images. Mathemat Foundations Comput
microsoft.com/security/blog/2018/09/27/out-of-sight-but-not-invisibled 2(4):299
efeating-fileless-malware-with-behavior-monitoring-amsi-and-next-gen-av Yokoyama A, Ishii K, Tanabe R, Papa Y, Yoshioka K, Matsumoto T, Kasama T, Inoue
Miramirkhani N, Appini MP, Nikiforakis N, Polychronakis M (2017) spotless sand- D, Brengel M, Backes M et al (2016) sandprint: Fingerprinting malware sand-
boxes: evading malware analysis systems using wear-and-tear artifacts. In: boxes to provide intelligence for sandbox evasion. In: research in attacks,
2017 IEEE symposium on security and privacy (SP), pp 1009–1024. IEEE intrusions, and defenses: 19th international symposium, RAID 2016, Paris,
MITRE: virtualization/sandbox evasion: user activity based checks (2021). https:// France, September 19-21, 2016, Proceedings 19, pp 165–187. Springer
attack.mitre.org/techniques/T1497/002/ Yosifovich P, Solomon DA, Ionescu A (2017) Windows internals, part 1: system
Mitre: mitre attck. https://attack.mitre.org/ architecture, processes, threads, memory management, and more. Micro-
Mitre: lazarus group (2021). https://attack.mitre.org/groups/G0032/ soft Press, USA, pp 113–202
Yu Z, Qing-Zhong L, Tao L, Li-Hua W, Chun S (2015) Research and development of
memory forensics. J Software 26(5):1151–1172
Liu et al. Cybersecurity (2023) 6:21 Page 22 of 22
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in pub-
lished maps and institutional affiliations.